Sunday, October 08, 2006

people and web pages

Disclaimer 1- Digressing from usual topics
Disclaimer 2 - One of my sunday night (impending monday morning) insanities...

Just was thinking about web pages was wondering if the same equations come into play while assigning authority to a page as they come while we assign authority to a person in our society..Let's think about it:
1) Sphere of influence - People who have larger sphere of influence (a successful social worker (Mother Teresa) or an successful enterpreneur- Ambanis) - are the people who are weigthed higher. You know them because they are the people who matter to a lot of people - Similarly web pages who are being linked to from lots of pages are perceived to be important by page rank algorithm.

2) "Known to be good" - At some point these people had earned name and the brand name is continuing - especially true in India where a well-known family is always a well-known family. It is easy to know who they are as we always knew who they are (example - We always knew Princess Diana). Similarly we have known good web pages - wikipedia, etc

3) Junk - Though every person is useful for atleast one person (dude - himself, dudette - herself) - they are the set of the people whom the remaining other set detest - these could be chors, roberrers or beggars (believe it or not in today's world poverty is sin) - while it is easy to identify the last category, it is difficult to identify the conmens. And yes these category of people map to our junk pages - some of them harmful, some of them them useless. Some easily identifiable - some you will never identify.

4) Middle class - This is my kinda category - these are the people - not exactly junk (they believe so) and definitely not the most influential people in the society (and they know so). They might/might not be important for the small set of people who know them. What elevates them from junk - we don't know - is it the fact that they are good( 2 am friends) or is it a larger sphere of influence (big gangs), or is it some latent attribute we are unaware of. Are they still junk - we don't know that too. And that is the category where most of the web pages fall- middle class.

And there lies the challenge of a good ranking algorithm - how do we order all these middle class pages...tough job if you ask me.


Anonymous said...

Hmm, is that a complete set? I mean does the union represent a Universal set?

Ankur Gupta said...
This comment has been removed by a blog administrator.
Ankur Gupta said...

It would be a universal set since the definition of the fourth category is everything that has not been covered by categories 1,2 and 3.

But one important category that has been left out in the domain of this mapping is the set of spam pages.

What is the set of people that would form co-domain of the set of spam pages?
Maybe people like me who no one is interested to listen to but poor people don't have a choice. :P