views:

167

answers:

3

I've looked at a bunch of resources provided by similar questions asked on this site, the most helpful so far has been found in this discussion, and the resources linked here: PageRank Explained..

While this provides a detailed overview, I'm looking for something a bit more specific. While I realize there are other factors in play, and there have been multiple changes to the algorithm since it's inception, a good indication of the value passed from each link is this: PageRank divided by total pages linked. So if a site (page) has a PR of 8, and links to 20 sites, the amount of total value passed to each site is 8 / 20. Atleast that is what I am led to believe. I know that PageRank is a value between 1 - 10 on a logarithmic scale, meaning that going from a PR 1 to 2 is significantly less difficult than a PR 9 going to a 10. Here's where I am confused - how would one calculate the amount of PR transferred to each link. I'm very much so simplifying things, because a page with a PR 10 with around 10 outbound links should still be passing more value than a PR 5 site with 2 outbound links. What is the best way to understand the proper math behind this at a simple level?

+1  A: 

First, it's worth noting that PageRank as currently implemented is far different from the original idea in the paper, and as it changes all the time even the other information in that SO question isn't entirely reliable. But I imagine the fundamentals are similar.

I think the PageRank is divided before conversion to the logarithmic scale, so if you have a PageRank of P and n > 0 outbound links, the PR transferred would be (somewhat less than, because of the decay factor) P - log_10 n. So with 10 links the PR would drop by 1, with 100 links drop by 2, and so on. Of course if n is 0 then no PageRank is given to other pages, it's just wasted.

Charles
True - PageRank is drastically different NOW. So you are saying you think the value given is different than what I've seen where PR / n links is the value given to each, and instead the formula (granted this is a bit of guesswork) is PR - log_10(n links) ?
Steve
I don't think the logarithmic PageRank was ever divided by the number of links, rather the raw PageRank was. The formula I gave does the right thing with the logarithmic PR.
Charles
Thanks for clarifying Charles, I appreciate all input on the best way to handle this.
Steve
A: 

There's the book Google's PageRank and Beyond: The Science of Search Engine Rankings.

lhf
Had not seen this before, looks fantastic! Will take a look and hopefully draw some valuable insights from it. Thanks!
Steve
A: 

I think this is the best explanation of Page Rank that I've read:

http://www.rose-hulman.edu/~bryan/googleFinalVersionFixed.pdf

At its heart it's an eigenvalue problem.

The title's a bit dated, though. Google's market cap was $167.5B at the end of trading today - 6.7 times the value cited by the paper.

Toby Segaran's "Programming Collective Intelligence" also discusses PageRank.

duffymo