Google PageRank的计算源代码

Wed 13 April 2005

Google PageRank的计算源代码

作者: 月光 搜索引擎

最近对google的PageRank(网页等级)比较感兴趣,一直想知道如何不用google toolbar来获取pr值。苦苦搜索之后,找到如下代码:

\<?php\ /**\     This code is released unto the public domain\ */\ //header(\"Content-Type: text/plain; charset=utf-8\");\ define(\'GOOGLE_MAGIC\', 0xE6359A60);

//unsigned shift right\ function zeroFill(\$a, \$b)\ {\     \$z = hexdec(80000000);\         if (\$z & \$a)\         {\             \$a = (\$a>>1);\             \$a &= (\~\$z);\             \$a |= 0x40000000;\             \$a = (\$a>>(\$b-1));\         }\         else\         {\             \$a = (\$a>>\$b);\         }\         return \$a;\ }  

function mix(\$a,\$b,\$c) {\   \$a -= \$b; \$a -= \$c; \$a \^= (zeroFill(\$c,13));\   \$b -= \$c; \$b -= \$a; \$b \^= (\$a\<\<8);\   \$c -= \$a; \$c -= \$b; \$c \^= (zeroFill(\$b,13));\   \$a -= \$b; \$a -= \$c; \$a \^= (zeroFill(\$c,12));\   \$b -= \$c; \$b -= \$a; \$b \^= (\$a\<\<16);\   \$c -= \$a; \$c -= \$b; \$c \^= (zeroFill(\$b,5));\   \$a -= \$b; \$a -= \$c; \$a \^= (zeroFill(\$c,3));\   \$b -= \$c; \$b -= \$a; \$b \^= (\$a\<\<10);\   \$c -= \$a; \$c -= \$b; \$c \^= (zeroFill(\$b,15));

return array(\$a,\$b,\$c);\ }

function GoogleCH(\$url, \$length=null, \$init=GOOGLE_MAGIC) {\     if(is_null(\$length)) {\         \$length = sizeof(\$url);\     }\     \$a = \$b = 0x9E3779B9;\     \$c = \$init;\     \$k = 0;\     \$len = \$length;\     while(\$len >= 12) {\         \$a += (\$url[\$k+0] +(\$url[\$k+1]\<\<8) +(\$url[\$k+2]\<\<16) +(\$url[\$k+3]\<\<24));\         \$b += (\$url[\$k+4] +(\$url[\$k+5]\<\<8) +(\$url[\$k+6]\<\<16) +(\$url[\$k+7]\<\<24));\         \$c += (\$url[\$k+8] +(\$url[\$k+9]\<\<8) +(\$url[\$k+10]\<\<16)+(\$url[\$k+11]\<\<24));\         \$mix = mix(\$a,\$b,\$c);\         \$a = \$mix[0]; \$b = \$mix[1]; \$c = \$mix[2];\         \$k += 12;\         \$len -= 12;\     }

\$c += \$length;\     switch(\$len)              /* all the case statements fall through */\     {\         case 11: \$c+=(\$url[\$k+10]\<\<24);\         case 10: \$c+=(\$url[\$k+9]\<\<16);\         case 9 : \$c+=(\$url[\$k+8]\<\<8);\           /* the first byte of c is reserved for the length */\         case 8 : \$b+=(\$url[\$k+7]\<\<24);\         case 7 : \$b+=(\$url[\$k+6]\<\<16);\         case 6 : \$b+=(\$url[\$k+5]\<\<8);\         case 5 : \$b+=(\$url[\$k+4]);\         case 4 : \$a+=(\$url[\$k+3]\<\<24);\         case 3 : \$a+=(\$url[\$k+2]\<\<16);\         case 2 : \$a+=(\$url[\$k+1]\<\<8);\         case 1 : \$a+=(\$url[\$k+0]);\          /* case 0: nothing left to add */\     }\     \$mix = mix(\$a,\$b,\$c);\     /*-------------------------------------------- report the result */\     return \$mix[2];\ }

//converts a string into an array of integers containing the numeric value of the char

function strord(\$string) {\     for(\$i=0;\$i\<strlen(\$string);\$i++) {\         \$result[\$i] = ord(\$string{\$i});\     }\     return \$result;\ }\ // http://www.example.com/ - Checksum: 6540747202\ \$url = \'info:\'.\$_GET[\'url\'];\ \$ch = GoogleCH(strord(\$url));\ \$url=\'info:\'.urlencode(\$_GET[\'url\']);\ echo file_get_contents(\"http://www.google.com/search?client=navclient-auto&ch=6$ch&ie=UTF-8&oe=UTF-8&features=Rank&q=$url\");\ /* use curl send the user angent\ \$curl = curl_init(\"http://www.google.com/search?client=navclient-auto&ch=6$ch&ie=UTF-8&oe=UTF-8&features=Rank&q=$url\");\ curl_setopt (\$curl, CURLOPT_USERAGENT, \"Mozilla/4.0 (compatible; GoogleToolbar 2.0.110-big; Windows 2000 5.0)\");\ curl_exec(\$curl);\ */\ ?>\ \</>\</>

Google
PageRank的计算源代码

::: {#article_dig style="text-align: right"} []{#note107}顶一下 ▲}([]{#sdig107

)   踩一下 ▼}([]{#sdown107

) :::

::: {.previous_content} « 上一篇 电波怒汉---万峰伊甸园经典语录 :::

::: {.next_content} 下一篇 » 常用的一些网站测评方法 :::

相关文章

Category: 月光博客2005