Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
619 views
in Technique[技术] by (71.8m points)

mysql - Implementing the Hacker News ranking algorithm in SQL

Here's how Paul Graham describes the ranking algorithm for Hacker News:

News.YC's is just

(p - 1) / (t + 2)^1.5

where p = points and t = age in hours

I'd like to do that in pure mySQL given the following tables:

  • Table Posts with fields postID (index) and postTime (timestamp).
  • Table Votes with fields voteID (index), postID, and vote (integer, 0 or 1).

The idea of the vote field is that votes can be rescinded. For the purposes of the ranking, vote=0 is equivalent to no vote at all. (All votes are upvotes, no such thing as downvotes.)

The question is how to construct a query that returns the top N postIDs, sorted by Paul Graham's formula. There are approximately 100k posts altogether so if you think caching of the scores or anything will be needed, I'd love to hear advice about that.

(Obviously this is not rocket science and I can certainly figure it out but I figured someone who eats SQL for breakfast, lunch, and dinner could just rattle it off. And it seems valuable to have available on StackOverflow.)


Related questions:

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Untested:

  SELECT x.*
    FROM POSTS x
    JOIN (SELECT p.postid, 
                 SUM(v.vote) AS points
            FROM POSTS p
            JOIN VOTES v ON v.postid = p.postid
        GROUP BY p.postid) y ON y.postid = x.postid
ORDER BY (y.points - 1)/POW(((UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(x.timestamp))/3600)+2, 1.5) DESC
   LIMIT n

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...