Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
580 views
in Technique[技术] by (71.8m points)

sorting - Hot content algorithm / score with time decay

I have been reading + researching on algorithms and formulas to work out a score for my user submitted content to display currently hot / trending items higher up the list, however i'll admit i'm a little over my head here.

I'll give some background on what i'm after... users upload audio to my site, audios have several actions:

  • Played
  • Downloaded
  • Liked
  • Favorited

Ideally i want an algorithm where I can update an audios score each time a new activity is logged (played, download etc...), also a download action is worth more than a play, like more than a download and a favourite more than a like.

If possible i would like for audios older than 1 week to drop off quite sharply from the list to give newer content more of a chance of trending.

I have read about reddits algorithm which looked good, but i'm in over my head on how to tweak it to make use of my multiple variables, and to drop off older articles after around 7 days.

Some articles that we're interesting:

Any help is appreciated!

Paul

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Reddits old formula and a little drop off

Basically you can use Reddit's formula. Since your system only supports upvotes you could weight them, resulting in something like this:

def hotness(track)
    s = track.playedCount
    s = s + 2*track.downloadCount
    s = s + 3*track.likeCount
    s = s + 4*track.favCount
    baseScore = log(max(s,1))

    timeDiff = (now - track.uploaded).toWeeks

    if(timeDiff > 1)
        x = timeDiff - 1
        baseScore = baseScore * exp(-8*x*x)

    return baseScore

The factor exp(-8*x*x) will give you your desired drop off:

Exponential drop off

The basics behind

You can use any function that goes to zero faster than your score goes up. Since we use log on our score, even a linear function can get multiplied (as long as your score doesn't grow exponentially).

So all you need is a function that returns 1 as long as you don't want to modify the score, and drops afterwards. Our example above forms that function:

multiplier(x) = x > 1 ? exp(-8*x*x) : 1

You can vary the multiplier if you want less steep curves. varying multiplier

Example in C++

Lets say that the probability for a given track to be played in a given hour is 50%, download 10%, like 1% and favorite 0.1%. Then the following C++ program will give you an estimate for your scores behavior:

#include <iostream>
#include <fstream>
#include <random>
#include <ctime>
#include <cmath>

struct track{
    track() : uploadTime(0),playCount(0),downCount(0),likeCount(0),faveCount(0){}
    std::time_t uploadTime;    
    unsigned int playCount;
    unsigned int downCount;
    unsigned int likeCount;
    unsigned int faveCount;    
    void addPlay(unsigned int n = 1){ playCount += n;}
    void addDown(unsigned int n = 1){ downCount += n;}
    void addLike(unsigned int n = 1){ likeCount += n;}
    void addFave(unsigned int n = 1){ faveCount += n;}
    unsigned int baseScore(){
        return  playCount +
            2 * downCount +
            3 * likeCount +
            4 * faveCount;
    }
};

int main(){
    track test;
    const unsigned int dayLength = 24 * 3600;
    const unsigned int weekLength = dayLength * 7;    

    std::mt19937 gen(std::time(0));
    std::bernoulli_distribution playProb(0.5);
    std::bernoulli_distribution downProb(0.1);
    std::bernoulli_distribution likeProb(0.01);
    std::bernoulli_distribution faveProb(0.001);

    std::ofstream fakeRecord("fakeRecord.dat");
    std::ofstream fakeRecordDecay("fakeRecordDecay.dat");
    for(unsigned int i = 0; i < weekLength * 3; i += 3600){
        test.addPlay(playProb(gen));
        test.addDown(downProb(gen));
        test.addLike(likeProb(gen));
        test.addFave(faveProb(gen));    

        double baseScore = std::log(std::max<unsigned int>(1,test.baseScore()));
        double timePoint = static_cast<double>(i)/weekLength;        

        fakeRecord << timePoint << " " << baseScore << std::endl;
        if(timePoint > 1){
            double x = timePoint - 1;
            fakeRecordDecay << timePoint << " " << (baseScore * std::exp(-8*x*x)) << std::endl;
        }
        else
            fakeRecordDecay << timePoint << " " << baseScore << std::endl;
    }
    return 0;
}

Result:

Decay

This should be sufficient for you.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...