Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
404 views
in Technique[技术] by (71.8m points)

javascript - 如何成功实现马尔可夫模型以生成句子的下一个单词?(How to succesfully implement a markov model for generating the next word of a sentence?)

I am working on javascript program that takes text and use it to generate sentences that seems to make sense at first glance.

(我正在研究采用文本的javascript程序,并使用它来生成乍一看似乎很有意义的句子。)

I'm implementing a markov model.

(我正在实现markov模型。)

I have for example :

(我有例如:)

[{word:"hello", prob: 0.5}, {word: "world", prob: 0.25},...]

My model is much more complex and I'm not going to explain every detail.

(我的模型要复杂得多,我不会解释每个细节。)

What I want to know is when knowing the probability of a certain word occurring, how can one create the sentence generator in Javascript?

(我想知道的是,当知道某个单词出现的可能性时,如何在Javascript中创建句子生成器 ?)

What I currently have seems to be doing that but when really thinking about it it's just random.

(我目前似乎正在这样做,但是当真正考虑它时,它只是随机的。)

What I've tried was to compare the prob value of each word in my table with a randomly selected value from 0 to 1.

(我试图将表中每个单词的概率值与从0到1随机选择的值进行比较。)

I would have for example picked

(我本来会选择)

 randomValue = Math.Random().toFixed(2)

using toFixed to have values that are 0.33 instead of 0.3455343.... And I would then compare it with the different prob value for every word and see if it matches.

(使用toFixed的值是0.33而不是0.3455343 ....然后,我将其与每个单词的不同prob值进行比较,看是否匹配。)

Once it matches I pick that word.

(一旦匹配,我就选择那个词。)

What is the correct way of at least getting words to be picked by probability rather than what I did which seems to just be random selection.

(什么是至少使概率被单词挑选的正确方法,而不是我所做的似乎只是随机选择的正确方法。)

  ask by Nassims translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I am not overly familiar with the markov model, but I feel like I could lend a hand here- especially considering that there are no answers here so far.

(我对markov模型不太熟悉,但是我觉得我可以在这里伸出援手-特别是考虑到目前为止还没有答案。)

First, the code you provided:

(首先,您提供的代码:)

randomValue = Math.Random().toFixed(2)

has a couple of issues.

(有几个问题。)

The "R" in random should be lowercase, and toFixed(2) returns a string, not a number.

(随机的“ R”应为小写,并且toFixed(2)返回字符串,而不是数字。)

The correct version of that line is:

(该行的正确版本是:)

var randomValue = Number(Math.random().toFixed(2));

That being said, to pick the next word based purely on the highest probability, you wouldn't need to use that line of code anyway.

(话虽这么说,要完全根据最高的概率选择下一个单词,则无论如何都不需要使用该行代码。)

You'd do something line this:

(您可以在此行:)

var nextWordProbabilities = [{word:"hello", prob: 0.5}, {word: "world", prob: 0.25}];

nextWordProbabilities.sort(function(a, b){
  if(a.prob < b.prob)return 1;
  if(a.prob > b.prob)return -1;
  return 0;
});
var nextWord = nextWordProbabilities[0].word;

If you then wanted to throw in a little randomness so you didn't always end up with exactly the highest probability word, but rather possibly a word that was just close enough to the highest possibility, you could go on to then add this following that previous code block:

(如果您随后想稍微随意一点,以使您不一定总是得到恰好是最高概率的单词,而是可能单词刚好接近最高可能性,那么您可以继续在该单词后面添加上一个代码块:)

var TENDENCY_TOWARDS_MOST_PROBABLE_WORDS = .5;
for(var i = 0; i < nextWordProbabilities.length; i++){
    if(Math.random() > TENDENCY_TOWARDS_MOST_PROBABLE_WORDS){
        nextWord = nextWordProbabilities[i].word;
    }
}

I'm also not sure how you're determining when to end a sentence.

(我也不确定您如何确定何时结束句子。)

If you're not just doing a set number of words in a row, it might be a good idea to just end the sentence when the most probable word isn't a super probable, like so:

(如果您不只是连续处理一定数量的单词,那么最好在最有可能的单词不是超级概率时结束句子,这是一个好主意,例如:)

if(nextWordProbabilities[0].prob < .2){
    //end the sentence
}

Hope this is helpful.

(希望这会有所帮助。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...