Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
536 views
in Technique[技术] by (71.8m points)

mysql - how to check if a string looks randomized, or human generated and pronouncable?

For the purpose of identifying [possible] bot-generated usernames.

Suppose you have a username like "bilbomoothof" .. it may be nonsense, but it still contains pronouncable sounds and so appears human-generated.

I accept that it could have been randomly generated from a dictionary of syllables, or word parts, but let's assume for a moment that the bot in question is a bit rubbish.

  1. Suppose you have a username like "sdfgbhm342r3f", to a human this is clearly a random string. But can this be identified programatically?
  2. Are there any algorithms available (similar to Soundex, etc..) that can identify pronounceable sounds within a string like this?

Solutions applicable in PHP/MySQL most appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I guess you could think of something like that if you could restrict yourself to pronounceable sounds in english. For me (I am French), words like szczepan or wawrzyniec are unpronounceable and certainly have a certain randomness.

But they are actually Polish first names (meaning steven and lawrence)...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...