Python NLTK has cmudict that spits out phonemes of recognized words.
(Python NLTK的命令会吐出已识别单词的音素。)
For example 'see' -> [u'S', u'IY1'], but for words that are not recognized it gives an error. (例如'see'-> [u'S',u'IY1'],但是对于无法识别的单词会给出错误。)
For example 'seasee' -> error. (例如'seasee'->错误。)
import nltk
arpabet = nltk.corpus.cmudict.dict()
for word in ('s', 'see', 'sea', 'compute', 'comput', 'seesea'):
try:
print arpabet[word][0]
except Exception as e:
print e
#Output
[u'EH1', u'S']
[u'S', u'IY1']
[u'S', u'IY1']
[u'K', u'AH0', u'M', u'P', u'Y', u'UW1', u'T']
'comput'
'seesea'
Is any there any module that doesn't have that limitation but able to find/guess phonemes of any real or made-up words?
(是否有没有那个限制但能够找到/猜测任何真实或虚构单词的音素的模块?)
If there is none, is there any way I can program it out?
(如果没有,我有什么办法可以对其编程?)
I am thinking about doing loops to test increasing portion of the word. (我正在考虑做循环以测试单词的递增部分。)
For example in 'seasee', the first loop takes "s", next loop takes 'se', and third takes 'sea'... etc and run the cmudict. (例如,在“ seasee”中,第一个循环使用“ s”,下一个循环使用“ se”,第三个循环使用“ sea” ...等等,然后运行命令。)
Though the problem is I don't know how to signal it's the right phoneme to consider. (尽管问题是我不知道该如何发信号,但这是需要考虑的正确音素。)
For example, both 's' and 'sea' in 'seasee' will output some valid phonemes. (例如,“ seasee”中的“ s”和“ sea”都将输出一些有效音素。)
Working progress:
(工作进程:)
import nltk
arpabet = nltk.corpus.cmudict.dict()
for word in ('s', 'see', 'sea', 'compute', 'comput', 'seesea', 'darfasasawwa'):
try:
phone = arpabet[word][0]
except:
try:
counter = 0
for i in word:
substring = word[0:1+counter]
counter += 1
try:
print substring, arpabet[substring][0]
except Exception as e:
print e
except Exception as e:
print e
#Output
c [u'S', u'IY1']
co [u'K', u'OW1']
com [u'K', u'AA1', u'M']
comp [u'K', u'AA1', u'M', u'P']
compu [u'K', u'AA1', u'M', u'P', u'Y', u'UW0']
comput 'comput'
s [u'EH1', u'S']
se [u'S', u'AW2', u'TH', u'IY1', u'S', u'T']
see [u'S', u'IY1']
sees [u'S', u'IY1', u'Z']
seese [u'S', u'IY1', u'Z']
seesea 'seesea'
d [u'D', u'IY1']
da [u'D', u'AA1']
dar [u'D', u'AA1', u'R']
darf 'darf'
darfa 'darfa'
darfas 'darfas'
darfasa 'darfasa'
darfasas 'darfasas'
darfasasa 'darfasasa'
darfasasaw 'darfasasaw'
darfasasaww 'darfasasaww'
darfasasawwa 'darfasasawwa'
ask by KubiK888 translate from so