Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
275 views
in Technique[技术] by (71.8m points)

python - Pig Latin Translator

So, I have a basic Pig Latin translator that only works for one word.

def Translate(Phrase):
Subscript = 0

while Phrase[Subscript] != "a" or Phrase[Subscript] != "e" or Phrase[Subscript] != "i" or           
  Phrase[Subscript] != "o" or Phrase[Subscript] != "u":  

  Subscript += 1
if Phrase[Subscript] == "a" or Phrase[Subscript] == "e" or Phrase[Subscript] == "i" or   

Phrase[Subscript] == "o" or Phrase[Subscript] == "u":  
return Phrase[Subscript:] + Phrase[:Subscript] + "ay"

Can someone please assist me in editing this translator in order to take more than one word? Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's pig latin dialect that takes into account how the words are pronounced:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re

sentences = ["Pig qoph an egg.",
             "Quiet European rhythms.",
             "My nth happy hour.",
             "Herb unit -- a dynasty heir."]
for sent in sentences:
    entsay = " ".join(["".join(map(to_piglatin, re.split("(W+)", nonws)))
                       for nonws in sent.split()])
    print(u'"{}" → "{}"'.format(sent, entsay))

Output

"Pig qoph an egg." → "igpay ophqay anway eggway."
"Quiet European rhythms." → "ietquay uropeaneay ythmsrhay."
"My nth happy hour." → "ymay nthway appyhay hourway."
"Herb unit -- a dynasty heir." → "herbway itunay -- away ynastyday heirway."

Note:

  • "-way" suffix is used for words that start with a vowel sound
  • qu in "quiet" is treated as a unit
  • European, unit start with a consonant
  • y in "rhythms", "dynasty" is a vowel
  • nth, hour, herb, heir start with a vowel

where to_piglatin() is:

from nltk.corpus import cmudict # $ pip install nltk
# $ python -c "import nltk; nltk.download('cmudict')"

def to_piglatin(word, pronunciations=cmudict.dict()):
    word = word.lower() #NOTE: ignore Unicode casefold
    i = 0
    # find out whether the word start with a vowel sound using
    # the pronunciations dictionary
    for syllables in pronunciations.get(word, []):
        for i, syl in enumerate(syllables):
            isvowel = syl[-1].isdigit()
            if isvowel:
                break
        else: # no vowels
            assert 0
        if i == 0: # starts with a vowel
            return word + "way"
        elif "y" in word: # allow 'y' as a vowel for known words
            return to_piglatin_naive(word, vowels="aeiouy", start=i)
        break # use only the first pronunciation
    return to_piglatin_naive(word, start=i)

def to_piglatin_naive(word, vowels="aeiou", start=0):
    word = word.lower()
    i = 0
    for i, c in enumerate(word[start:], start=start):
        if c in vowels:
            break
    else: # no vowel in the word
        i += 1
    return word[i:] + word[:i] + "w"*(i == 0) + "ay"*word.isalnum()

To split the text into sentences, words you could use nltk tokenizers. It is possible to modify the code to respect letters' case (uppercase/lowercase), contractions.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...