Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
554 views
in Technique[技术] by (71.8m points)

windows - Python - Encoding string - Swedish Letters

I'm having some trouble with Python's raw_input command (Python2.6), For some reason, the raw_input does not get the converted string that swedify() produces and this giving me a encoding error which i'm aware of, that's why i made swedify() to begin with. Here's what i'm trying to do:

elif cmd in ('help', 'hj?lp', 'info'):
    buffert += 'Just nu ?r programmet relativt begr?nsat,
De funktioner du har att anv?nda ?r:
'
    buffert += ' * historik :: skriver ut all din historik
'
    buffert += ' * ?ndra <n?got> :: ?ndrar n?got i databasen, f?ljande finns att ?ndra:
'
    print swedify(buffert)

This works just fine, it outputs the swedish characters just as i want them to the console. But when i try to (in the same code, with same x?? values, print this piece:

core['goalDistance'] = raw_input(swedify('Hur l?ngt i kilometer ?r ditt m?l: '))
core['goalTime'] = raw_input(swedify('Vad ?r ditt m?l i minuter att springa ' +  core['goalDistance'] + 'km p?: '))

Then i get this:

C:UsersAnon>python l?p.py
Traceback (most recent call last):
  File "l÷p.py", line 92, in <module>
    core['goalDistance'] = raw_input(swedify('Hur l├?ngt i kilometer ├?r ditt m├?l: '))
UnicodeEncodeError: 'ascii' codec can't encode character u'xe5' in position 5: ordinal not in range(128)

Now i've googled around, found some "solutions" but none of them work, some sad that i have to create a batch script that executes chcp ??? in the beginning, but that's not a clean solution IMO.

Here is swedify:

def swedify(inp):
    try:
        return inp.decode('utf-8')
    except:
        return '(!Dec:) ' + str(inp)

Any solutions on how to get raw_input to read my return value from swedify()? i've tried from encodings import getencoder, getdecoder and others but nothing for the better.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You mention the fact that you received an encoding error which motivated you to write swedify in the first place, and you have found solutions around chcp which is a Windows command.

On *nix systems with UTF-8 terminals, swedify is not necessary:

>>> raw_input('Hur l?ngt i kilometer ?r ditt m?l: ')
Hur l?ngt i kilometer ?r ditt m?l: 100
'100'
>>> a = raw_input('Hur l?ngt i kilometer ?r ditt m?l: ')
Hur l?ngt i kilometer ?r ditt m?l: 200
>>> a
'200'

FWIW, when I do use swedify, I get the same error you do:

>>> def swedify(inp):
...     try:
...         return inp.decode('utf-8')
...     except:
...         return '(!Dec:) ' + str(inp)
... 
>>> swedify('Hur l?ngt i kilometer ?r ditt m?l: ') 
u'Hur lxe5ngt i kilometer xe4r ditt mxe5l: '
>>> raw_input(swedify('Hur l?ngt i kilometer ?r ditt m?l: '))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'xe5' in position 5: ordinal not in range(128)

Your swedify function returns a unicode object. The built-in raw_input is just not happy with unicode objects.

>>> raw_input("?")
?eee
'eee'
>>> raw_input(u"?")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'xe5' in position 0: ordinal not in range(128)

You might want to try this in Python 3. See this Python bug.

Also of interest: How to read Unicode input and compare Unicode strings in Python?.

UPDATE According to this blog post there is a way to set the system's default encoding. This might be worth a try.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...