Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
608 views
in Technique[技术] by (71.8m points)

unicode - Printing a string prints 'u' before the string in Python?

'u' before elements in printed list? I didn't type u in my code.

hobbies = []

#prompt user three times for hobbies
for i in range(3):
    hobby = raw_input('Enter a hobby:')
    hobbies.append(hobby)

#print list stored in hobbies
print hobbies

When I run this, it prints the list but it is formatted like this:

Enter a hobby: Painting
Enter a hobby: Stargazing
Enter a hobby: Reading
[u'Painting', u'Stargazing', u'Reading']
None

Where did those 'u' come from before each of the elements of the list?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think what you're actually surprised by here is that printing a single string doesn't do the same thing as printing a list of strings—and this is true whether they're Unicode or not:

>>> hobby1 = u'Dizziness'
>>> hobby2 = u'Vértigo'
>>> hobbies = [hobby1, hobby2]
>>> print hobby1
Dizziness
>>> print hobbies
[u'Dizziness', u'Vxe9rtigo']

Even without the u, you've got those extra quotes, not to mention that backslash escape. And if you try the same thing with str byte strings instead of unicode strings, you'll still have the quotes and escapes (plus you might have mojibake characters if your source file and your terminal have different encodings… but forget that part).


In Python, every object can have two different representations: the end-user-friendly representation, str, and the programmer-friendly representation, repr. For byte strings, those representations are Painting and 'Painting', respectively. And for Unicode strings, they're Painting and u'Painting'.

The print statement uses the str, so print hobby1 prints out Painting, with no quotes (or u, if it's Unicode).

However, the str of a list uses the repr of each of its elements, not the str. So, when you print hobbies, each element has quotes around it (and a u if it's Unicode).

This may seem weird at first, but it's an intentional design decision, and it makes sense once you get used to it. And it would be ambiguous to print out [foo, bar, baz]—is that a list of three strings, or a list of two strings, one of which has a comma in the middle of it? But, more importantly, a list is already not a user-friendly thing, no matter how you print it out. My hobbies are [Painting, Stargazing] would look just as ugly as My hobbies are ['Painting', 'Stargazing']. When you want to show a list to an end-user, you always want to format it explicitly in some way that makes sense.

Often, what you want is as simple as this:

>>> print 'Hobbies:', ', '.join(hobbies)
Hobbies: Painting, Stargazing

Or, for Unicode strings:

>>> print u'Hobbies:', u', '.join(hobbies)
Hobbies: Painting, Stargazing

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...