Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
735 views
in Technique[技术] by (71.8m points)

unicode - Writing UTF-8 String to MySQL with Python

I am trying to push user account data from an Active Directory to our MySQL-Server. This works flawlessly but somehow the strings end up showing an encoded version of umlauts and other special characters.

The Active Directory returns a string using this sample format: Mxc3xbcller

This actually is the UTF-8 encoding for Müller, but I want to write Müller to my database not Mxc3xbcller.

I tried converting the string with this line, but it results in the same string in the database: tempEntry[1] = tempEntry[1].decode("utf-8")

If I run print "Mxc3xbcller".decode("utf-8") in the python console the output is correct.

Is there any way to insert this string the right way? I need this specific format for a web developer who wants to have this exact format, I don't know why he is not able to convert the string using PHP directly.

Additional info: I am using MySQLdb; The table and column encoding is utf8_general_ci

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As @marr75 suggests, make sure you set charset='utf8' on your connections. Setting use_unicode=True is not strictly necessary as it is implied by setting the charset.

Then make sure you are passing unicode objects to your db connection as it will encode it using the charset you passed to the cursor. If you are passing a utf8-encoded string, it will be doubly encoded when it reaches the database.

So, something like:

conn = MySQLdb.connect(host="localhost", user='root', password='', db='', charset='utf8')
data_from_ldap = 'Mxc3xbcller'
name = data_from_ldap.decode('utf8')
cursor = conn.cursor()
cursor.execute(u"INSERT INTO mytable SET name = %s", (name,))

You may also try forcing the connection to use utf8 by passing the init_command param, though I'm unsure if this is required. 5 mins testing should help you decide.

conn = MySQLdb.connect(charset='utf8', init_command='SET NAMES UTF8')

Also, and this is barely worth mentioning as 4.1 is so old, make sure you are using MySQL >= 4.1


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...