Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

python - Why does locale.getpreferredencoding() return 'ANSI_X3.4-1968' instead of 'UTF-8'?

Whenever I try to read UTF-8 encoded text files, using open(file_name, encoding='utf-8'), I always get an error saying ASCII codec can't decode some characters (eg. when using for line in f: print(line))

Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getpreferredencoding()
'ANSI_X3.4-1968'
>>> import sys
>>> sys.getfilesystemencoding()
'ascii'
>>>

and locale command prints:

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=en_HK.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had a similar problem. For me, initially the environtment variable LANG was not set (you can check this by running env)

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
(None, None)
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

The available locales for me was (on a fresh Ubuntu 18.04 Docker image):

$ locale -a
C
C.UTF-8
POSIX

So i picked the utf-8 one:

$ export LANG="C.UTF-8"

And then things work

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
('en_US', 'UTF-8')
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
UTF-8

If you pick a locale that is not avaiable, such as

export LANG="en_US.UTF-8"

it will not work:

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
('en_US', 'UTF-8')
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

and this is why locale is giving the error messages:

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...