Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
736 views
in Technique[技术] by (71.8m points)

unicode - Python: How can I replace full-width characters with half-width characters?

If this was PHP, I would probably do something like this:

function no_more_half_widths($string){
  $foo = array('1','2','3','4','5','6','7','8','9','10')
  $bar = array('1','2','3','4','5','6','7','8','9','10')
  return str_replace($foo, $bar, $string)
}

I have tried the .translate function in python and it indicates that the arrays are not of the same size. I assume this is due to the fact that the individual characters are encoded in utf-8. Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The built-in unicodedata module can do it:

>>> import unicodedata
>>> foo = u'1234567890'
>>> unicodedata.normalize('NFKC', foo)
u'1234567890'

The “NFKC” stands for “Normalization Form KC [Compatibility Decomposition, followed by Canonical Composition]”, and replaces full-width characters by half-width ones, which are Unicode equivalent.

Note that it also normalizes all sorts of other things at the same time, like separate accent marks and Roman numeral symbols.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...