Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
555 views
in Technique[技术] by (71.8m points)

unicode - How to setup vim properly for editing in utf-8

I've run into problems a few times because vim's encoding was set to latin1 by default and I didn't notice and assumed it was using utf-8. Now that I have, I'd like to set up vim so that it will do the right thing in all obvious cases, and use utf-8 by default.

What I'd like to avoid:

  • Forcing a file saved in some other encoding that would have worked before my changes to open as utf-8, resulting in gibberish.
  • Forcing a terminal that doesn't support multibyte characters (like the Windows XP one) to try to display them anyway, resulting in gibberish.
  • Interfering with other programs' ability to read or edit the files (I have a (perhaps unjustified) aversion to using a BOM by default because I am unclear on how likely it is to mess other programs up.)
  • Other issues that I don't know enough about to guess at (but hopefully you do!)

What I've got so far:

if has("multi_byte")
  if &termencoding == ""
    let &termencoding = &encoding
  endif
  set encoding=utf-8                     " better default than latin1
  setglobal fileencoding=utf-8           " change default file encoding when writing new files
  "setglobal bomb                        " use a BOM when writing new files
  set fileencodings=ucs-bom,utf-8,latin1 " order to check for encodings when reading files
endif

This is taken and slightly modified from the vim wiki. I moved the bomb from setglobal fileencoding to its own statement because otherwise it doesn't actually work. I also commented out that line because of my uncertainty towards BOMs.

What I'm looking for:

  • Possible pitfalls to avoid that I missed
  • Problems with the existing code
  • Links to anywhere this has been discussed / set out already

Ultimately, I'd like this to result in a no-thought-required copy/paste snippet that will set up vim for utf-8-by-default that will work across platforms.

EDIT: I've marked my own answer as accepted for now, as far as I can tell it works okay and accounts for all things it can reasonably account for. But it's not set in stone; if you have any new information please feel free to answer!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In response to sehe, I'll give a go at answering my own question! I removed the updates I made to the original question and have moved them to this answer. This is probably the better way to do it.

The answer:

if has("multi_byte")
  if &termencoding == ""
    let &termencoding = &encoding
  endif
  set encoding=utf-8                     " better default than latin1
  setglobal fileencoding=utf-8           " change default file encoding when writing new files
endif

I removed the bomb line because according to the BOM Wikipedia page it is not needed when using utf-8 and in fact defeats ASCII backwards compatibility. As long as ucs-bom is first in fileencodings, vim will be able to detect and handle existing files with BOMs, so it is not needed for that either.

I removed the fileencodings line because it is not needed in this case. From the Vim docs: When 'encoding' is set to a Unicode encoding, and 'fileencodings' was not set yet, the default for 'fileencodings' is changed.

I am using setglobal filencoding (as opposed to set fileencoding) because: When reading a file, fileencoding will be automatically set based on fileencodings. So it only matters for new files then. And according to the docs again:

For a new file the global value of 'fileencoding' is used.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...