Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
999 views
in Technique[技术] by (71.8m points)

visual c++ - Why are certain Unicode characters causing std::wcout to fail in a console app?

Consider the following code snippet, compiled as a Console Application on MS Visual Studio 2010/2012 and executed on Win7:

#include "stdafx.h"
#include <iostream>
#include <string>


const std::wstring test = L"helloxf021test!";

int _tmain(int argc, _TCHAR* argv[])
{
    std::wcout << test << std::endl;
    std::wcout << L"This doesn't print either" << std::endl;

    return 0;
}

The first wcout statement outputs "hello" (instead of something like "hello?test!") The second wcout statement outputs nothing.

It's as if 0xf021 (and other?) Unicode characters cause wcout to fail.

This particular Unicode character, 0xf021 (encoded as UTF-16), is part of the "Private Use Area" in the Basic Multilingual Plane. I've noticed that Windows Console applications do not have extensive support for Unicode characters, but typically each character is at least represented by a default character (e.g. "?"), even if there is no support for rendering a particular glyph.

What is causing the wcout stream to choke? Is there a way to reset it after it enters this state?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

wcout, or to be precise, a wfilebuf instance it uses internally, converts wide characters to narrow characters, then writes those to the file (in your case, to stdout). The conversion is performed by the codecvt facet in the stream's locale; by default, that just does wctomb_s, converting to the system default ANSI codepage, aka CP_ACP.

Apparently, character 'xf021' is not representable in the default codepage configured on your system. So the conversion fails, and failbit is set in the stream. Once failbit is set, all subsequent calls fail immediately.

I do not know of any way to get wcout to successfully print arbitrary Unicode characters to console. wprintf works though, with a little tweak:

#include <fcntl.h>
#include <io.h>
#include <string>

const std::wstring test = L"helloxf021test!";

int _tmain(int argc, _TCHAR* argv[])
{
  _setmode(_fileno(stdout), _O_U16TEXT);
  wprintf(test.c_str());

  return 0;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...