Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
335 views
in Technique[技术] by (71.8m points)

c++ - Do C++11 regular expressions work with UTF-8 strings?

If I want to use C++11's regular expressions with unicode strings, will they work with char* as UTF-8 or do I have to convert them to a wchar_t* string?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You would need to test your compiler and the system you are using, but in theory, it will be supported if your system has a UTF-8 locale. The following test returned true for me on Clang/OS X.

bool test_unicode()
{
    std::locale old;
    std::locale::global(std::locale("en_US.UTF-8"));

    std::regex pattern("[[:alpha:]]+", std::regex_constants::extended);
    bool result = std::regex_match(std::string("abcdéfg"), pattern);

    std::locale::global(old);

    return result;
}

NOTE: This was compiled in a file what was UTF-8 encoded.


Just to be safe I also used a string with the explicit hex versions. It worked also.

bool test_unicode2()
{
    std::locale old;
    std::locale::global(std::locale("en_US.UTF-8"));

    std::regex pattern("[[:alpha:]]+", std::regex_constants::extended);
    bool result = std::regex_match(std::string("abcdxC3xA9""fg"), pattern);

    std::locale::global(old);

    return result;
}

Update test_unicode() still works for me

$ file regex-test.cpp 
regex-test.cpp: UTF-8 Unicode c program text

$ g++ --version
Configured with: --prefix=/Applications/Xcode-8.2.1.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode-8.2.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...