Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
432 views
in Technique[技术] by (71.8m points)

c - How to properly add hex escapes into a string-literal?

When you have string in C, you can add direct hex code inside.

char str[] = "abcde"; // 'a', 'b', 'c', 'd', 'e', 0x00
char str2[] = "abcx12x34"; // 'a', 'b', 'c', 0x12, 0x34, 0x00

Both examples have 6 bytes in memory. Now the problem exists if you want to add value [a-fA-F0-9] after hex entry.

//I want: 'a', 'b', 'c', 0x12, 'e', 0x00
//Error, hex is too big because last e is treated as part of hex thus becoming 0x12e
char problem[] = "abcx12e";

Possible solution is to replace after definition.

//This will work, bad idea
char solution[6] = "abcde";
solution[3] = 0x12;

This can work, but it will fail, if you put it as const.

//This will not work
const char solution[6] = "abcde";
solution[3] = 0x12; //Compilation error!

How to properly insert e after x12 without triggering error?


Why I'm asking? When you want to build UTF-8 string as constant, you have to use hex values of character if it is larger than ASCII table can hold.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use 3 octal digits:

char problem[] = "abc22e";

or split your string:

char problem[] = "abcx12" "e";

Why these work:

  • Unlike hex escapes, standard defines 3 digits as maximum amount for octal escape.

    6.4.4.4 Character constants

    ...

    octal-escape-sequence:
         octal-digit
         octal-digit octal-digit
         octal-digit octal-digit octal-digit
    

    ...

    hexadecimal-escape-sequence:
        x hexadecimal-digit
        hexadecimal-escape-sequence hexadecimal-digit
    
  • String literal concatenation is defined as a later translation phase than literal escape character conversion.

    5.1.1.2 Translation phases

    ...

    1. Each source character set member and escape sequence in character constants and string literals is converted to the corresponding member of the execution character set; if there is no corresponding member, it is converted to an implementation- defined member other than the null (wide) character. 8)

    2. Adjacent string literal tokens are concatenated.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...