Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
824 views
in Technique[技术] by (71.8m points)

encoding - What is "=C2=A0" in MIME encoded, quoted-printable text?

This is an example raw email I am trying to parse:

MIME-version: 1.0
Content-type: text/html; charset=UTF-8
Content-transfer-encoding: quoted-printable
X-Mailer: Verizon Webmail
X-Originating-IP: [x.x.x.x]

=C2=A0test testing testing 123

What is =C2=A0? I have tried a half dozen quoted-printable parsers, but none handle this correctly. How would one properly parse this in C#?

Honestly, for now, I'm coding:

//TODO WTF
encoded = encoded.Replace("=C2=A0", "");

Because I can't figure out why that text is there randomly within the MIME content, and isn't supposed to be rendered into anything. By just removing it, I'm getting the desired effect - but WHY?!

To be clear, I know that (=[0-9A-F]{2}) is an encoded character. But in this case, it seemingly represents NOTHING.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

=C2=A0 represents the bytes C2 A0. Since this is UTF-8, it translates to U+00A0, which is the Unicode for non-breaking space.

See UTF-8 (Wikipedia).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...