Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
664 views
in Technique[技术] by (71.8m points)

escaping - How to replace a double backslash with a single backslash in python?

I have a string. In that string are double backslashes. I want to replace the double backslashes with single backslashes, so that unicode char codes can be parsed correctly.

(Pdb) p fetched_page
'<p style="text-align:center;" align="center"><strong><span style="font-family:'Times New Roman', serif;font-size:115%;">Chapter 0<\/span><\/strong><\/p>
<p><span style="font-family:'Times New Roman', serif;font-size:115%;">Chapter 0 in \u201cDreaming in Code\u201d give a brief description of programming in its early years and how and why programmers are still struggling today...'

Inside of this string, you can see escaped unicode character codes, such as:

\u201c

I want to turn this into:

u201c

Attempt 1:

fetched_page.replace('\\', '\')

but this doesn't work -- it searches for quadruple backslashes.

Attempt 2:

fetched_page.replace('\', '')

But this results in an end of line error.

Attempt 3:

fetched_page.decode('string_escape')

But this had no effect on the text. All the double backslashes remained as double backslashes.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can try codecs.escape_decode, this should decode the escape sequences.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...