Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

python - Can't concat bytes to str (Converting to Python3)

I'm trying to convert my Python 2 code to Python3 but I am receiving the following error:

Traceback (most recent call last):
  File "markovtest.py", line 73, in <module>
    get_all_tweets("quit_cryan")
  File "markovtest.py", line 41, in get_all_tweets
    outtweets = [(tweet.text.encode("utf-8") + str(b" ")) for tweet in alltweets]
  File "markovtest.py", line 41, in <listcomp>
    outtweets = [(tweet.text.encode("utf-8") + str(b" ")) for tweet in alltweets]
TypeError: can't concat bytes to str

The problem is in this for loop:

outtweets = [(tweet.text.encode("utf-8") + " ") for tweet in alltweets]

I have tried changing encode to decode or removing the encode parameter altogether but I cannot figure it out. Any help would be appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Python3 has several different 'string' types. Details on which ones there are and what they are supposed to do can be found here.

You are trying to combine a bytes string (basically an immutable character array) to a unicode string. This can not (easily) be done.

The problem in your code snippet is that the tweet text, most likely a string, is converted to bytes with the encode method. This works fine, but when you try to concatenate the space " " (which is a string) to the bytes object the error occurs. You can either remove the encode and do the concatenation as strings (and maybe encode later) or make the space a bytes object by adding a 'b' before the quotes like this b" ".

Let's take a look at your options:

In [1]: type("foo")
Out[1]: str

In [2]: type("foo".encode("utf-8"))
Out[2]: bytes

In [3]: "foo" + " "  # str + str
Out[3]: 'foo '

In [4]: "foo".encode("utf-8") + " "  # str + bytes
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-5c7b745d9739> in <module>()
----> 1 "foo".encode("utf-8") + " "

TypeError: can't concat bytes to str

I guess for you problem, the simplest solution would be to make the space a byte string (as below). I hope this helps.

In [5]: "foo".encode("utf-8") + b" "  # bytes + bytes
Out[5]: b'foo '

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...