Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
172 views
in Technique[技术] by (71.8m points)

python - What is the quickest way to add rows to a pandas dataframe built from a list?

I am trying to create a dataframe of twitter data. Using the twitter API, I have a list of twitter objects as a list (tweets) and want to populate a dataframe with various info from those twitter objects and using some other functions on the text. The current method I have uses list comprehension for each column, iterating through all tweets each time.

df = pd.DataFrame(data=[tweet.all_text for tweet in tweets], columns=["tweets"])

df.loc[:, 'id'] = np.array([tweet.id for tweet in tweets])
df.loc[:, 'len_tweet'] = np.array([len(tweet.all_text) for tweet in tweets])
df.loc[:, 'date_created'] = np.array([tweet.created_at_datetime for tweet in tweets])
df.loc[:, 'author'] = np.array([tweet.name for tweet in tweets])
df.loc[:, 'clean_tweet'] = np.array([self.clean_tweet_eng(tweet) for tweet in df.tweets])
df.loc[:, 'clean_stopwords_tweet'] = np.array([self.stopwords_clean(tweet) for tweet in df.tweets])

etc...
            

As I scale up the number of tweets, this becomes very slow.

I have looked at two other methods: creating the dataframe through iteratively adding elements to a dictionary, and building up the dataframe one row at a time using iterrows to only cycle through the list of tweets once. Both seem to be slower.

What is the fastest way to do achieve this?

question from:https://stackoverflow.com/questions/66049573/what-is-the-quickest-way-to-add-rows-to-a-pandas-dataframe-built-from-a-list

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think the simplest way would to be convert the list of twitter objects in to one list of dictionaries then use load the data once

import pandas as pd

list_of_dicts = [{'name': 'jon', 'age': 30}, {'name': 'paul', 'age': 26}]
df = pd.DataFrame(list_of_dicts)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...