Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
279 views
in Technique[技术] by (71.8m points)

python - Pandas recalculate index after a concatenation

I have a problem where I produce a pandas dataframe by concatenating along the row axis (stacking vertically).

Each of the constituent dataframes has an autogenerated index (ascending numbers).

After concatenation, my index is screwed up: it counts up to n (where n is the shape[0] of the corresponding dataframe), and restarts at zero at the next dataframe.

I am trying to "re-calculate the index, given the current order", or "re-index" (or so I thought). Turns out that isn't exactly what DataFrame.reindex seems to be doing.


Here is what I tried to do:

train_df = pd.concat(train_class_df_list)
train_df = train_df.reindex(index=[i for i in range(train_df.shape[0])])

It failed with "cannot reindex from a duplicate axis." I don't want to change the order of my data... just need to delete the old index and set up a new one, with the order of rows preserved.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If your index is autogenerated and you don't want to keep it, you can use the ignore_index option. `

train_df = pd.concat(train_class_df_list, ignore_index=True)

This will autogenerate a new index for you, and my guess is that this is exactly what you are after.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...