Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
250 views
in Technique[技术] by (71.8m points)

python - How to concatenate pairs of row elements into a new column in a pandas dataframe?

I have this DataFrame where the columns are coordinates (e.g. x1,y1,x2,y2...). The coordinate columns start from the 8th column (the previous ones are irrelevant for the question)
I have a larger example sample here, but here's a sample:

start_column = 8    
df = pd.DataFrame(columns = ['x1','y1','x2','y2'],
                 data = [(0,0,1,0),(0,1,2,3),(-1,-2,None,None)])
for i in range(7):
    df.insert(0,'c'+str(7-i),'x')
df

I want to create a new column in the DataFrame as a list of xy pairs, as in: df["coordinates"]=[[x1,y1],[x2,y2],[x3,y3]....]

What I've tried so far:

for row in df.iterrows():
   for i in range(1,total_count_of_xy_rows):
      df["coordinates"]= 
             df[["x{}".format(i),"y{}".format(i)]].values.tolist()
   print(df)

Is there a better way to do this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can create the new column by .apply-ing a custom list comprehension function across the different rows:

start_column = 8    
coordinates_list = list(zip(df.columns[(start_column-1):-1:2],df.columns[start_column::2]))
df['coordinates'] = df.apply(lambda row: [(row[x], row[y]) 
                                          for x,y in coordinates_list if not any((pd.isna(row[x]), pd.isna(row[y])))], axis=1)

Using this example input, with the coordinate columns starting from the 8th column, as you stated in a comment:

df = pd.DataFrame(columns = ['x1','y1','x2','y2'],
                 data = [(0,0,1,0),(0,1,2,3),(-1,-2,None,None)])
for i in range(start_column-1):
    df.insert(0,'c'+str(start_column-1-i),'x')
df

    c1  c2  c3  c4  c5  c6  c7  x1  y1  x2  y2
0   x   x   x   x   x   x   x   0   0   1.0 0.0
1   x   x   x   x   x   x   x   0   1   2.0 3.0
2   x   x   x   x   x   x   x   -1  -2  NaN NaN

This will produce this output:

c1  c2  c3  c4  c5  c6  c7  x1  y1  x2  y2  coordinates
0   x   x   x   x   x   x   x   0   0   1.0 0.0 [(0, 0), (1.0, 0.0)]
1   x   x   x   x   x   x   x   0   1   2.0 3.0 [(0, 1), (2.0, 3.0)]
2   x   x   x   x   x   x   x   -1  -2  NaN NaN [(-1, -2)]

This deals with the unequal number of coordinates in each row. Hope that helps!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...