Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
449 views
in Technique[技术] by (71.8m points)

python - Explode column of list to multiple rows

I want to expand the list in a certain column (in the example column_x) to multiple rows.

So

df = pd.DataFrame({'column_a': ['a_1', 'a_2'], 
                   'column_b': ['b_1', 'b_2'], 
                   'column_x': [['c_1', 'c_2'], ['d_1', 'd_2']]
                  })

shall be transformed from

    column_a    column_b    column_x
0   a_1         b_1         [c_1, c_2]
1   a_2         b_2         [d_1, d_2]

to

    column_a    column_b    column_x
0   a_1         b_1         c_1
1   a_1         b_1         c_2
2   a_2         b_2         d_1
3   a_2         b_2         d_2

The code I have so far does exactly this, and it does it fast.

lens = [len(item) for item in df['column_x']]
pd.DataFrame( {"column_a" : np.repeat(df['column_a'].values, lens), 
               "column_b" : np.repeat(df['column_b'].values, lens), 
               "column_x" : np.concatenate(df['column_x'].values)})

However, I have lots of columns. Is there a neat and elegant solution for repeating the whole data frame without specifying each column again?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Pandas >= 0.25

Pandas can do this in a single function call via df.explode.

df.explode('column_x')

  column_a column_b column_x
0      a_1      b_1      c_1
0      a_1      b_1      c_2
1      a_2      b_2      d_1
1      a_2      b_2      d_2

Note that you can only explode a Series/DataFrame on one column.


Pandas < 0.25

Call np.repeat along the 0th axis for every column besides column_x.

df1 = pd.DataFrame(
    df.drop('column_x', 1).values.repeat(df['column_x'].str.len(), axis=0),
    columns=df.columns.difference(['column_x'])
)
df1['column_x'] = np.concatenate(df['column_x'].values)

df1

  column_a column_b column_x
0      a_1      b_1      c_1
1      a_1      b_1      c_2
2      a_2      b_2      d_1
3      a_2      b_2      d_2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...