python - Pandas iterate over DataFrame row pairs

Question

Welcome To Ask or Share your Answers For Others

python - Pandas iterate over DataFrame row pairs

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas iterate over DataFrame row pairs

How can I iterate over pairs of rows of a Pandas DataFrame?

For example:

content = [(1,2,[1,3]),(3,4,[2,4]),(5,6,[6,9]),(7,8,[9,10])]
df = pd.DataFrame( content, columns=["a","b","interval"])
print df

output:

   a  b interval
0  1  2   [1, 3]
1  3  4   [2, 4]
2  5  6   [6, 9]
3  7  8  [9, 10]

Now I would like to do something like

for (indx1,row1), (indx2,row2) in df.?
    print "row1:
", row1
    print "row2:
", row2
    print "
"

which should output

row1:
a    1
b    2
interval    [1,3]
Name: 0, dtype: int64
row2:
a    3
b    4
interval    [2,4]
Name: 1, dtype: int64

row1:
a    3
b    4
interval    [2,4]
Name: 1, dtype: int64
row2:
a    5
b    6
interval    [6,9]
Name: 2, dtype: int64

row1:
a    5
b    6
interval    [6,9]
Name: 2, dtype: int64
row2:
a    7
b    8
interval    [9,10]
Name: 3, dtype: int64

Is there a builtin way to achieve this? I looked at df.groupby(df.index // 2) and df.itertuples but none of these methods seems to do what I want.

Edit: The overall goal is to get a list of bools indicating whether the intervals in column "interval" overlap. In the above example the list would be

overlaps = [True, False, False]

So one bool for each pair.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:55:41+0000

shift the dataframe & concat it back to the original using axis=1 so that each interval & the next interval are in the same row

df_merged = pd.concat([df, df.shift(-1).add_prefix('next_')], axis=1)
df_merged
#Out:
   a  b interval     next_a     next_b    next_interval
0  1  2   [1, 3]        3.0        4.0           [2, 4]
1  3  4   [2, 4]        5.0        6.0           [6, 9]
2  5  6   [6, 9]        7.0        8.0          [9, 10]
3  7  8  [9, 10]        NaN        NaN              NaN

define an intersects function that works with your lists representation & apply on the merged data frame ignoring the last row where the shifted_interval is null

def intersects(left, right):
    return left[1] > right[0]

df_merged[:-1].apply(lambda x: intersects(x.interval, x.next_interval), axis=1)
#Out:
0     True
1    False
2    False
dtype: bool

Categories

python - Pandas iterate over DataFrame row pairs

python - Pandas iterate over DataFrame row pairs

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags