Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
224 views
in Technique[技术] by (71.8m points)

python - Get index of a row of a pandas dataframe as an integer

Assume an easy dataframe, for example

    A         B
0   1  0.810743
1   2  0.595866
2   3  0.154888
3   4  0.472721
4   5  0.894525
5   6  0.978174
6   7  0.859449
7   8  0.541247
8   9  0.232302
9  10  0.276566

How can I retrieve an index value of a row, given a condition? For example: dfb = df[df['A']==5].index.values.astype(int) returns [4], but what I would like to get is just 4. This is causing me troubles later in the code.

Based on some conditions, I want to have a record of the indexes where that condition is fulfilled, and then select rows between.

I tried

dfb = df[df['A']==5].index.values.astype(int)
dfbb = df[df['A']==8].index.values.astype(int)
df.loc[dfb:dfbb,'B']

for a desired output

    A         B
4   5  0.894525
5   6  0.978174
6   7  0.859449

but I get TypeError: '[4]' is an invalid key

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The easier is add [0] - select first value of list with one element:

dfb = df[df['A']==5].index.values.astype(int)[0]
dfbb = df[df['A']==8].index.values.astype(int)[0]

dfb = int(df[df['A']==5].index[0])
dfbb = int(df[df['A']==8].index[0])

But if possible some values not match, error is raised, because first value not exist.

Solution is use next with iter for get default parameetr if values not matched:

dfb = next(iter(df[df['A']==5].index), 'no match')
print (dfb)
4

dfb = next(iter(df[df['A']==50].index), 'no match')
print (dfb)
no match

Then it seems need substract 1:

print (df.loc[dfb:dfbb-1,'B'])
4    0.894525
5    0.978174
6    0.859449
Name: B, dtype: float64

Another solution with boolean indexing or query:

print (df[(df['A'] >= 5) & (df['A'] < 8)])
   A         B
4  5  0.894525
5  6  0.978174
6  7  0.859449

print (df.loc[(df['A'] >= 5) & (df['A'] < 8), 'B'])
4    0.894525
5    0.978174
6    0.859449
Name: B, dtype: float64

print (df.query('A >= 5 and A < 8'))
   A         B
4  5  0.894525
5  6  0.978174
6  7  0.859449

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...