I am facing an issue with the following dataset:
item price
1 1706
2 210
3 1664
4 103
5 103
6 314
7 1664
8 57
9 140
10 1628
11 688
12 180
13 604
14 86
15 180
16 86
17 1616
18 832
19 1038
20 57
21 2343
22 151
23 328
24 328
25 57
26 86
27 1706
28 604
29 609
30 86
31 0
32 57
33 302
34 328
I want to have a cumulative sum column which "resets" each time it reaches the threshold (read not exceed it, it is fine to have a big gap between the last cumsum number and the threshold as long as it does not exceed it).
I have tried the following code:
threshold = (7.17*1728)*0.75 #this is equal to 9292.32
df['cumsum'] = df.groupby((df['price'].cumsum()) // threshold)['price'].cumsum()
This output the following:
item price cumsum
1 1706 1706
2 210 1916
3 1664 3580
4 103 3683
5 103 3786
6 314 4100
7 1664 5764
8 57 5821
9 140 5961
10 1628 7589
11 688 8277
12 180 8757
13 604 9061
14 86 9147
15 180 9327 #exceeds threshold
16 86 9413 #
17 1616 1616
18 832 2448
19 1038 3486
20 57 3543
21 2343 5886
22 151 6037
23 328 6365
24 328 6693
25 57 6750
26 86 6836
27 1706 8542
28 604 9146
29 609 9755 #exceeds threshold same below
30 86 9841 #
31 0 9841 #
32 57 9898 #
33 302 10200 #
34 328 328
My expected result would be the following instead (for the first part for example):
item price cumsum
1 1706 1706
2 210 1916
3 1664 3580
4 103 3683
5 103 3786
6 314 4100
7 1664 5764
8 57 5821
9 140 5961
10 1628 7589
11 688 8277
12 180 8757
13 604 9061
14 86 9147
15 180 180 #
16 86 266 #
What do I need to change in order to get this result? also i would appreciate any explanation as to why the above code does not work.
Thank you in advance.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…