Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
94 views
in Technique[技术] by (71.8m points)

python - replicating data in same dataFrame

I want to replicate the data from the same dataframe when a certain condition is fulfilled. Dataframe:

Hour,Wage
 1,15
 2,17
 4,20
 10,25 
 15,26
 16,30
 17,40
 19,15

I want to replicate the dataframe when going through a loop and there is a difference greater than 4 in row.hour.

Expected Output:

Hour,Wage
    1,15
    2,17
    4,20
    10,25
    15,26
    16,30
    17,40
    19,15
    2,17
    4,20

i want to replicate the rows when the iterating through all the row and there is a difference greater than 4 in row.hour row.hour[0] = 1 row.hour[1] = 2.here the difference between is 1 but in (row.hour[2]=4 and row,hour[3]=10).here the difference is 6 which is greater than 4.I want to replicate the data above of the index where this condition(greater than 4) is fulfilled I can replicate the data with **df = pd.concat([df]*2, ignore_index=False)**.but it does not replicate when i run it with if statement

I tried the code below but nothing is happening.

**for i in range(0,len(df)-1):
     if (df.iloc[i,0] - df.iloc[i+1,0]) > 4  :
            df = pd.concat([df]*2, ignore_index=False)**
question from:https://stackoverflow.com/questions/65646732/replicating-data-in-same-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

My understanding is: you want to compare 'Hour' values for two successive rows. If the difference is > 4 you want to add the previous row to the DF. If that is what you want try this:

Create a DF:

       j = pd.DataFrame({'Hour':[1, 2, 4,10,15,16,17,19],
              'Wage':[15,17,20,25,26,30,40,15]})

Define a function:

   def f1(d):
     dn = d.copy()
     for x in range(len(d)-2):
         if (abs(d.iloc[x+1].Hour - d.iloc[x+2].Hour) > 4):
            idx = x + 0.5
            dn.loc[idx] = d.iloc[x]['Hour'], d.iloc[x]['Wage']

     dn = dn.sort_index().reset_index(drop=True)
     return dn

Call the function passing your DF:

   nd = f1(j)


     Hour   Wage
    0   1   15
   1    2   17
   2    2   17
   3    4   20
   4    4   20
   5    10  25
   6    15  26
   7    16  30
   8    17  40
   9    19  15

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...