Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
132 views
in Technique[技术] by (71.8m points)

python - How to add dummies to Pandas DataFrame?

I have a data_df that looks like:

   price vehicleType  yearOfRegistration    gearbox  powerPS  model  kilometer fuelType       brand notRepairedDamage  postalCode
0  18300       coupe                2011    manuell      190    NaN     125000   diesel        audi                ja       66954
1   9800         suv                2004  automatik      163  grand     125000   diesel        jeep               NaN       90480
2   1500  kleinwagen                2001    manuell       75   golf     150000   benzin  volkswagen              nein       91074
3   3600  kleinwagen                2008    manuell       69  fabia      90000   diesel       skoda              nein       60437
4    650   limousine                1995    manuell      102    3er     150000   benzin         bmw                ja       33775

Tried to convert classification columns (vehicleType) to dummies ("one hot encoding"):

columns = [ 'vehicleType' ] #, 'gearbox', 'model', 'fuelType', 'brand', 'notRepairedDamage' ]
for column in columns:
  dummies = pd.get_dummies(data_df[column], prefix=column)
  data_df.drop(columns=[column], inplace=True)
  data_df = data_df.add(dummies, axis='columns')

But the original data is missing:

  brand fuelType gearbox  kilometer model notRepairedDamage  ...  vehicleType_coupe  vehicleType_kleinwagen  vehicleType_kombi  vehicleType_limousine  vehicleType_suv  yearOfRegistration
0   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
1   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
2   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
3   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
4   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN

So, how to replace a given column with the dummies?

question from:https://stackoverflow.com/questions/65600279/create-a-column-from-another-column-information

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
# Get one hot encoding of columns 'vehicleType'
one_hot = pd.get_dummies(data_df['vehicleType'])
# Drop column as it is now encoded
data_df = data_df.drop('vehicleType',axis = 1)
# Join the encoded df
data_df = data_df.join(one_hot)
data_df 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...