Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
597 views
in Technique[技术] by (71.8m points)

python 3.x - KeyError: "None of [Index(['23/01/2020' ......, dtype='object', length=9050)] are in the [columns]"

I am learning pandas and matplotlib on my own by using some public dataset via this api link

I'm using colab and below are my codes:

import datetime 
import io
import json
import pandas as pd
import requests
import matplotlib.pyplot as plt

confirm_resp = requests.get('https://api.data.gov.hk/v2/filterq=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Fenhanced_sur_covid_19_eng.csv%22%2 C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content

confirm_df = pd.read_json(io.StringIO(confirm_resp.decode('utf-8')))
confirm_df.columns = confirm_df.columns.str.replace(" ", "_")
pd.to_datetime(confirm_df['Report_date'])
confirm_df.columns = ['Case_no', 'Report_date', 'Onset_date', 'Gender', 'Age', 
'Name_of_hospital_admitted', 'Status', 'Resident', 'Case_classification', 'Confirmed_probable']
confirm_df = confirm_df.drop('Name_of_hospital_admitted', axis = 1)
confirm_df.head()

and this is what the dataframe looks like:

Case_no Report_date Onset_date Gender Age Status Resident Case_classification Confirmed_probable
1 23/01/2020 21/01/2020 M 39 Discharged Non-HK resident Imported case Confirmed
2 23/01/2020 18/01/2020 M 56 Discharged HK resident Imported case Confirmed
3 24/01/2020 20/01/2020 F 62 Discharged Non-HK resident Imported case Confirmed
4 24/01/2020 23/01/2020 F 62 Discharged Non-HK resident Imported case Confirmed
5 24/01/2020 23/01/2020 M 63 Discharged Non-HK resident Imported case Confirmed

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First, you need to assign the converted date back to the column:

confirm_df['Report_date'] = pd.to_datetime(confirm_df['Report_date'])

Second, When the plot method is called from a dataframe object, you need to provide only the column names as argument (1).

confirm_df.plot(x='Report_date', y='Case_classification')

But the above code still throws error because 'Case_classification' is not numeric data.

You are trying to plot datetime vs. categorical data, so normal plot won't work but Something like this could work (2):

# I used only first 15 examples here, full dataset is kinda messy
confirm_df.iloc[:15, :].groupby(['Report_date', 'Case_classification']).size().unstack().plot.bar()

enter image description here

(1)pandas.DataFrame.plot

(2)How to plot categorical variable against a date column in Python


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...