Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
99 views
in Technique[技术] by (71.8m points)

python - Pandas format datetime with many different date types

I am trying to format the column 'Data' to make a pattern with dates.

The formats I have are:

1/30/20 16:00
1/31/2020 23:59
2020-02-02T23:43:02

Here is the code for the dataframe.

import requests
import pandas as pd
import numpy as np
url = "https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports"
csv_only  = [i.split("=")[1][1:-1] for i in requests.get(url).text.split(" ") if '.csv' in i and 'title' in i]

combo = [pd.read_csv(url.replace("github","raw.githubusercontent").replace("/tree/","/")+"/"+f) for f in csv_only]

one_df = pd.concat(combo,ignore_index=True)

one_df["País"] = one_df["Country/Region"].fillna(one_df["Country_Region"])
one_df["Data"] = one_df["Last Update"].fillna(one_df["Last_Update"])

I tried adding the code bellow but it doesn't bring the result I wanted

pd.to_datetime(one_df['Data'])
one_df.style.format({"Data": lambda t: t.strftime("%m/%d/%Y")})

Any help?

UPDATE

This is the complete code, but it doesn't work. Many exceptions printed with different date formats.

import requests
import pandas as pd
import numpy as np
from datetime import datetime
url = "https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports"
csv_only  = [i.split("=")[1][1:-1] for i in requests.get(url).text.split(" ") if '.csv' in i and 'title' in i]

combo = [pd.read_csv(url.replace("github","raw.githubusercontent").replace("/tree/","/")+"/"+f) for f in csv_only]

one_df = pd.concat(combo,ignore_index=True)

df = pd.DataFrame()
DATE_FORMATS = ["%m/%d/%y %H:%M", "%m/%d/%Y %H:%M", "%Y-%m-%dT%H:%M:%S", "%Y-%m-%d %H:%M:%S", "%Y-%m-%d %H:%M:%S", "%Y-%m-%d  %H:%M:%S"]

df["Regi?o"] = one_df["Province/State"].fillna(one_df["Admin2"])
df["País"] = one_df["Country/Region"].fillna(one_df["Country_Region"])
df["Data"] = one_df["Last Update"].fillna(one_df["Last_Update"])
df["Confirmados"] = one_df["Confirmed"]
df["Mortes"] = one_df["Deaths"]
df["Recuperados"] = one_df["Recovered"]

def parse(x_):
    for fmt in DATE_FORMATS :
        try:
            tmp = datetime.strptime(x_, fmt).strftime("%m/%d/%Y")
            return tmp
        except ValueError:
            print(x_)

pd.to_datetime(df['Data'])
df['Data'] = df['Data'].apply(lambda x: parse(x))

#df['Data'].strftime('%m/%d/%Y')
#df['Data'] = df['Data'].map(lambda x: x.strftime('%m/%d/%Y') if x else '')

df.to_excel(r'C:UsersguilhDownloadsCovid2Covid-19.xlsx', index=False,  encoding="utf8")
print(df)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
from datetime import datetime
import pandas as pd

You could save all possible formats in a list as -

DATE_FORMATS = ["%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S", "%m/%d/%y %H:%M", "%m/%d/%Y %H:%M"]

Define a function that loops through the formats and tries to parse it. (Fixed a bug, where the print statement should have been outside the for loop)

issues = set()
def parse(x_):
    for fmt in DATE_FORMATS:
        try:
            return datetime.strptime(x_, fmt).strftime("%m/%d/%Y")
        except ValueError:
            pass
    issues.add(x_)


sample = ["1/30/20 16:00", "1/31/2020 23:59", "2020-02-02T23:43:02"]

df = pd.DataFrame({'data': sample})
df['data'] = df['data'].apply(lambda x: parse(x))

assert df['Data'].isna().sum() == len(issues) == 0, "Issues observed, nulls observed in dataframe"

print("Done")

Output

         data
0  01/30/2020
1  01/31/2020
2  02/02/2020

If df.apply() comes across a particular date format that hasn't been defined in the list, it would simply print None since nothing would be returned by the function parse()


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...