Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
186 views
in Technique[技术] by (71.8m points)

Python Pandas - Dataset with many columns - want to iterate over each column, add row values to new list only from fields that are not null

I have a dataset that I am inhering of website logs that basically adds a new series of columns based on the number of pages visited. For example, if someone went to 2 pages on our website we'd have something like: visit_id, url_1, visit_datetime_1, url_2, visit_datetime_2. The problem is that some people visit just one page, and some visit 14. I want to simply this. See below for my current format and desired output. I guess I just don't understand how I will go through each column, when the number of fields are not always consistent (but the column names WILL be consistent: visit_id is a unique identifier, url_x, visit_datetime_x). I'm stumped.

Just to be clear below, visit_id 1000 visited 3 pages, 2000 visited 1 page, and 3000 visited 2 pages.

enter image description here

I've just never done anything like this before in Pandas and I'm just at a roadblock. I've gotten this far, which isn't far, but at least shows I'm trying. All help is appreciated.


visit_ids = []
urls = []
visit_datetimes = []

dataset = pd.read_excel('data.xlsx', engine='openpyxl')
df = pd.DataFrame(dataset)

for colname in df.iteritems():
    
    #do something to add to list
question from:https://stackoverflow.com/questions/66045484/python-pandas-dataset-with-many-columns-want-to-iterate-over-each-column-ad

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can split last numbers after _ to MultiIndex and reshape by DataFrame.stack:

df = pd.read_excel('data.xlsx', engine='openpyxl')

df1 = df.set_index('visit_id')
df1.columns = df1.columns.str.rsplit('_', n=1, expand=True)

df1 = df1.stack().reset_index()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...