Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
557 views
in Technique[技术] by (71.8m points)

css - extracting data using bs4 from columns with same td class name

I'm trying to scrape a table from this URl https://finance.yahoo.com/quote/AAPL/history?p=AAPL. this is my code:

import pandas as pd
import numpy as  np
import requests
from bs4 import BeautifulSoup
from random import randint
url=('https://finance.yahoo.com/quote/AAPL/history?p=AAPL')
r=requests.get(url)
r
soup=BeautifulSoup(r.text, 'html.parser')
date=[]
t=soup.find_all(class_="W(100%) M(0)") 
for i in t:
you=i.find_all('td',class_='Py(10px) Ta(start) Pend(10px)')

I have no problem getting the date column. when I ran the below code for the 2nd column, it returns all the data for the remaining 6 columns

for i in t:
u=i.find_all(class_='Py(10px) Pstart(10px)')
for k in u:
  print(k.text)

I want to get all of each individual columns one at a time, that is for open, high, low, close etc. how can I accomplish this using bs4?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could loop by select_one('td:nth-of-type(1)')

Example

import pandas as pd
import numpy as  np
import requests
from bs4 import BeautifulSoup
from random import randint
url=('https://finance.yahoo.com/quote/AAPL/history?p=AAPL')
r=requests.get(url)
r
soup=BeautifulSoup(r.text, 'html.parser')
date=[]
t=soup.find_all(class_="BdT Bdc($seperatorColor) Ta(end) Fz(s) Whs(nw)") 
for i in t:
    if i.select_one('td:nth-of-type(3)'):
        date = i.select_one('td:nth-of-type(1)').text
        start = i.select_one('td:nth-of-type(2)').text
        high = i.select_one('td:nth-of-type(3)').text
        low = i.select_one('td:nth-of-type(4)').text
        close = i.select_one('td:nth-of-type(5)').text
        adjClose = i.select_one('td:nth-of-type(6)').text 
        volume = i.select_one('td:nth-of-type(7)').text 
        print(date, start, high, low, close, adjClose, volume)

Output

Dec 31, 2020 134.08 134.74 131.72 132.69 132.69 98,990,400
Dec 30, 2020 135.58 135.99 133.40 133.72 133.72 96,452,100
Dec 29, 2020 138.05 138.79 134.34 134.87 134.87 121,047,300
Dec 28, 2020 133.99 137.34 133.51 136.69 136.69 124,486,200
Dec 24, 2020 131.32 133.46 131.10 131.97 131.97 54,930,100
Dec 23, 2020 132.16 132.43 130.78 130.96 130.96 88,223,700
Dec 22, 2020 131.61 134.41 129.65 131.88 131.88 168,904,800
Dec 21, 2020 125.02 128.31 123.45 128.23 128.23 121,251,600

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...