Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
242 views
in Technique[技术] by (71.8m points)

python - Scraping star ratings with beautifulsoup

I am trying to scrape star ratings from this website using beautifulsoup:

https://www.vezeeta.com/ar/%D8%AF%D9%83%D8%AA%D9%88%D8%B1/%D9%83%D9%84-%D8%A7%D9%84%D8%AA%D8%AE%D8%B5%D8%B5%D8%A7%D8%AA/%D9%85%D8%B5%D8%B1

i have tried the following code, but it did not work: emphasized text

import requests 
from bs4 import BeautifulSoup
import pandas as pd 


def extract(pages):
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"}
    url = f'https://www.vezeeta.com/ar/%D8%AF%D9%83%D8%AA%D9%88%D8%B1/%D9%83%D9%84-%D8%A7%D9%84%D8%AA%D8%AE%D8%B5%D8%B5%D8%A7%D8%AA/%D9%85%D8%B5%D8%B1?page={pages}'
    r = requests.get(url, headers)
    soup = BeautifulSoup(r.content, 'html.parser')
    return soup 

def transform(soup):
    divs = soup.find_all('div', class_ = "cPqVnh")
    for item in divs:
        try:
            Doc_Name = item.find('span', class_ = 'kLULsT').text.strip()
        except:
            Doc_Name = "Na"
        
        try:
            Breif_description = item.find('h3', class_ = 'iKpeGL').text.strip()
        except:
            Breif_description = 'Na'
            
        try:
            star_rating = item.find("span", class_="kviwPW").attrs["data-testid"]
        except:
            star_rating = 'Na' 
            
        try: 
            num_of_raters = item.find('span', class_ = 'RFlHo')
        except:
            num_of_raters = 'Na'
        
        try:
            specialization = item.find('a', class_ = 'aEFQT').text.strip()
        except:
            specialization = 'Na'
            
        try:
            address = item.find('span', class_ = 'hcumo').text.strip()
        except:
            address = 'Na'
   
        try:
            price = item.find("span",attrs={"itemprop":"priceRange"}).text.strip()
        except:
            price = 'Na'

        try:
            waiting_time = item.find("span", {"class":["jtwJzn", "iaCxfW", "iWZJjx"]}).text.strip()
        except:
            waiting_time = 'Na'
            
    
        
        job = {
            'Doc_Name': Doc_Name,
            'Breif_description': Breif_description,
            'star_rating': star_rating,
            'num_of_raters': num_of_raters,
            "specialization": specialization,
            "address": address,
            "price": price,
            'waiting_time': waiting_time,
        }
        job_list.append(job)
    return 

job_list = []

for i in range(1,3):
    c = extract(i)
    transform(c)
    
df = pd.DataFrame(job_list)
df.to_csv("Vezeeta.csv")
question from:https://stackoverflow.com/questions/65929172/scraping-star-ratings-with-beautifulsoup

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...