hyperlink - Scrapy: Follow link to get additional Item data?

Question

Welcome To Ask or Share your Answers For Others

hyperlink - Scrapy: Follow link to get additional Item data?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

hyperlink - Scrapy: Follow link to get additional Item data?

I don't have a specific code issue I'm just not sure how to approach the following problem logistically with the Scrapy framework:

The structure of the data I want to scrape is typically a table row for each item. Straightforward enough, right?

Ultimately I want to scrape the Title, Due Date, and Details for each row. Title and Due Date are immediately available on the page...

BUT the Details themselves aren't in the table -- but rather, a link to the page containing the details (if that doesn't make sense here's a table):

|-------------------------------------------------|
|             Title              |    Due Date    |
|-------------------------------------------------|
| Job Title (Clickable Link)     |    1/1/2012    |
| Other Job (Link)               |    3/2/2012    |
|--------------------------------|----------------|

I'm afraid I still don't know how to logistically pass the item around with callbacks and requests, even after reading through the CrawlSpider section of the Scrapy documentation.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:08:53+0000

Please, first read the docs to understand what i say.

The answer:

To scrape additional fields which are on other pages, in a parse method extract URL of the page with additional info, create and return from that parse method a Request object with that URL and pass already extracted data via its meta parameter.

how do i merge results from target page to current page in scrapy?

Categories

hyperlink - Scrapy: Follow link to get additional Item data?

hyperlink - Scrapy: Follow link to get additional Item data?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags