I collect url from command python and then insert it into start_urls
from flask import Flask, jsonify, request
import scrapy
import subprocess
class ClassSpider(scrapy.Spider):
name = 'mySpider'
#start_urls = []
#pages = 0
news = []
def __init__(self, url, nbrPage):
self.pages = nbrPage
self.start_urls = []
self.start_urlsappend(url)
def parse(self):
...
def run(self):
subprocess.check_output(['scrapy', 'crawl', 'mySpider', '-a', f'url={self.start_urls}', '-a', f'nbrPage={self.pages}'])
return self.news
app = Flask(__name__)
data = []
@app.route('/', methods=['POST'])
def getNews():
mySpiderClass = ClassSpider(request.json['url'], 2)
return jsonify({'data': mySpider.run()})
if __name__ == "__main__":
app.run(debug=True)
I got this error: raise not supported("unsupported url scheme %s: %s" %
scrapy.exceptions.NotSupported: Unsupported URL scheme '': no handler available for that scheme
When I put a
print('my urls List: ' + str(self.start_urls))
, it prints a list of url like --> my urls List: ['www.googole.com']
Any help plz
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…