鍍金池/ 問(wèn)答/人工智能  數(shù)據(jù)分析&挖掘  Python/ scrapy怎么總是只采集第一頁(yè),代碼怎么改?

scrapy怎么總是只采集第一頁(yè),代碼怎么改?

怎么總是采集完第一頁(yè)的數(shù)據(jù),就停止了,我這代碼哪里有問(wèn)題嗎?
spider代碼

import scrapy
import json
from douyu.items import DouyuItem


class MeinvSpider(scrapy.Spider):
    name = 'meinv'
    allowed_domains = ['capi.douyucdn.cn']
    
    offset = 0
    url = "http://capi.douyucdn.cn/api/v1/getVerticalRoom?limit=20&offset="
    start_urls = [url + str(offset)]

    def parse(self, response):
        res = json.loads(response.text)
        for each in res['data']:
            item = DouyuItem()
            item["nickname"] = each["nickname"]
            item["imagelink"] = each["vertical_src"] 
            yield item
            
        self.offset += 20
        yield scrapy.Request(self.url+str(self.offset),callable=self.parse)

圖片管道代碼

#獲取設(shè)置
import scrapy
from scrapy.utils.project import get_project_settings
from scrapy.pipelines.images import ImagesPipeline
import os

class DouyuPipeline(ImagesPipeline):

    #獲取settings文件里設(shè)置的變量值
    IMAGES_STORE = get_project_settings().get("IMAGES_STORE")

    #獲取圖片鏈接,并發(fā)送請(qǐng)求
    def get_media_requests(self,item,info):
        image_url = item["imagelink"]
        yield scrapy.Request(image_url,meta={
            "item":item
        })
    
    #處理圖片
    def item_completed(self,results,item,info):
        # ok判斷是否下載成功
        image_paths = [x["path"] for ok, x in results if ok]
        
        if not image_paths:
            raise DropItem("Item contains no images")
        
        #os.rename(self.IMAGES_STORE + image_path[0], self.IMAGES_STORE + item["nickname"] + ",jpg")
        item["imagePath"] = image_paths[0]

        return item
    
回答
編輯回答
茍活

Request的參數(shù)名寫錯(cuò)了 callback

2018年3月4日 22:01