当前位置:网站首页>A preliminary study on the middleware of script Downloader
A preliminary study on the middleware of script Downloader
2022-07-03 22:42:00 【Keep a low profile】
Preliminary learning of downloader middleware , This thing is still quite complicated
Mainly complicated in his request 、 Changes in response , If there is no interception , This is easier
stay settings.py It's enabled inside
DOWNLOADER_MIDDLEWARES = {
'test_middle_demo.middlewares.TestMiddleDemoDownloaderMiddleware': 543,
}
@classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
s = cls()
crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
return s
first spider_opened and The following functions work together
def spider_opened(self, spider):
spider.logger.info('Spider opened: %s' % spider.name)
print('1. The crawler is running ')
def process_request(self, request, spider):
# Called for each request that goes through the downloader
# middleware.
# Must either:
# - return None: continue processing this request
# - or return a Response object
# - or return a Request object
# - or raise IgnoreRequest: process_exception() methods of
# installed downloader middleware will be called
print('2. Come to the request ', request.url, request.headers)
return None
"""
return none Continue to send the request to the middleware or downloader No interception
return Response Direct return response , The middleware Downloader is not executed , Forward pass
return Request Return the request object Return to the engine , engine Return to scheduler , Continue with the following process
""
def process_response(self, request, response, spider):
# Called with the response returned from the downloader.
# Must either;
# - return a Response object # Respond to the upper layer , To the engine
# - return a Request object # Return request , Give the engine , To the scheduler
# - or raise IgnoreRequest
print('3. Here we are ', response.status, response.headers)
return response
import scrapy
from bs4 import BeautifulSoup
class TestMSpider(scrapy.Spider):
name = 'test_m'
allowed_domains = ['baidu.com']
start_urls = ['https://www.baidu.com/']
def parse(self, response, **kwargs):
print('4. Finally came to the reptile response here , Give something about page parsing ')
soup = BeautifulSoup(response.text, 'lxml')
title = soup.find('title').text
print(title)
Then you will get such a result 
Take a chestnut
If it is multiple downloader middleware , As shown in the following code
Focus on
This 100,200 This number Namely Middleware to The distance of the engine
The movement of this thing is linear
So this walking method is shown in the figure below 1,3,4,2
DOWNLOADER_MIDDLEWARES = {
'test_middle_demo.middlewares.TestMiddleDemoDownloaderMiddleware_01': 100,
'test_middle_demo.middlewares.TestMiddleDemoDownloaderMiddleware_02': 200,
}
class TestMiddleDemoDownloaderMiddleware_01:
def process_request(self, request, spider):
print(1)
return None
def process_response(self, request, response, spider):
print(2)
return response
class TestMiddleDemoDownloaderMiddleware_02:
def process_request(self, request, spider):
print(3)
return None
def process_response(self, request, response, spider):
print(4)
return response
边栏推荐
- Yyds dry goods inventory [practical] simply encapsulate JS cycle with FP idea~
- Hcip day 16 notes
- This time, thoroughly understand bidirectional data binding 01
- DR-NAS26-Qualcomm-Atheros-AR9582-2T-2R-MIMO-802.11-N-5GHz-high-power-Mini-PCIe-Wi-Fi-Module
- [sg function] 2021 Niuke winter vacation training camp 6 h. winter messenger 2
- 4 environment construction -standalone ha
- [Android reverse] use the DB browser to view and modify the SQLite database (copy the database file from the Android application data directory | use the DB browser tool to view the data block file)
- Codeforces Round #768 (Div. 1)(A-C)
- Bluebridge cup Guoxin Changtian single chip microcomputer -- detailed explanation of schematic diagram (IV)
- What indicators should be paid attention to in current limit monitoring?
猜你喜欢

Hcip day 14 notes
![[automation operation and maintenance novice village] flask-2 certification](/img/9a/a9b45e1f41b9b75695dcb06c212a69.jpg)
[automation operation and maintenance novice village] flask-2 certification

JS closure knowledge points essence

SDMU OJ#P19. Stock trading

The overseas listing of Shangmei group received feedback, and brands such as Han Shu and Yiye have been notified for many times and received attention
![[flax high frequency question] leetcode 426 Convert binary search tree to sorted double linked list](/img/db/b992d2b461ca17652518a1511b4947.gif)
[flax high frequency question] leetcode 426 Convert binary search tree to sorted double linked list

QGIS grid processing DEM data reclassification
Creation of the template of the password management software keepassdx

Data consistency between redis and database

How to restore the factory settings of HP computer
随机推荐
DR882-Qualcomm-Atheros-QCA9882-2T2R-MIMO-802.11ac-Mini-PCIe-Wi-Fi-Module-5G-high-power
Shell script three swordsman awk
Pointer concept & character pointer & pointer array yyds dry inventory
AST (Abstract Syntax Tree)
[dynamic planning] counting garlic customers: the log of garlic King (the longest increasing public subsequence)
Niuke winter vacation training camp 4 g (enumeration optimization, Euler power reduction)
Hcip day 12 notes
Programming language (2)
LeetCode 540. A single element in an ordered array
[Android reverse] use DB browser to view and modify SQLite database (download DB browser installation package | install DB browser tool)
C deep anatomy - the concept of keywords and variables # dry inventory #
Teach you to easily learn the type of data stored in the database (a must see for getting started with the database)
Programming language (1)
On my first day at work, this API timeout optimization put me down!
股票炒股开户注册安全靠谱吗?有没有风险的?
Exness: the Central Bank of England will raise interest rates again in March, and inflation is coming
Electronic tube: Literature Research on basic characteristics of 6j1
Buuctf, misc: n solutions
Exclusive download! Alibaba cloud native brings 10 + technical experts to bring "new possibilities of cloud native and cloud future"
Plug - in Oil Monkey