当前位置:网站首页>Reasons for automatic allocation failure of crawler agent IP
Reasons for automatic allocation failure of crawler agent IP
2022-06-24 06:21:00 【User 6172015】
Recently, a friend found a problem when using crawler agent , After the request is made through the crawler agent , Not every HTTP Request automatic assignment of different agents IP, Instead, all requests remain the same proxy IP Fixed use 20 Seconds later , Will switch to a new agent IP, What is the cause of this ? Some codes provided by small partners are as follows :
#! -*- encoding:utf-8 -*-
import requests
import random
# Target page to visit
targetUrl = "http://httpbin.org/ip"
# Objectives to visit HTTPS page
# targetUrl = "https://httpbin.org/ip"
# proxy server ( The product's official website www.16yun.cn)
proxyHost = "t.16yun.cn"
proxyPort = "31111"
# Proxy authentication information
proxyUser = "username"
proxyPass = "password"
proxyMeta = "http://%(user)s:%(pass)[email protected]%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
# Set up http and https All visits are made with HTTP agent
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
# Set up IP Switch head
tunnel = random.randint(1,10000)
headers = {
‘Connection’:'keep-alive',
'Accept-Language':'zh',
"Proxy-Tunnel": str(tunnel)
}
for i in range(100):
resp = requests.get(targetUrl, proxies=proxies, headers=headers)
print resp.status_code
print resp.text
time.sleep(0.2)After debugging and Analysis , The above code is mainly two problems :
1、‘Connection’:'keep-alive' Need to be closed
keep-alive It is the protocol specification of client and server , Turn on keep-alive, Then the server returns response Do not close after TCP Connect , After receiving the response message , The client does not close the connection , Send next HTTP The connection is reused when requested , This is the guide TCP Links keep opening , Therefore, the automatic of crawler agent IP The switch fails . Cause an agent IP It will be used for a long time , Until the agent IP Effective time of 20 After the second expires , closed TCP Connect and switch to the new agent IP.
2、tunnel Parameter setting error
tunnel Is used to control the agent IP Switching control parameters . The crawler agent will check tunnel The numerical , Different values will HTTP Request random assignment of a new agent IP forward ,tunnel The same will HTTP Request to assign the same agent IP forward . So to achieve each HTTP Requests go through different agents IP forward , Should be in for The following implementation tunnel = random.randint(1,10000), Make sure every time HTTP In the request tunnel Are different values .
边栏推荐
- How to resolve the domain name? How to choose a domain name?
- Optimized the search function of broken websites
- New tea: reshuffle, transformation, merger and acquisition
- Comparison of common layout solutions (media query, percentage, REM and vw/vh)
- Manual for automatic testing and learning of anti stepping pits, one for each tester
- Nature Neuroscience: challenges and future directions of functional brain tissue characterization
- Network review
- MySQL forgets root password cracking root password cracking all user passwords, shell script
- 25 classic selenium automated interview questions, collect them quickly
- Event delegation
猜你喜欢

Solution to the 39th weekly game of acwing

The product layout is strengthened, the transformation of digital intelligence is accelerated, and FAW Toyota has hit 2022million annual sales
![[fault announcement] one stored procedure brings down the entire database](/img/7c/e5adda73a077fe4b8f04b59d1e0e1e.jpg)
[fault announcement] one stored procedure brings down the entire database

ServiceStack. Source code analysis of redis (connection and connection pool)

Manual for automatic testing and learning of anti stepping pits, one for each tester

One line of keyboard

Technology is a double-edged sword, which needs to be well kept

A cigarette of time to talk with you about how novices transform from functional testing to advanced automated testing

What is the difference between a white box test and a black box test

Enter the software test pit!!! Software testing tools commonly used by software testers software recommendations
随机推荐
Discussion on NFT Technology
Urban Waterlogging Monitoring and early warning system
Analysis of official template of wechat personnel recruitment management system (III)
Working principle and type selection of signal generator
Linux Apache setting compression and caching
Double non students, self-taught programming, counter attack Baidu one year after graduation!
Load balancing on Tencent cloud
Tencent cloud won the "best customer value award for security hosting services in China" from Sullivan toubao Research Institute
Why do the new generation of highly concurrent programming languages like go and rust hate shared memory?
Intranet environment request Tencent cloud 3.0 API details
Analysis of official template of micro build low code (I)
Web automation test (3): Selenium basic course of web function automation test
You don't have to spend a penny to build a wechat official website in a minute
"Adobe international certification" confused me: what is Pantone?
MySQL series tutorial (I) getting to know MySQL
Correct way to update Fedora image Yum source to Tencent cloud Yum source
WordPress pill applet build applet from zero to one [applet registration configuration]
TRTC applet custom message
Web automated testing (1): further discussion on UI development history and UI and function automated testing
How to select cloud game platforms? Just pay attention to two points