当前位置:网站首页>Getting started with crawlers (1) -- requests (1)
Getting started with crawlers (1) -- requests (1)
2022-07-28 15:37:00 【WHJ226】
Catalog
1. requests Module installation
3. End of year benefits : Capture beautiful pictures
requests Kucai requests Using a blocking network request , in other words , After request , You must wait for a response before continuing to perform the following tasks .
1. requests Module installation
be based on PyCharm2022.1.1 Development environment of .
1.1 pip insatll requests
Click on Terminal

Input pip install requests Back carriage return , I have installed , So display requirements to meet .
1.2 PyCharm install




After the installation is completed, a message similar to successful sign .
2. requests actual combat
Take Sogou as an example :
import requests # The import module
url = 'https://www.sogou.com/' # Request URL
response = requests.get(url) # Respond to
response.encoding = 'utf-8' # Encoding mode
print(' The response content is :',response.content) # Get response content
print(' The response text is :',response.text) # Get response text
print(' The request header is :',response.headers) # Get request header
print(' The request method is :',response.request) # Get request method
print(' The encoding method is :',response.encoding) # Get the encoding
print(' Request URL url by :',response.url) # Get the requested URL url
print('cookies by :',response.cookies) # obtain cookies
print(' Status code for :',response.status_code) # Get status code , commonly 200 The request is successful ,404 request was aborted
print(' The response type is :',type(response)) # Get the response type
print(' The content response type is :',type(response.content))
print(' The text response type is :',type(response.text))The operation results are as follows :
The response content is : b'<!DOCTYPE html><html lang="cn"><head><meta name="viewport" content="width=device-width,minimum-scale=1,maximum-scale=1,user-scalable=no"><script>window._speedMark = new Date(); window.lead_ip = \'123.147.244.130\';\n window.now = 1653966907968;</script><script type="text/javascript">/*file=static/js/resourceErrorReport.js*/!function(a){var n=(new Date).getTime(),r=a.location.protocol;function c(e,t){var o=(new Date).getTime()-n;(new Image).src=["//pb.sogou.com/pv.gif?uigs_productid=wapapp&type=resource-error&stype=",e,"×tamp=",o,"&protocol=",r,"&host=",encodeURIComponent(a.location.host),"&path=",encodeURIComponent(a.location.pathname),"&resource=",encodeURIComponent(t)].join("")}function e(e){if((e=e||a.event)&&"error"===e.type){var t=e.srcElement?e.srcElement:e.target;if(t){var o,n,r=t.tagName;"LINK"===r?(n="css",(o=t.getAttribute("href"))&&o.match(/\\.css($|\\?)/)&&c(n,o)):"SCRIPT"===r&&(n="js",(o=t.getAttribute("src"))&&o.match(/\\.js($|\\?)/)&&c(n,o))}}}r&&(r=r.substring(0,r.length-1)),a.addEventListener?a.addEventListener("error",e,!0):a.attachEvent&&a.attachEvent("onerror",e)}(window);</script><meta charset="utf-8"><link rel="dns-prefetch" href="//img01.sogoucdn.com"><link rel="dns-prefetch" href="//img02.sogoucdn.com"><link rel="dns-prefetch" href="//img03.sogoucdn.com"><link rel="dns-prefetch" href="//img04.sogoucdn.com"><link rel="dns-prefetch" href="//dlweb.sogoucdn.com"><title>\xe6\x90\x9c\xe7\x8b\x97\xe6\x90\x9c\xe7\xb4\xa2\xe5\xbc\x95\xe6\x93\x8e - \xe4\xb8\x8a\xe7\xbd\x91\xe4\xbb\x8e\xe6\x90\x9c\xe7\x8b\x97\xe5\xbc\x80\xe5\xa7\x8b</title><link rel="shortcut icon" href="/images/logo/new/favicon.ico?v=4" type="image/x-icon"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><link rel="search" type="application/opensearchdescription+xml" href="/content-search.xml" title="\xe6\x90\x9c\xe7\x8b\x97\xe6\x90\x9c\xe7\xb4\xa2"><meta name="keywords" content="\xe6\x90\x9c\xe7\x8b\x97\xe6\x90\x9c\xe7\xb4\xa2,\xe7\xbd\x91\xe9\xa1\xb5\xe6\x90\x9c\xe7\xb4\xa2,\xe5\xbe\xae\xe4\xbf\xa1\xe6\x90\x9c\xe7\xb4\xa2,\xe8\xa7\x86\xe9\xa2\x91\xe6\x90\x9c\xe7\xb4\xa2,\xe5\x9b\xbe\xe7\x89\x87\xe6\x90\x9c\xe7\xb4\xa2,\xe9\x9f\xb3\xe4\xb9\x90\xe6\x90\x9c\xe7\xb4\xa2,\xe6\x96\xb0\xe9\x97\xbb\xe6\x90\x9c\xe7\xb4\xa2,\xe8\xbd\xaf\xe4\xbb\xb6\xe6\x90\x9c\xe7\xb4\xa2,\xe9\x97\xae\xe7\xad\x94\xe6\x90\x9c\xe7\xb4\xa2,\xe7\x99\xbe\xe7\xa7\x91\xe6\x90\x9c\xe7\xb4\xa2,\xe8\xb4\xad\xe7\x89\xa9\xe6\x90\x9c\xe7\xb4\xa2"><meta name="description" content="\xe6\x90\x9c\xe7\x8b\x97\xe6\x90\x9c\xe7\xb4\xa2\xe6\x98\xaf\xe5\x85\xa8\xe7\x90\x83\xe7\xac\xac\xe4\xb8\x89\xe4\xbb\xa3\xe4\xba\x92\xe5\x8a\xa8\xe5\xbc\x8f\xe6\x90\x9c\xe7\xb4\xa2\xe5\xbc\x95\xe6\x93\x8e\xef\xbc\x8c\xe6\x94\xaf\xe6\x8c\x81\xe5\xbe\xae\xe4\xbf\xa1\xe5\x85\xac\xe4\xbc\x97\xe5\x8f\xb7\xe5\x92\x8c\xe6\x96\x87\xe7\xab\xa0\xe6\x90\x9c\xe7\xb4\xa2\xe3\x80\x81\xe7\x9f\xa5\xe4\xb9\x8e\xe6\x90\x9c\xe7\xb4\xa2\xe3\x80\x81\xe8\x8b\xb1\xe6\x96\x87\xe6\x90\x9c\xe7\xb4\xa2\xe5\x8f\x8a\xe7\xbf\xbb\xe8\xaf\x91\xe7\xad\x89\xef\xbc\x8c\xe9\x80\x9a\xe8\xbf\x87\xe8\x87\xaa\xe4\xb8\xbb\xe7\xa0\x94\xe5\x8f\x91\xe7\x9a\x84\xe4\xba\xba\xe5\xb7\xa5\xe6\x99\xba\xe8\x83\xbd\xe7\xae\x97\xe6\xb3\x95\xe4\xb8\xba\xe7\x94\xa8\xe6\x88\xb7\xe6\x8f\x90\xe4\xbe\x9b\xe4\xb8\x93\xe4\xb8\x9a\xe3\x80\x81\xe7\xb2\xbe\xe5\x87\x86\xe3\x80\x81\xe4\xbe\xbf\xe6\x8d\xb7\xe7\x9a\x84\xe6\x90\x9c\xe7\xb4\xa2\xe6\x9c\x8d\xe5\x8a\xa1\xe3\x80\x82"><link rel="stylesheet" type="text/css" href="//dlweb.sogoucdn.com/pcsearch/web/index/css/index_style_39e6e10.css"><style>.wrapper .suggestion{border:1px solid #e8e8e8;width:653px;-moz-box-shadow:0 1px 8px rgba(0,0,0,.1);-webkit-box-shadow:0 1px 8px rgba(0,0,0,.1);box-shadow:0 1px 8px rgba(0,0,0,.1);border-top-left-radius:0;border-top-right-radius:0;border-bottom-right-radius:2px;border-bottom-left-radius:2px;top:43px}.wrapper .suglist{width:206px}.wrapper .suglist .keyword{color:#7a77c8}.big-scn .suggestion{width:820px}.big-scn .suglist{width:236px}.wrapper .suglist{padding:4px 0}input[type=text]::-ms-clear{display:none}</style><!-- indexSnippetToHeader start --> <!-- indexSnippetToHeader end --></head><body color-style="white"><div class="wrapper " id="wrap"><div class="header"> <div class="top-nav"><ul><li class="cur"><span>\xe7\xbd\x91\xe9\xa1\xb5</span></li><li><a onclick="st(this,\'73141200\',\'weixin\')" href="http://weixin.sogou.com/" uigs-id="nav_weixin" id="weixinch">\xe5\xbe\xae\xe4\xbf\xa1</a></li><li><a onclick="st(this,\'40051200\',\'zhihu\')" href="http://zhihu.sogou.com/" uigs-id="nav_zhihu" id="zhihu">\xe7\x9f\xa5\xe4\xb9\x8e</a></li><li><a onclick="st(this,\'40030500\',\'pic\')" href="http://pic.sogou.com" uigs-id="nav_pic" id="pic">\xe5\x9b\xbe\xe7\x89\x87</a></li><li><a onclick="st(this,\'40030600\',\'video\')" href="https://v.sogou.com/" uigs-id="nav_v" id="video">\xe8\xa7\x86\xe9\xa2\x91</a></li><li><a href="http://mingyi.sogou.com?fr=common_index_nav" uigs-id="nav_mingyi" id="mingyi" onclick="st(this,\'\',\'myingyi\')">\xe5\x8c\xbb\xe7\x96\x97</a></li><li><a href="http://hanyu.sogou.com?fr=pcweb_index_nav" uigs-id="nav_hanyu" id="hanyu" onclick="st(this,\'\',\'hanyu\')">\xe6\xb1\x89\xe8\xaf\xad</a></li><li><a href="http://fanyi.sogou.com?fr=common_index_nav_pc" uigs-id="nav_fanyi" id="fanyi" onclick="st(this,\'\',\'fanyi\')">\xe7\xbf\xbb\xe8\xaf\x91</a></li><li><a onclick="st(this,\'web2ww\',\'wenwen\')" href="https://wenwen.sogou.com/?ch=websearch" uigs-id="nav_wenwen" id="index_more_wenwen">\xe9\x97\xae\xe9\x97\xae</a></li><li><a onclick="st(this,\'web2ww\',\'baike\')" href="http://baike.sogou.com/Home.v" uigs-id="nav_baike" id="index_baike">\xe7\x99\xbe\xe7\xa7\x91</a></li><li><a onclick="st(this,\'40031000\')" href="http://map.sogou.com" uigs-id="nav_map" id="map">\xe5\x9c\xb0\xe5\x9b\xbe</a></li><li class="show-more"><a href="javascript:void(0);" id="more-product">\xe6\x9b\xb4\xe5\xa4\x9a<i class="m-arr"></i></a><div class="pos-more" id="products-box" style="top:40px"><span class="ico-san"></span><a onclick="st(this,\'40031500\')" href="http://gouwu.sogou.com/" uigs-id="nav_gouwu" id="index_more_gouwu">\xe8\xb4\xad\xe7\x89\xa9</a><a onclick="st(this)" href="http://zhishi.sogou.com" uigs-id="nav_zhishi" id="index_more_zhishi">\xe7\x9f\xa5\xe8\xaf\x86</a><a onclick="st(this,\'40051205\')" href="http://as.sogou.com/" uigs-id="nav_app" id="index_more_appli">\xe5\xba\x94\xe7\x94\xa8</a><a href="https://baike.sogou.com/kexue/home.htm" uigs-id="nav_science" id="science">\xe7\xa7\x91\xe5\xad\xa6</a><span class="all"><a onclick="st(this,\'40051206\')" href="http://www.sogou.com/docs/more.htm?v=1" uigs-id="nav_all" target="_blank">\xe5\x85\xa8\xe9\x83\xa8</a></span></div></li></ul></div><div class="user-box"> <a href="javascript:void(0)" id="cniil_wza" style="float:left;text-decoration:none;color:#000;opacity:.75;padding-right:20px;margin-right:20px;border-right:1px solid #e7e7e7;line-height:14px;position:relative;top:5px">\xe6\x97\xa0\xe9\x9a\x9c\xe7\xa2\x8d</a> <div class="local-weather" id="local-weather"><div class="wea-box" id="cur-weather" style="display:none"></div> <div class="pos-more" id="detail-weather" style="top:40px;left:-80px"></div> </div><span class="line" id="user-box-line" style="display:none"></span><div class="user-enter"> <a href="javascript:void(0);" class="enter" id="loginBtn">\xe7\x99\xbb\xe5\xbd\x95</a> </div></div></div><div class="content" id="content"><div class="pos-header" id="top-float-bar"><div class="part-one"></div><div class="part-two" id="card-tab-layer"><div class="c-top" id="top-card-tab"></div></div></div><div class="logo2" id="logo-s"><span></span></div><div class="logo" id="logo-l"><span></span></div> <div class="search-box querybox-focus" id="search-box"><form action="/web" name="sf" id="sf"><span class="sec-input-box"><input type="text" class="sec-input active" name="query" id="query" maxlength="100" len="80" autocomplete="off"></span><span class="enter-input"><input type="submit" value="\xe6\x90\x9c\xe7\x8b\x97\xe6\x90\x9c\xe7\xb4\xa2" id="stb"></span><input type="hidden" name="_asf" value="www.sogou.com"> <input type="hidden" name="_ast"> <input type="hidden" name="w" value="01019900"> <input type="hidden" name="p" value="40040100"> <input type="hidden" name="ie" value="utf8"> <input type="hidden" name="from" value="index-nologin"> <input type="hidden" name="s_from" value="index"><div class="keywords-tips" id="keywordsTips" style="display:none"><i></i><p>\xe2\x80\x9c<strong id="keywordsTipsStrong">369</strong>\xe2\x80\x9d\xe5\x90\x8e\xe9\x9d\xa2\xe7\x9a\x84\xe6\x96\x87\xe5\xad\x97\xe8\xa2\xab\xe5\xbf\xbd\xe7\x95\xa5\xef\xbc\x8c\xe6\x90\x9c\xe7\x8b\x97\xe7\x9a\x84\xe6\x9f\xa5\xe8\xaf\xa2\xe9\x99\x90\xe5\x88\xb6\xe5\x9c\xa840\xe4\xb8\xaa\xe6\xb1\x89\xe5\xad\x97\xe4\xbb\xa5\xe5\x86\x85\xe3\x80\x82</p></div></form></div> </div><div class="card-box" id="card-box" style="display:none"><div class="card-box2" id="card-box2"><div class="c-top" id="card-tab-box"><a href="javascript:void(0);" uigs-id="settings_close-card" id="close-card" class="shezhi"></a></div><div class="c-main" id="card-content"></div></div></div><div class="loog-more" id="scroll-more" style="display:none"><a href="javascript:void(0);" uigs-id="scroll-more">\xe6\xbb\x9a\xe5\x8a\xa8\xe6\x9f\xa5\xe7\x9c\x8b\xe6\x9b\xb4\xe5\xa4\x9a<br><span class="ico_san"></span></a></div><div class="ft" id="footer" style="display:none" ><a href="http://b.sogou.com/" target="_blank" uigs-id="footer_tuiguang">\xe4\xbc\x81\xe4\xb8\x9a\xe6\x8e\xa8\xe5\xb9\xbf</a><span class="line"></span><a href="http://www.sogou.com/docs/terms.htm?v=1" target="_blank" uigs-id="footer_disclaimer">\xe5\x85\x8d\xe8\xb4\xa3\xe5\xa3\xb0\xe6\x98\x8e</a><span class="line"></span><a href="http://fankui.help.sogou.com/index.php/web/web/index/type/4" target="_blank" uigs-id="footer_feedback">\xe6\x84\x8f\xe8\xa7\x81\xe5\x8f\x8d\xe9\xa6\x88\xe5\x8f\x8a\xe6\x8a\x95\xe8\xaf\x89</a><span class="line"></span><a href="http://corp.sogou.com/private.html" target="_blank" uigs-id="footer_private">\xe9\x9a\x90\xe7\xa7\x81\xe6\x94\xbf\xe7\xad\x96</a><br><span class="g">\xe8\x8d\xaf\xe5\x93\x81\xe5\x8c\xbb\xe7\x96\x97\xe5\x99\xa8\xe6\xa2\xb0\xe7\xbd\x91\xe7\xbb\x9c\xe4\xbf\xa1\xe6\x81\xaf\xe6\x9c\x8d\xe5\x8a\xa1\xe5\xa4\x87\xe6\xa1\x88\xef\xbc\x9a\xef\xbc\x88\xe4\xba\xac\xef\xbc\x89\xe7\xbd\x91\xe8\x8d\xaf\xe6\xa2\xb0\xe4\xbf\xa1\xe6\x81\xaf\xe5\xa4\x87\xe5\xad\x97\xef\xbc\x882021\xef\xbc\x89\xe7\xac\xac00047\xe5\x8f\xb7</span> / <span class="g">\xe4\xba\x92\xe8\x81\x94\xe7\xbd\x91\xe8\x8d\xaf\xe5\x93\x81\xe4\xbf\xa1\xe6\x81\xaf\xe6\x9c\x8d\xe5\x8a\xa1\xe8\xb5\x84\xe6\xa0\xbc\xe8\xaf\x81\xe4\xb9\xa6(\xe9\x9d\x9e\xe7\xbb\x8f\xe8\x90\xa5\xe6\x80\xa7)\xef\xbc\x9a(\xe4\xba\xac)-\xe9\x9d\x9e\xe7\xbb\x8f\xe8\x90\xa5\xe6\x80\xa7-2018-0311</span><br>© 2004-2022 Sogou.com / <a href="http://www.12377.cn" class="g" target="_blank">\xe7\xbd\x91\xe4\xb8\x8a\xe6\x9c\x89\xe5\xae\xb3\xe4\xbf\xa1\xe6\x81\xaf\xe4\xb8\xbe\xe6\x8a\xa5\xe4\xb8\x93\xe5\x8c\xba</a> / <span class="g">\xe4\xba\xac\xe7\xbd\x91\xe6\x96\x87(2019)6117-724\xe5\x8f\xb7</span> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank">\xe4\xba\xacICP\xe8\xaf\x81050897\xe5\x8f\xb7</a> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank">\xe4\xba\xacICP\xe5\xa4\x8711001839\xe5\x8f\xb7-1</a> / <a href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=11000002000025" class="ba" target="_blank">\xe4\xba\xac\xe5\x85\xac\xe7\xbd\x91\xe5\xae\x89\xe5\xa4\x8711000002000025\xe5\x8f\xb7</a></div> <div class="ft-v1" id="QRcode-footer" style="padding-bottom:28px"><div class="ft-info"><a uigs-id="mid_pinyin" href="http://pinyin.sogou.com/" target="_blank"><i class="i1"></i>\xe6\x90\x9c\xe7\x8b\x97\xe8\xbe\x93\xe5\x85\xa5\xe6\xb3\x95</a><span class="line"></span><a uigs-id="mid_liulanqi" href="http://ie.sogou.com/" target="_blank"><i class="i2"></i>\xe6\xb5\x8f\xe8\xa7\x88\xe5\x99\xa8</a><span class="line"></span><a uigs-id="mid_daohang" href="http://123.sogou.com/" target="_blank"><i class="i3"></i>\xe7\xbd\x91\xe5\x9d\x80\xe5\xaf\xbc\xe8\x88\xaa</a><br><a href="http://b.sogou.com/" target="_blank" class="g">\xe4\xbc\x81\xe4\xb8\x9a\xe6\x8e\xa8\xe5\xb9\xbf</a> - <a href="http://www.sogou.com/docs/terms.htm?v=1" target="_blank" class="g">\xe5\x85\x8d\xe8\xb4\xa3\xe5\xa3\xb0\xe6\x98\x8e</a> - <a href="http://fankui.help.sogou.com/index.php/web/web/index/type/4" target="_blank" class="g">\xe6\x84\x8f\xe8\xa7\x81\xe5\x8f\x8d\xe9\xa6\x88\xe5\x8f\x8a\xe6\x8a\x95\xe8\xaf\x89</a> - <a href="http://corp.sogou.com/private.html" target="_blank" class="g" uigs-id="footer_private">\xe9\x9a\x90\xe7\xa7\x81\xe6\x94\xbf\xe7\xad\x96</a><br><span class="g">\xe8\x8d\xaf\xe5\x93\x81\xe5\x8c\xbb\xe7\x96\x97\xe5\x99\xa8\xe6\xa2\xb0\xe7\xbd\x91\xe7\xbb\x9c\xe4\xbf\xa1\xe6\x81\xaf\xe6\x9c\x8d\xe5\x8a\xa1\xe5\xa4\x87\xe6\xa1\x88\xef\xbc\x9a\xef\xbc\x88\xe4\xba\xac\xef\xbc\x89\xe7\xbd\x91\xe8\x8d\xaf\xe6\xa2\xb0\xe4\xbf\xa1\xe6\x81\xaf\xe5\xa4\x87\xe5\xad\x97\xef\xbc\x882021\xef\xbc\x89\xe7\xac\xac00047\xe5\x8f\xb7</span> / <span class="g">\xe4\xba\x92\xe8\x81\x94\xe7\xbd\x91\xe8\x8d\xaf\xe5\x93\x81\xe4\xbf\xa1\xe6\x81\xaf\xe6\x9c\x8d\xe5\x8a\xa1\xe8\xb5\x84\xe6\xa0\xbc\xe8\xaf\x81\xe4\xb9\xa6(\xe9\x9d\x9e\xe7\xbb\x8f\xe8\x90\xa5\xe6\x80\xa7)\xef\xbc\x9a(\xe4\xba\xac)-\xe9\x9d\x9e\xe7\xbb\x8f\xe8\x90\xa5\xe6\x80\xa7-2018-0311</span><br>© 2004-2022 Sogou.com / <a href="http://www.12377.cn" class="g" target="_blank">\xe7\xbd\x91\xe4\xb8\x8a\xe6\x9c\x89\xe5\xae\xb3\xe4\xbf\xa1\xe6\x81\xaf\xe4\xb8\xbe\xe6\x8a\xa5\xe4\xb8\x93\xe5\x8c\xba</a> / <span class="g">\xe4\xba\xac\xe7\xbd\x91\xe6\x96\x87(2019)6117-724\xe5\x8f\xb7</span> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank">\xe4\xba\xacICP\xe8\xaf\x81050897\xe5\x8f\xb7</a> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank">\xe4\xba\xacICP\xe5\xa4\x8711001839\xe5\x8f\xb7-1</a> / <a href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=11000002000025" class="ba" target="_blank">\xe4\xba\xac\xe5\x85\xac\xe7\xbd\x91\xe5\xae\x89\xe5\xa4\x8711000002000025\xe5\x8f\xb7</a></div> <div class="fit-older"></div> </div> <div class="kuozhan" id="QRcode-box" style="display:none"><a href="javascript:void(0);" id="miniQRcode"></a><span id="QRcode"></span></div><a href="javascript:void(0);" class="back-top" id="back-top"></a></div> <script>var SugPara, uigs_para, msBrowserName = navigator.userAgent.toLowerCase(),msIsSe = false,msIsMSearch = false, hasDoodle = false, queryinput = document.getElementById(\'query\');</script><script>/*file=static/js/indexjs.js*/function indexjsInit(e,o,n,t,s,u,i){var r={puid:t,cards:s,cards_sw:u,uigs_cookie:"SUID,sct,SUV"};function c(){try{window.external.metasearch("make_connection","www.google.com.hk")}catch(e){}}uigs_para={uigs_productid:"webapp",type:"webindex_new",stype:e?"login":"nologin",scrnwi:screen.width,scrnhi:screen.height,uigs_pbtag:"A",uigs_cookie:"SUID,sct",protocol:"https:"==location.protocol.toLowerCase()?"https":"http"},e&&(uigs_para=Object.assign(uigs_para,r)),window.loginCardConfig={},SugPara={queryboxid:"search-box",enableSug:!0,sugType:"web",domain:"w.sugg.sogou.com",productId:"web",sugFormName:"sf",inputid:"query",submitId:"stb",suggestRid:"01015002",normalRid:"01019900",useParent:1,sugglocation:"index",showVr:!0,showHotwords:!0,suggAbtestObject:o},/se 2\\.x/i.test(msBrowserName)&&(msIsSe=!0),/metasr/i.test(msBrowserName)&&(msIsMSearch=!0),queryinput&&msIsSe&&msIsMSearch&&(queryinput.addEventListener?(queryinput.addEventListener("keypress",c,!1),queryinput.addEventListener("keydown",c,!1)):queryinput.attachEvent?(queryinput.attachEvent("onkeypress",c),queryinput.attachEvent("onkeydown",c)):(queryinput.onkeypress=c,queryinput.onkeydown=c)),window.m_s_index=function(){var e=document.sf.query,o=Math.round(1e3*((new Date).getTime()+Math.random()));e.focus(),new RegExp("kw=([^&]+)").test(location.search)&&0==e.value.length&&(e.value=decodeURIComponent(RegExp.$1)),document.cookie.indexOf("SUV=")<0&&(document.cookie="SUV="+o+";path=/;expires=Sun, 29 July 2026 00:00:00 UTC;domain="+function(){var e=document.domain;return e.indexOf("sogou.com")==e.length-9?".sogou.com":e.indexOf("soso.com")==e.length-8?".soso.com":-1!=e.indexOf("sogo.com")?".sogo.com":void 0}()),n&&((new Image).src="//pb6.sogou.com/v6")},window.st=function(e,o,n,t){var s=document.sf.query,u=encodeURIComponent(s.value),i={news:"http://news.sogou.com/news?ie=utf8&query=",web:"web?ie=utf8&query=",weixin:"http://weixin.sogou.com/weixin?type=2&ie=utf8&query=",zhihu:"http://zhihu.sogou.com/zhihu?ie=utf8&query=",pic:"http://pic.sogou.com/pics?ie=utf8&query=",video:"https://v.sogou.com/v?ie=utf8&query=",myingyi:"https://www.sogou.com/web?m2web=mingyi.sogou.com&ie=utf8&query=",overseas:"http://english.sogou.com?b_o_e=1&ie=utf8&fr=pcweb_index_nav&query=",scholar:"http://scholar.sogou.com?ie=utf8&fr=common_index_nav&query=",fanyi:"http://fanyi.sogou.com/?fr=common_index_nav_pc&ie=utf8&keyword=",wenwen:"http://wenwen.sogou.com/s/?ch=websearch&w=",hanyu:"https://hanyu.sogou.com/?query=",science:"https://baike.sogou.com/kexue/home.htm?query="},r=i[n]||e.href;function c(e){return-1<e.indexOf("?")?"&":"?"}s&&""!==s.value&&(["hanyu"].includes(n)?r=r.match(/.*(?=\\?query\\=)/)[0]+{hanyu:{index:"",result:"result"}}[n].result+"?query="+u:i[n]?r=i[n]+u:0<r.indexOf("kw=")?r=r.replace(new RegExp("kw=[^&$]*"),"kw="+u):r+=c(r)+"kw="+u),o&&(r+=c(r)+"p="+o),t&&0<t.length&&(r+="#"+t),!s||""!=s.value||"wenwen"!=n&&"science"!=n||(r=e.href),e.href=r},window.cid=function(e,o){var n=document.sf.query,t=encodeURIComponent(n.value);t?"web2ww"===o?e.href+="s/?cid=web2ww&w="+t:"web2bk"===o&&(e.href+="Search.e?sp=S"+t+"&cid=web2bk"):e.href+="?cid="+o},window.m_s_index()}indexjsInit(false, {"suggestHistoryStrategy1":"","suggestHistoryStrategy2":"0|1|2|3|4|5|6|7|8","suggHistoryAbtest":""}, true, \'invaliduser\', \'\', \'\');</script><script src="//dlweb.sogoucdn.com/pcsearch/web/index/js/suggbase_b9937f7.js"></script> <script src="//dlweb.sogoucdn.com/pcsearch/js/common/widget/index_login_b1cc5cb.js"></script><script src="//account.sogou.com/static/api/passport-async.js"></script> <script src="//dlweb.sogoucdn.com/pcsearch/web/index/js/searchbase_453304b.js"></script> <script defer="defer" async type="text/javascript" src="//dlweb.sogoucdn.com/barrier_free/pc/wzaV15/aria.js?appid=c4d5562ec7daa12a5a351cbe1a292da1" charset="utf-8"></script></body></html><!--zly-->'
The response text is : <!DOCTYPE html><html lang="cn"><head><meta name="viewport" content="width=device-width,minimum-scale=1,maximum-scale=1,user-scalable=no"><script>window._speedMark = new Date(); window.lead_ip = '123.147.244.130';
window.now = 1653966907968;</script><script type="text/javascript">/*file=static/js/resourceErrorReport.js*/!function(a){var n=(new Date).getTime(),r=a.location.protocol;function c(e,t){var o=(new Date).getTime()-n;(new Image).src=["//pb.sogou.com/pv.gif?uigs_productid=wapapp&type=resource-error&stype=",e,"×tamp=",o,"&protocol=",r,"&host=",encodeURIComponent(a.location.host),"&path=",encodeURIComponent(a.location.pathname),"&resource=",encodeURIComponent(t)].join("")}function e(e){if((e=e||a.event)&&"error"===e.type){var t=e.srcElement?e.srcElement:e.target;if(t){var o,n,r=t.tagName;"LINK"===r?(n="css",(o=t.getAttribute("href"))&&o.match(/\.css($|\?)/)&&c(n,o)):"SCRIPT"===r&&(n="js",(o=t.getAttribute("src"))&&o.match(/\.js($|\?)/)&&c(n,o))}}}r&&(r=r.substring(0,r.length-1)),a.addEventListener?a.addEventListener("error",e,!0):a.attachEvent&&a.attachEvent("onerror",e)}(window);</script><meta charset="utf-8"><link rel="dns-prefetch" href="//img01.sogoucdn.com"><link rel="dns-prefetch" href="//img02.sogoucdn.com"><link rel="dns-prefetch" href="//img03.sogoucdn.com"><link rel="dns-prefetch" href="//img04.sogoucdn.com"><link rel="dns-prefetch" href="//dlweb.sogoucdn.com"><title> Sogou search engine - The Internet starts with Sogou </title><link rel="shortcut icon" href="/images/logo/new/favicon.ico?v=4" type="image/x-icon"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><link rel="search" type="application/opensearchdescription+xml" href="/content-search.xml" title=" Sogou search "><meta name="keywords" content=" Sogou search , Web search , WeChat search , Video search , Image search , Music search , News search , Software search , Q & a search , Encyclopedia Search , Shopping search "><meta name="description" content=" Sogou search is the third generation of interactive search engine in the world , Support WeChat official account and article search 、 Zhihu search 、 English search and Translation , Provide professional services for users through self-developed artificial intelligence algorithms 、 accurate 、 Convenient search service ."><link rel="stylesheet" type="text/css" href="//dlweb.sogoucdn.com/pcsearch/web/index/css/index_style_39e6e10.css"><style>.wrapper .suggestion{border:1px solid #e8e8e8;width:653px;-moz-box-shadow:0 1px 8px rgba(0,0,0,.1);-webkit-box-shadow:0 1px 8px rgba(0,0,0,.1);box-shadow:0 1px 8px rgba(0,0,0,.1);border-top-left-radius:0;border-top-right-radius:0;border-bottom-right-radius:2px;border-bottom-left-radius:2px;top:43px}.wrapper .suglist{width:206px}.wrapper .suglist .keyword{color:#7a77c8}.big-scn .suggestion{width:820px}.big-scn .suglist{width:236px}.wrapper .suglist{padding:4px 0}input[type=text]::-ms-clear{display:none}</style><!-- indexSnippetToHeader start --> <!-- indexSnippetToHeader end --></head><body color-style="white"><div class="wrapper " id="wrap"><div class="header"> <div class="top-nav"><ul><li class="cur"><span> Webpage </span></li><li><a onclick="st(this,'73141200','weixin')" href="http://weixin.sogou.com/" uigs-id="nav_weixin" id="weixinch"> WeChat </a></li><li><a onclick="st(this,'40051200','zhihu')" href="http://zhihu.sogou.com/" uigs-id="nav_zhihu" id="zhihu"> You know </a></li><li><a onclick="st(this,'40030500','pic')" href="http://pic.sogou.com" uigs-id="nav_pic" id="pic"> picture </a></li><li><a onclick="st(this,'40030600','video')" href="https://v.sogou.com/" uigs-id="nav_v" id="video"> video </a></li><li><a href="http://mingyi.sogou.com?fr=common_index_nav" uigs-id="nav_mingyi" id="mingyi" onclick="st(this,'','myingyi')"> Medical care </a></li><li><a href="http://hanyu.sogou.com?fr=pcweb_index_nav" uigs-id="nav_hanyu" id="hanyu" onclick="st(this,'','hanyu')"> chinese </a></li><li><a href="http://fanyi.sogou.com?fr=common_index_nav_pc" uigs-id="nav_fanyi" id="fanyi" onclick="st(this,'','fanyi')"> translate </a></li><li><a onclick="st(this,'web2ww','wenwen')" href="https://wenwen.sogou.com/?ch=websearch" uigs-id="nav_wenwen" id="index_more_wenwen"> ask </a></li><li><a onclick="st(this,'web2ww','baike')" href="http://baike.sogou.com/Home.v" uigs-id="nav_baike" id="index_baike"> Encyclopedias </a></li><li><a onclick="st(this,'40031000')" href="http://map.sogou.com" uigs-id="nav_map" id="map"> Map </a></li><li class="show-more"><a href="javascript:void(0);" id="more-product"> more <i class="m-arr"></i></a><div class="pos-more" id="products-box" style="top:40px"><span class="ico-san"></span><a onclick="st(this,'40031500')" href="http://gouwu.sogou.com/" uigs-id="nav_gouwu" id="index_more_gouwu"> shopping </a><a onclick="st(this)" href="http://zhishi.sogou.com" uigs-id="nav_zhishi" id="index_more_zhishi"> knowledge </a><a onclick="st(this,'40051205')" href="http://as.sogou.com/" uigs-id="nav_app" id="index_more_appli"> application </a><a href="https://baike.sogou.com/kexue/home.htm" uigs-id="nav_science" id="science"> science </a><span class="all"><a onclick="st(this,'40051206')" href="http://www.sogou.com/docs/more.htm?v=1" uigs-id="nav_all" target="_blank"> All </a></span></div></li></ul></div><div class="user-box"> <a href="javascript:void(0)" id="cniil_wza" style="float:left;text-decoration:none;color:#000;opacity:.75;padding-right:20px;margin-right:20px;border-right:1px solid #e7e7e7;line-height:14px;position:relative;top:5px"> Barrier free </a> <div class="local-weather" id="local-weather"><div class="wea-box" id="cur-weather" style="display:none"></div> <div class="pos-more" id="detail-weather" style="top:40px;left:-80px"></div> </div><span class="line" id="user-box-line" style="display:none"></span><div class="user-enter"> <a href="javascript:void(0);" class="enter" id="loginBtn"> Sign in </a> </div></div></div><div class="content" id="content"><div class="pos-header" id="top-float-bar"><div class="part-one"></div><div class="part-two" id="card-tab-layer"><div class="c-top" id="top-card-tab"></div></div></div><div class="logo2" id="logo-s"><span></span></div><div class="logo" id="logo-l"><span></span></div> <div class="search-box querybox-focus" id="search-box"><form action="/web" name="sf" id="sf"><span class="sec-input-box"><input type="text" class="sec-input active" name="query" id="query" maxlength="100" len="80" autocomplete="off"></span><span class="enter-input"><input type="submit" value=" Sogou search " id="stb"></span><input type="hidden" name="_asf" value="www.sogou.com"> <input type="hidden" name="_ast"> <input type="hidden" name="w" value="01019900"> <input type="hidden" name="p" value="40040100"> <input type="hidden" name="ie" value="utf8"> <input type="hidden" name="from" value="index-nologin"> <input type="hidden" name="s_from" value="index"><div class="keywords-tips" id="keywordsTips" style="display:none"><i></i><p>“<strong id="keywordsTipsStrong">369</strong>” The following text is ignored , Sogou's queries are limited to 40 Within characters .</p></div></form></div> </div><div class="card-box" id="card-box" style="display:none"><div class="card-box2" id="card-box2"><div class="c-top" id="card-tab-box"><a href="javascript:void(0);" uigs-id="settings_close-card" id="close-card" class="shezhi"></a></div><div class="c-main" id="card-content"></div></div></div><div class="loog-more" id="scroll-more" style="display:none"><a href="javascript:void(0);" uigs-id="scroll-more"> Scroll to see more <br><span class="ico_san"></span></a></div><div class="ft" id="footer" style="display:none" ><a href="http://b.sogou.com/" target="_blank" uigs-id="footer_tuiguang"> Enterprise promotion </a><span class="line"></span><a href="http://www.sogou.com/docs/terms.htm?v=1" target="_blank" uigs-id="footer_disclaimer"> disclaimer </a><span class="line"></span><a href="http://fankui.help.sogou.com/index.php/web/web/index/type/4" target="_blank" uigs-id="footer_feedback"> Feedback and complaints </a><span class="line"></span><a href="http://corp.sogou.com/private.html" target="_blank" uigs-id="footer_private"> Privacy policy </a><br><span class="g"> Filing of network information service for drugs and medical devices :( Beijing ) The information of network medicine and equipment is prepared (2021) The first 00047 Number </span> / <span class="g"> Qualification certificate of Internet drug information service ( Non operating ):( Beijing )- Non operating -2018-0311</span><br>© 2004-2022 Sogou.com / <a href="http://www.12377.cn" class="g" target="_blank"> Online harmful information reporting area </a> / <span class="g"> Jingwangwen (2019)6117-724 Number </span> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank"> Beijing ICP Prove 050897 Number </a> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank"> Beijing ICP To prepare 11001839 Number -1</a> / <a href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=11000002000025" class="ba" target="_blank"> Beijing public network security 11000002000025 Number </a></div> <div class="ft-v1" id="QRcode-footer" style="padding-bottom:28px"><div class="ft-info"><a uigs-id="mid_pinyin" href="http://pinyin.sogou.com/" target="_blank"><i class="i1"></i> Sogou input method </a><span class="line"></span><a uigs-id="mid_liulanqi" href="http://ie.sogou.com/" target="_blank"><i class="i2"></i> browser </a><span class="line"></span><a uigs-id="mid_daohang" href="http://123.sogou.com/" target="_blank"><i class="i3"></i> Website navigation </a><br><a href="http://b.sogou.com/" target="_blank" class="g"> Enterprise promotion </a> - <a href="http://www.sogou.com/docs/terms.htm?v=1" target="_blank" class="g"> disclaimer </a> - <a href="http://fankui.help.sogou.com/index.php/web/web/index/type/4" target="_blank" class="g"> Feedback and complaints </a> - <a href="http://corp.sogou.com/private.html" target="_blank" class="g" uigs-id="footer_private"> Privacy policy </a><br><span class="g"> Filing of network information service for drugs and medical devices :( Beijing ) The information of network medicine and equipment is prepared (2021) The first 00047 Number </span> / <span class="g"> Qualification certificate of Internet drug information service ( Non operating ):( Beijing )- Non operating -2018-0311</span><br>© 2004-2022 Sogou.com / <a href="http://www.12377.cn" class="g" target="_blank"> Online harmful information reporting area </a> / <span class="g"> Jingwangwen (2019)6117-724 Number </span> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank"> Beijing ICP Prove 050897 Number </a> / <a class="g" href="https://beian.miit.gov.cn/" target="_blank"> Beijing ICP To prepare 11001839 Number -1</a> / <a href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=11000002000025" class="ba" target="_blank"> Beijing public network security 11000002000025 Number </a></div> <div class="fit-older"></div> </div> <div class="kuozhan" id="QRcode-box" style="display:none"><a href="javascript:void(0);" id="miniQRcode"></a><span id="QRcode"></span></div><a href="javascript:void(0);" class="back-top" id="back-top"></a></div> <script>var SugPara, uigs_para, msBrowserName = navigator.userAgent.toLowerCase(),msIsSe = false,msIsMSearch = false, hasDoodle = false, queryinput = document.getElementById('query');</script><script>/*file=static/js/indexjs.js*/function indexjsInit(e,o,n,t,s,u,i){var r={puid:t,cards:s,cards_sw:u,uigs_cookie:"SUID,sct,SUV"};function c(){try{window.external.metasearch("make_connection","www.google.com.hk")}catch(e){}}uigs_para={uigs_productid:"webapp",type:"webindex_new",stype:e?"login":"nologin",scrnwi:screen.width,scrnhi:screen.height,uigs_pbtag:"A",uigs_cookie:"SUID,sct",protocol:"https:"==location.protocol.toLowerCase()?"https":"http"},e&&(uigs_para=Object.assign(uigs_para,r)),window.loginCardConfig={},SugPara={queryboxid:"search-box",enableSug:!0,sugType:"web",domain:"w.sugg.sogou.com",productId:"web",sugFormName:"sf",inputid:"query",submitId:"stb",suggestRid:"01015002",normalRid:"01019900",useParent:1,sugglocation:"index",showVr:!0,showHotwords:!0,suggAbtestObject:o},/se 2\.x/i.test(msBrowserName)&&(msIsSe=!0),/metasr/i.test(msBrowserName)&&(msIsMSearch=!0),queryinput&&msIsSe&&msIsMSearch&&(queryinput.addEventListener?(queryinput.addEventListener("keypress",c,!1),queryinput.addEventListener("keydown",c,!1)):queryinput.attachEvent?(queryinput.attachEvent("onkeypress",c),queryinput.attachEvent("onkeydown",c)):(queryinput.onkeypress=c,queryinput.onkeydown=c)),window.m_s_index=function(){var e=document.sf.query,o=Math.round(1e3*((new Date).getTime()+Math.random()));e.focus(),new RegExp("kw=([^&]+)").test(location.search)&&0==e.value.length&&(e.value=decodeURIComponent(RegExp.$1)),document.cookie.indexOf("SUV=")<0&&(document.cookie="SUV="+o+";path=/;expires=Sun, 29 July 2026 00:00:00 UTC;domain="+function(){var e=document.domain;return e.indexOf("sogou.com")==e.length-9?".sogou.com":e.indexOf("soso.com")==e.length-8?".soso.com":-1!=e.indexOf("sogo.com")?".sogo.com":void 0}()),n&&((new Image).src="//pb6.sogou.com/v6")},window.st=function(e,o,n,t){var s=document.sf.query,u=encodeURIComponent(s.value),i={news:"http://news.sogou.com/news?ie=utf8&query=",web:"web?ie=utf8&query=",weixin:"http://weixin.sogou.com/weixin?type=2&ie=utf8&query=",zhihu:"http://zhihu.sogou.com/zhihu?ie=utf8&query=",pic:"http://pic.sogou.com/pics?ie=utf8&query=",video:"https://v.sogou.com/v?ie=utf8&query=",myingyi:"https://www.sogou.com/web?m2web=mingyi.sogou.com&ie=utf8&query=",overseas:"http://english.sogou.com?b_o_e=1&ie=utf8&fr=pcweb_index_nav&query=",scholar:"http://scholar.sogou.com?ie=utf8&fr=common_index_nav&query=",fanyi:"http://fanyi.sogou.com/?fr=common_index_nav_pc&ie=utf8&keyword=",wenwen:"http://wenwen.sogou.com/s/?ch=websearch&w=",hanyu:"https://hanyu.sogou.com/?query=",science:"https://baike.sogou.com/kexue/home.htm?query="},r=i[n]||e.href;function c(e){return-1<e.indexOf("?")?"&":"?"}s&&""!==s.value&&(["hanyu"].includes(n)?r=r.match(/.*(?=\?query\=)/)[0]+{hanyu:{index:"",result:"result"}}[n].result+"?query="+u:i[n]?r=i[n]+u:0<r.indexOf("kw=")?r=r.replace(new RegExp("kw=[^&$]*"),"kw="+u):r+=c(r)+"kw="+u),o&&(r+=c(r)+"p="+o),t&&0<t.length&&(r+="#"+t),!s||""!=s.value||"wenwen"!=n&&"science"!=n||(r=e.href),e.href=r},window.cid=function(e,o){var n=document.sf.query,t=encodeURIComponent(n.value);t?"web2ww"===o?e.href+="s/?cid=web2ww&w="+t:"web2bk"===o&&(e.href+="Search.e?sp=S"+t+"&cid=web2bk"):e.href+="?cid="+o},window.m_s_index()}indexjsInit(false, {"suggestHistoryStrategy1":"","suggestHistoryStrategy2":"0|1|2|3|4|5|6|7|8","suggHistoryAbtest":""}, true, 'invaliduser', '', '');</script><script src="//dlweb.sogoucdn.com/pcsearch/web/index/js/suggbase_b9937f7.js"></script> <script src="//dlweb.sogoucdn.com/pcsearch/js/common/widget/index_login_b1cc5cb.js"></script><script src="//account.sogou.com/static/api/passport-async.js"></script> <script src="//dlweb.sogoucdn.com/pcsearch/web/index/js/searchbase_453304b.js"></script> <script defer="defer" async type="text/javascript" src="//dlweb.sogoucdn.com/barrier_free/pc/wzaV15/aria.js?appid=c4d5562ec7daa12a5a351cbe1a292da1" charset="utf-8"></script></body></html><!--zly-->
The request header is : {'Server': 'nginx', 'Date': 'Tue, 31 May 2022 03:15:08 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Set-Cookie': 'ABTEST=7|1653966908|v17; expires=Thu, 30-Jun-22 03:15:08 GMT; path=/, IPLOC=CN5000; expires=Wed, 31-May-23 03:15:08 GMT; domain=.sogou.com; path=/, SUID=82F4937B364A910A000000006295883C; expires=Mon, 26-May-2042 03:15:08 GMT; domain=.sogou.com; path=/, black_passportid=; path=/; expires=Thu, 01 Jan 1970 00:00:00 GMT; domain=.sogou.com', 'P3P': 'CP="CURa ADMa DEVa PSAo PSDo OUR BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR", CP="CURa ADMa DEVa PSAo PSDo OUR BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR", CP="CURa ADMa DEVa PSAo PSDo OUR BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR"', 'Pragma': 'No-cache', 'Cache-Control': 'max-age=0', 'Expires': 'Tue, 31 May 2022 03:15:08 GMT', 'UUID': 'a09a4fc6-1144-4ddb-a028-df355ff57969', 'Content-Encoding': 'gzip'}
The request method is : <PreparedRequest [GET]>
The encoding method is : utf-8
Request URL url by : https://www.sogou.com/
cookies by : <RequestsCookieJar[<Cookie IPLOC=CN5000 for .sogou.com/>, <Cookie SUID=82F4937B364A910A000000006295883C for .sogou.com/>, <Cookie ABTEST=7|1653966908|v17 for www.sogou.com/>]>
Status code for : 200
The response type is : <class 'requests.models.Response'>
The content response type is : <class 'bytes'>
The text response type is : <class 'str'>2.1 Get request method
Take Sogou as an example :
Right click in the blank “ Check ” Or press F12 key .

The following interface will be displayed ,1 Click on Network,2 Refresh ,3 Choose Name Next first www.sogou.com.
After entering , You can see URL, Request mode , Ask for first-class information .


2.2 Add request header
Add request headers to disguise , Deal with a small backcrawl .
Take Sogou as an example , Type in the search box “ Jackie Chan ” Search for , Press F12 Key to enter the following page :

obtain url And how to request get , Write a crawler :
When the request header information is not added :
# When the request header is not added :
import requests
url = "https://www.sogou.com/web?query=%E6%88%90%E9%BE%99&_ast=1653967846&_asf=www.sogou.com&w=01029901&p=40040100&dp=1&cid=&s_from=result_up&sut=674&sst0=1653967851220&lkt=0%2C0%2C0&sugsuv=1653292431916060&sugtime=1653967851220"#f,query, To express with f Put the variable query Plug into the url In the string of
response = requests.get(url)
print(response.text)# Get the source code The operation results are as follows :
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<link rel="shortcut icon" href="//www.sogou.com/images/logo/new/favicon.ico?v=4" type="image/x-icon">
<title> Sogou search </title>
<link rel="stylesheet" href="static/css/anti.min.css?v=1"/>
<script src="//dlweb.sogoucdn.com/common/lib/jquery/jquery-1.11.0.min.js"></script>
<script src="static/js/antispider.min.js?v=3"></script>
<script>
var domain = getDomain();
window.imgCode = -1;
(function() {
function checkSNUID() {
var cookieArr = document.cookie.split('; '),
count = 0;
for(var i = 0, len = cookieArr.length; i < len; i++) {
if (cookieArr[i].indexOf('SNUID=') > -1) {
count++;
}
}
return count > 1;
}
if(checkSNUID()) {
var date = new Date(), expires;
date.setTime(date.getTime() -100000);
expires = date.toGMTString();
document.cookie = 'SNUID=1;path=/;expires=' + expires;
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.www.sogo.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.weixin.sogo.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.sogo.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.www.sogou.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.weixin.sogou.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.sogou.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.snapshot.sogoucdn.com';
/*document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.zhinan.sogou.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.gouwu.sogou.com';
document.cookie = 'SNUID=1;path=/;expires=' + expires + ';domain=.ishop.sogou.com';*/
sendLog('delSNUID');
}
if(getCookie('seccodeRight') === 'success') {
sendLog('verifyLoop');
setCookie('seccodeRight', 1, getUTCString(-1), location.hostname, '/');
}
if(getCookie('refresh')) {
sendLog('refresh');
}
})();
function setImgCode(code) {
try {
var t = new Date().getTime() - imgRequestTime.getTime();
sendLog('imgCost',"cost="+t);
} catch (e) {
}
window.imgCode = code;
}
sendLog('index');
function changeImg2() {
if(window.event) {
window.event.returnValue=false
}
}
var suuid = "9321d62d-f547-4a1e-a150-c9527e7c82de";var auuid = "c918ed45-9536-4d74-b27b-6ca583856d4a"; </script>
</head>
<body>
<div class="header">
<div class="logo">
<a href="/">
<img width="180" height="60" src="static/images/logo_180x60.png" srcset="static/images/[email protected] 2x">
</a>
</div>
<div class="other"><span class="s1"> There was an error in your access </span><span class="s2"><a href="/"> Back to the home page >></a></span></div>
</div>
<div class="content-box">
<p class="ip-time-p">IP:123.147.244.130<br> Access time :2022.05.31 14:37:58<br>SourceVerifyCode:c9527e7c82de<br>From:www.sogou.com</p>
<p class="p2"> Hello, user , Our system has detected that there are abnormal access requests in your network .<br> This verification code is used to confirm that these requests are your normal behavior and not sent by an automated program , Need your help to verify .</p>
<p class="p3"><label for="seccodeInput"> Verification Code :</label></p>
<form name="authform" method="POST" id="seccodeForm" action="/">
<p class="p4">
<input type=text name="c" value="" placeholder=" Please enter the verification code " id="seccodeInput" autocomplete="off">
<input type="hidden" name="tc" id="tc" value="">
<input type="hidden" name="r" id="from" value="%2Fweb%3Fquery%3D%E6%88%90%E9%BE%99%26_ast%3D1653967846%26_asf%3Dwww.sogou.com%26w%3D01029901%26p%3D40040100%26dp%3D1%26cid%3D%26s_from%3Dresult_up%26sut%3D674%26sst0%3D1653967851220%26lkt%3D0%2C0%2C0%26sugsuv%3D1653292431916060%26sugtime%3D1653967851220" >
<input type="hidden" name="p" id="product" value="web_gd" >
<input type="hidden" name="m" value="f9ab5bf7a9587003b95025fada8f5ce5" > <span class="s1">
<script>imgRequestTime=new Date();</script>
<a onclick="changeImg2();" href="javascript:void(0)">
<img id="seccodeImage" onload="setImgCode(1)" onerror="setImgCode(0)" src="util/seccode.php?tc=1653979078" width="100" height="40" alt=" Please enter the verification code in the figure " title=" Please enter the verification code in the figure ">
</a>
</span>
<a href="javascript:void(0);" id="change-img" onclick="changeImg2();" style="padding-left:50px;"> In a </a>
<span class="s2" id="error-tips" style="display: none;"></span>
</p>
</form>
<p class="p5">
<a href="javascript:void(0);" id="submit"> Submit </a>
<span> The problem was not solved after submission ? welcome <a href="http://fankui.help.sogou.com/index.php/web/web/index?type=10&anti_time=1653979078&domain=www.sogou.com" target="_blank"> feedback </a>.</span>
<!--span> The problem was not solved after submission ? welcome <a href="http://fankui.help.sogou.com/index.php/web/web/index?type=10&anti_time=1653979078&domain=www.sogou.com&verifycode=c9527e7c82de" target="_blank"> feedback </a>.</span-->
</p>
</div>
<div id="ft"><a href="http://fuwu.sogou.com/" target="_blank"> Enterprise promotion </a><a href="http://corp.sogou.com/" target="_blank"> About Sogou </a><a href="/docs/terms.htm?v=1" target="_blank"> disclaimer </a><a href="http://fankui.help.sogou.com/index.php/web/web/index?type=10&anti_time=1653979078&domain=www.sogou.com" target="_blank"> Feedback </a><br> © 2022<span id="footer-year"></span> Sogou Inc. - <a href="http://www.miibeian.gov.cn" target="_blank" class="g"> Beijing ICP Prove 050897 Number </a> - Beijing public network security 1100<span class="ba">00000025 Number </span></div>
<script src="static/js/index.min.js?v=0.1.5"></script>
</body>
</html>
<!--zly-->Obviously there's a problem , There was no “ Jackie Chan ” about .
resolvent :
1. Get request header :

2. Add request header :
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.41 Safari/537.36'
}# Request header , camouflage
url = "https://www.sogou.com/web?query=%E6%88%90%E9%BE%99&_ast=1653967846&_asf=www.sogou.com&w=01029901&p=40040100&dp=1&cid=&s_from=result_up&sut=674&sst0=1653967851220&lkt=0%2C0%2C0&sugsuv=1653292431916060&sugtime=1653967851220"#f,query, To express with f Put the variable query Plug into the url In the string of
response = requests.get(url=url,headers=headers)
print(response.text)# Get the source code The operation results are as follows :
Length is too long , Put the screenshot of the results to view , Please understand :


Get the source code successfully .
But we will find that at this time url Is too long. , How to deal with it ?
# Original website :
url ='https://www.sogou.com/web?query=%E6%88%90%E9%BE%99&_ast=1653985908&_asf=www.sogou.com&w=01029901&p=40040108&dp=1&cid=&s_from=result_up&sut=916&sst0=1653985949798&lkt=0%2C0%2C0&sugsuv=1653292431916060&sugtime=1653985949798'
# Processed website :
url ='https://www.sogou.com/web?query=%E6%88%90%E9%BE%99'
# Or is it :
url = 'https://www.sogou.com/web?query= Jackie Chan 'After deleting the URL, press enter and you will find the same interface , So we got a simplified version of the website .

What if I want to search others ?
Let's first look at the characteristics of the original web page :

It turns out that the response we get is from “query: Jackie Chan ” The control of the , Then change the code as follows :
import requests
query = input(' Enter the name of a star :')
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.41 Safari/537.36'
}# Request header , camouflage
url = f"https://www.sogou.com/web?query={query}"#f,query, To express with f Put the variable query Plug into the url In the string of
response = requests.get(url=url,headers=headers)# Deal with a small backcrawl
print(response)
print(response.text)# Get the source code The operation results are as follows :

succeed .
3. End of year benefits : Capture beautiful pictures
Practical drill :
First , Get the picture download address :

No second , Last :
import requests
url = 'https://i01piccdn.sogoucdn.com/a2df911ea958c157'
response = requests.get(url)
with open('liuyifei.jpg', 'wb') as f: # Create... In the current path liuyifei.jpg File and open as f file
f.write(response.content)
print(" Download successful !")
give the result as follows :

边栏推荐
- Configure CX Oracle solution (cx_oracle.databaseerror) dpi-1047: cannot locate a 64 bit Oracle client library: "th
- Sharing of award-winning activities: you can get up to iphone13 after using WordPress to build your own blog
- 4. Main program and cumulative interrupt processing routine implementation code
- 21. Definition of message processing task
- sql 开发篇一 之 表锁查询及解锁
- Leetcode - number of operations, non repeating numbers, diagonal traversal, Joseph Ring
- Crmeb Standard Version window+phpstudy8 installation tutorial (I)
- DataTables warning: table id=campaigntable - cannot reinitialize datatable. solve
- 多线程
- Grpc protocol buffer
猜你喜欢

融云实时社区解决方案

Crmeb Standard Edition window+phpstudy8 installation tutorial (II)

Deepfacelab model parameters collection

百度提出动态自蒸馏方法,结合交互模型与双塔模型实现稠密段落检索

爆肝整理 JVM 十大模块知识点总结,不信你还不懂

详解.NET的求复杂类型集合的差集、交集、并集

DataTables warning: table id=campaigntable - cannot reinitialize datatable. solve
![[jspwiki]jspwiki installation deployment and configuration](/img/3c/81a201bb80dcbb17d1c97b1a5bb215.png)
[jspwiki]jspwiki installation deployment and configuration

Tencent interview -- please design a thread pool to implement sequential execution

如何获取及嵌入Go二进制执行包信息
随机推荐
[leetcode] binary search given an N-element ordered (ascending) integer array num and a target value target, write a function to search the target in num. if the target value exists, return the subscr
How many tips do you know about using mock technology to help improve test efficiency?
800V高压系统
Template injection summary
Some operations of bit operation
爆肝整理 JVM 十大模块知识点总结,不信你还不懂
4.8 hd-gr GNSS navigation software source code
Summary of common redis commands (self provided)
Grpc frequently asked questions
Opencv - closely combine multiple irregular small graphs into large graphs
2022-07-28日报:Science:AI设计蛋白质再获突破,可设计特定功能性蛋白质
MIT指出公开预训练模型不能乱用
What are the functions to be added in crmeb pro2.2?
7、实时数据备份和实时时钟相关定义
Write a standard character device driver with your hands
How to write a JMeter script common to the test team
Daily news on July 28, 2022: Science: AI has made another breakthrough in protein design, and can design specific functional proteins
3. Basic constants and macro definitions
使用Mock技术帮助提升测试效率的小tips,你知道几个?
Execution process of SQL statement