当前位置:网站首页>English translation is too difficult? I wrote two translation scripts with crawler in a rage
English translation is too difficult? I wrote two translation scripts with crawler in a rage
2022-07-07 07:23:00 【Hall owner a Niu】
Personal profile
- Author's brief introduction : Hello everyone , I'm Daniel
- Personal home page : Hall owner a Niu
- Stand by me : Like collection ️ Leaving a message.
- Series column :python Web crawler
- Maxim : So far, all life is about failure , But it doesn't prevent me from moving forward !
Here's the catalog title
Preface
Here it comes ! Here it comes ! As a programmer , English sentences cannot be translated , I can't bear it , The script must be scheduled !!!
Baidu translation ( Simple )
analysis
Enter Baidu translation ,F12 Enter all of the network , When you write what you want to translate , You can see in all of the network sug This link , Our interface is ours url, Parameter is kw.
Code
import requests
post_url='https://fanyi.baidu.com/sug'
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36'
}
word = input(' Please enter the... You want to translate , It can be used in various languages :')
data = {
'kw': word
}
response = requests.post(url=post_url,data=data,headers=headers)
dic_obj = response.json() # take json Data conversion to dictionary
print(dic_obj['data'][0]['v'])
result
Youdao translation version ( difficult )
analysis (js reverse )
F12 Go into developer mode , In the network xhr( look for ajax Place of request ) Find the interface shown in the figure below .
Then we look at the parameters :
The comparison between the two figures shows that ,i It should be the sentence we want to translate , The green line is the parameters of different forms , We need to deal with it ,Its A see be 13 Bit time stamp ,salt It means salt in English , And better than timestamp lts More than a , The first 13 are the same , It should be a salt timestamp ( For a string of numbers, you can add a string of numbers or strings and then encrypt , In encryption, we call adding salt ), We can use these two parameters python Separate simulation , In order to avoid unnecessary trouble or some people will not , We found them directly behind js sentence , use python perform js Just generate it .
And here it is sign At a glance, there is 32 position , It should be generated by some encryption algorithm , The most common is md5 and rsa Encrypted , Let's do a global search js reverse :
After searching , We found an old friend md5 encryption , The generation method of parameters is also found , In the figure js Inside r It's a time stamp ,js Inside i It's the salt timestamp ,sign Yes, it is md5 Encrypted string in parentheses , And analyze e The birth of , You can find out through break point debugging .
You can see e Is what we want to translate , Now the parameters are obvious , Our simplest call is actually python Medium hashlib Module md5 The encryption algorithm can get sign, But here we don't have to , Increase the difficulty , practice js reverse . I directly extracted md5 cryptographic js Put the files in the network disk , You can extract it yourself , Use... In the code .
link :https://pan.baidu.com/s/1aV1tEo35Oyw4TUExhJoXUA
Extraction code :waan
meanwhile , In order to deal with reverse climbing , Not just User-Agent, Plus Cookie and Referer.
Code
import requests
import execjs # perform js Module of statement
import json
import jsonpath
class Youdao():
def __init__(self,msg):
# url
self.url = 'https://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
# headers
self.headers = {
'User-Agent': 'Mozilla / 5.0(Windows NT 10.0;WOW64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 91.0.4472.124Safari / 537.',
'Cookie': 'OUTFOX_SEARCH_USER_ID = [email protected];OUTFOX_SEARCH_USER_ID_NCOO = 39238000.072458096;JSESSIONID = aaak-QLUNaabh_wFWK8Qx;___rl__test__cookies = 1626662199192',
'Referer': 'https://fanyi.youdao.com/'
}
self.msg = msg
self.Formdata = None
def js_Formdata(self):
# Time stamp
r = execjs.eval('"" + (new Date).getTime()')
# Time stamp, salt
i = r + str(execjs.eval('parseInt(10 * Math.random(), 10)'))
ctx = execjs.compile(open('./youdao.js', 'r', encoding='utf-8').read())
sign = ctx.call('getsign', self.msg,i) # call youdao.js Inside getsign function , Pass in the things to be translated and the salt time stamp .
self.Formdata = {
'i': self.msg,
'from': 'AUTO',
'to': 'AUTO',
'smartresult': 'dict',
'client': 'fanyideskweb',
'salt': i,
'sign': sign,
'lts': r,
'bv': 'f46e446c6db49492797b7d03ea1e82da',
'doctype': 'json',
'version': '2.1',
'keyfrom': 'fanyi.web',
'action': 'FY_BY_REALTlME',
}
def response(self):
resp = requests.post(url=self.url,data=self.Formdata,headers=self.headers).text
data = json.loads(resp) # take json Convert data into a dictionary
# utilize jsonpath Extract the data
if "translateResult" in data:
k = jsonpath.jsonpath(data, '$..translateResult')[0][0][0]['tgt']
print(k)
print(" Other translators :")
if "smartResult" in data:
lst = jsonpath.jsonpath(data, '$..entries')[0]
for k in lst[1:]:
k = k.replace("\r\n", "")
print(k)
def main(self):
#Formdata
self.js_Formdata()
#print(self.Formdata)
# Send a request , Get a response
self.response()
if __name__ == '__main__':
msg = input(' Please enter the word or sentence you want to translate :')
youdao = Youdao(msg)
youdao.main()
result
Conclusion
If you think the blogger's writing is good, give it to the third company !!!
边栏推荐
- js小练习
- Docker compose start redis cluster
- 抽丝剥茧C语言(高阶)指针进阶练习
- Introduction to abnova's in vitro mRNA transcription workflow and capping method
- Graduation design game mall
- Tumor immunotherapy research prosci Lag3 antibody solution
- main函数在import语句中的特殊行为
- transform-origin属性详解
- PostgreSQL source code (60) transaction system summary
- Select the product attribute pop-up box to pop up the animation effect from the bottom
猜你喜欢
transform-origin属性详解
After the promotion, sales volume and flow are both. Is it really easy to relax?
Tujia, muniao, meituan... Home stay summer war will start
ROS2规划系统plansys2简单的例子
$refs:组件中获取元素对象或者子组件实例:
2018 Jiangsu Vocational College skills competition vocational group "information security management and evaluation" competition assignment
Paranoid unqualified company
SQLMAP使用教程(四)实战技巧三之绕过防火墙
Lm11 reconstruction of K-line and construction of timing trading strategy
Abnova circulating tumor DNA whole blood isolation, genomic DNA extraction and analysis
随机推荐
抽絲剝繭C語言(高階)數據的儲存+練習
Advanced level of C language (high level) pointer
Composition API 前提
"Xiaodeng in operation and maintenance" meets the compliance requirements of gdpr
計算機服務中缺失MySQL服務
Wechat applet full stack development practice Chapter 3 Introduction and use of APIs commonly used in wechat applet development -- 3.9 introduction to network interface (IX) extending the request3 met
Multithreading and high concurrency (9) -- other synchronization components of AQS (semaphore, reentrantreadwritelock, exchanger)
JS small exercise ---- time sharing reminder and greeting, form password display hidden effect, text box focus event, closing advertisement
mips uclibc 交叉编译ffmpeg,支持 G711A 编解码
Example of Pushlet using handle of Pushlet
At the age of 20, I got the ByteDance offer on four sides, and I still can't believe it
深度学习花书+机器学习西瓜书电子版我找到了
1090: integer power (multi instance test)
Stockage et pratique des données en langage C (haut niveau)
Music | cat and mouse -- classic not only plot
【leetcode】1020. Number of enclaves
ROS2规划系统plansys2简单的例子
Detailed explanation of transform origin attribute
Abnova immunohistochemical service solution
Unity C function notes