当前位置:网站首页>English translation is too difficult? I wrote two translation scripts with crawler in a rage
English translation is too difficult? I wrote two translation scripts with crawler in a rage
2022-07-07 07:23:00 【Hall owner a Niu】
Personal profile
- Author's brief introduction : Hello everyone , I'm Daniel
- Personal home page : Hall owner a Niu
- Stand by me : Like collection ️ Leaving a message.
- Series column :python Web crawler
- Maxim : So far, all life is about failure , But it doesn't prevent me from moving forward !

Here's the catalog title
Preface
Here it comes ! Here it comes ! As a programmer , English sentences cannot be translated , I can't bear it , The script must be scheduled !!!
Baidu translation ( Simple )
analysis
Enter Baidu translation ,F12 Enter all of the network , When you write what you want to translate , You can see in all of the network sug This link , Our interface is ours url, Parameter is kw.

Code
import requests
post_url='https://fanyi.baidu.com/sug'
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36'
}
word = input(' Please enter the... You want to translate , It can be used in various languages :')
data = {
'kw': word
}
response = requests.post(url=post_url,data=data,headers=headers)
dic_obj = response.json() # take json Data conversion to dictionary
print(dic_obj['data'][0]['v'])
result


Youdao translation version ( difficult )
analysis (js reverse )
F12 Go into developer mode , In the network xhr( look for ajax Place of request ) Find the interface shown in the figure below .
Then we look at the parameters :

The comparison between the two figures shows that ,i It should be the sentence we want to translate , The green line is the parameters of different forms , We need to deal with it ,Its A see be 13 Bit time stamp ,salt It means salt in English , And better than timestamp lts More than a , The first 13 are the same , It should be a salt timestamp ( For a string of numbers, you can add a string of numbers or strings and then encrypt , In encryption, we call adding salt ), We can use these two parameters python Separate simulation , In order to avoid unnecessary trouble or some people will not , We found them directly behind js sentence , use python perform js Just generate it .
And here it is sign At a glance, there is 32 position , It should be generated by some encryption algorithm , The most common is md5 and rsa Encrypted , Let's do a global search js reverse :

After searching , We found an old friend md5 encryption , The generation method of parameters is also found , In the figure js Inside r It's a time stamp ,js Inside i It's the salt timestamp ,sign Yes, it is md5 Encrypted string in parentheses , And analyze e The birth of , You can find out through break point debugging .
You can see e Is what we want to translate , Now the parameters are obvious , Our simplest call is actually python Medium hashlib Module md5 The encryption algorithm can get sign, But here we don't have to , Increase the difficulty , practice js reverse . I directly extracted md5 cryptographic js Put the files in the network disk , You can extract it yourself , Use... In the code .
link :https://pan.baidu.com/s/1aV1tEo35Oyw4TUExhJoXUA
Extraction code :waan
meanwhile , In order to deal with reverse climbing , Not just User-Agent, Plus Cookie and Referer.
Code
import requests
import execjs # perform js Module of statement
import json
import jsonpath
class Youdao():
def __init__(self,msg):
# url
self.url = 'https://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
# headers
self.headers = {
'User-Agent': 'Mozilla / 5.0(Windows NT 10.0;WOW64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 91.0.4472.124Safari / 537.',
'Cookie': 'OUTFOX_SEARCH_USER_ID = [email protected];OUTFOX_SEARCH_USER_ID_NCOO = 39238000.072458096;JSESSIONID = aaak-QLUNaabh_wFWK8Qx;___rl__test__cookies = 1626662199192',
'Referer': 'https://fanyi.youdao.com/'
}
self.msg = msg
self.Formdata = None
def js_Formdata(self):
# Time stamp
r = execjs.eval('"" + (new Date).getTime()')
# Time stamp, salt
i = r + str(execjs.eval('parseInt(10 * Math.random(), 10)'))
ctx = execjs.compile(open('./youdao.js', 'r', encoding='utf-8').read())
sign = ctx.call('getsign', self.msg,i) # call youdao.js Inside getsign function , Pass in the things to be translated and the salt time stamp .
self.Formdata = {
'i': self.msg,
'from': 'AUTO',
'to': 'AUTO',
'smartresult': 'dict',
'client': 'fanyideskweb',
'salt': i,
'sign': sign,
'lts': r,
'bv': 'f46e446c6db49492797b7d03ea1e82da',
'doctype': 'json',
'version': '2.1',
'keyfrom': 'fanyi.web',
'action': 'FY_BY_REALTlME',
}
def response(self):
resp = requests.post(url=self.url,data=self.Formdata,headers=self.headers).text
data = json.loads(resp) # take json Convert data into a dictionary
# utilize jsonpath Extract the data
if "translateResult" in data:
k = jsonpath.jsonpath(data, '$..translateResult')[0][0][0]['tgt']
print(k)
print(" Other translators :")
if "smartResult" in data:
lst = jsonpath.jsonpath(data, '$..entries')[0]
for k in lst[1:]:
k = k.replace("\r\n", "")
print(k)
def main(self):
#Formdata
self.js_Formdata()
#print(self.Formdata)
# Send a request , Get a response
self.response()
if __name__ == '__main__':
msg = input(' Please enter the word or sentence you want to translate :')
youdao = Youdao(msg)
youdao.main()
result


Conclusion
If you think the blogger's writing is good, give it to the third company !!!
边栏推荐
- [Luogu p1971] rabbit and egg game (bipartite game)
- Release notes of JMeter version 5.5
- [semantic segmentation] - multi-scale attention
- $parent (get parent component) and $root (get root component)
- $parent(获取父组件) 和 $root(获取根组件)
- Differences between H5 architecture and native architecture
- About binary cannot express decimals accurately
- Example of Pushlet using handle of Pushlet
- Circulating tumor cells - here comes abnova's solution
- PostgreSQL source code (60) transaction system summary
猜你喜欢

Jesd204b clock network

The currently released SKU (sales specification) information contains words that are suspected to have nothing to do with baby

Example of Pushlet using handle of Pushlet

Implementation of AVL tree

Complete process of MySQL SQL

Four goals for the construction of intelligent safety risk management and control platform for hazardous chemical enterprises in Chemical Industry Park

非父子组件的通信

Kuboard can't send email and nail alarm problem is solved

Composition API 前提

1089: highest order of factorial
随机推荐
Common function detect_ image/predict
详解机器翻译任务中的BLEU
Complete process of MySQL SQL
[explanation of JDBC and internal classes]
子组件传递给父组件
点亮显示屏的几个重要步骤
[Luogu p1971] rabbit and egg game (bipartite game)
Abnova circulating tumor DNA whole blood isolation, genomic DNA extraction and analysis
Select the product attribute pop-up box to pop up the animation effect from the bottom
sql中对集合进行非空校验
About binary cannot express decimals accurately
Unity C function notes
Le Service MySQL manque dans le service informatique
Modify the jupyter notebook file path
计算机服务中缺失MySQL服务
How to model and simulate the target robot [mathematical / control significance]
Non empty verification of collection in SQL
选择商品属性弹框从底部弹出动画效果
Stockage et pratique des données en langage C (haut niveau)
面试官:你都了解哪些开发模型?