0%

Python学习笔记-爬虫探索

百度翻译

模仿

分析百度词典:

  1. 打开F12,查看Network栏
  2. 尝试输入单词girl,发现每敲一个字母后面都有请求
  3. 请求地址是:https://fanyi.baidu.com/sug
  4. 请求方式:POST
  5. 利用NetWork-All-Hearders查看,发现Form data的值是kw:girl,所以字典的key名为kw。
  6. 查看返回内容格式,返回的是json格式内容,需要用到json包

啊啊啊这几天太摸了!!!


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from urllib import request, parse, error
import json

baseURL = 'https://fanyi.baidu.com/sug'

try:
word = input('INPUT:')
dat = parse.urlencode({"kw": word}).encode()
with request.urlopen(baseURL, data=dat) as f:
# print('Status:', f.status, f.reason)
ret = f.read().decode()
ret = json.loads(ret)
# print(ret['data'])
ls = [a['k'] for a in ret['data']]
if word in ls:
print('%s: %s' % (word, ret['data'][0]['v']))
print('Similar expressions:')
else:
print('NOT FOUND. Did you mean:')
print(ls)
except error.URLError as e:
print(e)
# 爽!

# Upd on 8.11
import requests
import json

BaseURL = 'https://fanyi.baidu.com/sug'

try:
word = input('INPUT:')
r = requests.post(BaseURL, data={'kw': word})
r.raise_for_status()
js = json.loads(r.text)
print(js)
except requests.RequestException as e:
print(e)

# requests更爽!

百度翻译API

抓包发现返回的翻译内容来自https://fanyi.baidu.com/v2transapi?from=en&to=zh

Let’s try!

该死,竟然加密了…懒得搞了,看这个接口吧。