今日老同学来问我有关Python爬虫urllib的一些用法,恰逢错误一道。于是以此做个记录…
运行环境 Runtime environment
1 | 操作系统 : Windows10 |
症状
尝试了来自老同学发来的urllib爬虫代码,目的是为了通过使用get请求发送json。Python3.6.2发送请求的时候,出现报错。
报错信息:”can’t concat str to bytes.”
报错代码如下:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23# send http
import urllib.request
The_id = "1"
# name = "233666888"
department = "233666888"
position = "233666888"
phone = "233666888"
email = "233666888"
data = {}
data['id'] = The_id
# data['name'] = name
data['department'] = department
data['position'] = position
data['phone'] = phone
data['email'] = email
my_headers = {'Content-Type': 'application/json'}
# url = 'http://172.19.237.1:8091/web/index.jsp'
url = 'http://httpbin.org/post'
my_request = urllib.request.Request(url,data = data,headers = my_headers)
my_responese = urllib.request.urlopen(my_request)
my_html = my_responese.read().decode('utf-8')
print(my_html)
解决办法
这是因为encode返回的是bytes型的数据,不可以和str相加。
使用urllib.parse.urlencode(data).encode(encoding=’UTF8’)来处理即可
将其部分修改为:1
2
3
4
5
6
7# url = 'http://172.19.237.1:8091/web/index.jsp'
url = 'http://httpbin.org/post'
params = urllib.parse.urlencode(data).encode(encoding='UTF8')
my_request = urllib.request.Request(url,data = params,headers = my_headers)
my_responese = urllib.request.urlopen(my_request)
my_html = my_responese.read().decode('utf-8')
print(my_html)
即可解决这个问题!