python 爬虫爬wiki 报错 [Errno 65] No route to host,pythonerrno,代码如下# -*- co


代码如下

# -*- coding: utf-8 -*-import bs4import reimport requestsfrom bs4 import BeautifulSoupdef work(html):    soup = BeautifulSoup(html,'html.parser')    print(soup.prettify())use_data = {}use_data['url'] = r'https://zh.wikipedia.org/zh/\%E9\%A2\%9C\%E8\%89\%B2\%E5\%88\%97\%E8\%A1\%A8'proxy = {"http":"http://72.46.135.119:21071","https":"https://72.46.135.119:21071"} # shadowsocks服务器地址# response = requests.get(use_data['url'])response = requests.get(use_data['url'],proxies = proxy,verify=False)print type(requests.get(use_data['url']).text) #查看编码response.encoding = 'gbk'work(response.text)

在不启用代理,注释proxy,执行response = requests.get(use_data['url']) 时报错

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='zh.wikipedia.org', port=443): Max retries exceeded with url: /zh/%5C%E9%5C%A2%5C%9C%5C%E8%5C%89%5C%B2%5C%E5%5C%88%5C%97%5C%E8%5C%A1%5C%A8 (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x109c3e810>: Failed to establish a new connection: [Errno 65] No route to host',))

试着启用代理,使用的是自己买的shadowsocks服务器..结果报错无法连接代理。想问一下python爬虫可以用shadowsocks服务器作代理进行爬虫吗?如果不行,用什么方式代理爬wiki百科比较合适方便。谢谢

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='zh.wikipedia.org', port=443): Max retries exceeded with url: /zh/%5C%E9%5C%A2%5C%9C%5C%E8%5C%89%5C%B2%5C%E5%5C%88%5C%97%5C%E8%5C%A1%5C%A8 (Caused by ProxyError('Cannot connect to proxy.', error(54, 'Connection reset by peer')))

编橙之家文章,

评论关闭