python 爬虫爬wiki 报错 [Errno 65] No route to host,pythonerrno,代码如下# -*- co
python 爬虫爬wiki 报错 [Errno 65] No route to host,pythonerrno,代码如下# -*- co
代码如下
# -*- coding: utf-8 -*-import bs4import reimport requestsfrom bs4 import BeautifulSoupdef work(html): soup = BeautifulSoup(html,'html.parser') print(soup.prettify())use_data = {}use_data['url'] = r'https://zh.wikipedia.org/zh/\%E9\%A2\%9C\%E8\%89\%B2\%E5\%88\%97\%E8\%A1\%A8'proxy = {"http":"http://72.46.135.119:21071","https":"https://72.46.135.119:21071"} # shadowsocks服务器地址# response = requests.get(use_data['url'])response = requests.get(use_data['url'],proxies = proxy,verify=False)print type(requests.get(use_data['url']).text) #查看编码response.encoding = 'gbk'work(response.text)
在不启用代理,注释proxy,执行response = requests.get(use_data['url']) 时报错
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='zh.wikipedia.org', port=443): Max retries exceeded with url: /zh/%5C%E9%5C%A2%5C%9C%5C%E8%5C%89%5C%B2%5C%E5%5C%88%5C%97%5C%E8%5C%A1%5C%A8 (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x109c3e810>: Failed to establish a new connection: [Errno 65] No route to host',))
试着启用代理,使用的是自己买的shadowsocks服务器..结果报错无法连接代理。想问一下python爬虫可以用shadowsocks服务器作代理进行爬虫吗?如果不行,用什么方式代理爬wiki百科比较合适方便。谢谢
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='zh.wikipedia.org', port=443): Max retries exceeded with url: /zh/%5C%E9%5C%A2%5C%9C%5C%E8%5C%89%5C%B2%5C%E5%5C%88%5C%97%5C%E8%5C%A1%5C%A8 (Caused by ProxyError('Cannot connect to proxy.', error(54, 'Connection reset by peer')))
编橙之家文章,
相关内容
- 再次请教个flasksqlalchemy的表及字段问题,,class SpeedD
- python 第三方库requests连接url报错,pythonrequests,使用的语
- 今天两次遇到python里不同程序调同一模块,输入相同结
- Django: 路由与视图题的问题,django路由视图题,简介Dja
- wtforms 表单表单验证异常,不知如何解决,wtforms表单
- codecademy,ideone.com/等在线测试js,php,ruby,python是怎么实现的
- 个人博客站点如何上传文章(非WordPress),博客wordpr
- 建网站是怎样一个流程?,建网站流程?,我知道程序员
- python如何解析字符串中出现的英文人名?,python字符串
- 如何用python的selenium提取页面所有资源加载的链接?,
评论关闭