python爬虫，requests使用，网页采集案列：搜狗爬取人物信息，requests会自

文章由Byrx.net分享于2023-03-26 01:03:25评论（29）

python爬虫，requests使用，网页采集案列：搜狗爬取人物信息，requests会自

一、初识爬虫，requests使用

requests介绍：

Request支持HTTP连接保持和连接池，支持使用cookie保持会话，支持文件上传，支持自动响应内容的编码，支持国际化的URL和POST数据自动编码。requests会自动实现持久连接keep-alive

# 导入模块
import requests
# 目标URL
url = 'https://www.sogou.com/'
response = requests.get(url=url)  # 发起请求，并接受
# 接受的页面进行解析
page_text = response.text
# 打印出来
print(page_text)
# 保存到本地
with open('sogou.html', 'w', encoding='utf-8') as fp:
    fp.write(page_text)
print("结束")

二、网页采集案列：搜狗爬取人物信息

# 导入模块，输入url
import requests
url = 'https://www.sogou.com/web?'

# 模拟浏览器UA，防止被发现是个爬虫
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36)'
}

# 输入提示框（要搜索的东西）
name = input("输入一个人名:")

# 构造payload，模拟真实数据包
param = {
    'type': 'getpinyin',
    'query': name
}

# 发起请求并接受请求到的内容
response = requests.get(url, params=param, headers=headers)

# 文本方式读取
page_txt = response.text

# 保存网页
filename = name + '.html'
with open(filename, 'w', encoding='utf-8') as fp:
    fp.write(page_txt)
    print("succeed")

热门文章：

python爬虫，requests使用，网页采集案列：搜狗爬取人物信息，requests会自

python爬虫，requests使用，网页采集案列：搜狗爬取人物信息，requests会自

一、初识爬虫，requests使用

二、网页采集案列：搜狗爬取人物信息

相关内容

最新python教程

python~HOT