python使用BeautifulSoup解析html获得网站的百度收录量，,BeautifulSou

文章由Byrx.net分享于2019-03-23 11:03:46评论（538）

python使用BeautifulSoup解析html获得网站的百度收录量，,BeautifulSou

BeautifulSoup解析html非常方便，主要使用它的find()和findAll()方法来找到页面上的指定元素。

安装BeautifulSoup

在命令行使用如下指令安装：

easy_install BeautifulSoup

使用BeatifulSoup

我们已获得网站的baidu收录数为例，如下python代码：

# -*- coding: cp936 -*-import urllibfrom BeautifulSoup import BeautifulSoupimport redef get_baidu_records_count(host):  url = 'http://www.baidu.com/s?wd=site%3A' + host  data  = urllib.urlopen(url)  html = data.read()  soup = BeautifulSoup(html)  #使用find方法找到class为site_tip的p标签  siteTipP = soup.find('p',{'class':'site_tip'})  if not siteTipP:    return 0  #找到p标签的第一个strong标签  strong = siteTipP.find('strong')  #使用.string获得strong标签的内容  text = strong.string  numPattern = re.compile(r'\d+')  m = numPattern.search(text)  strCn = m.group(0)  return int(strCn)if __name__ == '__main__':  host = 'OutOfMemory.CN'  print '%s的百度收录量为%d' % (host,get_baidu_records_count(host))

运行程序可以获得OutOfMemory.CN的收录量，可惜现在百度收录量很差！还得继续加油！

热门文章：

python使用BeautifulSoup解析html获得网站的百度收录量，,BeautifulSou

python使用BeautifulSoup解析html获得网站的百度收录量，,BeautifulSou

安装BeautifulSoup

使用BeatifulSoup

相关内容

最新python源码实例

python~HOT