python通过HTMLParser抓取网页上的全部链接，pythonhtmlparser,Python HTMLP

文章由Byrx.net分享于2019-03-23 11:03:50评论（423）

python通过HTMLParser抓取网页上的全部链接，pythonhtmlparser,Python HTMLP

Python HTMLParser使用示例代码：

import HTMLParser, urllibclass linkParser(HTMLParser.HTMLParser):    def __init__(self):        HTMLParser.HTMLParser.__init__(self)        self.links = []    def handle_starttag(self, tag, attrs):        if tag=='a':            self.links.append(dict(attrs)['href'])htmlSource = urllib.urlopen("http://www.sharejs.com").read(200000)p = linkParser()p.feed(htmlSource)for link in p.links:    print link

热门文章：

python压缩和读取.tar.bz2格式的压缩包，python
webpy输出json例子代码，webpy输出json,webpy中可以
python的反射：动态获得模块，类，python模块
python使用urllib2抓取网页时的错误处理，pyth
python计算代码执行时间，python计算代码,impo
python使用xmlproc验证xml格式是否符合DTD定义，

python通过HTMLParser抓取网页上的全部链接，pythonhtmlparser,Python HTMLP

python通过HTMLParser抓取网页上的全部链接，pythonhtmlparser,Python HTMLP

相关内容

最新python源码实例

python~HOT