銆怭ython銆戝瓙鍩熷悕鏌ヨ鑴氭湰,,鏍囩锛?a hre
銆怭ython銆戝瓙鍩熷悕鏌ヨ鑴氭湰,,鏍囩锛?a hre
鏍囩锛?a href='http://www.byrx.net/so/1/domain' title='domain'>domain
鑴氭湰瀛︿範锛屽鍐欏啓灏变細鍟︼紝鏉ヤ竴鍙戜釜浜虹紪鍐欑殑瓒呯骇鏃犳晫low鐨勫瓙鍩熷悕鏌ヨ鑴氭湰
#coding:utf-8import reimport requestsimport urllibimport urllib2import bs4 from bs4 import BeautifulSoup key=raw_input("please input top domain: ")print "鏌ヨ椹笂寮€濮?.."title=[]domainlist=[]for n in xrange(1,66): if n!=1: n*=10 url="https://cn.bing.com/search?q=domain:"+key+"&first=%s" % n try: req=urllib2.Request(url) resp=urllib2.urlopen(req).read() #BeautifulSoup鍖归厤鏍囬 bsObj=BeautifulSoup(resp,"lxml") getList=bsObj.find_all("h2",{"class":""}) for t in getList: title.append(t.get_text()) #姝e垯鍖归厤瀛愬煙鍚?/span> regex=re.compile(鈥?/span><cite>(.*?)</cite>鈥?/span>).findall(resp) for i in regex: domainlist.append(i.strip(鈥?/span>https://鈥?/span>).strip(鈥?/span>http://鈥?/span>).split(鈥?/span>/鈥?/span>)[0]) #鍚屾杈撳嚭鏌ヨ鍒扮殑鏍囬鍜屽瓙鍩熷悕 for (i,j) in zip(title,domainlist): print "%-50s%-30s" % (i,j) except Exception,e: print e print "鏌ヨ宸插叏閮ㄥ畬鎴?.."#鍘绘帀閲嶅鐨勫瓙鍩熷悕domainlists=list(set(domainlist))#淇濆瓨瀛愬煙鍚?/span>for line in domainlists: with open(鈥?/span>subdomain.txt鈥?/span>,鈥?/span>a鈥?/span>) as fw: fw.write(line+鈥?/span>\n鈥?/span>)
杩愯鎴浘锛?/p>
杩愯缁撴灉鎴浘锛?/p>
銆怭ython銆戝瓙鍩熷悕鏌ヨ鑴氭湰
鏍囩锛?a href='http://www.byrx.net/so/1/domain' title='domain'>domain
鍘熸枃鍦板潃锛歨ttps://www.cnblogs.com/peterpan0707007/p/8831183.html
相关内容
- python3调用exe程序编写cve20190708批量检测工具,,1、pyth
- Python-爬虫-针对有frame框架的页面,,有的页面会使用f
- python 有关矩阵行列的存取 np.array,,初始化a = ran
- 利用python数据分析panda学习笔记之DataFrame,,2 DataFram
- 替换句子中的多个不同的词—— python 实现,,对一个句
- Python绘制wav文件音频图(静态)[matplotlib/wave],,#!/usr/bin
- python绝技 — 用Scapy解析TTL字段的值,,#!/usr/bin
- Python中的eval函数,,eval函数是实现字
- /和//的区别(python),,/ 除得到的是浮点数
- 分享《Python基础教程(第3版)》+PDF+源码+Magnus Lie Het
评论关闭