腾讯读书转TXT文件下载python代码,txtpython,这里是用腾讯读做为例子,


这里是用腾讯读做为例子,将它的内容转TXT文件后下载的这样一个python代码。

需要用到python urllib2等方法模块。

import re, os, urllib2url = 'http://book.qq.com/s/book/0/22/22707/'page_re = re.escape(url) + r'\d+\.shtml'data = urllib2.urlopen(url).read()pages = re.findall(page_re, data)count = 1txt = []for page in pages:    html = urllib2.urlopen(page).read()    print "downloading [%d/%d], %s" % (count, len(pages), page)    m = re.findall(re.escape('<div id="content"') + '.*?' + re.escape('</div>'), html, re.DOTALL)    if m:        m = m[0]    txt.append(m)    count += 1f=open('downqq.html', 'wb')#www.iplaypy.comf.write("""<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"             "http://www.w3.org/TR/html4/loose.dtd"><html lang="en">           <head><meta http-equiv="Content-Type" content="text/html;charset=GBK"><title></title></head><body>""")f.write('\r\n\r\n\r\n'.join(txt))f.write('</body></html>')f.close()print("DONE!")os.system("downqq.html")
                    

编橙之家文章,

评论关闭