python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen

文章由Byrx.net分享于2019-08-31 02:08:34评论（521）

python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen

from selenium import webdriverfrom lxml import etreefrom pyquery import PyQuery as pqimport timedriver = webdriver.Chrome()driver.maximize_window()driver.get(‘https://www.toutiao.com/‘)driver.implicitly_wait(10)driver.find_element_by_link_text(‘科技‘).click()driver.implicitly_wait(10)for x in range(3):    js="var q=document.documentElement.scrollTop="+str(x*500)    driver.execute_script(js)    time.sleep(2)time.sleep(5)page = driver.page_sourcedoc = pq(page)doc = etree.HTML(str(doc))contents = doc.xpath(‘//div[@class="wcommonFeed"]/ul/li‘)print(contents)for x in contents:    title = x.xpath(‘div/div[1]/div/div[1]/a/text()‘)    if title:        title = title[0]        with open(‘toutiao.txt‘,‘a+‘,encoding=‘utf8‘)as f:            f.write(title+‘\n‘)        print(title)    else:        pass

python爬虫 selenium 抓取今日头条（ajax异步加载）

热门文章：

python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen

python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen

相关内容

最新python教程

python~HOT

python爬虫 selenium 抓取 今日头条（ajax异步加载），,from selen

python爬虫 selenium 抓取 今日头条（ajax异步加载），,from selen

相关内容

最新python教程

python~HOT

python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen

python爬虫 selenium 抓取今日头条（ajax异步加载），,from selen