python re抓站存数据问题。，pythonre,喜欢看日乎日报，就像把他

文章由Byrx.net分享于2019-03-23 06:03:01评论（348）

python re抓站存数据问题。，pythonre,喜欢看日乎日报，就像把他

喜欢看日乎日报，就像把他们采集下来方便以后看。
但是碰到这样的目标：http://daily.zhihu.com/story/4692091
采集回来存数据库的时候，只存第一个条目.
需要标题和内容，使用的是scrapy和re.compile方法。
如何将标题和内容一一对应，并全部存入数据库。
练习python中...
采集代码:

        ......        item = ShenhuifuItem()        sites = response.body        i = sites        items = []        item['bid']=re.compile('(\d+)').findall(response.url)[0]        item['title']=re.compile(r'<h2 class="question-title">(.*?)</h2>').findall(i)        item['content']=re.compile(r'<div class="content">(.*?)</div>',re.DOTALL).findall(i)        item['author']=re.compile(ur'<span class="author">(.*?)</span>').findall(i)        for title in  item['title']:            item['title'] = title        for content in item['content']:            item['content'] = content        for author in item['author']:            if "，" in author:                item['author'] = author[:-1]            else:                item['author']=author        items.append(item)        yield item

编橙之家文章，

热门文章：

python re抓站存数据问题。，pythonre,喜欢看日乎日报，就像把他

python re抓站存数据问题。，pythonre,喜欢看日乎日报，就像把他

相关内容

最新python问答

python~HOT