python 使用spaCy 进行NLP处理,spacynlp,原文:http://
python 使用spaCy 进行NLP处理,spacynlp,原文:http://
原文:http://mp.weixin.qq.com/s/sqa-Ca2oXhvcPHJKg9PuVg
import spacynlp = spacy.load("en_core_web_sm")doc = nlp("The big grey dog ate all of the chocalate,but fortunately he wasn‘t sick!")# 利用空格分开print(doc.text.split())# 利用token的.orth_方法,可以识别标点符号print([token.orth_ for token in doc])# 带下划线的方法返回字符、不带下划线的方法返回数字print([(token, token.orth_, token.orth) for token in doc])# 分词,去除标点和空格print([token.orth_ for token in doc if not token.is_punct | token.is_space])# 标准化到基本形式practice = "practice practiced practicing"nlp_practice = nlp(practice)print([word.lemma_ for word in nlp_practice])# 词性标注 可以使用.pos_ 和 .tag_方法访问粗粒度POS标记和细粒度POS标记doc2 = nlp("Conor‘s dog‘s toy was hidden under the man‘s sofa in the woman‘s house")pos_tags = [(i, i.tag_) for i in doc2]print(pos_tags)# ‘s 的标签被标记为 POS.可以利用这个标记提取所有者和他们拥有的东西owners_possessions = []for i in pos_tags: if i[1] == "POS": owner = i[0].nbor(-1) possession = i[0].nbor(1) owners_possessions.append((owner, possession))print(owners_possessions)# 简化代码print([(i[0].nbor(-1), i[0].nbor(1)) for i in pos_tags if i[1] == "POS"])# 实体识别 PERSON 是不言自明的;NORP是国籍或宗教团体;GGPE标识位置(城市、国家等等);DATE 标识特定的日期或日期范围, ORDINAL标识一个表示某种类型的顺序的单词或数字。wiki_obama = """Barack Obama is an American politician who served as the 44th President of the United States from 2009 to 2017. He is the first African American to have served as president, as well as the first born outside the contiguous United States."""nlp_obama = nlp(wiki_obama)print([(i, i.label_, i.label) for i in nlp_obama.ents])# 将文章分成句子for ix, sent in enumerate(nlp_obama.sents,1): print("Sentence number {}:{}".format(ix,sent))
python 使用spaCy 进行NLP处理
相关内容
- Python2和Python3共存安装robotframework,,1、下载Python
- Python pandas.DataFrame调整列顺序及修改index名,,1. 从字典
- 阿里云WindowsServer部署python scrapy爬虫,,*本文适合Pytho
- 基于python的知乎开源爬虫 zhihu_oauth使用介绍,pythonzhi
- python模块之openpyxl,pythonopenpyxl,这是一个第三方库,可
- python常用模块-------转自林海峰老师,python林海峰,一
- window 安装 python,windowpython,官网地址下载安装包点
- python GUI尝鲜(但当涉猎,见往事耳),pythongui,第一步
- python之单元测试,python单元测试,一. 什么是单元测试
- Python 二级模拟选择题(七),python模拟选择题,1. 在面
评论关闭