python 处理xml 遇到特殊符号解析错误的情况,pythonxml,global:[20
python 处理xml 遇到特殊符号解析错误的情况,pythonxml,global:[20
global:[2017-06-05 10:27:48.662313] [DEBUG] 输出fmsg_content <msg fromusername="li2571" encryptusername="v1_d[email protected]stranger" fromnickname="??? ? ?? ??????e??" content="我是" fullpy="?????????????" shortpy="?????????????" imagestatus="3" scene="3" country="CN" province="Hubei" city="Jingzhou" sign="〢铷裹爱,请深爱.、.|‖|‖▍.*" percard="1" sex="2" alias="LiLizouli713" weibo="" weibonickname="" albumflag="0" albumstyle="0" albumbgimgid="" snsflag="17" snsbgimgid="http://szmmsns.qpic.cn/mmsns/aicQlel8roa2oJPnj8q8Gf1ibVnDX1x5HD23xde644eAP8x0E5qtm69hGQ5e6GOquAkiaku39cAte8/0" snsbgobjectid="12548372440867024976" mhash="3247e9c6ea7921d63e672c5ede4e206e" mfullhash="3247e9c6ea7921d63e672c5ede4e206e" bigheadimgurl="http://wx.qlogo.cn/mmhead/ver_1/kjW4HogEYibLpboXT4mUDTUV9BhRnXAt0C4DW7JvQUY3Tia8yF8ibBBGF7wRv9vaaFdFcLne8GybjLlsVaTrrKNrP2Zjjlxtp9vGKEdcgCiaB44/0" smallheadimgurl="http://wx.qlogo.cn/mmhead/ver_1/kjW4HogEYibLpboXT4mUDTUV9BhRnXAt0C4DW7JvQUY3Tia8yF8ibBBGF7wRv9vaaFdFcLne8GybjLlsVaTrrKNrP2Zjjlxtp9vGKEdcgCiaB44/96" ticket="v2_8655444fac8ef7e3a277aeee973c6038a[email protected]stranger" opcode="2" googlecontact="" qrticket="" chatroomusername="" sourceusername="" sourcenickname=""><brandlist count="0" ver="652744432"></brandlist></msg>
global:[2017-06-05 10:27:48.662493] [DEBUG] shuchuadezhi .、. #这里是fmsg_content[300:305]
global:[2017-06-05 10:27:48.662711] [ERROR] process_wechat_msg not well-formed (invalid token): line 1, column 301
import xml.etree.cElementTree as ET
xml_tree = ET.fromstring(fmsg_content)
运行报错
global:[2017-06-05 10:27:48.662711] [ERROR] process_wechat_msg not well-formed (invalid token): line 1, column 301
输出日志
fmsg_content[300:305] 得到的是特殊符号.、.
先尝试:
# parser = ET.XMLParser(encoding=‘utf-8‘)
# xml_tree = ET.fromstring(fmsg_content, parser=parser)
得到
*** Error in `/usr/bin/python3‘: double free or corruption (!prev): 0x00000000012ae500 ***
大坑。。
最终
fmsg_content=re.sub(u"[\x00-\x08\x0b-\x0c\x0e-\x1f]+",u"",fmsg_content)
xml_tree = ET.fromstring(fmsg_content)
替换掉非法字符 就不会报错了
python 处理xml 遇到特殊符号解析错误的情况
相关内容
- Python学习之路1-环境搭建与pycharm的配置,pythonpycharm,近
- python爬取视频网站m3u8视频,下载.ts后缀文件,合并成整
- 分享《Python基础教程(第3版)》(高清中文版PDF+高清
- Python 使用 Matplotlib 做图时,如何画竖直和水平的分割线
- python argparse详解,pythonargparse,1.argparse
- Python pyQt4学习笔记1,pythonpyqt4,PyQt4是用来编写
- linux中python安装,linuxpython安装,1、查看当前环境中是
- linux下安装python dlib依赖,pythondlib, dlib是主要用
- Python 随笔之Redis,python随笔redis,Python学习记录
- Python_网络爬虫(新浪新闻抓取),python新浪新闻,爬取
评论关闭