【Python3爬虫】使用云打码识别验证码,,本来是学着使用tes


本来是学着使用tesserocr来识别验证码的,但是由于tesserocr的识别率不高,还是学了一下使用云打码来识别验证码==

具体步骤如下:

1、首先是注册账号,然后进入这个网址(http://www.yundama.com/apidoc/YDM_SDK.html)选择PythonHTTP示例下载:

技术分享图片

2、下载后解压,可以看到有如下几个文件,因为我使用的Python版本是3.5,所以打开YDMHTTPDemo3.x:

技术分享图片

3、打开之后修改如下几个部分,用户名和密码就是你的用户名和密码,而appid和appkey需要进入开发者后台查看,第一次使用的时候还需要新建一个软件,才能有appid和appkey:

技术分享图片

下图中的软件代码就是appid,通讯密钥就是appkey:

技术分享图片

4、把信息都添加进去后运行代码,不出意外会返回一个1007,进入错误代码及排错(http://www.yundama.com/apidoc/YDM_ErrorCode.html)查找原因,原来是因为账户没有余额

技术分享图片

然后进入用户后台充值就行了,充值完以后再次运行代码,就可以看到识别结果了。

进行完如上步骤之后,我们就可以使用云打码平台来识别验证码了,不过为了使用方便,可以建一个YDMDemo.py,把账号密码等信息写进去,调用的时候只需要传入验证码图片就行了。

  1 import json  2 import time  3 import requests  4   5   6 class YDMHttp:  7     apiurl = ‘http://api.yundama.com/api.php‘  8     username = ‘‘  9     password = ‘‘ 10     appid = ‘‘ 11     appkey = ‘‘ 12  13     def __init__(self, username, password, appid, appkey): 14         self.username = username 15         self.password = password 16         self.appid = str(appid) 17         self.appkey = appkey 18  19     def request(self, fields, files=[]): 20         response = self.post_url(self.apiurl, fields, files) 21         response = json.loads(response) 22         return response 23  24     def balance(self): 25         data = {‘method‘: ‘balance‘, ‘username‘: self.username, ‘password‘: self.password, ‘appid‘: self.appid, 26                 ‘appkey‘: self.appkey} 27         response = self.request(data) 28         if response: 29             if response[‘ret‘] and response[‘ret‘] < 0: 30                 return response[‘ret‘] 31             else: 32                 return response[‘balance‘] 33         else: 34             return -9001 35  36     def login(self): 37         data = {‘method‘: ‘login‘, ‘username‘: self.username, ‘password‘: self.password, ‘appid‘: self.appid, 38                 ‘appkey‘: self.appkey} 39         response = self.request(data) 40         if response: 41             if response[‘ret‘] and response[‘ret‘] < 0: 42                 return response[‘ret‘] 43             else: 44                 return response[‘uid‘] 45         else: 46             return -9001 47  48     def upload(self, filename, codetype, timeout): 49         data = {‘method‘: ‘upload‘, ‘username‘: self.username, ‘password‘: self.password, ‘appid‘: self.appid, 50                 ‘appkey‘: self.appkey, ‘codetype‘: str(codetype), ‘timeout‘: str(timeout)} 51         file = {‘file‘: filename} 52         response = self.request(data, file) 53         if response: 54             if response[‘ret‘] and response[‘ret‘] < 0: 55                 return response[‘ret‘] 56             else: 57                 return response[‘cid‘] 58         else: 59             return -9001 60  61     def result(self, cid): 62         data = {‘method‘: ‘result‘, ‘username‘: self.username, ‘password‘: self.password, ‘appid‘: self.appid, 63                 ‘appkey‘: self.appkey, ‘cid‘: str(cid)} 64         response = self.request(data) 65         return response and response[‘text‘] or ‘‘ 66  67     def decode(self, filename, codetype, timeout): 68         cid = self.upload(filename, codetype, timeout) 69         if cid > 0: 70             for i in range(0, timeout): 71                 result = self.result(cid) 72                 if result != ‘‘: 73                     return cid, result 74                 else: 75                     time.sleep(1) 76             return -3003, ‘‘ 77         else: 78             return cid, ‘‘ 79  80     def report(self, cid): 81         data = {‘method‘: ‘report‘, ‘username‘: self.username, ‘password‘: self.password, ‘appid‘: self.appid, 82                 ‘appkey‘: self.appkey, ‘cid‘: str(cid), ‘flag‘: ‘0‘} 83         response = self.request(data) 84         if response: 85             return response[‘ret‘] 86         else: 87             return -9001 88  89     def post_url(self, url, fields, files=[]): 90         for key in files: 91             files[key] = open(files[key], ‘rb‘) 92         res = requests.post(url, files=files, data=fields) 93         return res.text 94  95  96 def use_ydm(filename): 97     username = ‘‘  # 用户名 98     password = ‘‘  # 密码 99     app_id = 1  # 软件ID100     app_key = ‘‘  # 软件密钥101     code_type = 1004  # 验证码类型102     timeout = 60  # 超时时间,秒103     yundama = YDMHttp(username, password, app_id, app_key)  # 初始化104     balance = yundama.balance()  # 查询余额105     print(‘您的题分余额为{}‘.format(balance))106     cid, result = yundama.decode(filename, code_type, timeout)  # 开始识别107     print(‘识别结果为{}‘.format(result))108     return result

【Python3爬虫】使用云打码识别验证码

评论关闭