python简单爬数据,,失败了,即使跟Fir


失败了,即使跟Firefox看到的headers,参数一模一样都不行,爬出来有网页,但是就是不给数据,尝试禁用了js,然后看到了cookie(不禁用js是没有cookie的),用这个cookie爬,还是不行,隔了时间再看,cookie的内容也并没有变化,有点受挫,但还是发出来,也算给自己留个小任务啥的

如果有大佬经过,还望不吝赐教

另外另两个网站的脚本都可以用,过会直接放下代码,过程就不说了


目标网站 http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml

先解决一下date到decimal years的转换,仅考虑到天的粗略转换

def date2dy(year, month, day):    months = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]    oneyear = 365    if year%100 == 0:        if year%400 == 0:            months[1] = 29            oneyear = 366    else:        if year%4 == 0:            months[1] = 29            oneyear = 366    days = 0    i = 1    while i < month:        days = days + months[i]        i = i + 1    days = days + day - 1    return year + days/366

第一个小目标是抓下2016.12.1的数据

打开FireFox的F12,调到网络一栏

技术分享

提交数据得到

技术分享

有用的信息是请求头,请求网址和参数,扒下来扔到程序里面试试

这块我试了大概一天多,抓不下来,我好菜呀.jpg

放下代码吧先,万一有大佬经过还望不吝赐教

#!usr/bin/pythonimport requestsimport sysweb_url = r‘http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml‘request_url = r‘http://www.geomag.bgs.ac.uk/cgi-bin/igrfsynth‘filepath = sys.path[0] + ‘\\data_igrf_raw_‘ + ‘.html‘fid = open(filepath, ‘w‘, encoding=‘utf-8‘)headers = {    ‘Host‘: ‘www.geomag.bgs.ac.uk‘,    ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 6.1; rv:53.0) Gecko/20100101 Firefox/53.0‘,    ‘Accept‘: ‘text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8‘,    ‘Accept-Language‘: ‘zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3‘,    ‘Accept-Encoding‘: ‘gzip, deflate‘,    ‘Content-Type‘: ‘application/x-www-form-urlencoded‘,    ‘Content-Length‘: ‘136‘,    ‘Referer‘: ‘http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml‘,    ‘Connection‘: ‘keep-alive‘,    ‘Upgrade-Insecure-Requests‘: ‘1‘}payload = {    ‘name‘: ‘-‘,  # your name and email address    ‘coord‘: ‘1‘,  # ‘1‘: Geodetic ‘2‘: Geocentic    ‘date‘: ‘2016.92‘,  # decimal years    ‘alt‘: ‘150‘,  # Altitude    ‘place‘: ‘‘,    ‘degmin‘: ‘y‘,  # Position Coordinates: ‘y‘: In Degrees and Minutes ‘n‘: In Decimal Degrees    ‘latd‘: ‘60‘,  # latitude degrees (degrees negative for south)    ‘latm‘: ‘0‘,  # latitude minutes    ‘lond‘: ‘120‘,  # longitude degrees (degrees negative for west)    ‘lonm‘: ‘0‘,  # longitude minutes    ‘tot‘: ‘y‘,  # Total Intensity(F)    ‘dec‘: ‘y‘,  # Declination(D)    ‘inc‘: ‘y‘,  # Inclination(I)    ‘hor‘: ‘y‘,  # Horizontal Intensity(H)    ‘nor‘: ‘y‘,  # North Component (X)    ‘eas‘: ‘y‘,  # East Component (Y)    ‘ver‘: ‘y‘,  # Vertical Component (Z)    ‘map‘: ‘0‘,  # Include a Map of the Location: ‘0‘: NO ‘1‘: YES    ‘sv‘: ‘n‘}#如果需要Secular Variation (rate of change), 加上‘sv‘: ‘y‘r = requests.post(request_url, data=payload, headers=headers)fid.write(r.text)fid.close();

python简单爬数据

评论关闭