Python数据挖掘—回归—神经网络,python数据挖掘,概念:神经网络:全称


概念:

神经网络:全称为人工神经网络,是一种模仿生物神经网络(动物的中枢神经系统,特别是大脑)的结构和功能的数学模型或计算模型

生物神经网络:神经细胞是构成神经系统的基本单元,称为生物神经元,简称神经元

技术分享图片

技术分享图片

一般采用三到五层

首先导入自变量和因变量

技术分享图片
 1 import pandas; 2 from pandas import read_csv; 3  4 data = read_csv( 5     "C:\\Users\\Jw\\Desktop\\python_work\\Python数据挖掘实战课程课件\\4.5\\data.csv",  6     encoding=‘utf8‘ 7 ) 8 data = data.dropna() 9 10 dummyColumns = [11     ‘Gender‘, ‘Home Ownership‘, ‘Internet Connection‘, ‘Marital Status‘,12     ‘Movie Selector‘, ‘Prerec Format‘, ‘TV Signal‘]13 14 for column in dummyColumns:15     data[column]=data[column].astype(‘category‘)16 17 dummiesData = pandas.get_dummies(18     data, 19     columns=dummyColumns,20     prefix=dummyColumns,21     prefix_sep=" ",22     drop_first=True23 )24 25 """26 博士后    Post-Doc27 博士      Doctorate28 硕士      Master‘s Degree29 学士      Bachelor‘s Degree30 副学士    Associate‘s Degree31 专业院校  Some College32 职业学校  Trade School33 高中      High School34 小学      Grade School35 """36 educationLevelDict = {37     ‘Post-Doc‘: 9,38     ‘Doctorate‘: 8,39     ‘Master\‘s Degree‘: 7,40     ‘Bachelor\‘s Degree‘: 6,41     ‘Associate\‘s Degree‘: 5,42     ‘Some College‘: 4,43     ‘Trade School‘: 3,44     ‘High School‘: 2,45     ‘Grade School‘: 146 }47 48 dummiesData[‘Education Level Map‘] = dummiesData[‘Education Level‘].map(educationLevelDict)49 50 freqMap = {51     ‘Never‘: 0,52     ‘Rarely‘: 1,53     ‘Monthly‘: 2,54     ‘Weekly‘: 3,55     ‘Daily‘: 456 }57 dummiesData[‘PPV Freq Map‘] = dummiesData[‘PPV Freq‘].map(freqMap)58 dummiesData[‘Theater Freq Map‘] = dummiesData[‘Theater Freq‘].map(freqMap)59 dummiesData[‘TV Movie Freq Map‘] = dummiesData[‘TV Movie Freq‘].map(freqMap)60 dummiesData[‘Prerec Buying Freq Map‘] = dummiesData[‘Prerec Buying Freq‘].map(freqMap)61 dummiesData[‘Prerec Renting Freq Map‘] = dummiesData[‘Prerec Renting Freq‘].map(freqMap)62 dummiesData[‘Prerec Viewing Freq Map‘] = dummiesData[‘Prerec Viewing Freq‘].map(freqMap)63 64 dummiesSelect = [65     ‘Age‘, ‘Num Bathrooms‘, ‘Num Bedrooms‘, ‘Num Cars‘, ‘Num Children‘, ‘Num TVs‘, 66     ‘Education Level Map‘, ‘PPV Freq Map‘, ‘Theater Freq Map‘, ‘TV Movie Freq Map‘, 67     ‘Prerec Buying Freq Map‘, ‘Prerec Renting Freq Map‘, ‘Prerec Viewing Freq Map‘, 68     ‘Gender Male‘,69     ‘Internet Connection DSL‘, ‘Internet Connection Dial-Up‘, 70     ‘Internet Connection IDSN‘, ‘Internet Connection No Internet Connection‘,71     ‘Internet Connection Other‘, 72     ‘Marital Status Married‘, ‘Marital Status Never Married‘, 73     ‘Marital Status Other‘, ‘Marital Status Separated‘, 74     ‘Movie Selector Me‘, ‘Movie Selector Other‘, ‘Movie Selector Spouse/Partner‘, 75     ‘Prerec Format DVD‘, ‘Prerec Format Laserdisk‘, ‘Prerec Format Other‘, 76     ‘Prerec Format VHS‘, ‘Prerec Format Video CD‘, 77     ‘TV Signal Analog antennae‘, ‘TV Signal Cable‘, 78     ‘TV Signal Digital Satellite‘, ‘TV Signal Don\‘t watch TV‘79 ]80 81 inputData = dummiesData[dummiesSelect]82 83 outputData = dummiesData[[‘Home Ownership Rent‘]]
View Code

导入神经网络中的MLPClassifier类,使用模型进行多次评分

activation="relu",为激活函数,默认为relu,该句类似于使用s函数,hidden_layer_sizes时隐藏的层数

activation 激活函数

  √ relu 线性纠正函数,优于logistics和tanh,因为更符合生物神经元(要么不活动,活动起来比较平缓)

  √logistic logistic函数

  √tanh tanh函数

 1 from sklearn.neural_network import MLPClassifier 2  3 for l in range(1, 11): 4     ANNModel = MLPClassifier( 5         activation=‘relu‘,   #类似于s函数 6         hidden_layer_sizes=l   #隐藏层层数 7     ) 8  9     ANNModel.fit(inputData, outputData)10 11     score = ANNModel.score(inputData, outputData)12     print(str(l) + ", " + str(score))

预测数据

 1 newData = read_csv( 2     "C:\\Users\\Jw\\Desktop\\python_work\\Python数据挖掘实战课程课件\\4.4\\newData.csv",  3     encoding=‘utf-8‘ 4 ) 5  6 for column in dummyColumns: 7     newData[column] = newData[column].astype( 8         ‘category‘,  9         categories=data[column].cat.categories10     )11 12 newData = newData.dropna()13 14 newData[‘Education Level Map‘] = newData[‘Education Level‘].map(educationLevelDict)15 newData[‘PPV Freq Map‘] = newData[‘PPV Freq‘].map(freqMap)16 newData[‘Theater Freq Map‘] = newData[‘Theater Freq‘].map(freqMap)17 newData[‘TV Movie Freq Map‘] = newData[‘TV Movie Freq‘].map(freqMap)18 newData[‘Prerec Buying Freq Map‘] = newData[‘Prerec Buying Freq‘].map(freqMap)19 newData[‘Prerec Renting Freq Map‘] = newData[‘Prerec Renting Freq‘].map(freqMap)20 newData[‘Prerec Viewing Freq Map‘] = newData[‘Prerec Viewing Freq‘].map(freqMap)21 22 dummiesNewData = pandas.get_dummies(23     newData, 24     columns=dummyColumns,25     prefix=dummyColumns,26     prefix_sep=" ",27     drop_first=True28 )29 30 inputNewData = dummiesNewData[dummiesSelect]31 32 ANNModel.predict(inputData)

Python数据挖掘—回归—神经网络

评论关闭