python Pandas 读取数据,写入文件,python创建txt文件并写入,pandas 选取数


pandas 选取数据 iloc和 loc的用法不太一样,iloc是根据索引, loc是根据行的数值

>>>importpandasaspd>>>importos>>>os.chdir("D:\\")>>>d=pd.read_csv("GWAS_water.qassoc",delimiter="\s+")>>>d.loc[1:3]CHRSNPBPNMISSBETASER2TP11.447440.18000.17830.023691.0090.318521.449440.27850.24730.029311.1260.266531.452440.18000.17830.023691.0090.3185>>>d.loc[0:3]CHRSNPBPNMISSBETASER2TP01.410440.21570.17720.034061.2170.230411.447440.18000.17830.023691.0090.318521.449440.27850.24730.029311.1260.266531.452440.18000.17830.023691.0090.3185>>>d.iloc[0:3]CHRSNPBPNMISSBETASER2TP01.410440.21570.17720.034061.2170.230411.447440.18000.17830.023691.0090.318521.449440.27850.24730.029311.1260.2665>>>d.iloc[1:3,2]14472449Name:BP,dtype:int64>>>d.iloc[0:3,2]041014472449Name:BP,dtype:int64>>>d.head()CHRSNPBPNMISSBETASER2TP01.410440.21570.17720.034061.21700.230411.447440.18000.17830.023691.00900.318521.449440.27850.24730.029311.12600.266531.452440.18000.17830.023691.00900.318541.462440.25480.27440.020120.92860.3584>>>d.tail(3)CHRSNPBPNMISSBETASER2TP41870412.1934558844-0.22070.25580.01743-0.86310.39341870512.1934559844-0.22070.25580.01743-0.86310.39341870612.1934561144-0.22070.25580.01743-0.86310.393>>>d.describe()CHRBPNMISSBETASEcount418707.0000004.187070e+05418707.04.186820e+05418682.00000mean5.8057381.442822e+0744.0-4.271777e-030.21433std3.3929308.933882e+060.02.330019e-010.05190min1.0000004.100000e+0244.0-1.610000e+000.1013025%3.0000007.345860e+0644.0-1.638000e-010.1732050%5.0000001.371612e+0744.0-1.826000e-160.2067075%9.0000002.051322e+0744.01.391000e-010.25010max12.0000004.238896e+0744.01.467000e+000.67580R2TPcount418682.0000004.186820e+054.186820e+05mean0.026268-1.910774e-024.772397e-01std0.0359031.095115e+002.944290e-01min0.000000-5.582000e+002.034000e-0825%0.002969-7.955000e-012.179000e-0150%0.012930-8.468000e-164.624000e-0175%0.0359106.712000e-017.254000e-01max0.5312006.898000e+001.000000e+00>>>d.sort_values(by="P").iloc[0:15]CHRSNPBPNMISSBETASER2TP428701.32316680441.18700.17210.53126.8982.034000e-08293011.22184568441.18700.17210.53126.8982.034000e-08293021.22184590441.18700.17210.53126.8982.034000e-08293061.22184654441.18700.17210.53126.8982.034000e-08293051.22184628441.18700.17210.53126.8982.034000e-08293041.22184624441.18700.17210.53126.8982.034000e-081122123.14365699441.46700.22550.50186.5047.490000e-08292541.22167448441.07800.17230.48226.2541.713000e-07692912.9480651441.11400.18290.46906.0912.939000e-07292991.22180991440.85270.14580.44885.8486.574000e-071013913.6959715440.67820.11660.44625.8177.285000e-07293331.22198267440.92520.16160.43835.7249.888000e-071955135.20178388441.03500.18170.43595.6971.082000e-06292951.22180901440.74690.13200.43245.6571.236000e-06293001.22181119440.74690.13200.43245.6571.236000e-06>>>sort_D=d.sort_values(by="P").iloc[0:5]>>>m_D=d.dropna()#removeNA>>>sort_C=d.sort_values(["P","CHR","BP"])>>>sort_C.to_csv(file_name,sep=‘\t‘,encoding=‘utf-8‘)>>>d.sort_values(by="C",ascending=True)>>>sort_D.to_csv("result.txt",sep="")>>>sort_D.to_csv("result_no_index.txt",sep="",index=False)>>>


参考:

form,iinenumerate(list(range(1,10))):forn,jinenumerate(list(range(m+1,10))):printi*j


http://stackoverflow.com/questions/25943208/using-pandas-read-csv-on-an-open-file-twice


https://github.com/lijin-THU/notes-python



本文出自 “R和Python应用” 博客,请务必保留此出处http://matrix6ro.blog.51cto.com/1746429/1891793

python Pandas 读取数据,写入文件

相关内容

    暂无相关文章

评论关闭