1、apply()
pandas.DataFrame.apply
该方法可对Series和DataFrame进行操作
DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)
- func应用函数
- axis 0为行,1为列
- raw 传入为 Series 还是 ndarray 对象
- result_type 返回类型, expand 将返回结果扩展
>>>df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
>>>df
A B
0 4 9
1 4 9
2 4 9
>>>df.apply(np.sum, axis=0)
A 12
B 27
dtype: int64
>>>df.apply(np.sum, axis=1)
A 12
B 27
dtype: int64
>>>df.apply(lambda x: [1, 2], axis=1)
0 [1, 2]
1 [1, 2]
2 [1, 2]
dtype: object
>>>df.apply(lambda x: [1, 2], axis=1, result_type='expand')
0 1
0 1 2
1 1 2
2 1 2
#高级应用
import numpy as np
>>>df=pd.DataFrame(np.random.randn(6,4),index=list('abcdef'),columns=list('ABCD'))
>>>df
A B C D
a 1.728403 0.837979 0.802319 0.290152
b -0.975462 1.083493 -0.233180 -0.431483
c -0.142911 -0.277469 -0.249412 0.152072
d -0.398880 -0.441956 0.374148 0.984257
e -0.042556 0.156787 -0.689351 1.551284
f 0.741325 0.557786 -2.353452 0.042494
>>>df['E']=df['D'].apply(lambda x:x*2)
>>>df
A B C D E
a 1.728403 0.837979 0.802319 0.290152 0.580305
b -0.975462 1.083493 -0.233180 -0.431483 -0.862965
c -0.142911 -0.277469 -0.249412 0.152072 0.304143
d -0.398880 -0.441956 0.374148 0.984257 1.968514
e -0.042556 0.156787 -0.689351 1.551284 3.102569
f 0.741325 0.557786 -2.353452 0.042494 0.084989
>>>df['F']=df.apply(lambda row:max(row['B'],row['C']),axis=1)
>>>df
A B C D E F
a 1.728403 0.837979 0.802319 0.290152 0.580305 0.837979
b -0.975462 1.083493 -0.233180 -0.431483 -0.862965 1.083493
c -0.142911 -0.277469 -0.249412 0.152072 0.304143 -0.249412
d -0.398880 -0.441956 0.374148 0.984257 1.968514 0.374148
e -0.042556 0.156787 -0.689351 1.551284 3.102569 0.156787
f 0.741325 0.557786 -2.353452 0.042494 0.084989 0.557786
>>>def get_res(a,b):
print([a*3,b-2])
return a*3,b-2
>>>df=pd.concat([df,df.apply(lambda row:pd.Series(get_res(row['A'],row['B']), index=['H', 'I']),axis=1,result_type='expand')],axis=1)
>>>df
A B C ... F H I
a 1.728403 0.837979 0.802319 ... 0.837979 5.185209 -1.162021
b -0.975462 1.083493 -0.233180 ... 1.083493 -2.926385 -0.916507
c -0.142911 -0.277469 -0.249412 ... -0.249412 -0.428732 -2.277469
d -0.398880 -0.441956 0.374148 ... 0.374148 -1.196640 -2.441956
e -0.042556 0.156787 -0.689351 ... 0.156787 -0.127668 -1.843213
f 0.741325 0.557786 -2.353452 ... 0.557786 2.223976 -1.442214
2、map()
pandas.Series.map
map() 只对Series有作用
Series.map(self, arg, na_action=None)
>>> s = pd.Series(['cat', 'dog', np.nan, 'rabbit'])
>>> s
0 cat
1 dog
2 NaN
3 rabbit
dtype: object
>>> s.map({'cat': 'kitten', 'dog': 'puppy'})
0 kitten
1 puppy
2 NaN
3 NaN
dtype: object
>>> s.map('I am a {}'.format, na_action='ignore')
0 I am a cat
1 I am a dog
2 NaN
3 I am a rabbit
dtype: object
3、applymap()
pandas.DataFrame.applymap
DataFrame.applymap(self, func)
只对DataFrame有作用
>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])
>>> df
0 1
0 1.000 2.120
1 3.356 4.567
>>> df.applymap(lambda x: len(str(x)))
0 1
0 3 4
1 5 5
0 Comments