假设我有一个多索引的pandas数据帧,看起来像下面这个,取自
documentation.
- import numpy as np
- import pandas as pd
- arrays = [np.array(['bar','bar','baz','foo','qux','qux']),np.array(['one','two','one','two'])]
- df = pd.DataFrame(np.random.randn(8,4),index=arrays)
看起来像这样:
- 0 1 2 3
- bar one -0.096648 -0.080298 0.859359 -0.030288
- two 0.043107 -0.431791 1.923893 -1.544845
- baz one 0.639951 -0.008833 -0.227000 0.042315
- two 0.705281 0.446257 -1.108522 0.471676
- foo one -0.579483 -2.261138 -0.826789 1.543524
- two -0.358526 1.416211 1.589617 0.284130
- qux one 0.498149 -0.296404 0.127512 -0.224526
- two -0.286687 -0.040473 1.443701 1.025008
现在我只想要在MultiIndex的第二级中包含“ne”的行.
有没有办法为(部分)包含的字符串切片MultiIndex?
解决方法
您可以应用如下掩码:
- df = df.iloc[df.index.get_level_values(1).str.contains('ne')]
返回:
- bar one -0.143200 0.523617 0.376458 -2.091154
- baz one -0.198220 1.234587 -0.232862 -0.510039
- foo one -0.426127 0.594426 0.457331 -0.459682
- qux one -0.875160 -0.157073 -0.540459 -1.792235
编辑:
也可以在多个级别上应用逻辑掩码,例如:
- df = df.iloc[(df.index.get_level_values(0).str.contains('ba')) | (df.index.get_level_values(1).str.contains('ne'))]
收益: