我试图获得某些模式的出现基于索引在Dataframe,任何帮助将不胜感激
手动添加了一列作为Index,需要根据Index和该列获取模式的出现次数
dataset sample
a(index) d
pattern 1 test
pattern 1 test
pattern 1 test2
pattern 2 test3
pattern 2 test
pattern 2 test
expected output
Am looking to make a dataframe something like below with the above sample data
pattern test test2 test3
----------------------------------
pattern 1 2 1 0
pattern 2 2 0 1
如果您重置索引,它是一个非常简单的groupby
:
In [18]: df
Out[18]:
d
a
pattern 1 test
pattern 1 test
pattern 1 test2
pattern 2 test3
pattern 2 test
pattern 2 test
In [19]: df.reset_index().groupby(['a', 'd']).apply(len).reset_index()
Out[19]:
a d 0
0 pattern 1 test 2
1 pattern 1 test2 1
2 pattern 2 test 2
3 pattern 2 test3 1