python - grouping dataframes in pandas efficiently? -
i have following dataframe in pandas there's unique index (employee) each row , group label type:
df = pandas.dataframe({"employee": ["a", "b", "c", "d"], "type": ["x", "y", "y", "y"], "value": [10,20,30,40]}) df = df.set_index("employee") i want group employees type , calculate statistic each type. how can , final dataframe type x statistic, example type x (mean of types)? tried using groupby:
g = df.groupby(lambda x: df.ix[x]["type"]) result = g.mean() this inefficient since references index ix of df each row - there better way?
like @sza says, can use:
in [11]: g = df.groupby("type") in [12]: g.mean() out[12]: value type x 10 y 30 see groupby docs more...
Comments
Post a Comment