Pandas 聚合函数

Pandas 中的聚合函数对数据执行汇总计算，通常是针对分组后的数据。但它也可以用于 Series 对象。

这对于计算数据中不同组的平均值、总和、计数和其他统计信息等任务非常有用。

语法

聚合函数的基本语法如下：

df.aggregate(func, axis=0, *args, **kwargs)

这里，

func - 聚合函数，例如 sum、mean 等。
axis - 指定是沿行还是沿列应用聚合操作。
*args 和 **kwargs - 可以传递给聚合函数的附加参数。

应用单个聚合函数

以下是在 Pandas 中应用单个聚合函数的方法。

import pandas as pd

data = {
    'Category': ['A', 'A', 'B', 'B', 'A', 'B'],
    'Value': [10, 15, 20, 25, 30, 35]
}

df = pd.DataFrame(data)

# calculate total sum of the Value column
total_sum = df['Value'].aggregate('sum')
print("Total Sum:", total_sum)

# calculate the mean of the Value column
average_value = df['Value'].aggregate('mean')
print("Average Value:", average_value)

# calculate the maximum value in the Value column
max_value = df['Value'].aggregate('max')
print("Maximum Value:", max_value)

输出

Total Sum: 135
Average Value: 22.5
Maximum Value: 35

这里，

df['Value'].aggregate('sum') - 计算 data DataFrame 中 Value 列的总和
df['Value'].aggregate('mean') - 计算 data DataFrame 中 Value 列的平均值
df['Value'].aggregate('max') - 计算 Value 列中的最大值。

在 Pandas 中应用多个聚合函数

我们还可以使用 Pandas 的 aggregate() 函数对一个或多个列应用多个聚合函数。例如：

import pandas as pd

data = {
    'Category': ['A', 'A', 'B', 'B', 'A', 'B'],
    'Value': [10, 15, 20, 25, 30, 35]
}

df = pd.DataFrame(data)

# applying multiple aggregation functions to a single column
result = df.groupby('Category')['Value'].agg(['sum', 'mean', 'max', 'min'])
print(result)

输出

          sum       mean  max  min
Category                           
A          55  18.333333   30   10
B          80  26.666667   35   20

在上面的示例中，我们使用 aggregate() 函数在按 Category 列分组后，对 Value 列应用多个聚合函数（sum、mean、max 和 min）。

结果 DataFrame 显示了每个类别的计算值。

应用不同的聚合函数

在 Pandas 中，我们可以使用包含字典的 aggregate() 函数将不同的聚合函数应用于不同的列。例如：

import pandas as pd

data = {
    'Category': ['A', 'A', 'B', 'B', 'A', 'B'],
    'Value1': [10, 15, 20, 25, 30, 35],
    'Value2': [5, 8, 12, 15, 18, 21]
}

df = pd.DataFrame(data)

agg_funcs = {

 # applying 'sum' to Value1 column
    'Value1': 'sum',   
       
# applying 'mean' and 'max' to Value2 column 
    'Value2': ['mean', 'max']   
}

result = df.groupby('Category').aggregate(agg_funcs)
print(result)

输出

         Value1 Value2    
            sum   mean max
Category                  
A            55  17.00  18
B            80  16.00  21

在这里，我们使用 aggregate() 函数在按 Category 列分组后，将不同的聚合函数应用于不同的列。

结果 DataFrame 显示了每个类别和每个指定聚合函数的计算值。

热门教程

热门实例

参考资料

认证课程

成为一名认证的 Python
程序员。

热门教程

参考资料

热门实例

简介

DataFrame 操作和处理

数据导入和导出

数据清洗

数据分析和聚合

数据可视化

Pandas 聚合函数

语法

应用单个聚合函数

在 Pandas 中应用多个聚合函数

应用不同的聚合函数

目录

热门教程

热门实例

参考资料

认证课程

成为一名认证的 Python程序员。

热门教程

参考资料

热门实例

简介

DataFrame 操作和处理

数据导入和导出

数据清洗

数据分析和聚合

数据可视化

Pandas 聚合函数

语法

应用单个聚合函数

在 Pandas 中应用多个聚合函数

应用不同的聚合函数

目录

成为一名认证的 Python
程序员。