std()
方法计算给定数字集沿指定轴的标准差。
示例
import numpy as np
# create an array
array1 = np.array([0, 1, 2, 3, 4, 5, 6, 7])
# calculate the standard deviation of the array
deviation = np.std(array1)
print(deviation)
# Output: 2.29128784747792
std() 语法
std()
的语法是
numpy.std(array, axis=None, dtype=None, out=None, ddof = 0, keepdims=<no value>, where=<no value>)
std() 参数
std()
方法接受以下参数
array
- 包含需要计算标准差的数字的数组 (可以是array_like
)axis
(可选) - 计算标准差的轴或轴 (int
或tuple of int
)dtype
(可选) - 用于计算标准差的数据类型 (datatype
)out
(可选) - 用于存储结果的输出数组 (ndarray
)ddof
(可选) - 自由度增量 (int
)keepdims
(可选) - 指定是否保留原始数组的形状 (bool
)where
(可选) - 过滤包含在标准差计算中的元素 (array of bool
)
注意事项
默认值为,
axis = None
,即数组被展平,并计算整个数组的标准差。dtype = None
,即对于整数,取float
,否则标准差与元素的类型相同。- 默认情况下,不传递
keepdims
和where
。
std() 返回值
std()
方法返回数组的标准差。
示例 1:计算 ndArray 的标准差
import numpy as np
# create an array
array1 = np.array([[[0, 1], [2, 3]],
[[4, 5], [6, 7]]])
# find the standard deviation of entire array
deviation1 = np.std(array1)
# find the standard deviation across axis 0 (slice wise mean)
deviation2 = np.std(array1, 0)
# find the standard deviation across axis 0 and 1
deviation3 = np.std(array1, (0, 1))
print('\nStandard Deviation of the entire array:', deviation1)
print('\nStandard Deviation across axis 0:\n', deviation2)
print('\nStandard Deviation across axis 0 and 1', deviation3)
输出
Standard Deviation of the entire array: 2.29128784747792 Standard Deviation across axis 0: [[2. 2.] [2. 2.]] Standard Deviation across axis 0 and 1 [2.23606798 2.23606798]
当未指定 axis
参数时,np.std(array1)
计算整个数组的标准差。
沿 axis=0
计算标准差,得到的是每列的行标准差。
沿 axis=(0, 1)
计算标准差,同时计算行和列的标准差。结果数组是一个 **1**D 数组,包含整个 **2**D 数组的所有元素的标准差。
示例 2:指定 ndArray 的标准差数据类型
我们可以使用 dtype
参数指定输出数组的数据类型。
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# by default int is converted to float
result1 = np.std(array1)
# pass dtype to specify integer output
result2 = np.std(array1, dtype = int)
print('Float deviation:', result1)
print('Integer deviation:', result2)
输出
Float deviation: 1.707825127659933 Integer deviation: 1
注意: 使用较低精度的 dtype
,例如 int
,可能会导致精度损失。
示例 3:使用可选的 keepdims 参数
如果 keepdims
设置为 True
,则保留原始数组的维度并将其传递给结果的标准差数组。
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# keepdims defaults to False
result1 = np.std(array1, axis = 0)
# pass keepdims as True
result2 = np.std(array1, axis = 0, keepdims = True)
print('Original Array Dimension:', array1.ndim)
print('Standard Deviation without keepdims:', result1, 'Dimensions', result1.ndim)
print('Standard Deviation with keepdims:', result2, 'Dimensions', result2.ndim)
输出
Original Array Dimension: 2 Standard Deviation without keepdims: [1.5 1.5 1.5] Dimensions 1 Standard Deviation with keepdims: [[1.5 1.5 1.5]] Dimensions 2
示例 4:使用 where() 计算过滤后数组的标准差
我们可以使用 where
参数过滤数组,并计算过滤后数组的标准差。
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# standard deviation of entire array
result1 = np.std(array1)
# standard deviation of only even elements
result2 = np.std(array1, where = (array1%2==0))
# standard deviation of numbers greater than 3
result3 = np.std(array1, where = (array1 > 3))
print('Standard Deviation of entire array:', result1)
print('Standard Deviation of only even elements:', result2)
print('Standard Deviation of numbers greater than 3:', result3)
输出
Standard Deviation of entire array: 1.707825127659933 Standard Deviation of only even elements: 1.632993161855452 Standard Deviation of numbers greater than 3: 0.816496580927726
示例 5:使用 out 将结果存储在指定位置
out
参数允许指定一个输出数组,结果将存储在该数组中。
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# create an output array
output = np.zeros(3)
# compute standard deviation and store the result in the output array
np.std(array1, out = output, axis = 0)
print('Standard Deviation:', output)
输出
Standard Deviation: [1.5 1.5 1.5]
常见问题
什么是标准差?
标准差是衡量数据相对于平均值分散程度的度量。在我们的例子中,标准差衡量的是值围绕给定数组的平均值的散布程度。
数学上,
std = sqrt(sum((arr - arr.mean())**2) / (N - 1))
在 NumPy 中,
import numpy as np
array1 = np.array([2, 4, 6, 8, 10])
# calculate standard deviation using np.std()
deviation1 = np.std(array1)
# calculate standard deviation without using np.std()
mean = np.mean(array1)
diff_squared = (array1 - mean) ** 2
variance = np.mean(diff_squared)
deviation2 = np.sqrt(variance)
print('Standard Deviation with np.std():', deviation1)
print('Standard Deviation without np.std():', deviation2)
输出
Standard Deviation with np.std(): 2.8284271247461903 Standard Deviation without np.std(): 2.8284271247461903
np.std()
中的 ddof
参数是做什么用的?ddof
(Delta Degrees of Freedom) 参数在 np.std()
中允许调整用于计算标准差的除数。默认值为 **0**,这对应于除以 **N**,即元素的数量。
在上面 std 的公式中,
std = sqrt(sum((arr - arr.mean())**2) / (N - ddof))
让我们看一个例子。
import numpy as np
array1 = np.array([1, 2, 3, 4, 5])
# calculate standard deviation with the default ddof=0
deviation1 = np.std(array1)
# calculate standard deviation with ddof=1
deviation2 = np.std(array1, ddof=1)
print('Standard Deviation (default ddof=0):', deviation1)
print('Standard Deviation (ddof=1):', deviation2)
输出
Standard Deviation (default ddof=0): 1.4142135623730951 Standard Deviation (ddof=1): 1.5811388300841898