在NumPy中,loadtxt()
方法从文本文件中加载数据。
示例
import numpy as np
# load text from a file
array1 = np.loadtxt('file.txt')
print(array1)
'''
Output
[[0. 1.]
[2. 3.]]
'''
注意:我们假设有一个名为file.txt
的文本文件,其中包含从0到3的数字。
loadtxt() 语法
loadtxt()
的语法是
numpy.loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None, *, quotechar=None, like=None)
loadtxt() 参数
loadtxt()
方法接受以下参数
fname
- 要读取的文件(file
或str
或path
或generator
或list of str
)dtype
(可选)- 输出数组的类型comments
(可选)- 用于标识注释开头的字符(str
或None
)delimiter
(可选)- 用于分隔值的字符(str
)converters
(可选)- 用于自定义解析的函数(dict
或callable
)skiprows
(可选)- 要在开头跳过的行数(int
)usecols
(可选)- 要读取的列(int
或sequence
)unpack
(可选)- 如果为True
,则将列解包为单独的数组ndmin
(可选)- 数组中的最小维度数(int
)encoding
(可选)- 用于解码输入文件的编码(str
)max_rows
(可选)- 要读取的行数(int
)quotechar
(可选)- 用于表示带引号项的开始和结束的字符like
(可选)- 用于创建非NumPy数组的参考对象(array_like
)
注意事项
delimiter
只能是单个字符。ndmin
只能是0、1或2。max_rows
会忽略注释行和空行。
loadtxt() 返回值
loadtxt()
方法返回一个包含文本文件数据的数组。
示例1:使用loadtxt创建数组
我们当前的编译器不支持文件操作,因此我们使用了StringIO
类。这个类允许我们通过将字符串视为类似文件的对象来规避与文件相关的限制。
# StringIO behaves like a file object
from io import StringIO
file1 = StringIO('0 1 2\n3 4 5\n6 7 8')
# import numpy
import numpy as np
# load from file
array1 = np.loadtxt(file1)
print(array1)
输出
[[0. 1. 2.] [3. 4. 5.] [6. 7. 8.]]
示例2:使用dtype参数指定数据类型
dtype
参数有助于指定创建的numpy数组所需的等效数据类型。默认情况下,数据类型为float
,但是,我们可以根据需要将其更改为任何兼容的数据类型。
import numpy as np
# create two file objects using StringIO
from io import StringIO
file1 = StringIO(' 1 2\n3 4\n5 6')
file2 = StringIO(' 1 2\n3 4\n5 6')
# load from file
array1 = np.loadtxt(file1)
print("Default type:\n", array1)
# load from file to create an int array
array2 = np.loadtxt(file2, dtype = int)
print("\nInteger type:\n", array2)
输出
Default type: [[1. 2.] [3. 4.] [5. 6.]] Integer type: [[1 2] [3 4] [5 6]]
示例3:使用comments参数忽略文件中的行
comments
参数有助于指定注释从什么字符开始,以便我们在创建数组时可以忽略它们。
import numpy as np
# create two file objects using StringIO
from io import StringIO
file1 = StringIO(' 1 2\n#skip this line\n3 4\n5 6')
file2 = StringIO('1 2 3 4 5 6?skip the second half of this line')
file3 = StringIO('1 2 3 4 5 6%skip the second half of this line')
# load from the file if comments start with #
array1 = np.loadtxt(file1, comments = '#')
print('Array1:', array1)
# load from the file and ignore all the characters after ?
array2 = np.loadtxt(file2, comments = '?')
print('Array2:', array2)
# load from the file and ignore all the characters after *
array3 = np.loadtxt(file3, comments = '*')
print('Array3:', array3)
输出
Array1: [[1. 2.] [3. 4.] [5. 6.]] Array2: [1. 2. 3. 4. 5. 6.] ValueError: could not convert string '6%skip' to float64 at row 0, column 6.
在这里,array1和array2工作正常,但array3引发了错误。这是因为我们的文件使用%
作为注释指示符,而我们的代码则将*
作为输入。这导致array3无法正确处理注释,从而导致错误。
当文本文件包含附加信息或元数据行,而这些信息或元数据不是我们想要加载的实际数据的一部分时,comments
参数非常有用。
示例4:使用delimiter参数分隔数据条目
delimiter
参数有助于指定分隔输入文件中数据条目的字符。默认情况下,delimiter = None
,这意味着将空白字符视为分隔符。
import numpy as np
# create two file objects using StringIO
from io import StringIO
file1 = StringIO('1 2 3 4 5 6')
file2 = StringIO('1,2,3,4,5,6')
# load from file
# by default white-space acts as delimiter
array1 = np.loadtxt(file1)
print('Array1:', array1)
# load from file with commas as delimiter
array2 = np.loadtxt(file2, delimiter = ',')
print('Array2:', array2)
输出
Array1: [1. 2. 3. 4. 5. 6.] Array2: [1. 2. 3. 4. 5. 6.]
示例5:使用converters参数解析输入
converter
参数有助于转换和解析输入文件内容以创建NumPy数组。
import numpy as np
# create two file objects using StringIO
from io import StringIO
file1 = StringIO('1 2 3 4 5')
def square(n):
return int(n)**2
# load from file and square
array1 = np.loadtxt(file1, converters = square)
print('Array1:', array1)
# reset file pointer
file1.seek(0)
# use lambda function as converter
array2 = np.loadtxt(file1, converters = lambda i:int(i)**2)
print('Array2:', array2)
# reset file pointer
file1.seek(0)
# use converter only on element at index 2
array3 = np.loadtxt(file1, converters = {2:square})
print('Array3:', array3)
输出
Array1: [ 1. 4. 9. 16. 25.] Array2: [ 1. 4. 9. 16. 25.] Array3: [1. 2. 9. 4. 5.]
示例6:使用skiprows参数跳过行
skiprows
参数在读取文件内容以创建NumPy数组之前,会跳过开头指定数量的行。
import numpy as np
# create a file object using StringIO
from io import StringIO
file1 = StringIO('Col1 Col2\n1 20\n4 50\n9 81')
# load from file and skip the 1st row
array1 = np.loadtxt(file1, skiprows = 1)
print('Array1:\n', array1)
输出
Array1: [[ 1. 20.] [ 4. 50.] [ 9. 81.]]
示例7:使用usecols参数读取特定列
usecols
参数读取文件内容指定列以创建NumPy数组。
import numpy as np
# StringIO behaves like a file object
from io import StringIO
file1 = StringIO('1 2 3\n4 5 6\n7 8 9')
# load from file
array1 = np.loadtxt(file1)
# reset file pointer
file1.seek(0)
# load from file and read only 1st and 3rd column
array2 = np.loadtxt(file1, usecols = [0,2])
print('Whole Array:\n', array1)
print('Array using Column 0 and 2:\n', array2)
输出
Whole Array: [[1. 2. 3.] [4. 5. 6.] [7. 8. 9.]] Array using Column 0 and 2: [[1. 3.] [4. 6.] [7. 9.]]
示例8:使用unpack参数
unpack
参数是一个布尔标志,指定是否应解包加载的数据。
当True
时,每列被视为自己的数组;当False
时,整个文件作为一个数组。
import numpy as np
# create two file objects using StringIO
from io import StringIO
file1 = StringIO('1 2 3\n4 5 6\n7 8 9')
file2 = StringIO('1 2 3\n4 5 6\n7 8 9')
# load from file
array1 = np.loadtxt(file1)
print('Array1:\n', array1)
# load from file and unpack it
x, y, z = np.loadtxt(file2, unpack = True)
print('Unpacked Values:\n', x, y, z)
输出
Array1: [[1. 2. 3.] [4. 5. 6.] [7. 8. 9.]] Unpacked Values: [1. 4. 7.] [2. 5. 8.] [3. 6. 9.]
注意:值是按列解包的。
示例9:使用ndmin参数指定最小维度数
ndmin
参数指定创建数组的最小维度数。默认情况下,ndmin = 0
,即不会强制创建数组具有特定的最小维度数。
import numpy as np
# create three file object using StringIO
from io import StringIO
file1 = StringIO('1 2 3 4 5 ')
file2 = StringIO('1 2 3 4 5')
file3 = StringIO('1 2 3 4 5')
# load from file
array1 = np.loadtxt(file1)
print('Array1:\n', array1)
# load from file in an array of 1 dimension
array2 = np.loadtxt(file2, ndmin = 1)
print('Array2:\n',array2)
# load from file in an array of 2 dimensions
array3 = np.loadtxt(file3, ndmin = 2)
print('Array3:\n', array3)
输出
Array1: [1. 2. 3. 4. 5.] Array2: [1. 2. 3. 4. 5.] Array3: [[1. 2. 3. 4. 5.]]
注意:ndmin
的范围只能是0到2。
示例10:使用max_rows参数指定最大行数
max_rows
参数指定从文件中读取的最大行数。
import numpy as np
# create three file object using StringIO
from io import StringIO
file1 = StringIO('1 2\n3 4\n5 6')
file2 = StringIO('1 2\n3 4\n5 6')
file3 = StringIO('1 2\n3 4\n5 6')
# load all rows from the file
array1 = np.loadtxt(file1)
print('Original array:\n', array1)
# load 2 rows from file
array2 = np.loadtxt(file2, max_rows = 2)
print('Array with 2 max rows:\n', array2)
# load 1 row from file after skipping the first row
array3 = np.loadtxt(file3, skiprows = 1, max_rows = 1)
print('Array with 1 skipped row and 1 max row:\n',array3)
输出
Orginal array: [[1. 2.] [3. 4.] [5. 6.]] Array with 2 max rows: [[1. 2.] [3. 4.]] Array with 1 skipped row and 1 max row: [3. 4.]
示例11:使用quotechars参数指定引号
quotechars
参数表示带引号项的开始和结束。
import numpy as np
# create a file object using StringIO
from io import StringIO
file1 = StringIO('Today is a !good day!')
# load text from file
array1 = np.loadtxt(file1, dtype = 'U20', quotechar = '!')
print('Array1:\n', array1)
输出
Array1: ['Today' 'is' 'a' 'good day']
在这里,空格被用作delimiter
。尽管good
和day
之间有一个空格,但是使用!
表示good day
是一个单独的元素。