Python读取txt实例

IO模式

Mode	r	r+	w	w+	a	a+
Read	+	+		+		+
Write		+	+	+	+	+
Create			+	+	+	+
Cover			+	+
Point in the beginning	+	+	+	+
Point in the end					+	+

读取文件样例

例如读取以下文档中的数据：

1 2	5.1,3.5,1.4,0.2,Iris-setosa 7.0,3.2,4.7,1.4,Iris-versicolor

代码如下：

# -*- coding: utf-8 -*-
change = { 'Iris-setosa':0,'Iris-versicolor':1}
data = list()
class read(object):
    def read_data(self):
        with open('D:\\testIris.txt', 'r') as f:
            for line in f.readlines():
                print '初始样本line',line,'end'
                print '初始样本line类型',type(line)
                line = line.strip()
                print 'strip后line',line
                print 'strip后line类型',type(line)
                line = line.split(',')
                print 'split后line',line
                print 'split后line类型',type(line)
                print 'split后里面的元素类型', type(line[0])
                line = [change[x] if x in change else x for x in line]   #将最后一列数据用0.1.2替换
                print '替换之后line',line
                print '替换之后line类型',type(line)
                line = map(float,line)             #转化为浮点，以前为字符型
                print '转为浮点之后line',line
                print '转为浮点之后line类型',type(line)
                print '转为浮点之后line里面的元素类型', type(line[0])
                data.append(line)                #添加到list列
            print '添加到data之后data',data
            print '添加到data之后data类型',type(data)
        return data

if __name__ == '__main__':
    myread = read()       # python必须要实例化才能用！
    one = myread.read_data()

结果：

初始样本line 5.1,3.5,1.4,0.2,Iris-setosa
end
初始样本line类型 <type 'str'>
strip后line 5.1,3.5,1.4,0.2,Iris-setosa
strip后line类型 <type 'str'>
split后line ['5.1', '3.5', '1.4', '0.2', 'Iris-setosa']
split后line类型 <type 'list'>
split后里面的元素类型 <type 'str'>
替换之后line ['5.1', '3.5', '1.4', '0.2', 0]
替换之后line类型 <type 'list'>
转为浮点之后line [5.1, 3.5, 1.4, 0.2, 0.0]
转为浮点之后line类型 <type 'list'>
转为浮点之后line里面的元素类型 <type 'float'>
初始样本line 7.0,3.2,4.7,1.4,Iris-versicolor end
初始样本line类型 <type 'str'>
strip后line 7.0,3.2,4.7,1.4,Iris-versicolor
strip后line类型 <type 'str'>
split后line ['7.0', '3.2', '4.7', '1.4', 'Iris-versicolor']
split后line类型 <type 'list'>
split后里面的元素类型 <type 'str'>
替换之后line ['7.0', '3.2', '4.7', '1.4', 1]
替换之后line类型 <type 'list'>
转为浮点之后line [7.0, 3.2, 4.7, 1.4, 1.0]
转为浮点之后line类型 <type 'list'>
转为浮点之后line里面的元素类型 <type 'float'>
添加到data之后data [[5.1, 3.5, 1.4, 0.2, 0.0], [7.0, 3.2, 4.7, 1.4, 1.0]]
添加到data之后data类型 <type 'list'>

注意：在windows下读取文件时，若使用D:\testIris.txt，会报IOError: [Errno 22] invalid mode ('r') or filename的错误，这个时候要改为D:\\testIris.txt或者D:/testIris.txt。

结果分析：

第一组print：初始样本因为第一行数据之后留有换行符，但是第二行没有换行符，所有造成第一次输出初始样本之后有一行的换行符号，但是第二组数据因为手动消除了换行，入上图所示，所以第二次输出没有换行。可以看到这里读数据的时候读取的是字符串形式。
第二组print：经过strip()操作之后会清除掉开头结尾的空格或换行符号，此时还是字符串形式。
第三组print：进过split(',')操作之后会以上面的str的,为分隔符，返回一个list。但是里面的每一个元素还是str形式。
第四组print：用 in 查找替换之后输出
第五组print：将list中的每一个元素都替换成浮点型，即可消去''符号，这个时候所有的元素都是float类型，而不是str类型。
第六个print：将每一组的 line 添加到 data中组成一个新的list。

二维list删除某一列

# -*- coding: utf-8 -*-
import numpy as np

data = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
print 'data:\n',data
print 'data_type',type(data)

# raw_data = np.array(data)    #这个可以不加，方法delete返回的值为 numpy.ndarray
test_detele = np.delete(data, 1 ,axis=1)
print 'test_delete\n',test_detele
print 'test_delete_type',type(test_detele)

raw_data = np.array(data)  #二维数组切片必须加这个，否则提示 TypeError: list indices must be integers, not tuple
test_slice = raw_data[:,:-1]
print 'test_slice\n',test_slice
print 'test_slice_type',type(test_slice)

结果：

data:
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
data_type <type 'list'>
test_delete
[[ 1  3  4]
 [ 5  7  8]
 [ 9 11 12]]
test_delete_type <type 'numpy.ndarray'>
test_slice
[[ 1  2  3]
 [ 5  6  7]
 [ 9 10 11]]
test_slice_type <type 'numpy.ndarray'>

二维list删除某一行

# -*- coding: utf-8 -*-
import numpy as np

data = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
print 'data:\n',data
print 'data_type',type(data)

# raw_data = np.array(data)    #这个可以不加，方法delete返回的值为 numpy.ndarray
test_detele = np.delete(data, 1 ,axis=0)
print 'test_delete\n',test_detele
print 'test_delete_type',type(test_detele)

raw_data = np.array(data)  #二维数组切片必须加这个，否则提示 TypeError: list indices must be integers, not tuple
test_slice = raw_data[:-1,:]
print 'test_slice\n',test_slice
print 'test_slice_type',type(test_slice)

partone = data[:1]
print partone
parttwo = data[2:]
test_sum = partone + parttwo
print 'test_sum\n',test_sum
print 'test_sum_type',type(test_sum)

结果：

data:
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
data_type <type 'list'>
test_delete
[[ 1  2  3  4]
 [ 9 10 11 12]]
test_delete_type <type 'numpy.ndarray'>
test_slice
[[1 2 3 4]
 [5 6 7 8]]
test_slice_type <type 'numpy.ndarray'>
[[1, 2, 3, 4]]
test_sum
[[1, 2, 3, 4], [9, 10, 11, 12]]
test_sum_type <type 'list'>

参考

How to open a file for both reading and writing?