python读取数据

发布时间: 2022-01-10 08:09:56

⑴ python 读取大文件数据怎么快速读取

python中读取数据的时候有几种方法，无非是read，readline，readlings和xreadlines几种方法，在几种方法中，read和xreadlines可以作为迭代器使用，从而在读取大数据的时候比较有效果.

在测试中，先创建一个大文件，大概1GB左右，使用的程序如下：

[python] view plainprint?
import os.path
import time
while os.path.getsize('messages') <1000000000:
f = open('messages','a')
f.write('this is a file/n')
f.close()

print 'file create complted'

在这里使用循环判断文件的大小，如果大小在1GB左右，那么结束创建文件。--需要花费好几分钟的时间。

测试代码如下：

[python] view plainprint?
#22s
start_time = time.time()
f = open('messages','r')
for i in f:
end_time = time.time()
print end_time - start_time
break
f.close()

#22s
start_time = time.time()
f = open('messages','r')
for i in f.xreadlines():
end_time = time.time()
print end_time - start_time
break
f.close()

start_time = time.time()
f = open('messages','r')
k= f.readlines()
f.close()
end_time = time.time()
print end_time - start_time

使用迭代器的时候，两者的时间是差不多的，内存消耗也不是很多，使用的时间大概在22秒作用
在使用完全读取文件的时候，使用的时间在40s，并且内存消耗相当严重，大概使用了1G的内存。。

其实，在使用跌倒器的时候，如果进行连续操作，进行print或者其他的操作，内存消耗还是不可避免的，但是内存在那个时候是可以释放的，从而使用迭代器可以节省内存，主要是可以释放。
而在使用直接读取所有数据的时候，数据会保留在内存中，是无法释放这个内存的，从而内存卡死也是有可能的。

在使用的时候，最好是直接使用for i in f的方式来使用，在读取的时候，f本身就是一个迭代器，其实也就是f.read方法

⑵ python 从txt中读取数据到 list 中

list1,list2,list3.... 有多少行事先知道？

a=open('myfile.txt')
lines=a.readlines()
lists=[]#直接用一个数组存起来就好了
forlineinlines:
lists.append(line.split())
print(lists)

⑶ python 从文件读入数据数据以空格隔开

1、打开Visual Studio Code 1.40.2进入下图界面。

⑷ python如何读取文件的内容

# _*_ coding: utf-8 _*_

import pandas as pd

# 获取文件的内容

def get_contends(path):

with open(path) as file_object:

contends = file_object.read()

return contends

# 将一行内容变成数组

def get_contends_arr(contends):

contends_arr_new = []

contends_arr = str(contends).split(']')

for i in range(len(contends_arr)):

if (contends_arr[i].__contains__('[')):

index = contends_arr[i].rfind('[')

temp_str = contends_arr[i][index + 1:]

if temp_str.__contains__('"'):

contends_arr_new.append(temp_str.replace('"', ''))

# print(index)

# print(contends_arr[i])

return contends_arr_new

if __name__ == '__main__':

path = 'event.txt'

contends = get_contends(path)

contends_arr = get_contends_arr(contends)

contents = []

for content in contends_arr:

contents.append(content.split(','))

df = pd.DataFrame(contents, columns=['shelf_code', 'robotid', 'event', 'time'])

(4)python读取数据扩展阅读：

python控制语句

1、if语句，当条件成立时运行语句块。经常与else, elif(相当于else if) 配合使用。

2、for语句，遍历列表、字符串、字典、集合等迭代器，依次处理迭代器中的每个元素。

3、while语句，当条件为真时，循环运行语句块。

4、try语句，与except,finally配合使用处理在程序运行中出现的异常情况。

5、class语句，用于定义类型。

6、def语句，用于定义函数和类型的方法。

⑸ 如何使用python在文件中读取数据

withopen('f:/C.txt')asfid:
forlineinfid:
line=line.split()
print(line[1])

⑹ python怎么读取txt文件全部数据

f=open("a.txt")
printf.read()

⑺ python程序读取和输出数据

class StepTime:
def __init__(self,name):
self.name=name
self.values=[]
def close(self):
if sum(self.values)==0.0:
print "all zero:",self.name
def put(self,value):
self.values.append(float(value))
if len(values)==1:
print "not zero:",self.name

import re,os
lasttime=None
for line in open("filename","rt"):
if line.startswith("step"):
if lasttime:lasttime.close()
name=line[len('step time='):].strip()
lasttime=StepTime(name)
else:
lasttime.put(line[line.find("=")+1:].strip())
lasttime.close()

完成了，就这东西。似乎StepTime这个类就是一个简单的状态机吧。

⑻ python如何读取网页中的数据

用Beautiful Soup这类解析模块：

Beautiful Soup 是用Python写的一个HTML/XML的解析器，它可以很好的处理不规范标记并生成剖析树(parse tree)；
它提供简单又常用的导航(navigating)，搜索以及修改剖析树的操作；
用urllib或者urllib2(推荐)将页面的html代码下载后，用beautifulsoup解析该html；

然后用beautifulsoup的查找模块或者正则匹配将你想获得的内容找出来，就可以进行相关处理了，例如：


html='<html><head><title>test</title></head><body><p>testbody</p></body></html>'
soup=BeautifulSoup(html)
soup.contents[0].name
#u'html'
soup.comtents[0].contents[0].name
#u'head'
head=soup.comtents[0].contents[0]
head.parent.name
#u'html'
head.next
#u'<title>test</title>

阅读全文

热点内容

cod17编译着色器55 发布：2025-07-08 15:43:53 浏览：556

Shell脚本的posix模式发布：2025-07-08 15:41:32 浏览：352

压缩奶油消泡发布：2025-07-08 15:40:11 浏览：424

服务器一定要有公网ip吗发布：2025-07-08 15:35:12 浏览：797

appendpython 发布：2025-07-08 15:22:54 浏览：656

安卓虚拟号码怎么设置发布：2025-07-08 15:22:04 浏览：663

c语言爱心代码编译不出来发布：2025-07-08 15:11:07 浏览：540

qq密码的数据库发布：2025-07-08 14:54:50 浏览：6

多图床源码发布：2025-07-08 14:46:36 浏览：573

sqldblinkoracle 发布：2025-07-08 14:44:50 浏览：608

python读取数据

与python读取数据相关的资讯