I have a very long txt file containing geographic coordinates..the format for each row looks like this:
501418.209 5314160.484 512.216
501418.215 5314160.471 512.186
501418.188 5314160.513 512.216
so separated by a blank (" ") and at the end a line break (\n)
I need to import that file into a list...so far I only managed to import it as a string and then tried to converted in into a list. Unforunately, I have no idea how I can keep the formatting of the txt file, as I need to perform calculations on each row.
My solution so far to import the txt file to a string variable:
fileobj = file(source,'r')
data = ""
for line in fileobj.readlines():
linevals = line.strip().split(" ")
data += "%s %s %s\n" % (linevals[0], linevals[1], linevals[2])
print type(data)
And my solution for importing as list that didn't work:
fileobj = file(source,'r')
data = []
for line in fileobj.readlines():
linevals = line.strip().split(" ")
data.append(linevals)
On stackoverflow I found lots of solutions that suggested the eval function - but that didn't work as I need the whole row as one list element. Hope that was clear. Any solutions for this problem? I'm pretty newish to python, but that bothers me for quite some time now. Thank you!
You don't need eval or anything other than simply splitting each row and casting to float:
with open(source) as f:
for row in f:
print(map(float,row.split()))
[501418.209, 5314160.484, 512.216]
[501418.215, 5314160.471, 512.186]
[501418.188, 5314160.513, 512.216]
If you want all rows in a single list:
with open(source) as f:
data = [ map(float,row.split()) for row in f] # python3 ->list(map(float,row.split()))
print(data)
[[501418.209, 5314160.484, 512.216], [501418.215, 5314160.471, 512.186], [501418.188, 5314160.513, 512.216]]
Or using the csv module:
import csv
with open(source) as f:
data = [map(float,row) for row in csv.reader(f,delimiter=" ")]
print(data)
If you want a flat list of all data:
with open(source) as f:
data = []
for row in f:
data.extend(map(float,row.split()))
If you are doing a lot of work on the data you may find numpy useful:
import numpy as np
data = np.genfromtxt(source,delimiter=" ").flatten()
Related
I have lists of float numbers in python.
I would like to save some lists in txt files and I read them list by list as they are. But after I wrote the following code I discovered that str(i)convert every digit in one str. I can not read them as well as they are written.
Ps I have 10,000 list of result so I would like to save every list in one line .
result = [-0.33434,0.4434, 4343....]
with open("out.txt", w) as out:
for i in result:
out.write(str(i)+' ')
out.write("\n")
Updated Answer
For multiple lists, put them all inside a single list and write that to the pkl/txt file.
import pickle
import numpy as np
results = [
[1.456,2.245,-3.441],
[4.53, 4.55, 1.22]
]
np.savetxt("results.txt", results) # write
#read
result = np.loadtxt("results.txt")
print(result.tolist())
If you want to have names associated to your lists, consider using a dict. But then, you can only use pickle.
import pickle
results = {
"result1": [1.456,2.245,-3.441],
"result2": [4.53, 4.55, 1.22],
}
# write
with open("results.pkl", "wb") as resultFile:
pickle.dump(results, resultFile)
# read
with open("results.pkl", "rb") as resultFile:
result = pickle.load(resultFile)
print(result)
Original Answer
Use pickle or numpy for this, as that is more well suited for this task.
Using pickle:
import pickle
result1 = [1.456,2.245,-3.441]
# write
with open("result1.pkl", "wb") as resultFile:
pickle.dump(x, resultFile)
# read
with open("result1.pkl", "rb") as resultFile:
result = pickle.load(resultFile)
print(result)
Using numpy:
import numpy as np
result1 = [1.456,2.245,-3.441]
np.savetxt("result1.txt", result1) # write
#read
result = np.loadtxt("result1.txt")
print(result.tolist())
If you don't want to use a library to do this, you can simply join each list by a known separator and then write the resulting string to the file.
Here, I assume you have a list-of-lists called allResults, which is of the form
allResults = [
[-0.332,434,0.4434,4865],
[9.456,-0.540,-7.06540,5.05453],
# ... and so on
]
separator = ","
with open("out.txt", "w") as out_file:
for l in allResults:
out_string = separator.join(str(x) for x in l) + "\n"
out_file.write(out_string)
Now, out.txt contains:
-0.332,434,0.4434,4865
9.456,-0.54,-7.0654,5.05453
Then to read the file, you can read each line, split it by your separator, convert each element of the split string to a float, and put that new list in your list of lists:
all_lists = []
with open("out.txt", "r") as in_file:
for line in in_file:
new_list = [float(x) for x in line.split(separator)]
all_lists.append(new_list)
And now you have your list of lists back:
all_lists: [[-0.332, 434.0, 0.4434, 4865.0], [9.456, -0.54, -7.0654, 5.05453]]
if you want to read it again, read the line and split it on ' '(space). This should give you the string representation of the float numbers. afterwards you can use the float(num) method to convert it back to a number
i have following output from a csv file:
word1|word2|word3|word4|word5|word6|01:12|word8
word1|word2|word3|word4|word5|word6|03:12|word8
word1|word2|word3|word4|word5|word6|01:12|word8
what i need to do is change the time string like this 00:01:12.
my idea is to extract the list item [7] and add a "00:" as string to the front.
import csv
with open('temp', 'r') as f:
reader = csv.reader(f, delimiter="|")
for row in reader:
fixed_time = (str("00:") + row[7])
begin = row[:6]
end = row[:8]
print begin + fixed_time +end
get error message:
TypeError: can only concatenate list (not "str") to list.
i also had a look on this post.
how to change [1,2,3,4] to '1234' using python
i neeed to know if my approach to soloution is the right way. maybe need to use split or anything else for this.
thx for any help
The line that's throwing the exception is
print begin + fixed_time +end
because begin and end are both lists and fixed_time is a string. Whenever you take a slice of a list (that's the row[:6] and row[:8] parts), a list is returned. If you just want to print it out, you can do
print begin, fixed_time, end
and you won't get an error.
Corrected code:
I'm opening a new file for writing (I'm calling it 'final', but you can call it whatever you want), and I'm just writing everything to it with the one modification. It's easiest to just change the one element of the list that has the line (row[6] here), and use '|'.join to write a pipe character between each column.
import csv
with open('temp', 'r') as f, open('final', 'w') as fw:
reader = csv.reader(f, delimiter="|")
for row in reader:
# just change the element in the row to have the extra zeros
row[6] = '00:' + row[6]
# 'write the row back out, separated by | characters, and a new line.
fw.write('|'.join(row) + '\n')
you can use regex for that:
>>> txt = """\
... word1|word2|word3|word4|word5|word6|01:12|word8
... word1|word2|word3|word4|word5|word6|03:12|word8
... word1|word2|word3|word4|word5|word6|01:12|word8"""
>>> import re
>>> print(re.sub(r'\|(\d\d:\d\d)\|', r'|00:\1|', txt))
word1|word2|word3|word4|word5|word6|00:01:12|word8
word1|word2|word3|word4|word5|word6|00:03:12|word8
word1|word2|word3|word4|word5|word6|00:01:12|word8
I've a .xls file that I convert to .csv, and then read this .csv until one specific line that contains the word clientegen, get that row and put it on a array.
This is my code so far:
import xlrd
import csv
def main():
print "Converts xls to csv and reads csv"
wb = xlrd.open_workbook('ejemplo.xls')
sh = wb.sheet_by_name('Hoja1')
archivo_csv = open('fichero_csv.csv', 'wb')
wr = csv.writer(archivo_csv, quoting=csv.QUOTE_ALL)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
archivo_csv.close()
f = open('fichero_csv.csv', 'r')
for lines in f:
print lines
if __name__ == '__main__':
main()
This prints me:
[... a lot of more stuff ...]
"marco 4","","","","","","","","","","","","","","",""
"","","","","","","","","","","","","","","",""
"","","","","","","","","","","","","","","",""
"clientegen","maier","embega","Jegan ","tapa pure","cil HUF","carcHUF","tecla NSS","M1 NSS","M2 nss","M3 nss","doble nss","tapon","sagola","clip volvo","pillar"
"pz/bast","33.0","40.0","34.0","26.0","80.0","88.0","18.0","16.0","8.0","6.0","34.0","252.0","6.0","28.0","20.0"
"bast/Barra","5.0","3.0","6.0","8.0","10.0","4.0","10.0","10.0","10.0","10.0","8.0","4.0","6.0","10.0","6.0"
[... a lot of more stuff ...]
The thing I want to do is take that clientegen line, and save the content of the row on a new string array with the name finalarray for example.
finalarray = ["maier", "embega", "Jegan", "tapa pure", "cil HUF", "carcHUF", "tecla NSS", "M1 NSS", "M2 nss", "M3 nss", "doble nss", "tapon", "sagola", "clip volvo", "pillar"]
I'm not a lot into python file's read/read so I would like to know how if someone could give me a hand to find that line, get those values and put them on a array. Thanks in advance.
If you swap this for loop out for your for loop, it should do the trick:
for rownum in xrange(sh.nrows):
row = sh.row_values(rownum)
if row[0] == "clientegen": # Check if "clientgen" is the first element of the row
finalarray = list(row) # If so, make a copy of it and name it `finalarray`
wr.writerow(row)
If there will ever be more than one "clientegen" line, we can adjust this code to save all of them.
If you are just looking for the line that contains clientegen, then you could try:
finalarray = list()
with open("fichero_csv.csv") as f:
for line in f: #loop through all the lines
words = line.split(" ") #get a list of all the words
if "clientegen" in words: #check to see if your word is in the list
finalarray = words #if so, put the word list in your finalarray
break #stop checking any further
I want to get data from a table in a text file into a python array. The text file that I am using as an input has 7 columns and 31 rows. Here is an example of the first two rows:
10672 34.332875 5.360831 0.00004035881220 0.00000515052523 4.52E-07 6.5E-07
12709 40.837833 19.429158 0.00012010938453 -0.00000506426720 7.76E-06 2.9E-07
The code that I have tried to write isn't working as it is not reading one line at a time when it goes through the for loop.
data = []
f = open('hyadeserr.txt', 'r')
while True:
eof = "no"
array = []
for i in range(7):
line = f.readline()
word = line.split()
if len(word) == 0:
eof = "yes"
else:
array.append(float(word[0]))
print array
if eof == "yes": break
data.append(array)
Any help would be greatly appreciated.
A file with space-separated values is just a dialect of the classic comma-separated values (CSV) file where the delimiter is a space (), followed by more spaces, which can be ignored.
Happily, Python comes with a csv.reader class that understands dialects.
You should use this:
Example:
#!/usr/bin/env python
import csv
csv.register_dialect('ssv', delimiter=' ', skipinitialspace=True)
data = []
with open('hyadeserr.txt', 'r') as f:
reader = csv.reader(f, 'ssv')
for row in reader:
floats = [float(column) for column in row]
data.append(floats)
print data
If you don't want to use cvs here, since you don't really need it:
data = []
with open("hyadeserr.txt") as file:
for line in file:
data.append([float(f) for f in line.strip().split()])
Or, if you know for sure that the only extra chars are spaces and line ending \n, you can turn the last line into:
data.append([float(f) for f in line[:-1].split()])
Before I was not able to get the split to work. Now it is working but only performing the calculation on the last list of the list of list. I need it to calculate the efficiency on each of the players not just the last one in the file.
I am thinking a while loop before the calculation might solve my problem, but I am open to suggestions.
def get_data_list (file_name):
data_file = open(file_name, "r")
data_list = []
for line_str in data_file:
# strip end-of-line, split on commas, and append items to list
data_list =line_str.strip().split(',')
gp=int(data_list[6])
mins=int(data_list[7])
pts=int(data_list[8])
oreb=int(data_list[9])
dreb=int(data_list[10])
reb=int(data_list[11])
asts=int(data_list[12])
stl=int(data_list[13])
blk=int(data_list[14])
to=int(data_list[15])
pf=int(data_list[16])
fga=int(data_list[17])
fgm=int(data_list[18])
fta=int(data_list[19])
ftm=int(data_list[20])
tpa=int(data_list[21])
tpm=int(data_list[22])
efficiency = ((pts+reb+asts+stl+blk)-((fga-fgm)+(fta-ftm)+to))/gp
data_list.append (efficiency)
return data_list
file_name1 = input("File name: ")
result_list = get_data_list (file_name1)
print(result_list)
Thanks in advance for your help.
You're redefining data_list in each iteration:
data_list = []
for line_str in data_file:
# strip end-of-line, split on commas, and append items to list
data_list =line_str.strip().split(',')
Try changing the first data_list to something like data = []. Also, you can use with when opening your file so that things like closing are handled properly:
def get_data_list (file_name):
with open(file_name, "r") as data_file:
data = []
for line_str in data_file:
# strip end-of-line, split on commas, and append items to list
data_list =line_str.strip().split(',')
# Your definitions here...
gp=int(data_list[6])
# ...
efficiency = ((pts+reb+asts+stl+blk)-((fga-fgm)+(fta-ftm)+to))/gp
data_list.append (efficiency)
data.append(data_list)
return data
However you could also look into the csv module - it looks like you're dealing with comma-separated values, and that module provides a very nice interface for handling them.