Organization scheme of the data in the csv file:
Lisbon,Madrid,600
Madrid,Paris,650
import csv
with open('cidades.csv', 'r') as f: # Somente r para leitura do arquivo
read = csv.DictReader(f)
list1 =[]
for line in read:
list1.append=tuple(line.values())
I want to store these tuples all in a list and then I use the "distance" attribute to calculate the smallest path between two cities
It returns an error passing the tuple to the end of the list. AttributeError: 'list' object attribute 'append' is read-only
of course:
list1.append=tuple(line.values())
here you're trying to redefine the append method of list as a tuple. Fortunately for you, list members are protected against redefinition, which explains the error.
This is a typo: you should have done:
list1.append(tuple(line.values()))
better: get rid of the loop and read list1 in one go using list comprehension:
with open('cidades.csv', 'r') as f:
list1 = [tuple(line.values()) for line in csv.DictReader(f)]
but this approach still makes me dubious: reading the values loses infomation about the order (unless you're using Python 3.6 which uses OrderedDict for csv to fix this issue). I would drop DictReader and rely on the order of your csv using csv.reader, like this:
with open('cidades.csv', 'r') as f:
reader = csv.reader(f)
next(reader) # skip the title line we don't need it
list1 = [tuple(line) for line in reader]
Related
I have a text file data.txt that contains 2 rows of text.
first_row_1 first_row_2 first_row_3
second_row_1 second_row_2 second_row_3
I would like to read the second row of the text file and convert the contents into a list of string in python. The list should look like this;
txt_list_str=['second_row_1','second_row_2','second_row_3']
Here is my attempted code;
import csv
with open('data.txt', newline='') as f:
reader = csv.reader(f)
row1 = next(reader)
row2 = next(reader)
my_list = row2.split(" ")
I got the error AttributeError: 'list' object has no attribute 'split'
I am using python v3.
EDIT: Thanks for all the answers. I am sure all of them works. But can someone tell me what is wrong with my own attempted code? Thanks.
The reason your code doesn't work is you are trying to use split on a list, but it is meant to be used on a string. Therefore in your example you would use row2[0] to access the first element of the list.
my_list = row2[0].split(" ")
Alternatively, if you have access to the numpy library you can use loadtxt.
import numpy as np
f = np.loadtxt("data.txt", dtype=str, skiprows=1)
print (f)
# ['second_row_1' 'second_row_2' 'second_row_3']
The result of this is an array as opposed to a list. You could simply cast the array to a list if you require a list
print (list(f))
#['second_row_1', 'second_row_2', 'second_row_3']
Use read file method to open file.
E.g.
>>> fp = open('temp.txt')
Use file inbuilt generator to iterate lines by next method, and ignore first line.
>>> next(fp)
'first_row_1 first_row_2 first_row_3)\n'
Get second line in any variable.
>>> second_line = next(fp)
>>> second_line
'second_row_1 second_row_2 second_row_3'
Use Split string method to get items in list. split method take one or zero argument. if not given they split use white space for split.
>>> second_line.split()
['second_row_1', 'second_row_2', 'second_row_3']
And finally close the file.
fp.close()
Note: There are number of way to get respective output.
But you should attempt first as DavidG said in comment.
with open("file.txt", "r") as f:
next(f) # skipping first line; will work without this too
for line in f:
txt_list_str = line.split()
print(txt_list_str)
Output
['second_row_1', 'second_row_2', 'second_row_3']
I just started to learn python. I need to store a csv file data into a list of tuples: tuples to represent the values on each row, list to store all the rows.
The function I have problem with is when I need to filter the list. Basically create a copy of the list with only the ones that met criteria. I have successfully appended all the tuples into a list, but when I need to append the tuples into a new list, it doesn't work.
def filterRecord():
global filtered
filtered = list()
try:
if int(elem[2])>= int(command[-1]): #the condition
#if I print(elem) here, all results are correct
filtered.append(tuple(elem)) #tuples do not add into the list
#len(filtered) is 0
except ValueError:
pass
def main():
infile = open('file.csv')
L = list()
for line in infile:
parseLine() #a function store each row into tuple
for line in stdin:
command = line.split() #process user input, multiple lines
for elem in L:
if command == 0:
filterRecord()
If I run it, the program doesn't response. If I force stop it, the traceback is always for line in stdin
Also, I am not allowed to use the csv module in this program.
I think you need to import sys and use for line in sys.stdin
You should use python's built-in library to parse csv files (unless this is something like a homework assignment): https://docs.python.org/2/library/csv.html.
You can then do something like:
import csv
with open ('file.csv', 'r') as f:
reader = csv.DictReader(f, delimiter=",")
I have a csv file that has each line formatted with the line name followed by 11 pieces of data. Here is an example of a line.
CW1,0,-0.38,2.04,1.34,0.76,1.07,0.98,0.81,0.92,0.70,0.64
There are 12 lines in total, each with a unique name and data.
What I would like to do is extract the first cell from each line and use that to name the corresponding data, either as a variable equal to a list containing that line's data, or maybe as a dictionary, with the first cell being the key.
I am new to working with inputting files, so the farthest I have gotten is to read the file in using the stock solution in the documentation
import csv
path = r'data.csv'
with open(path,'rb') as csvFile:
reader = csv.reader(csvFile,delimiter=' ')
for row in reader:
print(row[0])
I am failing to figure out how to assign each row to a new variable, especially when I am not sure what the variable names will be (this is because the csv file will be created by a user other than myself).
The destination for this data is a tool that I have written. It accepts lists as input such as...
CW1 = [0,-0.38,2.04,1.34,0.76,1.07,0.98,0.81,0.92,0.70,0.64]
so this would be the ideal end solution. If it is easier, and considered better to have the output of the file read be in another format, I can certainly re-write my tool to work with that data type.
As Scironic said in their answer, it is best to use a dict for this sort of thing.
However, be aware that dict objects do not have any "order" - the order of the rows will be lost if you use one. If this is a problem, you can use an OrderedDict instead (which is just what it sounds like: a dict that "remembers" the order of its contents):
import csv
from collections import OrderedDict as od
data = od() # ordered dict object remembers the order in the csv file
with open(path,'rb') as csvFile:
reader = csv.reader(csvFile, delimiter = ' ')
for row in reader:
data[row[0]] = row[1:] # Slice the row up into 0 (first item) and 1: (remaining)
Now if you go looping through your data object, the contents will be in the same order as in the csv file:
for d in data.values():
myspecialtool(*d)
You need to use a dict for these kinds of things (dynamic variables):
import csv
path = r'data.csv'
data = {}
with open(path,'rb') as csvFile:
reader = csv.reader(csvFile,delimiter=' ')
for row in reader:
data[row[0]] = row[1:]
dicts are especially useful for dynamic variables and are the best method to store things like this. to access you just need to use:
data['CW1']
This solution also means that if you add any extra rows in with new names, you won't have to change anything.
If you are desperate to have the variable names in the global namespace and not within a dict, use exec (N.B. IF ANY OF THIS USES INPUT FROM OUTSIDE SOURCES, USING EXEC/EVAL CAN BE HIGHLY DANGEROUS (rm * level) SO MAKE SURE ALL INPUT IS CONTROLLED AND UNDERSTOOD BY YOURSELF).
with open(path,'rb') as csvFile:
reader = csv.reader(csvFile,delimiter=' ')
for row in reader:
exec("{} = {}".format(row[0], row[1:])
In python, you can use slicing: row[1:] will contain the row, except the first element, so you could do:
>>> d={}
>>> with open("f") as f:
... c = csv.reader(f, delimiter=',')
... for r in c:
... d[r[0]]=map(int,r[1:])
...
>>> d
{'var1': [1, 3, 1], 'var2': [3, 0, -1]}
Regarding variable variables, check How do I do variable variables in Python? or How to get a variable name as a string in Python?. I would stick to dictionary though.
An alternative to using the proper csv library could be as follows:
path = r'data.csv'
csvRows = open(path, "r").readlines()
dataRows = [[float(col) for col in row.rstrip("\n").split(",")[1:]] for row in csvRows]
for dataRow in dataRows: # Where dataRow is a list of numbers
print dataRow
You could then call your function where the print statement is.
This reads the whole file in and produces a list of lines with trailing newlines. It then removes each newline and splits each row into a list of strings. It skips the initial column and calls float() for each entry. Resulting in a list of lists. It depends how important the first column is?
In the following short program:
data = []
f = open('C:/tsg3.txt', 'r').read().split("\t")
for i in range(0, len(f)-1):
[GeneID, Sym, Alias, Xref, Chromo, Cyto, Full_name, Gene_type, Desc, Nuc_seq, Pro_seq = f[i]
I am seeing the appearance of a ValueError (need more than 4 values to unpack).
Obviously, I am doing something wrong since I am relatively new to Python.
Any help would be appreciated. I'm using Python 3.3.2.
Thanks.
You split the whole file by tabs, resulting in a single list of strings.
You then loop over that list, assigning f[i] (individual strings) to a long list of variables. From your error message, you are trying to assign a 4-character string to those variables, leading to individual characters being assigned, which fails because the number of characters doesn't match the number of variables.
Most likely, you want to process a tab-delimited file. Use the csv module for such tasks:
import csv
with open('C:/tsg3.txt', 'rb') as f:
reader = csv.reader(f, delimiter='\t')
for row in reader:
# `row` is a list of columns.
Because the file has headers, you can also use a csv.DictReader and use dictionaries instead (keyed with the headers):
with open('C:/tsg3.txt', 'rb') as f:
reader = csv.DictReader(f, delimiter='\t')
for row in reader:
# `row` is a dictionary of columns.
Not all rows have all values; some appear to be missing Nucleotide_Sequence and Protein_Sequence columns.
For future reference, you can loop directly over a python list, there is no need to use indices with a range():
for i in f:
# do something with the individual elements of `f`, assigned to `i` each iteration.
I have about 40 million lines of text to parse through and I want to treat each line as a split string and then ask for multiple slices (or subscripts, whatever they are called) using a list of numbers I generate in a method.
# ...
other_file = open('output.txt','w')
list = [1, 4, 5, 7, ...]
for line in open(input_file):
other_file.write(line.split(',')[i for i in list])
the subscript can't take this generator I have shown, but I want to ask the split line for multiple entries in it without having to iterate through the list in every line.
I apologize, I know this is a simple answer but I just can't think of it. It's so late!
CSV module can help you
import csv
reader = csv.reader(open(input_file, 'r'))
writer = csv.writer(open(output_file, 'w'))
fields = (1,4,5,7,...)
for row in reader:
writer.writerow([row[i] for i in fields])
For further improvements, open files with context managers
Don't use list as a variable name - remember there is a builtin called list
other_file = open('output.txt','w')
lst = [1,4,5,7,...]
for line in open(input_file):
fields = line.split(',')
other_file.write(",".join(fields[i] for i in lst) + "\n")
For further improvement use context managers to open/close the files for you
from operator import itemgetter
from csv import reader, writer
fields = 1,4,5,7
row_filter = itemgetter(*fields)
with open('inp.txt', 'r') as inp:
with open('out.txt', 'w') as out:
writer(out).writerows(map(row_filter, reader(inp)))