Hello i want to write my list to a .csv file.
This is my code
def writeCsv(self, content):
filename = 'data.csv'
f = open(filename, 'w')
header = 'index;title;img;link;views;brand;\n'
f.write(header)
#print(len(content))
i = 0
for c in content:
f.write(c['index'] + ";" + c['title'] + ';' + c['img'] + ';' + c['link'] + ';' + c['views'] + ";\n")
#i += 1
#print(i)
f.close()
My problem is that len(content) returns 72 but the loop only runs 21 times. (I print i every time in the loop and my .csv file only has 21 lines.)
Is there some limit or unknown parameter i miss on the write() function?
Update: I used Sayse´s solution but added encoding='utf-8'. The probem was an illegal charater in line 22
As noted in the comments, the only thing that could cause this is malformed data (probably line 22) and you catching the broader exception.
Regardless, you should just use the csv modules DictWriter
from csv import DictWriter
def writeCsv(self, content):
filename = 'data.csv'
with open(filename, 'w') as f:
field_names = ["index","title","img","link","views","brand"]
dict_writer = DictWriter(f, field_names, delimiter=";")
dict_writer.writeheader()
dict_writer.writerows(content)
Try this perhaps:
def writeCsv(self, content):
filename = 'data.csv'
f = open(filename, 'w')
header = 'index;title;img;link;views;brand'
f.write(header)
#print(len(content))
i = 0
for c in content:
try:
f.write(";\n"+";".join([c[k] for k in header.split(";")]))
except KeyError:
print(c)
i += 1
print(i)
f.write(";")
f.close()
Using the header as your indexes is cleaner imo and wrapping your explicit key access in error handling could help you get through some snags. Also based on how you are writing you output file you will have an empty line at the end of your file, presuming that you have amalgamated your data from some number of similar files you likely have empty elements in your list.
Related
def generate_daily_totals(input_filename, output_filename):
"""result in the creation of a file blahout.txt containing the two lines"""
with open(input_filename, 'r') as reader, open(output_filename, 'w') as writer: #updated
for line in reader: #updated
pieces = line.split(',')
date = pieces[0]
rainfall = pieces[1:] #each data in a line
total_rainfall = 0
for data in rainfall:
pure_data = data.rstrip()
total_rainfall = total_rainfall + float(pure_data)
writer.write(date + "=" + '{:.2f}'.format(total_rainfall) + '\n') #updated
#print(date, "=", '{:.2f}'.format(total_rainfall)) #two decimal point format,
generate_daily_totals('data60.txt', 'totals60.txt')
checker = open('totals60.txt')
print(checker.read())
checker.close()
By reading a file, the original program runs well but I was required to convert it by writing the file. I am confused as the write method applies to string only so does that mean only the print section can be replaced by write method? This is the first time I am trying to use the write method. Thanks!
EDIT: the above codes have been updated based on the blhsing instruction which helped a lot! But still not running well as the for loop which gets skipped for some reason. Proper suggestions would be appreciated!
expected output:
2006-04-10 = 1399.46
2006-04-11 = 2822.36
2006-04-12 = 2803.81
2006-04-13 = 1622.71
2006-04-14 = 3119.60
2006-04-15 = 2256.14
2006-04-16 = 3120.05
2006-04-20 = 1488.00
You should open both the input file for reading, and the output file for writing, so change:
with open(input_filename, 'w') as writer:
for line in writer: # error not readable
to:
with open(input_filename, 'r') as reader, open(output_filename, 'w') as writer:
for line in reader:
Also, unlike the print function, the write method of a file object does not automatically add a trailing newline character to the output, so you would have to add it on your own.
Change:
writer.write(date + "=" + '{:.2f}'.format(total_rainfall))
to:
writer.write(date + "=" + '{:.2f}'.format(total_rainfall) + '\n')
or you can use print with the outputting file object specified as the file argument:
print(date, "=", '{:.2f}'.format(total_rainfall), file=writer)
I am using python 2.4.4 (old machine, can't do anything about it) on a UNIX machine. I am extremely new to python/programming and have never used a UNIX machine before. This is what I am trying to do:
extract a single sequence from a FASTA file (proteins + nucleotides) to a temporary text file.
Give this temporary file to a program called 'threader'
Append the output from threader (called tempresult.out) to a file called results.out
Remove the temporary file.
Remove the tempresult.out file.
Repeat using the next FASTA sequence.
Here is my code so far:
import os
from itertools import groupby
input_file = open('controls.txt', 'r')
output_file = open('results.out', 'a')
def fasta_parser(fasta_name):
input = fasta_name
parse = (x[1] for x in groupby(input, lambda line: line[0] == ">"))
for header in parse:
header = header.next()[0:].strip()
seq = "\n".join(s.strip() for s in parse.next())
yield (header, '\n', seq)
parsedfile = fasta_parser(input_file)
mylist = list(parsedfile)
index = 0
while index < len(mylist):
temp_file = open('temp.txt', 'a+')
temp_file.write(' '.join(mylist[index]))
os.system('threader' + ' temp.txt' + ' tempresult.out' + ' structures.txt')
os.remove('temp.txt')
f = open('tempresult.out', 'r')
data = str(f.read())
output_file.write(data)
os.remove('tempresult.out')
index +=1
output_file.close()
temp_file.close()
input_file.close()
When I run this script I get the error 'Segmentation Fault'. From what I gather this is to do with me messing with memory I shouldn't be messing with (???). I assume it is something to do with the temporary files but I have no idea how I would get around this.
Any help would be much appreciated!
Thanks!
Update 1:
Threader works fine when I give it the same sequence multiple times like this:
import os
input_file = open('control.txt', 'r')
output_file = open('results.out', 'a')
x=0
while x<3:
os.system('threader' + ' control.txt' + ' tempresult.out' + ' structures.txt')
f = open('tempresult.out', 'r')
data = str(f.read())
output_file.write(data)
os.remove('result.out')
x += 1
output_file.close()
input_file.close()
Update 2: In the event that someone else gets this error. I forgot to close temp.txt before invoking the threader program.
Suppose I have many different text files from the same directory with the content structure as shown below:
File a.txt:
HEADER_X;HEADER_Y;HEADER_Z
a_value;a_value;a_value
a_value;a_value;a_value
File b.txt:
HEADER_X;HEADER_Y;HEADER_Z
b_value;b_value;b_value
b_value;b_value;b_value
File c.txt:
HEADER_X;HEADER_Y;HEADER_Z
c_value;c_value;c_value
c_value;c_value;c_value
File d.txt: ...
I'd like to merge all of the txt files into one, by appending the content of each file to the final row of the each previous file. See below:
File combined.txt:
HEADER_X;HEADER_Y;HEADER_Z
a_value;a_value;a_value
a_value;a_value;a_value
b_value;b_value;b_value
b_value;b_value;b_value
c_value;c_value;c_value
c_value;c_value;c_value
...
How can I do this in Python?
Assumptions:
- all txt files are located in the same folder
- all txt files have same headers
- all txt files have same number of columns
- all txt files have different number of rows
Use the CSV Module. Something like this:
import csv
with ('output.csv', 'ab') as output:
writer = csv.writer(output, delimiter=";")
with open('a.txt', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=";")
reader.readline() // this is to skip the header
for row in spamreader:
writer.writerow(row)
If you didn't want to harcode in every file (Say you have many more than three) you could do:
from os import listdir
from os.path import isfile, join
onlyfiles = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]
for aFile in onlyfiles:
with open(aFile, 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=";")
reader.readline() // this is to skip the header
for row in spamreader:
writer.writerow(row)
I managed to do something that seems to work (at least in the cases I tested).
This will parse all files, get all the headers and format the values on each line of each file to add ";" according to the headers present/absent on that file.
headers = []
values = []
files = ("csv0.txt", "csv1.txt")#put the files you want to parse here
#read the files a first time, just to get the headers
for file_name in files:
file = open(file_name, 'r')
first_line = True
for line in file:
if first_line:
first_line = False
for header in line.strip().split(";"):
if header not in headers:
headers.append(header)
else:
break
file.close()
headers = sorted(headers)
#read a second time to get the values
file_number = 0
for file_name in files:
file = open(file_name, 'r')
file_headers = []
first_line = True
corresponding_indexes = []
values.append([])
for line in file:
if first_line:
first_line = False
index = 0
for header in line.strip().split(";"):
while headers[index] != header:
index += 1
corresponding_indexes.append(index)
else:
line_values = line.strip().split(";")
current_index = 0
values_str = ""
for value in line_values:
#this part write the values with ";" added for the headers not in this file
while current_index not in corresponding_indexes:
current_index += 1
values_str += ";"
values_str += value + ";"
current_index += 1
values_str = values_str[:-1] #we remove the last ";" (useless)
values[file_number].append(values_str)
file_number += 1
file.close()
#and now we write the output file with all headers and values
headers_str = ""
for header in headers:
headers_str += header + ";"
headers_str = headers_str[:-1]
output_file = open("output.txt", 'w')
output_file.write(headers_str + "\n")
for file_values in values:
for values_line in file_values:
output_file.write(values_line + "\n")
output_file.close()
If you have any question, feel free to ask.
here is my code for readinng individual cell of one csv file. but want to read multiple csv file one by one from .txt file where csv file paths are located.
import csv
ifile = open ("C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv", "rb")
data = list(csv.reader(ifile, delimiter = ';'))
REQ = []
RES = []
n = len(data)
for i in range(n):
x = data[i][1]
y = data[i][2]
REQ.append (x)
RES.append (y)
i += 1
for j in range(2,n):
try:
if REQ[j] != '' and RES[j]!= '': # ignore blank cell
print REQ[j], ' ', RES[j]
except:
pass
j += 1
And csv file paths are stored in a .txt file like
C:\Desktop\Test_Specification\RDBI.csv
C:\Desktop\Test_Specification\ECUreset.csv
C:\Desktop\Test_Specification\RDTC.csv
and so on..
You can read stuff stored in files into variables. And you can use variables with strings in them anywhere you can use a literal string. So...
with open('mytxtfile.txt', 'r') as txt_file:
for line in txt_file:
file_name = line.strip() # or was it trim()? I keep mixing them up
ifile = open(file_name, 'rb')
# ... the rest of your code goes here
Maybe we can fix this up a little...
import csv
with open('mytxtfile.txt', 'r') as txt_file:
for line in txt_file:
file_name = line.strip()
csv_file = csv.reader(open(file_name, 'rb', delimiter=';'))
for record in csv_file[1:]: # skip header row
req = record[1]
res = record[2]
if len(req + res):
print req, ' ', res
you just need to add a while which will read your file containing your list of files & paths upon your first open statement, for example
from __future__ import with_statement
with open("myfile_which_contains_file_path.txt") as f:
for line in f:
ifile = open(line, 'rb')
# here the rest of your code
You need to use a raw string string your path contains \
import csv
file_list = r"C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv"
with open(file_list) as f:
for line in f:
with open(line.strip(), 'rb') as the_file:
reader = csv.reader(the_file, delimiter=';')
for row in reader:
req,res = row[1:3]
if req and res:
print('{0} {1}'.format(req, res))
My python program loops through a bunch of csv-files, read them, and write specific columns in the file to another csv file. While the program runs, i can see the files being written in the correct manner, but once the program is finished, all the files i've just written become empty.
The solution to all the other similar threads seems to be closing the file you write to properly, but i cant seem to figure out what im doing wrong. Anyone?
import os
import csv
def ensure_dir(f):
d = os.path.dirname(f)
if not os.path.exists(d):
os.makedirs(d)
readpath = os.path.join("d:\\", "project")
savepath=os.path.join("d:\\", "save")
ensure_dir(savepath)
contents_1=os.listdir(readpath)
for i in contents_1[1:len(contents_1)]:
readpath_2=os.path.join(readpath, i)
if os.path.isdir(readpath_2)== True :
contents_2=os.listdir(readpath_2)
for i in contents_2:
readpath_3=os.path.join(readpath_2, i)
if os.path.isfile(readpath_3)== True :
savefile=savepath + "\\" + i
savefile = open(savefile, 'wb')
writer = csv.writer(savefile, delimiter=';')
readfile=open(readpath_3, 'rb')
reader = csv.reader(readfile, delimiter=';')
try:
for row in reader:
writer.writerow([row[0], row[3]])
except:
print(i)
finally:
savefile.close()
readfile.close()
savefile=savepath + "\\" + i is the error. If both "d:\\project\a\x.csv" and "d:\\project\b\x.csv" exist, then you will write to savepath + "\\" + i more than once. If the second path as an empty "x.csv", then it would overwrite the result with an empty file.
Try this instead:
import os
import csv
def ensure_dir(f):
d = os.path.dirname(f)
if not os.path.exists(d):
os.makedirs(d)
readpath = os.path.join("d:\\", "project")
savepath = os.path.join("d:\\", "save")
ensure_dir(savepath)
for dname in os.listdir(readpath)[1:]:
readpath_2 = os.path.join(dname, fname)
if not os.path.isdir(readpath_2):
continue
for fname in os.listdir(readpath_2)
fullfname = os.path.join(readpath_2, fname)
if not os.path.isfile(fullfname):
continue
savefile = open(savepath + "\\" + dname + "_" + fname, wb)
writer = csv.writer(savefile, delimiter=';')
readfile=open(fullfname, 'rb')
reader = csv.reader(readfile, delimiter=';')
try:
for row in reader:
writer.writerow([row[0], row[3]])
except:
print(i)
finally:
savefile.close()
readfile.close()
This code could be greatly improved with os.walk
Quoting from the python documentation:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
Change the 'w' and 'r' flags to 'wb' and 'rb'.
(1) Your outer loop AND your inner loop both use i as the loop variable. This has no hope of (a) being understood by a human (b) working properly.
(2) except: print(i) ... What??? I'd suggest you remove the try/except and fix any bugs that you come across.