I am trying to add extra columns in a csv file after processing an input csv file. But, I am getting extra new line added after each line in the output.
What's missing or wrong in my below code -
import csv
with open('test.csv', 'r') as infile:
with open('test_out.csv', 'w') as outfile:
reader = csv.reader(infile, delimiter=',')
writer = csv.writer(outfile, delimiter=',')
for row in reader:
colad = row[5].rstrip('0123456789./ ')
if colad == row[5]:
col2ad = row[11]
else:
col2ad = row[5].split(' ')[-1]
writer.writerow([row[0],colad,col2ad] +row[1:])
I am processing huge a csv file so would like to get rid of those extra lines.
I had the same problem on Windows (your OS as well, I presume?). CSV and Windows as combination make a \r\r\n at the end of each line (so: double newline).
You need to open the output file in binary mode:
with open('test_out.csv', 'wb') as outfile:
For other answers:
Python's CSV writer produces wrong line terminator
CSV in Python adding an extra carriage return
Related
I have 200 files with dates in the file name. I would like to add date from this file name into new column in each file.
I created macro in Python:
import pandas as pd
import os
import openpyxl
import csv
os.chdir(r'\\\\\\\')
for file_name in os.listdir(r'\\\\\\'):
with open(file_name,'r') as csvinput:
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('FileName')
all.append(row)
for row in reader:
row.append(file_name)
all.append(row)
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
if file_name.endswith('.csv'):
workbook = openpyxl.load_workbook(file_name)
workbook.save(file_name)
csv_filename = pd.read_csv(r'\\\\\\')
csv_data= pd.read_csv(csv_filename, header = 0)
csv_data['filename'] = csv_filename`
Right now I see "InvalidFileException: File is not a zip file" and only first file has added column with the file name.
Can you please advise what am I doing wrong? BTW I,m using Python 3.4.
Many thanks,
Lukasz
First problem, this section:
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
should be indented, to be included in the for loop. Now it is only executed once after the loop. This is why you only get one output file.
Second problem, the exception is probably caused by openpyxl.load_workbook(file_name). Presumably openpyxl can only open actual Excel files (which are .zip files with other extension), no CSV files. Why do you want to open and save it after all? I think you can just remove those three lines.
The raw ECG that I have is in csv format. I need to convert it into .txt file which will have only the ECG data. I need a python code for the same. Can I get some help on this.
csv_file = 'ECG_data_125Hz_Simulator_Patch_Normal_Sinus.csv'
txt_file = 'ECG_data_125Hz_Simulator_Patch_Normal_Sinus.txt'
import csv
with open(txt_file, "w") as my_output_file:
with open(csv_file, "r") as my_input_file:
//need to write data to the output file
my_output_file.close()
The input ECG data looks like this:
Raw_ECG_data
What worked for me
import csv
csv_file = 'FL_insurance_sample.csv'
txt_file = 'ECG_data_125Hz_Simulator_Patch_Normal_Sinus.txt'
with open(txt_file, "w") as my_output_file:
with open(csv_file, "r") as my_input_file:
[ my_output_file.write(" ".join(row)+'\n') for row in csv.reader(my_input_file)]
my_output_file.close()
A few things:
You can open multiple files with the same context manager (with statement):
with open(csv_file, 'r') as input_file, open(txt_file, 'w') as output_file:
...
When using a context manager to handle files, there's no need to close the file, that's what the with statement is doing; it's saying "with the file open, do the following". So once the block is ended, the file is closed.
You could do something like:
with open(csv_file, 'r') as input_file, open(txt_file, 'w') as output_file:
for line in input_file:
output_file.write(line)
... But as #MEdwin says a csv can just be renamed and the commas will no longer act as separators; it will just become a normal .txt file. You can rename a file in python using os.rename():
import os
os.rename('file,txt', 'file.csv')
Finally, if you want to remove certain columns from the csv when writing to the txt file, you can use .split(). This allows you use an identifier such as a comma, and separate the line according this identifier into a list of strings. For example:
"Hello, this is a test".split(',')
>>> ["Hello", "this is a test"]
You can then just write certain indices from the list to the new file.
For more info on deleting columns en masse, see this post
I must be missing something very simple here, but I've been hitting my head against the wall for a while and don't understand where the error is. I am trying to open a csv file and read the data. I am detecting the delimiter, then reading in the data with this code:
with open(filepath, 'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read())
delimiter = repr(dialect.delimiter)[1:-1]
csvdata = [line.split(delimiter) for line in csvfile.readlines()]
However, my csvfile is being read as having no length. If I run:
print(sum(1 for line in csvfile))
The result is zero. If I run:
print(sum(1 for line in open(filepath, 'r')))
Then I get five lines, as expected. I've checked for name clashes by changing csvfile to other random names, but this does not change the result. Am I missing a step somewhere?
You need to move the file pointer back to the start of the file after sniffing it. You don't need to read the whole file in to do that, just enough to include a few rows:
import csv
with open(filepath, 'r') as f_input:
dialect = csv.Sniffer().sniff(f_input.read(2048))
f_input.seek(0)
csv_input = csv.reader(f_input, dialect)
csv_data = list(csv_input)
Also, the csv.reader() will do the splitting for you.
I'm trying to import a csv file which has # as delimiter and \r\n as line break. Inside one column there is data which also has newline in it but \n.
I'm able to read one line after another without problems but using the csv lib (Python 3) I've got stuck.
The below example throws a
Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
Is it possible to use the csv lib with multiple newline characters?
Thanks!
import csv
with open('../database.csv', newline='\r\n') as csvfile:
file = csv.reader(csvfile, delimiter='#', quotechar='"')
for row in file:
print(row[3])
database.csv:
2202187#"645cc14115dbfcc4defb916280e8b3a1"#"cd2d3e434fb587db2e5c2134740b8192"#"{
Age = 22;
Salary = 242;
}
Please try this code. According to Python 3.5.4 documentation, with newline=None, common line endings like '\r\n' ar replaced by '\n'.
import csv
with open('../database.csv', newline=None) as csvfile:
file = csv.reader(csvfile, delimiter='#', quotechar='"')
for row in file:
print(row[3])
I've replaced newline='\r\n' by newline=None.
You also could use the 'rU' modifier but it is deprecated.
...
with open('../database.csv', 'rU') as csvfile:
...
import csv
in_txt = csv.reader(open(post.text, "rb"), delimiter = '\t')
out_csv = csv.writer("C:\Users\sptechsoft\Documents\source3.csv", 'wb')
out_csv.writerows(in_txt)
when executing above code i am getting IO error and i need to save in CSV in seperate folder
You dont need to open file before passing it to csvreader.
You can directly pass the file to csvreader and it would work
import csv
in_txt = csv.reader("post.text", "rb", delimiter = '\t')
out_csv = csv.writer("C:\Users\sptechsoft\Documents\source3.csv", 'wb')
out_csv.writerows(in_txt)
Try the following:
import csv
with open(post.text, "rb") as f_input, open(r"C:\Users\sptechsoft\Documents\source3.csv", "wb") as f_output:
in_csv = csv.reader(f_input, delimiter='\t')
out_csv = csv.writer(f_output)
out_csv.writerows(in_csv)
The csv.reader() and csv.writer() needs either a list or a file object. It cannot open the file for you. By using with it ensures the files are correctly closed automatically afterwards.
Also do not forget to prefix your path string with r to disable any string escaping due to the backslashes.