I have this script:
import csv
import unicodedata
with open('output.csv', 'a', encoding='cp1252') as csvfile:
writer = csv.writer(csvfile)
with open('input.csv', 'r', encoding='cp1252') as csvfile:
for row in csv.reader(csvfile):
name_array = u''.join([c for c in unicodedata.normalize('NFKD', row[0].lower()) if (c.isalnum() or c.isspace()) if not unicodedata.combining(c)]).split()
writer.writerow(name_array)
which would create a name breakdown from a csv list of names. It works fine but the output has empty rows between successful name breakdowns.
Sample input.csv:
"Lastname, Firstname Secondname"
"Lastname1 Lastname2, Firstname1"
Sample output.csv:
lastname,firstname,secondname
##### empty row ####
lastname1,lastname2,firstname1
How do I remove the empty row?
In your csv.writer, specify a keyword argument for lineterminator='\n', which should eliminate the extra empty line.
Related
I want to append a column from 'b.csv' file and put it into 'a.csv' file but it only add a letter and not the whole string. I tried searching in google but there's no answer. I want to put the column under the headline "number". This is my code:
f = open('b.csv')
default_text = f.read()
with open('a.csv', 'r') as read_obj, \
open('output_1.csv', 'w', newline='') as write_obj:
csv_reader = reader(read_obj)
csv_writer = writer(write_obj)
for row in csv_reader:
row.append(default_text[8])
csv_writer.writerow(row)
This is the info in 'a.csv'
name,age,course,school,number
Leo,18,BSIT,STI
Rommel,23,BSIT,STI
Gaby,33,BSIT,STI
Ranel,31,BSIT,STI
This is the info in 'b.csv'
1212121
1094534
1345684
1093245
You can just concat rows read from both CSV file and pass it immediately to writer:
import csv
from operator import concat
with open(r'a.csv') as f1, \
open(r'b.csv') as f2, \
open(r'output_1.csv', 'w', newline='') as out:
f1_reader = csv.reader(f1)
f2_reader = csv.reader(f2)
writer = csv.writer(out)
writer.writerow(next(f1_reader)) # write column names
writer.writerows(map(concat, f1_reader, f2_reader))
So we initialize csv.reader() for both CSV files and csv.writer() for output. As first file (a.csv) contains column names, we read it using next() and pass to .writerow() to write them into output without any modifications. Then using map() we can iterate over both readers simultaneously applying operator.concat() which concatenate rows returned from both reader. We can pass it directly to .writerows() and let it consume generator returned by map().
If only pandas cannot be used, then it's convenient to use Table helper from convtools library (github).
from convtools.contrib.tables import Table
from convtools import conversion as c
(
Table.from_csv("tmp/1.csv", header=True)
# this step wouldn't be needed if your first file wouldn't have missing
# "number" column
.drop("number")
.zip(Table.from_csv("tmp/2.csv", header=["number"]))
.into_csv("tmp/results.csv")
)
I want to replace the header row of a cvs file text.csv.
header_list = ['column_1', 'column_2', 'column_3']
The header will look like this;
column_1, column_2, column_3
Here is my code;
import csv
with open('text.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(header_list)
The header of the csv file was replaced correctly. However, the rest of the rows in the csv file were deleted. How do I replace only the header leaving the other rows intact?
I am using python v3.6
Here is a proper way to do it using csv module.
csv.DictReader reads the content of csv file into a list of dicts. It takes an optional fieldnames argument which if set applies a custom header and ignores an original header and treats it as a data row. So, all you need to do is read your csv
file with csv.DictReader and write data with csv.DictWriter. You will have to drop the first row in the reader because it contains the old header and write the new header. It does make sense to write the new data to a separate file though.
import csv
header = ["column_1", "column_2", "column_3"]
with open('text.csv', 'r') as fp:
reader = csv.DictReader(fp, fieldnames=header)
# use newline='' to avoid adding new CR at end of line
with open('output.csv', 'w', newline='') as fh:
writer = csv.DictWriter(fh, fieldnames=reader.fieldnames)
writer.writeheader()
header_mapping = next(reader)
writer.writerows(reader)
Use this:
import csv
header_list = ['column_1', 'column_2', 'column_3']
mystring = ",".join(header_list)
def line_prepender(filename, line):
with open(filename, 'r+') as csvfile:
content = csvfile.read()
csvfile.seek(0, 0)
csvfile.write(line.rstrip('\r\n') + '\n' + content)
line_prepender("text.csv", mystring)
I'm trying to make the first row/header lowercase, in multiple csv files in a directory using python. The code and error are below. Is there any way to fix the code or some other way?
import csv
import glob
path = (r'C:\Users\Documents')
for fname in glob(path):
with open(fname, newline='') as f:
reader = csv.reader(f)
row1 = next(reader)
for row1 in reader:
data = [row1.lower() for row1 in row1]
os.rename(row1, data)
The error is:
TypeError: rename: src should be string, bytes or os.PathLike, not list
I think you're getting rows and columns mixed-up. Here's some untested code that does what you want, I think:
import csv
from glob import glob
path = (r'C:\Users\Documents\*.csv') # Note wildcard character added for glob().
for fname in glob(path):
with open(fname, newline='') as f:
reader = csv.reader(f)
header = next(reader) # Get the header row.
header = [column.lower() for column in header] # Lowercase the headings.
rows = [header] + list(reader) # Read the rest of the rows.
with open(fname, 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(rows) # Write new header & original rows back to file.
I have written a code that implements the given regex on every postcode that is included in the 'import_data.csv' file. It then generates a new csv file 'failed_validation.csv' which contains all the postcodes where the validation fails. The structure of both files is in the following format:
row_id postcode
134534 AABC 123
243534 AACD 4PQ
534345 QpCD 3DR
... ...
Following is my code:
import csv
import re
regex = r"(GIR\s0AA)|((([A-PR-UWYZ][0-9][0-9]?)|(([A-PR-UWYZ][A-HK-Y][0-9]((BR|FY|HA|HD|HG|HR|HS|HX|JE|LD|SM|SR|WC|WN|ZE)[0-9])[0-9])|([A-PR-UWYZ][A-HK-Y](AB|LL|SO)[0-9])|(WC[0-9][A-Z])|(([A-PR-UWYZ][0-9][A-HJKPSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRVWXY]))))\s[0-9][ABD-HJLNP-UW-Z]{2})"
codes = []
with open('../import_data.csv','r') as f:
r = csv.reader(f, delimiter=',')
for row in r:
if not(re.findall(regex, row[1])):
codes.append([row[0],row[1]])
with open('failed_validation.csv','w',newline='') as fp:
a = csv.writer(fp)
a.writerows(codes)
The code works fine but what I actually want is the postcodes in the new file need to be ordered as per the row_id, in ascending numeric order. I know how to generate a new file with Python, but I don't know how to order the data inside that file in ascending numeric order.
This will do it and preserve the header row:
import csv
import re
regex = r"(GIR\s0AA)|((([A-PR-UWYZ][0-9][0-9]?)|(([A-PR-UWYZ][A-HK-Y][0-9]((BR|FY|HA|HD|HG|HR|HS|HX|JE|LD|SM|SR|WC|WN|ZE)[0-9])[0-9])|([A-PR-UWYZ][A-HK-Y](AB|LL|SO)[0-9])|(WC[0-9][A-Z])|(([A-PR-UWYZ][0-9][A-HJKPSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRVWXY]))))\s[0-9][ABD-HJLNP-UW-Z]{2})"
codes = []
with open('import_data.csv', 'r', newline='') as fp:
reader = csv.reader(fp, delimiter=',')
header = next(reader)
for row in reader:
if not re.findall(regex, row[1]):
codes.append([row[0],row[1]])
with open('failed_validation.csv', 'w', newline='') as fp:
writer = csv.writer(fp)
writer.writerow(header)
writer.writerows(sorted(codes))
Sort your codes list before writing to the file.
headers = codes[0]
codes = sorted(codes[1:])
with open('failed_validation.csv','w',newline='') as fp:
a = csv.writer(fp)
a.writerow(header)
a.writerows(codes)
I have some data that needs to be written to a CSV file. The data is as follows
A ,B ,C
a1,a2 ,b1 ,c1
a2,a4 ,b3 ,ct
The first column has comma inside it. The entire data is in a list that I'd like to write to a CSV file, delimited by commas and without disturbing the data in column A. How can I do that? Mentioning delimiter = ',' splits it into four columns on the whole.
Just use the csv.writer from the csv module.
import csv
data = [['A','B','C']
['a1,a2','b1','c1']
['a2,a4','b3','ct']]
fname = "myfile.csv"
with open(fname,'wb') as f:
writer = csv.writer(f)
for row in data:
writer.writerow(row)
https://docs.python.org/library/csv.html#csv.writer
No need to use the csv module since the ',' in the first column is already part of your data, this will work:
with open('myfile.csv', 'w') as f:
for row in data:
f.write(', '.join(row))
f.write('\n')
You could try the below.
Code:
import csv
import re
with open('infile.csv', 'r') as f:
lst = []
for line in f:
lst.append(re.findall(r',?(\S+)', line))
with open('outfile.csv', 'w', newline='') as w:
writer = csv.writer(w)
for row in lst:
writer.writerow(row)
Output:
A,B,C
"a1,a2",b1,c1
"a2,a4",b3,ct