Separate data with a comma CSV Python - python

I have some data that needs to be written to a CSV file. The data is as follows
A ,B ,C
a1,a2 ,b1 ,c1
a2,a4 ,b3 ,ct
The first column has comma inside it. The entire data is in a list that I'd like to write to a CSV file, delimited by commas and without disturbing the data in column A. How can I do that? Mentioning delimiter = ',' splits it into four columns on the whole.

Just use the csv.writer from the csv module.
import csv
data = [['A','B','C']
['a1,a2','b1','c1']
['a2,a4','b3','ct']]
fname = "myfile.csv"
with open(fname,'wb') as f:
writer = csv.writer(f)
for row in data:
writer.writerow(row)
https://docs.python.org/library/csv.html#csv.writer

No need to use the csv module since the ',' in the first column is already part of your data, this will work:
with open('myfile.csv', 'w') as f:
for row in data:
f.write(', '.join(row))
f.write('\n')

You could try the below.
Code:
import csv
import re
with open('infile.csv', 'r') as f:
lst = []
for line in f:
lst.append(re.findall(r',?(\S+)', line))
with open('outfile.csv', 'w', newline='') as w:
writer = csv.writer(w)
for row in lst:
writer.writerow(row)
Output:
A,B,C
"a1,a2",b1,c1
"a2,a4",b3,ct

Related

Replace csv header without deleting the other rows

I want to replace the header row of a cvs file text.csv.
header_list = ['column_1', 'column_2', 'column_3']
The header will look like this;
column_1, column_2, column_3
Here is my code;
import csv
with open('text.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(header_list)
The header of the csv file was replaced correctly. However, the rest of the rows in the csv file were deleted. How do I replace only the header leaving the other rows intact?
I am using python v3.6
Here is a proper way to do it using csv module.
csv.DictReader reads the content of csv file into a list of dicts. It takes an optional fieldnames argument which if set applies a custom header and ignores an original header and treats it as a data row. So, all you need to do is read your csv
file with csv.DictReader and write data with csv.DictWriter. You will have to drop the first row in the reader because it contains the old header and write the new header. It does make sense to write the new data to a separate file though.
import csv
header = ["column_1", "column_2", "column_3"]
with open('text.csv', 'r') as fp:
reader = csv.DictReader(fp, fieldnames=header)
# use newline='' to avoid adding new CR at end of line
with open('output.csv', 'w', newline='') as fh:
writer = csv.DictWriter(fh, fieldnames=reader.fieldnames)
writer.writeheader()
header_mapping = next(reader)
writer.writerows(reader)
Use this:
import csv
header_list = ['column_1', 'column_2', 'column_3']
mystring = ",".join(header_list)
def line_prepender(filename, line):
with open(filename, 'r+') as csvfile:
content = csvfile.read()
csvfile.seek(0, 0)
csvfile.write(line.rstrip('\r\n') + '\n' + content)
line_prepender("text.csv", mystring)

Convert from CSV to array in Python

I have a CSV file containing the following.
0.000264,0.000352,0.000087,0.000549
0.00016,0.000223,0.000011,0.000142
0.008853,0.006519,0.002043,0.009819
0.002076,0.001686,0.000959,0.003107
0.000599,0.000133,0.000113,0.000466
0.002264,0.001927,0.00079,0.003815
0.002761,0.00288,0.001261,0.006851
0.000723,0.000617,0.000794,0.002189
I want convert the values into an array in Python and keep the same order (row and column). How I can achieve this?
I have tried different functions but ended with error.
You should use the csv module:
import csv
results = []
with open("input.csv") as csvfile:
reader = csv.reader(csvfile, quoting=csv.QUOTE_NONNUMERIC) # change contents to floats
for row in reader: # each row is a list
results.append(row)
This gives:
[[0.000264, 0.000352, 8.7e-05, 0.000549],
[0.00016, 0.000223, 1.1e-05, 0.000142],
[0.008853, 0.006519, 0.002043, 0.009819],
[0.002076, 0.001686, 0.000959, 0.003107],
[0.000599, 0.000133, 0.000113, 0.000466],
[0.002264, 0.001927, 0.00079, 0.003815],
[0.002761, 0.00288, 0.001261, 0.006851],
[0.000723, 0.000617, 0.000794, 0.002189]]
If your file doesn't contain parentheses
with open('input.csv') as f:
output = [float(s) for line in f.readlines() for s in line[:-1].split(',')]
print(output);
The csv module was created to do just this. The following implementation of the module is taken straight from the Python docs.
import csv
with open('file.csv','rb') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in reader:
#add data to list or other data structure
The delimiter is the character that separates data entries, and the quotechar is the quotechar.

Create subset of large CSV file and write to new CSV file

I would like to create a subset of a large CSV file using the rows that have the 4th column ass "DOT" and output to a new file.
This is the code I currently have:
import csv
outfile = open('DOT.csv','w')
with open('Service_Requests_2015_-_Present.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
if row[3] == "DOT":
outfile.write(row)
outfile.close()
The error is:
outfile.write(row)
TypeError: must be str, not list
How can I manipulate row so that I will be able to just straight up do write(row), if not, what is the easiest way?
You can combine your two open statements, as the with statement accepts multiple arguments, like this:
import csv
infile = 'Service_Requests_2015_-_Present.csv'
outfile = 'DOT.csv'
with open(infile, encoding='utf-8') as f, open(outfile, 'w') as o:
reader = csv.reader(f)
writer = csv.writer(o, delimiter=',') # adjust as necessary
for row in reader:
if row[3] == "DOT":
writer.writerow(row)
# no need for close statements
print('Done')
Make your outfile a csv.writer and use writerow instead of write.
outcsv = csv.writer(outfile, ...other_options...)
...
outcsv.writerow(row)
That is how I would do it... OR
outfile.write(",".join(row)) # comma delimited here...
In Above code you are trying to write list with file object , we can not write list that give error "TypeError: must be str, not list" you can convert list in string format then you able to write row in file. outfile.write(str(row))
or
import csv
def csv_writer(input_path,out_path):
with open(out_path, 'ab') as outfile:
writer = csv.writer(outfile)
with open(input_path, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
if row[3] == "DOT":
writer.writerow(row)
outfile.close()
csv_writer(input_path,out_path)
[This code for Python 3 version. In Python 2.7, the open function does not take a newline argument, hence the TypeError.]

How to write a list of path names on separate rows in csv file?

I have a list of pathnames:
li = [u"C:\\temp\\fileA.shp", u"C:\\temp\\fileB.shp", u"C:\\temp\\fileC.shp"]
I am trying to write each path on a separate line in a txt file. This is what I have done so far:
import csv
li = [u"C:\\temp\\fileA.shp", u"C:\\temp\\fileB.shp", u"C:\\temp\\fileC.shp"]
with open(r'C:\temp\myfile.csv', "wb") as f:
wr = csv.writer(f, delimiter=',', quoting=csv.QUOTE_NONE)
wr.writerows([li])
Which yields a list of files on the same row:
C:\temp\fileA.shp,C:\temp\fileB.shp,C:\temp\fileC.shp
How can I tweak this so that the pathnames are each on their own row? The following is what I am after:
C:\temp\fileA.shp
C:\temp\fileB.shp
C:\temp\fileC.shp
Easy just need to add \n witch means new
import csv
li = [u"C:\\temp\\fileA.shp", u"C:\\temp\\fileB.shp", u"C:\\temp\\fileC.shp"]
with open(r'C:\temp\myfile.txt', "wb") as f:
wr = csv.writer(f + '\n', delimiter=',', quoting=csv.QUOTE_NONE)
wr.writerows([li])
So now f will be printed + \n (new line)

How to Sort a Column in an Excel sheet

So I understand how sorting works in Python. If I put...
a = (["alpha2A", "hotel2A", "bravo2C", "alpha2B", "tango3B", "alpha3A", "zulu.A1", "foxtrot8F", "zulu.B1"]
a.sort()
print a
I will get...
'alpha2A', 'alpha2B', 'alpha3A', 'bravo2C', 'foxtrot8F', 'hotel2A', 'tango3B', 'zulu.A1', 'zulu.B1']
However, I want to sort a column in a Excel sheet so I tried...
isv = open("case_name.csv", "w+")
a = (["case_name.csv"[2]])
a.sort()
print a
And got a return of...
['s']
I understand that it is returning the 3rd letter in the file name but how do I make it sort and return the entire column of the Excel sheet?
Update: New Code
import csv
import operator
with open('case_name.csv') as infile:
data = list(csv.reader(infile, dialect=csv.excel_tab))
data.sort(key=operator.itemgetter(2))
with open('case_name_sorted.csv', 'w') as outfile:
writer = csv.writer(outfile, dialect='excel')
writer.writerows(data)
print(sum(1 for row in data if len(row) < 3))
And it returns
data = list(csv.reader(infile, dialect=csv.excel_tab))
_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
import csv
import oprator
# read the data from the source file
with open('case_name.csv') as infile:
data = list(csv.reader(infile, dialect='excel'))
# sort a list of sublists by the item in index 2
data.sort(key=operator.itemgetter(2))
# time to write the results into a file
with open('case_name_sorted.csv', 'w') as outfile:
writer = csv.writer(outfile, dialect='excel')
writer.writerows(data)

Categories

Resources