Sorting csv data - python

The ask is to sort and save a csv file in a new csv file. With the code below(below1), I seem to get the result when I open the new csv file. However, when I run the code on the homework interface, it prints out wrong (below2). Can anyone identify why it doesn't work? I have the correct solution to the ask as well (below3). I don't understand why mine doesn't work.
Below1:
import csv
def sort_records(csv_filename, new_filename):
file = open(csv_filename)
lines = file.readlines()
newfile = open(new_filename, "w")
header = lines[0]
newfile.write(header)
lines.remove(header)
lines.sort(key=lambda x: x[0])
for item in lines:
newfile.write(item)
file.close()
newfile.close()
Below2:
city/month,Jan,Feb,Mar,Apr
Brisbane,31.3,40.2,37.9,29
Darwin,34,34,33.2,34.5Melbourne,41.2,35.5,37.4,29.3
Below3:
import csv
def sort_records(csv_filename, new_filename):
csv_file = open(csv_filename)
reader = csv.reader(csv_file)
header = next(reader)
data2d = list(reader)
data2d.sort()
csv_file.close()
new_file = open(new_filename, "w")
writer = csv.writer(new_file)
writer.writerow(header)
writer.writerows(data2d)
new_file.close()
The original csv file:
city/month,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
Melbourne,41.2,35.5,37.4,29.3,23.9,16.8,18.2,25.7,22.3,33.5,36.9,41.1
Brisbane,31.3,40.2,37.9,29,30,26.7,26.7,28.8,31.2,34.1,31.1,31.2
Darwin,34,34,33.2,34.5,34.8,33.9,32,34.3,36.1,35.4,37,35.5
Perth,41.9,41.5,42.4,36,26.9,24.5,23.8,24.3,27.6,30.7,39.8,44.2
Adelaide,42.1,38.1,39.7,33.5,26.3,16.5,21.4,30.4,30.2,34.9,37.1,42.2
Canberra,35.8,29.6,35.1,26.5,22.4,15.3,15.7,21.9,22.1,30.8,33.4,35
Hobart,35.5,34.1,30.7,26,20.9,15.1,17.5,21.7,20.9,24.2,30.1,33.4
Sydney,30.6,29,35.1,27.1,28.6,20.7,23.4,27.7,28.6,34.8,26.4,30.2

There is no need for additional modules in this case. Open the input file for reading (readonly) and the output file for writing.
Write the first line (column descriptions) from input to output. Then sort the list returned from readlines and write to output.
Like this:
ORIGINAL = 'original.csv'
NEW = 'new.csv'
with open(ORIGINAL) as original, open(NEW, 'w') as new:
new.write(next(original))
new.writelines(sorted(original.readlines(), key=lambda x: x.split(',')[0]))

I've run your code on my machine and it works as it's supposed to. Is it possible to print out the CSV file from this homework interface before sorting?

Related

getting a specific row of multiple csv's and writing to a new csv

Have had a good search but can't quite find what I'm looking for. I have a number of csv files printed by a CFD simulation. The goal of my python script was to:
get the final row of each csv and
add the rows to a new file with the filename added to the start of each row
Currently I have
if file.endswith(".csv"):
with open(file, 'r') as f:
tsrNum = file.translate(None, '.csv')
print(tsrNum + ', ' + ', '.join(list(csv.reader(f))[-1]))
Which prints the correct values into the terminal, but I have to manually and paste it into a new file.
Can somebody help with the last step? I'm not familiar enough with the syntax of python, certainly on my to-do list once I finish this CFD project as so far it's been fantastic when I've managed to implement it correctly. I tried using loops and csv.dictWriter, but with little success.
EDIT
I couldn't get the posted solution working. Here's the code a guy helped me make
import csv
import os
import glob
# get a list of the input files, all CSVs in current folder
infiles = glob.glob("*.csv")
# open output file
ofile = open('outputFile.csv', "w")
# column names
fieldnames = ['parameter','time', 'cp', 'cd']
# access it as a dictionary for easy access
writer = csv.DictWriter(ofile, fieldnames=fieldnames)
# output file header
writer.writeheader()
# iterate through list of input files
for ifilename in infiles:
# open input file
ifile = open(ifilename, "rb+")
# access it as a dictionary for easy access
reader = csv.DictReader(ifile)
# get the rows in reverse order
rows = list(reader)
rows.reverse()
# get the last row
row = rows[0]
# output row to output csv
writer.writerow({'parameter': ifilename.translate(None, '.csv'), 'time': row['time'], 'cp': row['cp'], 'cd': row['cd']})
# close input file
ifile.close()
# close output file
ofile.close()
Split your problem in smaller pieces:
looping over directory
getting last line
writing to your new csv.
I have tried to be very verbose, so that you should try to do something like this:
import os
def get_last_line_of_this_file(filename):
with open(filename) as f:
for line in f:
pass
return line
def get_all_csv_filenames(directory):
for filename in os.listdir(directory):
if filename.endswith('.csv'):
yield filename
def write_your_new_csv_file(new_filename):
with open(new_filename, 'w') as writer:
for filename in get_all_csv_filenames('now the path to your dir'):
last_line = get_last_line_of_this_file(filename)
writer.write(filename + ' ' + last_line)
if __name__ == '__main__':
write_your_new_csv_file('your_created_filename.csv')

Reading data from one CSV and displaying parsed data on to another CSV file

I am very new to Python. I am trying to read a csv file and displaying the result to another CSV file. What I want to do is I want to write selective rows in the input csv file on to the output file. Below is the code I wrote so far. This code read every single row from the input file i.e. 1.csv and write it to an output file out.csv. How can I tweak this code say for example I want my output file to contain only those rows which starts with READ in column 8 and rows which are not equal to 0000 in column 10. Both of these conditions need to be met. Like start with READ and not equal to 0000. I want to write all these rows. Also this block of code is for a single csv file. Can anyone please tell me how I can do it for say 10000 csv files ? Also when I execute the code, I can see spaces between lines on my out csv. How can I remove those spaces ?
import csv
f1 = open("1.csv", "r")
reader = csv.reader(f1)
header = reader.next()
f2 = open("out.csv", "w")
writer = csv.writer(f2)
writer.writerow(header)
for row in reader:
writer.writerow(row)
f1.close()
f2.close()
Something like:
import os
import csv
import glob
class CSVReadWriter(object):
def munge(self, filename, suffix):
name,ext = os.path.split(filename)
return '{0}{1}.{2}'.format(name, suffix, ext)
def is_valid(self, row):
return row[8] == 'READ' and row[10] == '0000'
def filter_csv(fin, fout):
reader = csv.reader(fin)
writer = csv.writer(fout)
writer.write(reader.next()) # header
for row in reader:
if self.is_valid(row):
writer.writerow(row)
def read_write(self, iname, suffix):
with open(iname, 'rb') as fin:
oname = self.munge(filename, suffix)
with open(oname, 'wb') as fout:
self.filter_csv(fin, fout)
work_directory = r"C:\Temp\Data"
for filename in glob.glob(work_directory):
csvrw = CSVReadWriter()
csvrw.read_write(filename, '_out')
I've made it a class so that you can over ride the munge and is_valid methods to suit different cases. Being a class also means that you can store state better, for example if you wanted to output lines between certain criteria.
The extra spaces between lines that you mention are to do with \r\n carriage return and line feed line endings. Using open with 'wb' might resolve it.

How to save the output of the print statements to a CSV file?

I have written the following to isolate a very specific part of a file:
for line in open('120301.KAP'):
rec = line.strip()
if rec.startswith('PLY'):
print line
The output appears as such
PLY/1,48.107478621032,-69.733975000000
PLY/2,48.163516399836,-70.032838888053
PLY/3,48.270000002883,-70.032838888053
PLY/4,48.270000002883,-69.712824977522
PLY/5,48.192379262383,-69.711801581207
PLY/6,48.191666671083,-69.532840015422
PLY/7,48.033358898628,-69.532840015422
PLY/8,48.033359033880,-69.733975000000
PLY/9,48.107478621032,-69.733975000000
Ideally what I am hoping for is the output to create a CSV file with just the coordinates. The PLY/1, PLY/2, etc. does not need to stay.
Is this doable? If not, at least can the print statements result in a new text file with the same name as the KAP file?
You can use the csv module:
import csv
with open('120301.csv', 'w', newline='') as file:
writer = csv.writer(file)
for line in open('120301.KAP'):
rec = line.strip()
if rec.startswith('PLY'):
writer.writerow(rec.split(','))
In a similar way, the csv.reader can easily read records from your input file.
https://docs.python.org/3/library/csv.html?highlight=csv#module-contents
If you are using Python 2.x, you should open the file in binary mode:
import csv
with open('120301.csv', 'wb') as file:
writer = csv.writer(file)
for line in open('120301.KAP'):
rec = line.strip()
if rec.startswith('PLY'):
writer.writerow(rec.split(','))
You could open the file at the beginning of your code and then just add a write statement after the print line.
Something like this:
target = open(filename, 'w')
for line in open('120301.KAP'):
rec = line.strip()
if rec.startswith('PLY'):
print line
target.write(line)
target.write("\n") #writes a new line
This is totally doable!
Here are a couple of links to some docs for writing/reading CSV:
https://docs.python.org/2/library/csv.html
You could also just make your own CSV with the regular file reading/writing functions.
file = open('data', rw)
output = open('output.csv', w)
file.write('your infos') #add a comma to each string you output?
The simplest way is to redirect stdout to a file:
for i in range(10):
print str(i) + "," + str(i*2)
will output:
0,0
1,2
2,4
3,6
4,8
5,10
6,12
7,14
8,16
9,18
if you run it as python myprog.py > myout.txt the result go to myout.txt

Write CSV Python

I want write csv (outfile) file from another csv file (infile). In infile csv data write like this OF0A0C,00,D0,0F11AFCB I want to write to outfile same asinfile but I get like this "\r \n 0,F,0,A,0,C,","0,0,","D,0,","0,F,1,1,A,F,C,B \r \n
My code like this :
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
for line in inFile:
OutFile = csv.writer (outFile)
OutFile.writerow (line)
After write I want to save every data in row to list like this Data = [[length_of_all_data],[length_data_row_1,datarow1],[length_data_row_2,datarow1datarow2],[length_data_row_3,datarow1datarow3]]
I confused to save the with list mode like that. Thankyou
Few issues -
You should read the input csv file using csv module's csv.reader() , instead of iterating over its lines, since when you iterate over its lines, you get the line back as a string in the iteration - for line in inFile: , and then you are writing this line back using OutFile.writerow(line) , hence it writes each character into different columns.
You do not need to create separate OutFile = csv.writer (outFile) for every line.
Example code -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
for row in in_reader:
out_file.writerow(row)
EDIT: For the second issue that is updated, you can create a list and a counter to keep track of the complete length. Example -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
data = []
lencount = 0
for row in in_reader:
out_file.writerow(row)
tlen = len(''.join(row))
data.append([tlen] + row)
lencount += tlen
data.insert(0,[lencount])

re.sub for a csv file

I am receiving a error on this code. It is "TypeError: expected string or buffer". I looked around, and found out that the error is because I am passing re.sub a list, and it does not take lists. However, I wasn't able to figure out how to change my line from the csv file into something that it would read.
I am trying to change all the periods in a csv file into commas. Here is my code:
import csv
import re
in_file = open("/test.csv", "rb")
reader = csv.reader(in_file)
out_file = open("/out.csv", "wb")
writer = csv.writer(out_file)
for row in reader:
newrow = re.sub(r"(\.)+", ",", row)
writer.writerow(newrow)
in_file.close()
out_file.close()
I'm sorry if this has already been answered somewhere. There was certainly a lot of answers regarding this error, but I couldn't make any of them work with my csv file. Also, as a side note, this was originally an .xslb excel file that I converted into csv in order to be able to work with it. Was that necessary?
You could use list comprehension to apply your substitution to each item in row
for row in reader:
newrow = [re.sub(r"(\.)+", ",", item) for item in row]
writer.writerow(newrow)
for row in reader does not return single element to parse it rather it returns list of of elements in that row so you have to unpack that list and parse each item individually, just like #Trii shew you:
[re.sub(r'(\.)+','.',s) for s in row]
In this case, we are using glob to access all the csv files in the directory.
The code below overwrites the source csv file, so there is no need to create an output file.
NOTE:
If you want to get a second file with the parameters provided with re.sub, replace write = open(i, 'w') for write = open('secondFile.csv', 'w')
import re
import glob
for i in glob.glob("*.csv"):
read = open(i, 'r')
reader = read.read()
csvRe = re.sub(re.sub(r"(\.)+", ",", str(reader))
write = open(i, 'w')
write.write(csvRe)
read.close()
write.close()

Categories

Resources