printing from csv.reader - python

This should be an easy one, but I'm having a bit of a brain fart. the CSV maintains a list of four latitude and longitude pairs. Based on the code, if I print row[0] it prints just the latitudes and if I print row[1] it prints the longitudes. How to I format the code to print a specific lat/lon pair instead? Say.. The second lat/lon pair in the CSV.
import csv
with open('120101.KAP.csv','rb') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
print row[0]

Looping over reader gives you each row. If you wanted to get the second row, use the next() function instead, ignore one and get the second:
reader = csv.reader(csvfile)
next(reader) # ignore
row = next(reader) # second row
print row # print the second row.
You can generalise this by using the itertools.islice() object to do the skipping for you:
from itertools import islice
reader = csv.reader(csvfile)
row = next(islice(reader, rownumber)) # skip to index rownumber, read that
print row
Take into account that counting starts at 0, so "second row" is rownumber = 1.
Or you could just read all rows into a list and index into that:
reader = csv.reader(csvfile)
rows = list(reader)
print rows[1] # print the second row
print rows[3] # print the fourth row
Only do this (loading everything into a list) if there are a limited number of rows. Iteration over the reader only produces one row at a time and uses a file buffer for efficient reading, limiting how much memory is used; you could process gigantic CSV files this way.

Related

Joining the columns of one CSV to another CSV

So I'm trying to combine column values from one csv to another while saving it into a final csv file. But I want to iterate through all the rows adding the column values of each row to each row of the original csv.
In other words say csv1 has 3 rows.
Row 1: Frog,Rat,Duck
Row 2: Cat,Dog,Cow
Row 3: Moose,Fox,Zebra
And I want to combine 2 more column values from csv2 to each of those rows.
Row 1: Chicken,Pig
Row 2:
Row 3: Bear,Boar
So csv3 would end up looking like.
Row 1: Frog,Rat,Duck,Chicken,Pig
Row 2: Moose,Fox,Zebra,Bear,Boar
But at the same time if there's a row in csv2 that has no values at all I don't want it to copy the row from csv1. In other words that row will not exist at all in the final csv file. I prefer not to use pandas as I have just been using the csv module thus far throughout my code but any method is appreciated.
So far I have come across this method which works if there's only one single row. But when there's more than that it just adds random lines and appends the values all over the place. And it combines both of the columns into one string while adding an extra blank line at the end of the csv for some odd reason.
import csv
f1 = open ("2.csv","r", encoding='utf-8')
with open("3.csv","w", encoding='utf-8', newline='') as f:
writer = csv.writer(f)
with open("1.csv","r", encoding='utf-8') as csvfile:
reader = csv.reader(csvfile, delimiter=",")
for row in reader:
row[6] = f1.readline()
writer.writerow(row)
f1.close()
Using the same example csvs above the results given are.
Frog,Rat,Duck,Chicken,Pig
Cat,Dog,Cow
Moose,Fox,Zebra,Bear,Boar
You can zip together the two files and then iterate through each row. Then you can concatenate the two lists and write the result to a file.
To check if there is an empty row we can compare the set of the row to the set of an empty string.
import csv
new_csv_data = []
EMPTY_ROW = set([""])
with open("1.csv", "r", newline="") as first_file, open("2.csv", "r", newline="") as second_file, open("3.csv", "w", newline="") as out_file:
first_file_reader = csv.reader(first_file)
second_file_reader = csv.reader(second_file)
out_file_writer = csv.writer(out_file)
# The iterator will stop when the shortest file is finished
for row_1, row_2 in zip(first_file_reader, second_file_reader):
# Check if the second row is empty, skipping if it is
if not row_2 or set(row_2) == EMPTY_ROW:
continue
out_file_writer.writerow(row_1 + row_2)

How to print specific rows in a CSV files which have a specific value in a specific column?

I am new to python. I have a CSV file which I want to print specific row from it I'd appreciate it if you could give me guidance. for example below table I want to print a Row if record Number is 2:
This image shows an example of my case
I have below code as starter which prints out the headers:
with open(filename, "r") as f:
reader = csv.reader(f, delimiter="\t")
first = next(reader)
print(first[0].split(','))
for row in filename:
print()
Thanks!
your example code seems somewhat confused, I presume the file is actually comma separated not tab delimited. otherwise you wouldn't need to do the first[0].split(',').
assuming that's the case, maybe something like this would work:
with open(filename, "r") as f:
reader = csv.reader(f)
# skip header row
header = next(reader)
for row in reader:
if int(row[0]) == 2:
print(row)
if you're after a specific row number, you could use enumerate to count rows and print when you get to the correct one.
In your for loop check if the record number, which is the 0th column, is == 2:
for row in file:
if row[0] == 2:
print(row)

Need help in finding the row of CSV which contains the values in array

I have an array LiveTick = ['ted3m index','US0003m index','USGG3m index'] and I am reading a CSV file book1.csv. I have to find the row which contains the values in csv.
For example, 15th row will contain ted3m index 500 | 600 and 20th row will contain US0003m index 800 | 900 and likewise.
I then have to get the values contained in the row and parse it for each value contained in array LiveTick. How do I proceed? Below is my sample code:
with open('C:\\blp\\book1.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf)
for row in reader:
for list in LiveTick:
if list in row:
print ('Found: {}'.format(row))
You can use pandas, it's pretty fast and will do all reading, writing and filtering job for you out of the box:
import pandas as pd
df = pd.read_csv('C:\\blp\\book1.csv')
filtered_df = df[df['your_column_name'].isin(LiveTick)]
# now you can save it
filtered_df.to_csv('C:\\blp\\book_filtered.csv')
You have the right idea, but there are a few improvements you can make:
Instead of a nested for loop which doesn't short-circuit, use any to compare the first column to multiple values.
Write to your csv as you go along instead of just print. This is memory-efficient, as you hold in memory only one line at any one time.
Define outf as an open object in your with statement.
Do not shadow built-in list. Use another identifier, e.g. i, for elements in LiveTick.
Here's a demo:
with open('in.csv', 'r') as f, open('out.csv', 'wb', newline='') as outf:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf, delimiter=',')
for row in reader:
if any(i in row[0] for i in LiveTick):
writer.writerow(row)

CSV reader repeatedly reading 1 line

I have a csv.reader reading a file, but repeatedly reading the same line.
import csv
with open('mydata.csv', 'rb') as f:
reader = csv.reader(f)
reader.next()
for row in reader:
while i < 10:
print row
i=i+1
The code prints the second row (as I want to skip the header) 10 times.
Your code is doing exactly what you told it to do...
(and also, your title is misleading: the reader is reading the row only once, you are simply printing it 10 times)
reader.next() # advances to second line
for row in reader: # loops over remaining lines
while i < 10: # loops over i
print row # prints current row - this would be the second row in the first forloop iteration... 10 times, because you loop over i.
i=i+1 # increments i, so the next rows, i is already >=10, your while-loop only affects the second line.
Why do you have that while loop in the first place?
You could easily do something like:
reader = csv.reader(f)
for rownum, row in enumerate(reader):
if rownum: #skip first line
print row

How do create new column in csv file using python by shifting one row

I have CSV file like below. It is huge file with thousands of records.
input.csv
No;Val;Rec;CSR
0;10;1;1200
0;100;2;1300
0;100;3;1300
0;100;4;1400
0;10;5;1200
0;11;6;1200
I want to create output.csv file by adding new column "PSR" after 1st column "No". This column value depends on column "PSR" Value. For 1st row, "PSR" shall be zero. From next record on-wards, it depends on "CSR" value in previous row. If present and previous record CSR value is same, then "PSR" shall be zero. If not, PSR value shall have the previous CSR value. For exmple, Value of CSR in 2nd row is 1300 which is different to the value in 1st record ( it is 1200). So PSR value for 2nd row shall be 1200. Where in 2nd and 3rd row, CSR value is same. So PSR value for 3rd row shall be zero. So new value PSR depends on CSR value in present and previous field.
Output.csv
No;PCR;Val;Rec;CSR
0;0;10;1;1200
0;1200;100;2;1300
0;0;100;3;1300
0;1300;100;4;1400
0;1400;10;5;1200
0;0;11;6;1200
My Approach:
Use csv.reader and iterate over the objects in a list. Copy 5th column to 2nd column in list. Shift it one row down.
Then check the values in 2nd and 5th column (PCR and CSR), if both values are same. Replace the PCR value with zero.
I have problem in getting 1st step coded. I am able to duplicate the column but not able to shift it. Also 2nd step is quite straightforward.
Also, I am not sure whether this approach is correct Any pointers/recommendation would be really helpful.
Note: I am not able to install Pandas on CentOS. So help without this module would be better.
My Code:
with open('input.csv', 'r') as input, open('output.csv', 'w') as output:
reader = csv.reader(input, delimiter = ';')
writer = csv.writer(output, delimiter = ';')
mylist = []
header = next(reader)
mylist.append(header)
for rec in reader:
mylist.append(rec)
rec.insert(1, rec[3])
mylist.append(rec)
writer.writerows(mylist)
If your open to non-python solutions then awk could be a good option:
awk 'NR==1{$2="PSR;"$2}NR>1{$2=($4==a?0";"$2:+a";"$2);a=$4}1' FS=';' OFS=';' file
No;PSR;Val;Rec;CSR
0;0;10;1;1200
0;1200;100;2;1300
0;0;100;3;1300
0;1300;100;4;1400
0;1400;10;5;1200
0;0;11;6;1200
Awk is distributed with pretty much all Linux distributions and was designed exactly for this kind of task. It will blaze through your file. Add a redirection to the end > output.csv to save the output in a file.
A simple python approach using the same logic:
#!/usr/bin/env python
last = "0"
with open('input.csv') as csv:
print next(csv).strip().replace(';', ';PSR;', 1)
for line in csv:
field = line.strip().split(';')
if field[3] == last: field.insert(1, "0")
else: field.insert(1, last)
last = field[4]
print ';'.join(field)
Produces the same output:
$ python parse.py
No;PSR;Val;Rec;CSR
0;0;10;1;1200
0;1200;100;2;1300
0;0;100;3;1300
0;1300;100;4;1400
0;1400;10;5;1200
0;0;11;6;1200
Again just redirect the output to save it:
$ python parse.py > output.csv
Just code it as you explained it. Store the previous CSR and refer to it on the next loop through; just be sure to update it.
import csv
with open('input.csv', 'r') as input, open('output.csv', 'w') as output:
reader = csv.reader(input, delimiter = ';')
writer = csv.writer(output, delimiter = ';')
mylist = []
header = next(reader)
mylist.append(header)
mylist.insert(1,'PCR')
prev_csr = 0
for rec in reader:
rec.insert(1,prev_csr)
mylist.append(rec)
prev_csr = rec[4]
writer.writerows(mylist)
with open('input.csv', 'r') as input, open('output.csv', 'w') as output:
reader = csv.reader(input, delimiter = ';')
writer = csv.writer(output, delimiter = ';')
header = next(reader)
header.insert(1, 'PCR')
writer.writerow(header)
prevRow = next(reader)
prevRow.insert(1, '0')
writer.writerow(prevRow)
for row in reader:
if prevRow[-1] == row[-1]:
val = '0'
else:
val = prevRow[-1]
row.insert(1,val)
prevRow = row
writer.writerow(row)
Or, even easier using the DictReader and DictWriter capabilities of csv:
input_header = ['No','Val','Rec','CSR']
output_header = ['No','PCR','Val','Rec','CSR']
with open('input.csv', 'rb') as in_file, open('output.csv', 'wb') as out_file:
in_reader, out_writer = DictReader(in_file, input_header, delemeter =';'), DictWriter(out_file, output_header, delemeter =';')
in_reader.next() # skip the header
out_writer.writeheader() # place the output header
last_csr = None
for row in in_reader():
current_csr = row['CSR']
row['PCR'] = last_csr if current_csr != last_csr else 0
last_csr = current_csr
out_writer.writerow(row)

Categories

Resources