How to count number of columns in each row? - python

Each rows have different number of columns but Column A is always file name and rest of columns are fields of that file.
Is there any way I could count number of columns for each row?
import csv
file=('C:/)
with open('C:/Count.csv','w',encoding='cp949',newline='') as testfile:
csv_writer=csv.writer(testfile)
for line in file:
lst=[len(line)]
csv_writer.writerow(lst)

You can either choose to split on commas or open the file with csv.
I'd recommend the latter. Here's how you can do that:
file1 = ... # file to read
file2 = ... # file to write
with open(file1, 'r') as f1, open(file2, 'w', encoding='cp949', newline='') as f2:
csv_reader = csv.reader(f1)
csv_writer = csv.writer(f2)
for row in csv_reader:
csv_writer.writerow([len([x for x in row if x])]) # non-null counts only
Open both files simultaneously, iterate over the file to read, count its columns using len(row) and then write it out.

Related

Beginner deleting columns from CSV (no pandas)

I've just started coding, I'm trying to remove certain columns from a CSV for a project, we aren't supposed to use pandas. For instance, one of the fields I have to delete is called DwTm, but there's about 15 columns I have to get rid of; I only want the first few, Here's what I've gotten:
import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"
with open(FTemp, 'r') as csv_file:
csv_reader = csv.reader(csv_file)
with open(FOut, 'w') as new_file:
fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)
for line in csv_reader:
del line['DwTm']
csv_writer.writerow(line)
When I run this, I get the error
del line['DwTm']
TypeError: list indices must be integers or slices, not str
This is the only method I've found to almost work without using pandas. Any ideas?
The easiest way around this is to use a DictReader to read the file. Like DictWriter, which you are using to write the file, DictReader uses dictionaries for rows, so your approach of deleting keys from the old row then writing to the new file will work as you expect.
import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"
with open(FTemp, 'r') as csv_file:
# Adjust the list to be have the correct order
old_fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm', 'DwTm']
csv_reader = csv.DictReader(csv_file, fieldnames=old_fieldnames)
with open(FOut, 'w') as new_file:
fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)
for line in csv_reader:
del line['DwTm']
csv_writer.writerow(line)
Below
import csv
# We only want to read the 'department' field
# We are not interested in 'name' and 'birthday month'
# Make sure the list items are in ascending order
NON_INTERESTING_FIELDS_IDX = [2,0]
rows = []
with open('example.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
for idx in NON_INTERESTING_FIELDS_IDX:
del row[idx]
rows.append(','.join(row))
with open('example_out.csv','w') as out:
for row in rows:
out.write(row + '\n')
example.csv
name,department,birthday month
John Smith,Accounting,November
Erica Meyers,IT,March
example_out.csv
department
Accounting
IT
It's possible to simultaneously open the file to read from and the file to write to. Let's say you know the indices of the columns you want to keep, say, 0,2, and 4:
good_cols = (0,2,4)
with open(Ftemp, 'r') as fin, open(Fout, 'w') as fout:
for line in fin:
line = line.rstrip() #clean up newlines
temp = line.split(',') #make a list from the line
data = [temp[x] for x in range(len(temp)) if x in good_cols]
fout.write(','.join(data) + '\n')
The list comprehension (data) pulls only the columns you want to keep out of each row and immediately writes line-by-line to your new file, using the join method (plus tacking on an endline for each new row).
If you only know the names of the fields you want to keep/remove it's a bit more involved, you have to extract the indices from the first line of the csv file, but it's not much more difficult.

Adding multiple empty rows to a csv file

I have a csv file with around 500 lines, i want to insert multiple empty rows for each row , i.e add 5 empty rows after each row, so i tried this
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
i = 0
for row in read:
outwriter.writerow(row)
i += 1
outwriter.writerow([])
This creates 3 empty rows but not 5, i am not sure on how to add 5 rows for each row. what am i missing here
Update:
CSV File sample
No,First,Sec,Thir,Fourth
1,A,B,C,D
2,A,B,C,D
3,A,B,C,D
4,A,B,C,D
5,A,B,C,D
6,A,B,C,D
7,A,B,C,D
8,A,B,C,D
Adding the output csv file for answer code
Your code actually only adds one blank line. Use a loop to add as many as you want:
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])
The following code fixes it from the answer given by John Anderson, adding an additional newline='' parameter inside the open method gives the exact number of empty rows in range
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt',newline='') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])

I have a csv file and i want to extract each row of csv file into different csv file . how can i do that?

I have a CSV file and I want to extract each row of CSV file into the different CSV files. how can I do that?
Like this, it will be saved in files numerated by number of row
import csv
with open('file.csv', 'r') as csv_file:
rows = csv.reader(csv_file, skipinitialspace=True)
for i, row in enumerate(rows):
with open('file_{}.csv'.format(i), 'w') as write_file:
writer = csv.writer(write_file)
writer.writerow(row)

delete line, of a .csv, when certain column has no value python

I have a .csv file and I am trying to erase some rows/lines that have no usable information. I want to delete lines that do not have a value in a certain column. I am kinda new to programming and I could not find a way to do this. Is this possible?
I tried to delete a line if it did not have a certain number in it but that did not work as wel.
f = open('C:myfile.csv', 'rb')
lines = f.readlines()
f.close()
filename = 'myfile.csv'
f = open(filename, 'wb')
for line in lines:
if line != "1":
f.write(line)
f.close()
here are some sample rows:
0,593 0,250984 -20,523384 -25,406271
0,594 0,250984
0,595 0,250984
0,596 0,250984
0,597 0,250984 -15,793088 -21,286336
0,598 0,250984
0,599 0,908811
0,6 0,893612
0,601 0,784814 -12,130922 -11,825742
0,602 0,909238
0,603 0,25309
0,604 0,38435
0,605 0,602954 -8,316167 -3,43328
0,606 0,642628
0,607 0,39201
0,608 0,384289
0,609 0,251656 -11,825742 -5,874723
So I want to delete the rows when there is no number in the third and fourth column.
You can use Python's csv library to help you do this. Your data appears to be tab delimited, as such the following script should work:
import csv
with open('input.csv', 'rb') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output, delimiter = '\t')
for row in csv.reader(f_input, delimiter = '\t'):
if len(row[2]) and len(row[3]):
csv_output.writerow(row)
Giving you an output.csv file containing:
0,593 0,250984 -20,523384 -25,406271
0,597 0,250984 -15,793088 -21,286336
0,601 0,784814 -12,130922 -11,825742
0,605 0,602954 -8,316167 -3,43328
0,609 0,251656 -11,825742 -5,874723
Note, each of your rows appears to have 4 columns (your data has tabs for these missing entries), because of this, it is not enough to simply test the length is 4. You need to test the contents of the two cells.
import csv
fn_in = 'test.csv'
fn_out = 'outfile.csv'
with open(fn_in, 'r') as inp, open(fn_out, 'w') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if len(row)==4:
writer.writerow(row)

reading and parsing a TSV file, then manipulating it for saving as CSV (*efficiently*)

My source data is in a TSV file, 6 columns and greater than 2 million rows.
Here's what I'm trying to accomplish:
I need to read the data in 3 of the columns (3, 4, 5) in this source file
The fifth column is an integer. I need to use this integer value to duplicate a row entry with using the data in the third and fourth columns (by the number of integer times).
I want to write the output of #2 to an output file in CSV format.
Below is what I came up with.
My question: is this an efficient way to do it? It seems like it might be intensive when attempted on 2 million rows.
First, I made a sample tab separate file to work with, and called it 'sample.txt'. It's basic and only has four rows:
Row1_Column1 Row1-Column2 Row1-Column3 Row1-Column4 2 Row1-Column6
Row2_Column1 Row2-Column2 Row2-Column3 Row2-Column4 3 Row2-Column6
Row3_Column1 Row3-Column2 Row3-Column3 Row3-Column4 1 Row3-Column6
Row4_Column1 Row4-Column2 Row4-Column3 Row4-Column4 2 Row4-Column6
then I have this code:
import csv
with open('sample.txt','r') as tsv:
AoA = [line.strip().split('\t') for line in tsv]
for a in AoA:
count = int(a[4])
while count > 0:
with open('sample_new.csv', 'a', newline='') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=',')
csvwriter.writerow([a[2], a[3]])
count = count - 1
You should use the csv module to read the tab-separated value file. Do not read it into memory in one go. Each row you read has all the information you need to write rows to the output CSV file, after all. Keep the output file open throughout.
import csv
with open('sample.txt', newline='') as tsvin, open('new.csv', 'w', newline='') as csvout:
tsvin = csv.reader(tsvin, delimiter='\t')
csvout = csv.writer(csvout)
for row in tsvin:
count = int(row[4])
if count > 0:
csvout.writerows([row[2:4] for _ in range(count)])
or, using the itertools module to do the repeating with itertools.repeat():
from itertools import repeat
import csv
with open('sample.txt', newline='') as tsvin, open('new.csv', 'w', newline='') as csvout:
tsvin = csv.reader(tsvin, delimiter='\t')
csvout = csv.writer(csvout)
for row in tsvin:
count = int(row[4])
if count > 0:
csvout.writerows(repeat(row[2:4], count))

Categories

Resources