Adding multiple empty rows to a csv file - python

I have a csv file with around 500 lines, i want to insert multiple empty rows for each row , i.e add 5 empty rows after each row, so i tried this
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
i = 0
for row in read:
outwriter.writerow(row)
i += 1
outwriter.writerow([])
This creates 3 empty rows but not 5, i am not sure on how to add 5 rows for each row. what am i missing here
Update:
CSV File sample
No,First,Sec,Thir,Fourth
1,A,B,C,D
2,A,B,C,D
3,A,B,C,D
4,A,B,C,D
5,A,B,C,D
6,A,B,C,D
7,A,B,C,D
8,A,B,C,D
Adding the output csv file for answer code

Your code actually only adds one blank line. Use a loop to add as many as you want:
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])

The following code fixes it from the answer given by John Anderson, adding an additional newline='' parameter inside the open method gives the exact number of empty rows in range
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt',newline='') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])

Related

How to add a header to an existing CSV file without replacing the first row?

What I want to do is actually as it is written in the title.
with open(path, "r+", newline='') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
list_of_column_names = []
num_cols = len(next(csv_reader))
for i in range(num_cols):
list_of_column_names.append(i)
fields = list_of_column_names
with open(example.csv, "r+", newline='') as writeFile:
csvwriter = csv.DictWriter(writeFile, delimiter=',', lineterminator='\n', fieldnames=fields)
writeFile.seek(0, 0)
csvwriter.writeheader()
I want to enumerate the columns which initially doesn't have any column names. But when I run the code, it replaces the data in the first row. For example:
example.csv:
a,b
c,d
e,f
what I want:
0,1
a,b
c,d
e,f
what happens after running the code:
0,1
c,d
e,f
Is there a way to prevent this from happening?
There's no magical way to insert a line into an existing text file.
The following is how I think of doing this, and your code is already getting steps 2-4. Also, I wouldn't mess with the DictWriter since you're not trying to convert a Python dict to CSV (I can see you using it for writing the header, but that's easy enough to do with the regular reader/writer):
open a new file for writing
read the first row of your CSV
interpret the column indexes as the header
write the header
write the first row
read/write the rest of the rows
move the new file back to the old file, overwrite (not shown)
Here's what that looks like in code:
import csv
with open('output.csv', 'w', newline='') as out_f:
writer = csv.writer(out_f)
with open('input.csv', newline='') as in_f:
reader = csv.reader(in_f)
# Read the first row
first_row = next(reader)
# Count the columns in first row; equivalent to your `for i in range(len(first_row)): ...`
header = [i for i, _ in enumerate(first_row)]
# Write header and first row
writer.writerow(header)
writer.writerow(first_row)
# Write rest of rows
for row in reader:
writer.writerow(row)

How do you import a txt file and create a two column table splitting on /t?

I am trying to import a file.txt and read it as alist with two columns
the Format of the txt file is as follows.
1 1.234567
2 2.345678
Thank you
I can open it as a list but I couldt split \t so I could get two rows.
o=open('file.txt')
csv_o = csv.reader(o)
for line in csv_o:
print (line)
o.close()
What I get is
['1\t1.234567']
['2\t2.345678']
and What I want is
['1','1.234567']
['2','2.345678']
Use delimiter="\t"
Ex:
import csv
with open(filename) as csvfile:
reader = csv.reader(csvfile, delimiter="\t")
for row in reader:
print(row)

Need help in finding the row of CSV which contains the values in array

I have an array LiveTick = ['ted3m index','US0003m index','USGG3m index'] and I am reading a CSV file book1.csv. I have to find the row which contains the values in csv.
For example, 15th row will contain ted3m index 500 | 600 and 20th row will contain US0003m index 800 | 900 and likewise.
I then have to get the values contained in the row and parse it for each value contained in array LiveTick. How do I proceed? Below is my sample code:
with open('C:\\blp\\book1.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf)
for row in reader:
for list in LiveTick:
if list in row:
print ('Found: {}'.format(row))
You can use pandas, it's pretty fast and will do all reading, writing and filtering job for you out of the box:
import pandas as pd
df = pd.read_csv('C:\\blp\\book1.csv')
filtered_df = df[df['your_column_name'].isin(LiveTick)]
# now you can save it
filtered_df.to_csv('C:\\blp\\book_filtered.csv')
You have the right idea, but there are a few improvements you can make:
Instead of a nested for loop which doesn't short-circuit, use any to compare the first column to multiple values.
Write to your csv as you go along instead of just print. This is memory-efficient, as you hold in memory only one line at any one time.
Define outf as an open object in your with statement.
Do not shadow built-in list. Use another identifier, e.g. i, for elements in LiveTick.
Here's a demo:
with open('in.csv', 'r') as f, open('out.csv', 'wb', newline='') as outf:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf, delimiter=',')
for row in reader:
if any(i in row[0] for i in LiveTick):
writer.writerow(row)

Convert from CSV to array in Python

I have a CSV file containing the following.
0.000264,0.000352,0.000087,0.000549
0.00016,0.000223,0.000011,0.000142
0.008853,0.006519,0.002043,0.009819
0.002076,0.001686,0.000959,0.003107
0.000599,0.000133,0.000113,0.000466
0.002264,0.001927,0.00079,0.003815
0.002761,0.00288,0.001261,0.006851
0.000723,0.000617,0.000794,0.002189
I want convert the values into an array in Python and keep the same order (row and column). How I can achieve this?
I have tried different functions but ended with error.
You should use the csv module:
import csv
results = []
with open("input.csv") as csvfile:
reader = csv.reader(csvfile, quoting=csv.QUOTE_NONNUMERIC) # change contents to floats
for row in reader: # each row is a list
results.append(row)
This gives:
[[0.000264, 0.000352, 8.7e-05, 0.000549],
[0.00016, 0.000223, 1.1e-05, 0.000142],
[0.008853, 0.006519, 0.002043, 0.009819],
[0.002076, 0.001686, 0.000959, 0.003107],
[0.000599, 0.000133, 0.000113, 0.000466],
[0.002264, 0.001927, 0.00079, 0.003815],
[0.002761, 0.00288, 0.001261, 0.006851],
[0.000723, 0.000617, 0.000794, 0.002189]]
If your file doesn't contain parentheses
with open('input.csv') as f:
output = [float(s) for line in f.readlines() for s in line[:-1].split(',')]
print(output);
The csv module was created to do just this. The following implementation of the module is taken straight from the Python docs.
import csv
with open('file.csv','rb') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in reader:
#add data to list or other data structure
The delimiter is the character that separates data entries, and the quotechar is the quotechar.

Replace column in csv with modified column

I got a csv file with a couple of columns and a header containing 4 rows. The first column contains the timestamp. Unfortunately it also gives milliseconds, but whenever those are at 00, they are not given in the file. It looks like that:
"TOA5","CR1000","CR1000","E9048"
"TIMESTAMP","RECORD","BattV_Avg","PTemp_C_Avg"
"TS","RN","Volts","Deg C"
"","","Avg","Avg"
"2015-08-28 12:40:23.51",1,12.91,32.13
"2015-08-28 12:50:43.23",2,12.9,32.34
"2015-08-28 13:12:22",3,12.91,32.54
As I don't need the milliseconds, I want to get rid of those, as this makes further calculations containing time a bit complicated. My approach so far:
Extract first 20 digits in each row to get a format such as 2015-08-28 12:40:23
timestamp = []
with open(filepath) as f:
for _ in xrange(4): #skip 4 header rows
next(f)
for line in f:
time = line[1:20] #Get values for the current line
timestamp.append(time) #Add values to list
From here on I'm struggling on how to procede further. I want to exchange the first column in the csv file with the newly created timestamp list.
I tried creating a dictionary, but I don't know how to use the header caption in row 2 as the key:
d = {}
with open(filepath, 'rb') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for col in csv_reader:
#use header info from row 2 as key here
This would import the whole csv file into a dict and I'd then change the TIMESTAMP entry in the dict with the timestamp list above. Is this even possible?
Or is there an easier approach on how to just change the first column in the csv with my new list so that my csv file in the end contains the timestamp just without the millisecond information?
So the first column in my csv should look like this:
"TOA5"
"TIMESTAMP"
"TS"
""
2015-08-28 12:40:23
2015-08-28 12:50:43
2015-08-28 13:12:22
This should do it and preserve the quoting:
with open(filepath1, 'rb') as fin, open(filepath2, 'wb') as fout:
reader = csv.reader(fin)
writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
for _ in xrange(4): # copy first 4 header rows
writer.writerow(next(reader))
for row in reader: # process data lines
row[0] = row[0][:19] # strip fractional seconds from first column
writer.writerow([row[0], int(row[1])] + map(float, row[2:]))
Since a csv.reader returns the columns of each row as a list of strings, it's necessary to convert any which contain numeric values into their actual int or float numeric value before they're written out to prevent them from being quoted.
I believe you can easily create a new csv from iterating over the original csv and replacing the timestamp as you want.
Example -
with open(filepath, 'rb') as csv_file, open('<new file>','wb') as outfile:
csv_reader = csv.reader(csv_file, delimiter=',')
csv_writer = csv.writer(outfile, delimiter=',')
for i, row in enumerate(csv_reader): #Enumerating as we only need to change rows after 3rd index.
if i <= 3:
csv_writer.writerow(row)
else:
csv_writer.writerow([row[0][1:20]] + row[1:])
I'm not entirely sure about how to parse your csv but I would do something of the sort:
time = time.split(".")[0]
so if it does have a millisecond it would get removed and if it doesn't nothing will happen.

Categories

Resources