Re-arranging the format of csv in Python

Re-arranging the format of csv in Python - python

I tried doing what I can to solve this, the movie titles just won't move up. The problem is at the 2nd block in the for loop.. This is the function I wrote.
def writeFile(filename, movie_titles):
with open(filename, 'w') as f:
headers = "No., Title\n"
f.write(headers)
i = 0
for title in movie_titles:
while i < len(movie_titles[0:]): i = i + 1; f.write(str(i) + '\n')
f.write(', '+ "%s\n" % title.replace(',', '') + '\n')
f.close()

Another answer has a more straightforward and pythonic method, but for your specific code, this would solve it:
def writeFile(filename, movie_titles):
with open(filename, 'w') as f:
headers = "No., Title\n"
f.write(headers)
i = 0
for title in movie_titles:
i = i + 1
f.write(str(i) + ', '+ "%s\n" % title.replace(',', '') + '\n')
Note that the final f.close() is not needed. The with command takes care of that.

You can use enumerate() in for loop to get index. For example:
def writeFile(filename, movie_titles):
with open(filename, 'w') as f:
f.write("No., Title\n")
for i, title in enumerate(movie_titles, 1):
f.write('{},{}\n'.format(i, title.replace(',', '')))
Note: To create CSV file look at csv module.

You got your loops mixed up a bit. your code goes into the for-loop and iterates over all movies, during the first iteration it executes the while-loop and only after that is finished the for-loop is continued.
I would suggest something like this:
def writeFile(filename, movie_titles):
with open(filename, 'w') as f:
headers = "No., Title\n"
f.write(headers)
i = 0
for i in range(len(movie_titles)):
f.write(str(i+1) + ',')
f.write("%s\n" % movie_titles[i].replace(',', ''))
f.close()
the for loop iterates over all numbers from 0 to length of movielist - 1
then the number is written, here you add 1 so that your list starts with 1
after that you write your movie title. i assumed your movielist variable is a list and thus you can index this list by list[index], this index is in our case i and it's highest value corresponds to the last element of movie list.
also you had too many newlines because you only need one new line per line.
one could probably also write numbers and movienames separately but then you would need to specify which row of the file you are writing to.

Related

Fix compared data and return line number and data

I have typed a program to compare vehicle A price (v_priceA) to various other vehicle prices in a carprices.txt text file which are in a new line.
The result should be a new text called highprices.txt file with all the prices greater than the price of Vehicle A in a newline and the associated line number from carprices.txt
My problem is am able to generate two text files that has the line number of the greater file and another with the greater price, instead of the greater price itself and line number together. I need to fix this.
Vehicle A price: 2500.50
v_priceA = 2500.50
a_file = 'carprices.txt'
with open(a_file, 'r') as document:
values = [x for x, value in enumerate(document) if float(value) > v_priceA]
new_file = open('highpriceposition.txt', 'w')
for x in values:
new_file.write(str(x) + '\n')
new_file.close()
a_file = 'carprices.txt'
with open(a_file, 'r') as document:
values = [value for value in document if float(value) > v_priceA]
with open('highprice.txt', 'w') as f:
for x in values:
f.write(str(x)+'\n')
positionprice.txt
2 2900.00
3 3500.50
5 25000.30
6 45000.50

When you write to the new file new_file.write() you need to pass it both the line number and price. I.E.
v_priceA = 2500.50
a_file = 'carprices.txt'
output_file = 'highprices.txt'
with open(a_file, 'r') as document:
with open(output_file, 'w') as new_file:
for line, price in enumerate(document):
if float(price) > v_priceA:
new_file.write(str(line) + " " + str(price))
# See how I pass both in here?
It's important to know that whenever you open() a file in python as write "w" it's going to erase whatever is in that file before writing to it. (There's an append option if your interested).
Docs for Open.
Notice how I only open the output file once in the above code? That should help.
Now on to how enumerate works. It takes an iterable object in python
and for each item in that iterable returns a tuple of (itemIndex, item) with at least one very important exception it basically the succinct equivalent of:
def myEnumerate(iterableParameter):
i = 0
outPutList = []
while i < len(iterableParameter):
outPutList += (i, iterableParameter[i])
return outPutList
The important exception is that enumerate creates a generator where as the above creates a list. See further reading.

Changing the contents of a text file and making a new file with same format

I have a big text file with a lot of parts. Every part has 4 lines and next part starts immediately after the last part.
The first line of each part starts with #, the 2nd line is a sequence of characters, the 3rd line is a + and the 4th line is again a sequence of characters.
Small example:
#M00872:462:000000000-D47VR:1:1101:15294:1338 1:N:0:ACATCG
TGCTCGGTGTATGTAAACTTCCGACTTCAACTGTATAGGGATCCAATTTTGACAAAATATTAACGCTTATCGATAAAATTTTGAATTTTGTAACTTGTTTTTGTAATTCTTTAGTTTGTATGTCTGTTGCTATTATGTCTACTATTCTTTCCCCTGCACTGTACCCCCCAATCCCCCCTTTTCTTTTAAAAGTTAACCGATACCGTCGAGATCCGTTCACTAATCGAACGGATCTGTCTCTGTCTCTCTC
+
BAABBADBBBFFGGGGGGGGGGGGGGGHHGHHGH55FB3A3GGH3ADG5FAAFEGHHFFEFHD5AEG1EF511F1?GFH3#BFADGD55F?#GFHFGGFCGG/GHGHHHHHHHDBG4E?FB?BGHHHHHHHHHHHHHHHHHFHHHHHHHHHGHGHGHHHHHFHHHHHGGGGHHHHGGGGHHHHHHHGHGHHHHHHFGHCFGGGHGGGGGGGGFGGEGBFGGGGGGGGGFGGGGFFB9/BFFFFFFFFFF/
I want to change the 2nd and the 4th line of each part and make a new file with similar structure (4 lines for each part). In fact I want to keep the 1st 65 characters (in lines 2 and 4) and remove the rest of characters. The expected output for the small example would look like this:
#M00872:462:000000000-D47VR:1:1101:15294:1338 1:N:0:ACATCG
TGCTCGGTGTATGTAAACTTCCGACTTCAACTGTATAGGGATCCAATTTTGACAAAATATTAACG
+
BAABBADBBBFFGGGGGGGGGGGGGGGHHGHHGH55FB3A3GGH3ADG5FAAFEGHHFFEFHD5A
I wrote the following code:
infile = open("file.fastq", "r")
new_line=[]
for line_number in len(infile.readlines()):
if line_number ==2 or line_number ==4:
new_line.append(infile[line_number])
with open('out_file.fastq', 'w') as f:
for item in new_line:
f.write("%s\n" % item)
but it does not return what I want. How to fix it to get the expected output?

This code will achieve what you want -
from itertools import islice
with open('bio.txt', 'r') as infile:
while True:
lines_gen = list(islice(infile, 4))
if not lines_gen:
break
a,b,c,d = lines_gen
b = b[0:65]+'\n'
d = d[0:65]+'\n'
with open('mod_bio.txt', 'a+') as f:
f.write(a+b+c+d)
How it works?
We first make a generator that gives 4 lines at a time as you mention.
Then we open the lines into individual lines a,b,c,d and perform string slicing. Eventually we join that string and write it to a new file.

I think some itertools.cycle could be nice here:
import itertools
with open("transformed.file.fastq", "w+") as output_file:
with open("file.fastq", "r") as input_file:
for i in itertools.cycle((1,2,3,4)):
line = input_file.readline().strip()
if not line:
break
if i in (2,4):
line = line[:65]
output_file.write("{}\n".format(line))

readlines() will return list of each line in your file. You don't need to prepare a list new_line. Directly iterate over index-value pair of list, then you can modify all the values in your desired position.
By modifying your code, try this
infile = open("file.fastq", "r")
new_lines = infile.readlines()
for i, t in enumerate(new_lines):
if i == 1 or i == 3:
new_lines[i] = new_lines[i][:65]
with open('out_file.fastq', 'w') as f:
for item in new_lines:
f.write("%s" % item)

Python write text to CSV with two columns

Hi I have a small problem here.
I have a text file with numbers which looks like this
2.131583
2.058964
6.866568
0.996470
6.424396
0.996004
6.421990
And with
fList = [s.strip() for s in open('out.txt').readlines()]
outStr = ''
for i in fList:
outStr += (i+',')
f = open('text_to_csv.csv', 'w')
f.write(outStr.strip())
f.close()
I am able to generate a CSV and all the data is stored in it, but all in one row.
I would like to have them in two columns.
Is there any easy addition that would make the CSV look like this?
2.131583 2.058964
6.866568 0.996470
6.424396 0.996004

A better way would be using csv module. You can write like
import csv
with open('text_to_csv.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quoting=csv.QUOTE_MINIMAL)
for i in range(0, len(fList), 2):
writer.writerow(fList[i:i+2])

fList = [s.strip() for s in open('out.txt').readlines()]
outStr = ''
count = 0
for i in fList:
outStr += (i+',')
if count % 2 == 0: # You can replace 2 with what ever number you of columns you need
outStr += ('\r\n') # Make the return correct for your system
count += 1
f = open('text_to_csv.csv', 'w')
f.write(outStr.strip())
f.close()

Something like this:
with open('out.txt', 'r') as fList, open('text_to_csv.csv', 'w') as f:
i = 0
for line in fList:
f.write(line)
f.write('\n' if i% 2 == 0 else '\t')`

If you're not interested in storing the entries from the original file in a new list but just want the output file, you can also do something like this:
fList = [s.strip() for s in open('out.txt').readlines()]
f = open('text_to_csv.csv', 'w')
for i in range(0,len(fList)-1,2):
f.write(fList[i] + "," + fList[i+1] + "\n")
f.close()

If you have a list (from reading the file) in memory, just reformat the list into what you want:
input='''\
2.131583
2.058964
6.866568
0.996470
6.424396
0.996004
6.421990'''
cols=2
data=input.split() # proxy for a file
print data
print '==='
for li in [data[i:i+cols] for i in range(0,len(data),cols)]:
print li
Prints:
['2.131583', '2.058964', '6.866568', '0.996470', '6.424396', '0.996004', '6.421990']
===
['2.131583', '2.058964']
['6.866568', '0.996470']
['6.424396', '0.996004']
['6.421990']
Or, use a N-at-a-time file reading idiom:
import itertools
cols=2
with open('/tmp/nums.txt') as fin:
for li in itertools.izip_longest(*[fin]*cols):
print li
# prints
('2.131583\n', '2.058964\n')
('6.866568\n', '0.996470\n')
('6.424396\n', '0.996004\n')
('6.421990', None)
Which you can combine into one iterator in, one iterator out if you want a type of file filter:
import itertools
cols=2
with open('/tmp/nums.txt') as fin, open('/tmp/nout.txt','w') as fout:
for li in itertools.izip_longest(*[fin]*cols):
fout.write('\t'.join(e.strip() for e in li if e)+'\n')
The output file will now be:
2.131583 2.058964
6.866568 0.996470
6.424396 0.996004
6.421990
If you only want to write the output of there are the full set of numbers, i.e., the remainder numbers at the end of the file that are less than cols in total length:
import itertools
cols=2
# last number '6.421990' not included since izip is used instead of izip_longest
with open('/tmp/nums.txt') as fin, open('/tmp/nout.txt','w') as fout:
for li in itertools.izip(*[fin]*cols):
fout.write('\t'.join(e.strip() for e in li)+'\n')
Then the output file is:
2.131583 2.058964
6.866568 0.996470
6.424396 0.996004

I'am not really sure what you mean, but i think your expected output is:
2.131583,2.058964,
6.866568,0.996470,
6.424396,0.996004,
6.421990
My code for this:
with open('out.txt', 'r') as fif, open('text_to_csv.csv', 'w') as fof:
fList = ','.join([v.strip() if i % 2 else '\n'+v.strip()
for i, v in enumerate(fif.readlines())])[1:]
fof.write(fList)
Interesting points:
If you want to get rid of the trailing "," at the end of your file, just concatenate the list via the join() function.
flat_string = ','.join([item1,...,])
For leading linebreak on odd-items in the list i have enumerated it.
index, value enumerate([item1,...,])
And find the odd-items via the modulo-operator index % 2.
With an "inline-if" you can check this on the fly.
At least i exclude the redundant linebreak at the beginning on the string with [1:]

How to loop through a csv file in python and output each part of the csv file into a new file?

I have a csv file in excel that contains 2000 rows of data. I would like to output 100 lines of the data to different text files. However I have no idea of how to do this. All I can do is output the file into a single file. I have read the CSV file data in Python Pyscripter and then wrote the file to a single file like this:
def read_csv(self):
with open(self.data, newline='') as f:
reader = csv.reader(f)
for row in reader:
self.content.append(row)
def write_txt(self):
f = open(self.txtoutput, 'w')
for row in self.content:
f.write(', '.join(row) + '\n')
f.close()
However, I would like each 100 rows of the 2000 row data to be outputted to different text files.Can anyone point me to the right direction. Note:I am using Python3.
Thanks in advance.

Iterate over the csv file in chunks of 100 rows at a time and write each chunk to a separate file:
with open(csv_filename, newline='') as file:
chunks = zip(*[csv.reader(file)] * 100) # assume nrows % 100 == 0
for i, rows in enumerate(chunks):
with open("out%d.csv" % (i,), 'w', newline='') as output_file:
csv.writer(output_file).writerows(rows)
See What is the most “pythonic” way to iterate over a list in chunks?

For example: You have a counter that you increase with one for each line, and once it reaches a hundred you close the output file and open a new one.

Something like the following should work:
def write_txt(self):
i = 0
while i < len(self.content):
with open(self.txtoutput + str(i/100), 'w') as f:
for row in self.content[i:i+100]:
f.write(', '.join(row) + '\n')
i += 100
Since you didn't specify how the different text files should be named, I just appended an increasing number to the end of self.txtoutput.

Something like
def write_txt(self):
for index, row in enumerate(self.content):
if index % 100 == 0:
f = open(self.txtoutput + str(index) + ".txt", 'w')
if index > 0:
f.close()
f.write(', '.join(row) + '\n')
f.close()

def writeText(self):
for index, offset in enumerate(range(0, len(self.content), 100)):
with open(self.txtoutput + '{:03}'.format(index) + '.txt', 'w') as file:
for eachRow in self.content[offset, offset+100]:
file.write(', '.join(eachRow) + '\n')
No extra variables is fun sometimes. This is a while-less version of #F.J's solution. I formatted the incrementing index with leading 0's so they'd sort conveniently in file listings.
A list comprehension solution with tuneable rowCount might look like (haven't tested this):
def writeText(self):
rowCount = 100
for index, eachGlump in enumerate(self.content[i:i+rowCount] for i in range(0, len(self.content), rowCount)):
with open(self.txtoutput + '{:03}'.format(index) + '.txt', 'w') as file:
for eachRow in eachGlump:
file.write(', '.join(eachRow) + '\n')

Python adding text at the line

I have a file which contains following row:
//hva_SaastonJakaumanMuutos/printData/reallocationAssignment/changeUser/firstName>
I want to add "John" at the end of line.
I have written following code but for some reason it is not working,
def add_text_to_file(self, file, rowTitle, inputText):
f = open("check_files/"+file+".txt", "r")
fileList = list(f)
f.close()
j = 0
for row in fileList :
if fileList[j].find(rowTitle) > 0 :
fileList[j]=fileList[j].replace("\n","")+inputText+"\n"
break
j = j+1
f = open("check_files/"+file+".txt", "w")
f.writelines(fileList)
f.close()
Do you see where am I doing wrong?

str.find may return 0 if the text you are searching is found at the beginning. After all, it returns the index the match begins.
So your condition should be:
if fileList[j].find(rowTitle) >= 0 :
Edit:
The correction above would save the day but it's better if you things the right way, the pythonic way.
If you are looking for a substring in a text, you can use the foo in bar comparison. It will be True if foo can be found in bar and False otherwise.
You rarely need a counter in Python. enumerate built-in is your friend here.
You can combine the iteration and writing and eliminate an unnecessary step.
strip or rstrip is better than replace in your case.
For Python 2.6+, it is better to use with statement when dealing with files. It will deal with the closing of the file right way. For Python 2.5, you need from __future__ import with_statement
Refer to PEP8 for commonly preferred naming conventions.
Here is a cleaned up version:
def add_text_to_file(self, file, row_title, input_text):
with open("check_files/" + file + ".txt", "r") as infile:
file_list = infile.readlines()
with open("check_files/" + file + ".txt", "w") as outfile:
for row in file_list:
if row_title in row:
row = row.rstrip() + input_text + "\n"
outfile.write(row)

You are not giving much informations, so even thoug I wouldn't use the following code (because I'm sure there are better ways) it might help to clear your problem.
import os.path
def add_text_to_file(self, filename, row_title, input_text):
# filename should have the .txt extension in it
filepath = os.path.join("check_files", filename)
with open(filepath, "r") as f:
content = f.readlines()
for j in len(content):
if row_title in content[j]:
content[j] = content[j].strip() + input_text + "\n"
break
with open(filepath, "w") as f:
f.writelines(content)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Re-arranging the format of csv in Python - python

You can use enumerate() in for loop to get index. For example: def writeFile(filename, movie_titles): with open(filename, 'w') as f: f.write("No., Title\n") for i, title in enumerate(movie_titles, 1): f.write('{},{}\n'.format(i, title.replace(',', ''))) Note: To create CSV file look at csv module.

Related

Fix compared data and return line number and data

Changing the contents of a text file and making a new file with same format

Python write text to CSV with two columns

How to loop through a csv file in python and output each part of the csv file into a new file?

Python adding text at the line

Categories

Resources