What I am trying to do is remove the quotes while writing the data to a new CSV file.
I have tried using s.splits, and .replaces with no luck. Can you guys point me in the right direction?
Current Code:
def createParam():
with open('testcsv.csv', 'r') as f:
reader = csv.reader(f)
csvList = list(reader)
for item in csvList:
os.mkdir(r"C:\Users\jefhill\Desktop\Test Path\\" + item[0])
with open(r"C:\Users\jefhill\Desktop\Test Path\\" + item[0] + r"\prm.263", "w+") as f:
csv.writer(f).writerow(item[1:])
f.close
Data within testcsv.csv:
0116,"139,data1"
0123,"139,data2"
0130,"35,data678"
Data output when script is ran (in each individual file):
"139,data1"
"139,data2"
"35,data678"
Data I would like:
139,data1
139,data2
35,data678
You can use str.replace to replace the " (double quotes) with '' (null).
Then split and print all but first item in the list.
with open('outputfile.csv', w) as outfile: # open the result file to be written
with open('testcsv.csv', 'r') as infile: # open the input file
for line in infile: # iterate through each line in input file
newline = line.replace('"', '') # replace double quotes with no space
outfile.write(newline.split(',',maxsplit=1)[1]) # write second element to output file after splitting the newline once
You don't need f.close() when you use with open...
Related
So I have this crazy long text file made by my crawler and it for some reason added some spaces inbetween the links, like this:
https://example.com/asdf.html (note the spaces)
https://example.com/johndoe.php (again)
I want to get rid of that, but keep the new line. Keep in mind that the text file is 4.000+ lines long. I tried to do it myself but figured that I have no idea how to loop through new lines in files.
Seems like you can't directly edit a python file, so here is my suggestion:
# first get all lines from file
with open('file.txt', 'r') as f:
lines = f.readlines()
# remove spaces
lines = [line.replace(' ', '') for line in lines]
# finally, write lines in the file
with open('file.txt', 'w') as f:
f.writelines(lines)
You can open file and read line by line and remove white space -
Python 3.x:
with open('filename') as f:
for line in f:
print(line.strip())
Python 2.x:
with open('filename') as f:
for line in f:
print line.strip()
It will remove space from each line and print it.
Hope it helps!
Read text from file, remove spaces, write text to file:
with open('file.txt', 'r') as f:
txt = f.read().replace(' ', '')
with open('file.txt', 'w') as f:
f.write(txt)
In #Leonardo Chirivì's solution it's unnecessary to create a list to store file contents when a string is sufficient and more memory efficient. The .replace(' ', '') operation is only called once on the string, which is more efficient than iterating through a list performing replace for each line individually.
To avoid opening the file twice:
with open('file.txt', 'r+') as f:
txt = f.read().replace(' ', '')
f.seek(0)
f.write(txt)
f.truncate()
It would be more efficient to only open the file once. This requires moving the file pointer back to the start of the file after reading, as well as truncating any possibly remaining content left over after you write back to the file. A drawback to this solution however is that is not as easily readable.
I had something similar that I'd been dealing with.
This is what worked for me (Note: This converts from 2+ spaces into a comma, but if you read below the code block, I explain how you can get rid of ALL whitespaces):
import re
# read the file
with open('C:\\path\\to\\test_file.txt') as f:
read_file = f.read()
print(type(read_file)) # to confirm that it's a string
read_file = re.sub(r'\s{2,}', ',', read_file) # find/convert 2+ whitespace into ','
# write the file
with open('C:\\path\\to\\test_file.txt', 'w') as f:
f.writelines('read_file')
This helped me then send the updated data to a CSV, which suited my need, but it can help for you as well, so instead of converting it to a comma (','), you can convert it to an empty string (''), and then [or] use a read_file.replace(' ', '') method if you don't need any whitespaces at all.
Lets not forget about adding back the \n to go to the next row.
The complete function would be :
with open(str_path, 'r') as file :
str_lines = file.readlines()
# remove spaces
if bl_right is True:
str_lines = [line.rstrip() + '\n' for line in str_lines]
elif bl_left is True:
str_lines = [line.lstrip() + '\n' for line in str_lines]
else:
str_lines = [line.strip() + '\n' for line in str_lines]
# Write the file out again
with open(str_path, 'w') as file:
file.writelines(str_lines)
I am trying to write a python script to convert rows in a file to json output, where each line contains a json blob.
My code so far is:
with open( "/Users/me/tmp/events.txt" ) as f:
content = f.readlines()
# strip to remove newlines
lines = [x.strip() for x in content]
i = 1
for line in lines:
filename = "input" + str(i) + ".json"
i += 1
f = open(filename, "w")
f.write(line)
f.close()
However, I am running into an issue where if I have an entry in the file that is quoted, for example:
client:"mac"
This will be output as:
"client:""mac"""
Using a second strip on writing to file will give:
client:""mac
But I want to see:
client:"mac"
Is there any way to force Python to read text in the format ' "something" ' without appending extra quotes around it?
Instead of creating an auxiliary list to strip the newline from content, just open the input and output files at the same time. Write to the output file as you iterate through the lines of the input and stripping whatever you deem necessary. Try something like this:
with open('events.txt', 'rb') as infile, open('input1.json', 'wb') as outfile:
for line in infile:
line = line.strip('"')
outfile.write(line)
So I have this crazy long text file made by my crawler and it for some reason added some spaces inbetween the links, like this:
https://example.com/asdf.html (note the spaces)
https://example.com/johndoe.php (again)
I want to get rid of that, but keep the new line. Keep in mind that the text file is 4.000+ lines long. I tried to do it myself but figured that I have no idea how to loop through new lines in files.
Seems like you can't directly edit a python file, so here is my suggestion:
# first get all lines from file
with open('file.txt', 'r') as f:
lines = f.readlines()
# remove spaces
lines = [line.replace(' ', '') for line in lines]
# finally, write lines in the file
with open('file.txt', 'w') as f:
f.writelines(lines)
You can open file and read line by line and remove white space -
Python 3.x:
with open('filename') as f:
for line in f:
print(line.strip())
Python 2.x:
with open('filename') as f:
for line in f:
print line.strip()
It will remove space from each line and print it.
Hope it helps!
Read text from file, remove spaces, write text to file:
with open('file.txt', 'r') as f:
txt = f.read().replace(' ', '')
with open('file.txt', 'w') as f:
f.write(txt)
In #Leonardo Chirivì's solution it's unnecessary to create a list to store file contents when a string is sufficient and more memory efficient. The .replace(' ', '') operation is only called once on the string, which is more efficient than iterating through a list performing replace for each line individually.
To avoid opening the file twice:
with open('file.txt', 'r+') as f:
txt = f.read().replace(' ', '')
f.seek(0)
f.write(txt)
f.truncate()
It would be more efficient to only open the file once. This requires moving the file pointer back to the start of the file after reading, as well as truncating any possibly remaining content left over after you write back to the file. A drawback to this solution however is that is not as easily readable.
I had something similar that I'd been dealing with.
This is what worked for me (Note: This converts from 2+ spaces into a comma, but if you read below the code block, I explain how you can get rid of ALL whitespaces):
import re
# read the file
with open('C:\\path\\to\\test_file.txt') as f:
read_file = f.read()
print(type(read_file)) # to confirm that it's a string
read_file = re.sub(r'\s{2,}', ',', read_file) # find/convert 2+ whitespace into ','
# write the file
with open('C:\\path\\to\\test_file.txt', 'w') as f:
f.writelines('read_file')
This helped me then send the updated data to a CSV, which suited my need, but it can help for you as well, so instead of converting it to a comma (','), you can convert it to an empty string (''), and then [or] use a read_file.replace(' ', '') method if you don't need any whitespaces at all.
Lets not forget about adding back the \n to go to the next row.
The complete function would be :
with open(str_path, 'r') as file :
str_lines = file.readlines()
# remove spaces
if bl_right is True:
str_lines = [line.rstrip() + '\n' for line in str_lines]
elif bl_left is True:
str_lines = [line.lstrip() + '\n' for line in str_lines]
else:
str_lines = [line.strip() + '\n' for line in str_lines]
# Write the file out again
with open(str_path, 'w') as file:
file.writelines(str_lines)
I want write csv (outfile) file from another csv file (infile). In infile csv data write like this OF0A0C,00,D0,0F11AFCB I want to write to outfile same asinfile but I get like this "\r \n 0,F,0,A,0,C,","0,0,","D,0,","0,F,1,1,A,F,C,B \r \n
My code like this :
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
for line in inFile:
OutFile = csv.writer (outFile)
OutFile.writerow (line)
After write I want to save every data in row to list like this Data = [[length_of_all_data],[length_data_row_1,datarow1],[length_data_row_2,datarow1datarow2],[length_data_row_3,datarow1datarow3]]
I confused to save the with list mode like that. Thankyou
Few issues -
You should read the input csv file using csv module's csv.reader() , instead of iterating over its lines, since when you iterate over its lines, you get the line back as a string in the iteration - for line in inFile: , and then you are writing this line back using OutFile.writerow(line) , hence it writes each character into different columns.
You do not need to create separate OutFile = csv.writer (outFile) for every line.
Example code -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
for row in in_reader:
out_file.writerow(row)
EDIT: For the second issue that is updated, you can create a list and a counter to keep track of the complete length. Example -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
data = []
lencount = 0
for row in in_reader:
out_file.writerow(row)
tlen = len(''.join(row))
data.append([tlen] + row)
lencount += tlen
data.insert(0,[lencount])
I have a script that:
Reads in each line of a file
Finds the '*' character in each line and splits the line here
Rearranges the 3 parts (first to last, and last to first)
Writes the rearranged strings to a .txt file
Problem is, it's finding some new line character or something, and isn't outputting how it should. Have tried stripping newline chars, but there must be something I'm missing.
Thanks in advance for any help!
the script:
## Import packages
import time
import csv
## Make output file
file_output = open('output.txt', 'w')
## Open file and iterate over, rearranging the order of each string
with open('input.csv', 'rb') as f:
## Jump to next line (skips file headers)
next(f)
## Split each line, rearrange, and write the new line
for line in f:
## Strip newline chars
line = line.strip('\n')
## Split original string
category, star, value = line.rpartition("*")
##Make new string
new_string = value+star+category+'\n'
## Write new string to file
file_output.write(new_string)
file_output.close()
## Require input (stops program from immediately quitting)
k = input(" press any key to exit")
Input file (input.csv):
Category*Hash Value
1*FB1124FF6D2D4CD8FECE39B2459ED9D5
1*FB1124FF6D2D4CD8FECE39B2459ED9D5
1*FB1124FF6D2D4CD8FECE39B2459ED9D5
1*34AC061CCCAD7B9D70E8EF286CA2F1EA
Output file (output.txt)
FB1124FF6D2D4CD8FECE39B2459ED9D5
*1
FB1124FF6D2D4CD8FECE39B2459ED9D5
*1
FB1124FF6D2D4CD8FECE39B2459ED9D5
*1
34AC061CCCAD7B9D70E8EF286CA2F1EA
*1
EDIT: Answered. Thanks everyone! Looks all good now! :)
The file output.txt should exist.
The following work with python2 on debian:
## Import packages
import time
import csv
## Make output file
file_output = open('output.txt', 'w')
## Open file and iterate over, rearranging the order of each string
with open('input.csv', 'rb') as f:
## Jump to next line (skips file headers)
next(f)
## Split each line, rearrange, and write the new line
for line in f:
## Split original string
category, star, value = line.rpartition("*")
##Make new string
new_string = value.strip()+star+category+'\n'
## Write new string to file
file_output.write(new_string)
file_output.close()
## Require input (stops program from immediately quitting)
k = input(" press any key to exit")
I strip() the value witch contain the \n in order to sanitize it. You used strip('\n') which could be ambiguous and just using the method without parameter do the job.
Use a DictWriter
import csv
with open('aster.csv') as f, open('out.txt', 'w') as fout:
reader = csv.DictReader(f, delimiter='*')
writer = csv.DictWriter(fout, delimiter='*', fieldnames=['Hash Value','Category'])
#writer.writeheader()
for line in reader:
writer.writerow(line)
Without csv library
with open('aster.csv') as f:
next(f)
lines = [line.strip().split('*') for line in f]
with open('out2.txt', 'w') as fout:
for line in lines:
fout.write('%s*%s\n' % (line[1], line[0]))