The following code produces a file with content test\\nstring, but I need the file to contain test\nstring. I can't figure out a way to replace the \\symbol either.
s = "test\nstring"
with open('test.txt', 'w') as f:
f.write(s)
How can I make sure that the file contains only \n instead of \\n?
use s = "test\\nstring"
I tried with the following code and worked.
s = "test\\nstring"
with open('test.txt', 'w') as f:
f.write(s)
and the test.txt file contains
test\nstring
Besides of escaping and raw string, you can encode it (2 or 3) with 'string_escape':
s = "test\nstring".encode('string_escape')
with open('test.txt', 'w') as f:
f.write(s)
The raw strings may help
s = r"test\nstring"
with open('test.txt', 'w') as f:
f.write(s)
Related
I am trying to write a python script to convert rows in a file to json output, where each line contains a json blob.
My code so far is:
with open( "/Users/me/tmp/events.txt" ) as f:
content = f.readlines()
# strip to remove newlines
lines = [x.strip() for x in content]
i = 1
for line in lines:
filename = "input" + str(i) + ".json"
i += 1
f = open(filename, "w")
f.write(line)
f.close()
However, I am running into an issue where if I have an entry in the file that is quoted, for example:
client:"mac"
This will be output as:
"client:""mac"""
Using a second strip on writing to file will give:
client:""mac
But I want to see:
client:"mac"
Is there any way to force Python to read text in the format ' "something" ' without appending extra quotes around it?
Instead of creating an auxiliary list to strip the newline from content, just open the input and output files at the same time. Write to the output file as you iterate through the lines of the input and stripping whatever you deem necessary. Try something like this:
with open('events.txt', 'rb') as infile, open('input1.json', 'wb') as outfile:
for line in infile:
line = line.strip('"')
outfile.write(line)
I would like to copy certain lines of text from one text file to another. In my current script when I search for a string it copies everything afterwards, how can I copy just a certain part of the text? E.g. only copy lines when it has "tests/file/myword" in it?
current code:
#!/usr/bin/env python
f = open('list1.txt')
f1 = open('output.txt', 'a')
doIHaveToCopyTheLine=False
for line in f.readlines():
if 'tests/file/myword' in line:
doIHaveToCopyTheLine=True
if doIHaveToCopyTheLine:
f1.write(line)
f1.close()
f.close()
The oneliner:
open("out1.txt", "w").writelines([l for l in open("in.txt").readlines() if "tests/file/myword" in l])
Recommended with with:
with open("in.txt") as f:
lines = f.readlines()
lines = [l for l in lines if "ROW" in l]
with open("out.txt", "w") as f1:
f1.writelines(lines)
Using less memory:
with open("in.txt") as f:
with open("out.txt", "w") as f1:
for line in f:
if "ROW" in line:
f1.write(line)
readlines() reads the entire input file into a list and is not a good performer. Just iterate through the lines in the file. I used 'with' on output.txt so that it is automatically closed when done. That's not needed on 'list1.txt' because it will be closed when the for loop ends.
#!/usr/bin/env python
with open('output.txt', 'a') as f1:
for line in open('list1.txt'):
if 'tests/file/myword' in line:
f1.write(line)
Just a slightly cleaned up way of doing this. This is no more or less performant than ATOzTOA's answer, but there's no reason to do two separate with statements.
with open(path_1, 'a') as file_1, open(path_2, 'r') as file_2:
for line in file_2:
if 'tests/file/myword' in line:
file_1.write(line)
Safe and memory-saving:
with open("out1.txt", "w") as fw, open("in.txt","r") as fr:
fw.writelines(l for l in fr if "tests/file/myword" in l)
It doesn't create temporary lists (what readline and [] would do, which is a non-starter if the file is huge), all is done with generator comprehensions, and using with blocks ensure that the files are closed on exit.
f=open('list1.txt')
f1=open('output.txt','a')
for x in f.readlines():
f1.write(x)
f.close()
f1.close()
this will work 100% try this once
in Python 3.10 with parenthesized context managers, you can use multiple context managers in one with block:
with (open('list1.txt', 'w') as fout, open('output.txt') as fin):
fout.write(fin.read())
f = open('list1.txt')
f1 = open('output.txt', 'a')
# doIHaveToCopyTheLine=False
for line in f.readlines():
if 'tests/file/myword' in line:
f1.write(line)
f1.close()
f.close()
Now Your code will work. Try This one.
I'm having a bad time with character encoding. It's kinda to understand why this happens when I open my .txt file:
Questions:
What's this type of encoding? Why this happens?
How can I rewrite my txt file to use normal accents or even without accents and special chars?
Is there any special library to handle this? I could create a huge function that will replace() all these chars, but I don't know when or which chars will appear in my future txts.
My code:
folder = 'E:\\WinPython\\notebooks\\scripts\\script1\\'
txtFile = folder + 'PROF_SAI_318_210117_310117_orig.txt'
with open(txtFile, 'r') as f:
with open('PROF_SAI_318_210117_310117_clean.txt', 'w') as g:
for line in f:
do_something() # what should I write here to 'clean' my file?
g.write(line)
print("Ok!")
Output excerpt:
SPLEONARDO SIM\xc3\x83O ESTARLING
GOFLORESTA S/A A\xc3\x87UCAR E ALCOOL
SPFOCO REPRESENTA\xc3\x87\xc3\x95ES E CONSULTORIA
It looks like you are using Notepad++ to display your file. The encoding displayed looks like cp1252:
>>> b'COMUNICA\xc7\xc3O M\xc1QUINAS'.decode('cp1252')
'COMUNICAÇÃO MÁQUINAS'
In Notepad++, on the menu select Encoding->Character sets->Western European->Windows-1252 and your file should display correctly.
Here's an example that converts to UTF-8 (your output excerpt):
>>> b'SPLEONARDO SIM\xc3O ESTARLING'.decode('cp1252')
'SPLEONARDO SIMÃO ESTARLING'
>>> b'SPLEONARDO SIM\xc3O ESTARLING'.decode('cp1252').encode('utf8')
b'SPLEONARDO SIM\xc3\x83O ESTARLING'
For your example code, you can do:
with open(txtFile, 'r', encoding='cp1252') as f:
with open('PROF_SAI_318_210117_310117_clean.txt', 'w', encoding='utf8') as g:
for line in f:
g.write(line)
If your files aren't too large, you can just do:
with open(txtFile, 'r', encoding='cp1252') as f:
with open('PROF_SAI_318_210117_310117_clean.txt', 'w', encoding='utf8') as g:
g.write(f.read())
I have a text file which has quotations in the form ' and ". This text file is of the form.
"string1", "string2", 'string3', 'string4', ''string5'', etc.
How can I remove all of the quotations ' and " while leaving the rest of the file the way it is now?
I suspect it should be something like this:
with open('input.txt', 'r') as f, open('output.txt', 'w') as fo:
for line in f:
fo.write(line.strip())
Where line.strip() somehow strips the strings of the quotation marks. Is anything else required?
You're close. Instead of str.strip(), try str.replace():
with open('input.txt', 'r') as f, open('output.txt', 'w') as fo:
for line in f:
fo.write(line.replace('"', '').replace("'", ""))
The following is my replace line function:
def replace_line(file_name, num, replaced):
f = open(file_name, 'r', encoding='utf-8')
lines = f.readlines()
lines[num] = replaced
f.close()
f = open(file_name, 'w', encoding='utf-8')
f.writelines(lines)
f.close()
I am using this following line to run my code:
replace_line('Store.txt', int(line), new)
When I run my code, it replaces that line however it also removes everything after that line. For example, if this was my list:
To be honest, I'm not sure what was wrong with the original function. But I tried redoing it and this seems to work fine:
def replace_line(file_name, line_num, text):
with open(filename, 'r+') as f:
lines = f.read().splitlines()
lines[line_num] = text
f.seek(0)
f.writelines(lines)
f.truncate()
Please note that this overwrites the entire file. If you need to handle large files or are concerned with memory usage, you might want
to try another approach.