import re
re_for_identificate_1 = r""
with open("data_path/filename_1.txt","r+") as file:
for line in file:
#replace with a substring adding a space in the middle
line = re.sub(re_for_identificate_1, " milesimo", line)
#replace in txt with the fixed line
Example filename_1.txt :
unmilesimo primero
1001°
dosmilesimos quinto
2005°
tresmilesimos
3000°
nuevemilesimos doceavo
9012°
The correct output file that I need obtiene is this:
Rewrited input filename_1.txt
un milesimo primero
1001°
dos milesimos quinto
2005°
tres milesimos
3000°
nueve milesimos doceavo
9012°
What is the regex that I need and what is the best way to replace the fixed línes in their original positions in the input file?
You can use file.seek(0) to go beginning of the file, then write data and truncate the file. Like this:
import re
re_for_identificate_1 = "(?<!^)milesimo"
tmp = ""
with open("data.txt", "r+") as file:
for line in file:
line = re.sub(re_for_identificate_1, " milesimo", line)
tmp += line
file.seek(0)
file.write(tmp)
file.truncate()
The regex you want to use is "(?<!^)milesimo" to replace every instance of "milesimo" with " milesimo" but not at the beginning of a line.
Related
I am currently learning how to code python and i need help with something.
is there a way where I can only allow the script to read a line that starts with Text = .. ?
Because I want the program to read the text file and the text file has a lot of other sentences but I only want the program to focus on the sentences that starts with Text = .. and print it out, ignoring the other lines in the text file.
for example,
in text file:
min = 32.421
text = " Hello I am Robin and I am hungry"
max = 233341.42
how I want my output to be:
Hello I am Robin and I am hungry
I want the output to just solely be the sentence so without the " " and text =
This is my code so far after reading through comments!
import os
import sys
import glob
from english_words import english_words_set
try:
print('Finding file...')
file = glob.glob(sys.argv[1])
print("Found " + str(len(file)) + " file!")
print('LOADING NOW...')
with open(file) as f:
lines = f.read()
for line in lines:
if line.startswith('Text = '):
res = line.split('"')[1]
print(res)
You can read the text file and read its lines like so :
# open file
with open('text_file.txt') as f:
# store the list of lines contained in file
lines = f.readlines()
for line in lines:
# find match
if line.startswith('text ='):
# store the string inside double quotes
res = line.split('"')[1]
print(res)
This should print your expected output.
You can open the file and try to find if the word "text" begins a sentence in the file and then checking the value by doing
file = open("file.txt", "r") # specify the variable as reading the file, change file.txt to the files path
for line in file: # for each line in file
if line.startswith("text"): # checks for text following a new line
text = line.strip() # removes any whitespace from the line
text = text.replace("text = \"", "") # removes the part before the string
text = text.replace("\"", "") # removes the part after the string
print(text)
Or you could convert it from text to something like yml or toml (in python 3.11+) as those are natively supported in python and are much simpler than text files while still keeping your file system about the same. It would store it as a dictionary instead of a string in the variable.
List comprehensions in python:
https://www.youtube.com/watch?v=3dt4OGnU5sM
Using list comprehension with files:
https://www.youtube.com/watch?v=QHFWb_6fHOw
First learn list comprehensions, then the idea is this:
listOutput = ['''min = 32.421
text = "Hello I am Robin and I am hungry"
max = 233341.42''']
myText = ''.join(listOutput)
indexFirst= myText.find("text") + 8 # add 8 to this index to discard {text = "}
indexLast = myText.find('''"''', indexFirst) # locate next quote since indexFirst position
print(myText[indexFirst:indexLast])
Output:
Hello I am Robin and I am hungry
with open(file) as f:
lines = f.read().split("\n")
prefix = "text = "
for line in lines:
if line.startswith(prefix):
# replaces the first occurence of prefix and assigns it to result
result = line.replace(prefix, '', 1)
print(result)
Alternatively, you could use result = line.removeprefix(prefix) but removeprefix is only available in python3.9 upwards
I've learned that we can easily remove blank lined in a file or remove blanks for each string line, but how about remove all blanks at the end of each line in a file ?
One way should be processing each line for a file, like:
with open(file) as f:
for line in f:
store line.strip()
Is this the only way to complete the task ?
Possibly the ugliest implementation possible but heres what I just scratched up :0
def strip_str(string):
last_ind = 0
split_string = string.split(' ')
for ind, word in enumerate(split_string):
if word == '\n':
return ''.join([split_string[0]] + [ ' {} '.format(x) for x in split_string[1:last_ind]])
last_ind += 1
Don't know if these count as different ways of accomplishing the task. The first is really just a variation on what you have. The second does the whole file at once, rather than line-by-line.
Map that calls the 'rstrip' method on each line of the file.
import operator
with open(filename) as f:
#basically the same as (line.rstrip() for line in f)
for line in map(operator.methodcaller('rstrip'), f)):
# do something with the line
read the whole file and use re.sub():
import re
with open(filename) as f:
text = f.read()
text = re.sub(r"\s+(?=\n)", "", text)
You just want to remove spaces, another solution would be...
line.replace(" ", "")
Good to remove white spaces.
Given the following exemple how can i remove all "a" characters from a file that have the following content:
asdasdasd \n d1233sss \n aaa \n 123
I wrote the following solution but it does not work:
with open("testfisier","r+") as file:
for line in file:
for index in range(len(line)):
if line[index] is "a":line[index].replace("a","")
There weren't any changes because you didn't write it back to the file.
with open("testfisier", "r+") as file:
for line in file:
for index in range(len(line)):
if line[index] is "a":
replace_file = line[index].replace("a", "")
# Write the changes.
file.write(replace_file)
Or:
with open("testfisier", "r+") as f:
f.write(f.read().replace("a", ""))
Try using regexp substitution. For instance, assuming you have read in the string and named it a_string
import re
re.sub('a','',a_string,'')
This would be one of many possible solutions.
Hope this helps!
You can try this:
import re
data = open("testfisier").read()
final_data = re.sub('a+', '', data)
You can call replace on a long string. No need to call it on single chars. Also, replace does not change a string, but returns a new one:
with open("testfisier", "r+") as file:
text = file.read()
text = text.replace("a", "") # replace a's in the entire text
file.seek(0) # move file pointer back to start
file.write(text)
This is probably a duplicate, but I couldn't find my answer anywhere.
I have a text file and I want to remove a specific character from a specific line.
Here's one example:
#textfile.txt
Hey!
1234/
How are you//?
9/23r
How can I remove the slash from the second line?
The output should be:
#textfile.txt
Hey!
1234
How are you//?
9/23r
I've got no code and no clue on how to do this.
I run python 2.7.14 on Debian.
You can read the file line by line and identify the line you want to modify. Then identify the index/location of the character you want to modify(remove).
Replace it with blank and write the text line by line into the file.
#opeing the .txt file
fp = open("data.txt", "r")
#reading text line by line
text= fp.readlines()
#searching for character to remove
char = text[1][-2]
#removing the character by replacing it with blank
text[1] = text[1].replace(char, "")
#opeing the file in write mode
fw = open("data.txt", "w")
#writing lines one by one
for lines in text:
fw.write(lines)
#closing the file
fw.close()
A simple solution is to read in the entire file, find the line that you want to change, change it, and write out all of the content again:
filename = 'textfile.txt'
original = '1234/'
replacement = '1234'
# Open file for reading and read all lines into a list
with open('textfile.txt') as f:
lines = f.readlines()
# Find the line number (index) of the original string
index = lines.index(original + '\n')
# Replace this element of the list
lines[index] = replacement + '\n'
# Write out the modified lines to disk
with open(filename, 'w') as f:
f.writelines(lines)
I want to replace a line in a file but my code doesn't do what I want. The code doesn't change that line. It seems that the problem is the space between ALS and 4277 characters in the input.txt. I need to keep that space in the file. How can I fix my code?
A part part of input.txt:
ALS 4277
Related part of the code:
for lines in fileinput.input('input.txt', inplace=True):
print(lines.rstrip().replace("ALS"+str(4277), "KLM" + str(4945)))
Desired output:
KLM 4945
Using the same idea that other user have already pointed out, you could also reproduce the same spacing, by first matching the spacing and saving it in a variable (spacing in my code):
import re
with open('input.txt') as f:
lines = f.read()
match = re.match(r'ALS(\s+)4277', lines)
if match != None:
spacing = match.group(1)
lines = re.sub(r'ALS\s+4277', 'KLM%s4945'%spacing, lines.rstrip())
print lines
As the spaces vary you will need to use regex to account for the spaces.
import re
lines = "ALS 4277 "
line = re.sub(r"(ALS\s+4277)", "KLM 4945", lines.rstrip())
print(line)
Try:
with open('input.txt') as f:
for line in f:
a, b = line.strip().split()
if a == 'ALS' and b == '4277':
line = line.replace(a, 'KLM').replace(b, '4945')
print(line, end='') # as line has '\n'