using regex for lines in a file

using regex for lines in a file - python

I have text file which should change like this:
Go through the lines and If you see print var[number or string] change it to print ( var[number or string])
For input:
a = [1,2,3]
print a[1]
output must be:
a = [1,2,3]
print(a[1])
I tried this:
import re
with open('file.txt', 'r') as f:
data = f.read()
newline = re.sub(r"^print\s(.+)", r"print(\1)", data)
with open('file.txt', 'w') as f:
f.write(newline)
It only works when print is in first line. How should I check all the lines and change them?

You need to use the M flag to match the start of a line with ^ instead of the start of the text.
re.sub("^a", "c", "abba\nabba", flags=re.M)
'cbba\ncbba'

Look at the indentations. The only line inside the with is the line reading a line of text and you need to include the lines where you change the text and the one writing to the new file.
Try this:
with open('file.txt', 'r') as f:
with open('file-2.txt', 'w') as fout:
data = f.read()
newline = re.sub(r"^print\s(.+)", r"print(\1)", data)
fout.write(newline)

Related

How to convert multicharacter single line into string in Python

Hello I have line like below in a file
I want to convert Text :0 to 8978 as a single string. And same for other part i.e Text:1 to 8978.
Text:0
6786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
Text:1
8786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
I am getting output as
6
7
G
8
7
G
But i want output as from string one and from string two as
6
8
Code is :
file = open ('tem.txt','r')
lines = file.readlines()
print(lines)
for line in lines:
line=line.strip()
linex=line.replace(' ','')
print(linex)
print (linex[0])

I'm not sure about what exact do you need, so:
#1. If need only print the first number (6), I think your code is right.
#2. If you need to print the first part of string(before "space"), it can help you:
line="6786993cc8970hgsksgoop869368 7897909086hfhsi799hjdkdh099h Gsjdh768hhsjdg9978hhjh988978"
print(line[0])
print(line.split(' ')[0])
EDIT
To read a file....
file = open('file.txt', 'r')
Lines = file.readlines()
file.close()
for line in Lines:
print(line.split(' ')[0])
New EDIT
First you need to format your file to after that get the first element. Try this please:
file = open ('tem.txt','r')
lines = file.readlines()
file.close()
linesArray = []
lineTemp = ""
for line in lines:
if 'Text' in line:
if lineTemp:
linesArray.append(lineTemp)
lineTemp = ""
else:
lineTemp += line.strip()
linesArray.append(lineTemp)
for newline in linesArray:
print(newline.split(' ')[0][0])

This should work only if you want to view the first character. Essentially, this code will read your text file, convert multiple lines in the text file to one single string and print out the required first character.
with open(r'tem.txt', 'r') as f:
data = f.readlines()
line = ''.join(data)
print(line[0])
EDITED RESPONSE
Try using regex. Hope this helps.
import re
pattern = re.compile(r'(Text:[0-9]+\s)+')
with open(r'tem.txt', 'r') as f:
data = f.readlines()
data = [i for i in data if len(i.strip())>0]
line = ' '.join([i.strip() for i in data if len(i)>0]).strip()
occurences = re.findall(pattern, line)
for i in occurences:
match_i = re.search(i, line)
start = match_i.end()
print(line[start])

reading .txt file in python

I have a problem with a code in python. I want to read a .txt file. I use the code:
f = open('test.txt', 'r') # We need to re-open the file
data = f.read()
print(data)
I would like to read ONLY the first line from this .txt file. I use
f = open('test.txt', 'r') # We need to re-open the file
data = f.readline(1)
print(data)
But I am seeing that in screen only the first letter of the line is showing.
Could you help me in order to read all the letters of the line ? (I mean to read whole the line of the .txt file)

with open("file.txt") as f:
print(f.readline())
This will open the file using with context block (which will close the file automatically when we are done with it), and read the first line, this will be the same as:
f = open(“file.txt”)
print(f.readline())
f.close()
Your attempt with f.readline(1) won’t work because it the argument is meant for how many characters to print in the file, therefore it will only print the first character.
Second method:
with open("file.txt") as f:
print(f.readlines()[0])
Or you could also do the above which will get a list of lines and print only the first line.
To read the fifth line, use
with open("file.txt") as f:
print(f.readlines()[4])
Or:
with open("file.txt") as f:
lines = []
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
print(lines[-1])
The -1 represents the last item of the list
Learn more:
with statement
files in python
readline method

Your first try is almost there, you should have done the following:
f = open('my_file.txt', 'r')
line = f.readline()
print(line)
f.close()
A safer approach to read file is:
with open('my_file.txt', 'r') as f:
print(f.readline())
Both ways will print only the first line.
Your error was that you passed 1 to readline which means you want to read size of 1, which is only a single character. please refer to https://www.w3schools.com/python/ref_file_readline.asp

I tried this and it works, after your suggestions:
f = open('test.txt', 'r')
data = f.readlines()[1]
print(data)

Use with open(...) instead:
with open("test.txt") as file:
line = file.readline()
print(line)

Keep f.readline() without parameters.
It will return you first line as a string and move cursor to second line.
Next time you use f.readline() it will return second line and move cursor to the next, etc...

How to read quoted string from File and write it without quotes?

I am trying to write a python script to convert rows in a file to json output, where each line contains a json blob.
My code so far is:
with open( "/Users/me/tmp/events.txt" ) as f:
content = f.readlines()
# strip to remove newlines
lines = [x.strip() for x in content]
i = 1
for line in lines:
filename = "input" + str(i) + ".json"
i += 1
f = open(filename, "w")
f.write(line)
f.close()
However, I am running into an issue where if I have an entry in the file that is quoted, for example:
client:"mac"
This will be output as:
"client:""mac"""
Using a second strip on writing to file will give:
client:""mac
But I want to see:
client:"mac"
Is there any way to force Python to read text in the format ' "something" ' without appending extra quotes around it?

Instead of creating an auxiliary list to strip the newline from content, just open the input and output files at the same time. Write to the output file as you iterate through the lines of the input and stripping whatever you deem necessary. Try something like this:
with open('events.txt', 'rb') as infile, open('input1.json', 'wb') as outfile:
for line in infile:
line = line.strip('"')
outfile.write(line)

How to write before & after certain substrings in an Open file in python?

I'm trying to figure out how to read a file, find certain substrings, and edit the inputted file to write characters before and after that substring, but i'm stuck. I can only figure out how to write to the end of a file and not in the middle of the file in the middle of a line somewhere!
So for example, say I have a text file:
blah blurh blap
then I have code:
f = open('inputFile.txt', 'r+')
for line in f:
if 'blah' in line:
f.write('!')
f.close()
The way it is written above, the resulting text would say something like:
blah blurh blap!
but I need a way to figure out for it to say:
!blah! blurh blap
and I can't figure it out and can't find anything online about it. Any ideas?

A way to to this, as mentioned in comments, is to write to a different, temp file then renaming it.
This way is less memory expensive, albeit, it will occupy 2x the space in disk for a moment.
import os
with open('inputFile.txt', 'r') as inp, open('outfile.txt', 'w') as out:
for line in inp:
out.write(line.replace('blah', '!blah!'))
# Windows doesn't let you overwrite a file, remove it old input first
os.unlink('inputFile.txt')
os.rename('outfile.txt', 'inputFile.txt')
Or you can load the file entirely in memory, then re-write it.
with open('inputFile.txt', 'r') as inp:
fixed = inp.read().replace('blah', '!blah!')
with open('inputFile.txt', 'w') as out:
out.write(fixed)

Open the file, use replace() to modify the content and save the result to a string. Then you can write the string to your file.
file_name = 'inputFile.txt'
with open(file_name, 'r') as f:
file_content = f.read().replace('blah', '!blah!')
with open(file_name, 'w') as f:
f.write(file_content)

The only way I know to do this sort of thing is to write to a new file and rename it to the old file name at the end. Something like:
def mod_inline(myfilepath):
tmp = os.tmpnam()
with open(tmp,'w') as outfile:
with open(myfilepath, 'r') as infile:
for line in infile:
if 'blah' in line:
outfile.write(line + '!')
else:
outfile.write(line)
os.rename(tmp, myfilepath)

Input = sample.txt
blah blub blur
test hello world
Code - Read the file, operate on the lines, output to same file
filename = 'sample.txt'
# Read the file
with open(filename) as f:
file_lines = f.readlines()
# Operate on the lines
char = '!'
replace = 'blah'
for i,line in enumerate(file_lines):
file_lines[i] = line.replace(replace, '{0}{1}{0}'.format(char, replace))
# Overwrite the file with the new content
with open(filename, "w") as f:
for line in file_lines:
f.write(line)
Output - characters surrounding the string
!blah! blub blur
test hello world

Here's an approach with the re-module, which allows you to be a little more flexible and define multiple substrings to be surrounded by another string.
Code/Demo:
import re
def surround_keysubs(s, ksubs, char):
regex = '|'.join(ksubs)
repl_fun = lambda m: '{}{}{}'.format(char, m.group(), char)
return re.sub(regex, repl_fun, s)
keysubs = {'blah', 'bar'}
char = '!'
with open('testfile') as f:
content = surround_keysubs(f.read(), keysubs, char)
with open('testfile', 'w') as out:
out.write(content)
Demo:
$ cat testfile
blah blurh blap
foo bar buzz
blah blurh blap
$ python surround_keysubs.py
$ cat testfile
!blah! blurh blap
foo !bar! buzz
!blah! blurh blap

Replace character in line inside a file

I have these different lines with values in a text file
sample1:1
sample2:1
sample3:0
sample4:15
sample5:500
and I want the number after the ":" to be updated sometimes
I know I can split the name by ":" and get a list with 2 values.
f = open("test.txt","r")
lines = f.readlines()
lineSplit = lines[0].split(":",1)
lineSplit[1] #this is the value I want to change
im not quite sure how to update the lineSplit[1] value with the write functions

You can use the fileinput module, if you're trying to modify the same file:
>>> strs = "sample4:15"
Take the advantage of sequence unpacking to store the results in variables after splitting.
>>> sample, value = strs.split(':')
>>> sample
'sample4'
>>> value
'15'
Code:
import fileinput
for line in fileinput.input(filename, inplace = True):
sample, value = line.split(':')
value = int(value) #convert value to int for calculation purpose
if some_condition:
# do some calculations on sample and value
# modify sample, value if required
#now the write the data(either modified or still the old one) to back to file
print "{}:{}".format(sample, value)

Strings are immutable, meaning, you can't assign new values inside them by index.
But you can split up the whole file into a list of lines, and change individual lines (strings) entirely. This is what you're doing in lineSplit[1] = A_NEW_INTEGER
with open(filename, 'r') as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
if condition:
lineSplit = line.split(':')
lineSplit[1] = new_integer
lines[i] = ':'.join(lineSplit)
with open(filename, 'w') as f:
f.write('\n'.join(lines)

Maybe something as such (assuming that each first element before : is indeed a key):
from collections import OrderedDict
with open('fin') as fin:
samples = OrderedDict(line.split(':', 1) for line in fin)
samples['sample3'] = 'something else'
with open('output') as fout:
lines = (':'.join(el) + '\n' for el in samples.iteritems())
fout.writelines(lines)

Another option is to use csv module (: is a column delimiter in your case).
Assuming there is a test.txt file with the following content:
sample1:1
sample2:1
sample3:0
sample4:15
sample5:500
And you need to increment each value. Here's how you can do it:
import csv
# read the file
with open('test.txt', 'r') as f:
reader = csv.reader(f, delimiter=":")
lines = [line for line in reader]
# write the file
with open('test.txt', 'w') as f:
writer = csv.writer(f, delimiter=":")
for line in lines:
# edit the data here
# e.g. increment each value
line[1] = int(line[1]) + 1
writer.writerows(lines)
The contents of test.txt now is:
sample1:2
sample2:2
sample3:1
sample4:16
sample5:501
But, anyway, fileinput sounds more logical to use in your case (editing the same file).
Hope that helps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

using regex for lines in a file - python

You need to use the M flag to match the start of a line with ^ instead of the start of the text. re.sub("^a", "c", "abba\nabba", flags=re.M) 'cbba\ncbba'

Related

How to convert multicharacter single line into string in Python

reading .txt file in python

How to read quoted string from File and write it without quotes?

How to write before & after certain substrings in an Open file in python?

Replace character in line inside a file

Categories

Resources