How to convert multicharacter single line into string in Python - python

Hello I have line like below in a file
I want to convert Text :0 to 8978 as a single string. And same for other part i.e Text:1 to 8978.
Text:0
6786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
Text:1
8786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
I am getting output as
6
7
G
8
7
G
But i want output as from string one and from string two as
6
8
Code is :
file = open ('tem.txt','r')
lines = file.readlines()
print(lines)
for line in lines:
line=line.strip()
linex=line.replace(' ','')
print(linex)
print (linex[0])

I'm not sure about what exact do you need, so:
#1. If need only print the first number (6), I think your code is right.
#2. If you need to print the first part of string(before "space"), it can help you:
line="6786993cc8970hgsksgoop869368 7897909086hfhsi799hjdkdh099h Gsjdh768hhsjdg9978hhjh988978"
print(line[0])
print(line.split(' ')[0])
EDIT
To read a file....
file = open('file.txt', 'r')
Lines = file.readlines()
file.close()
for line in Lines:
print(line.split(' ')[0])
New EDIT
First you need to format your file to after that get the first element. Try this please:
file = open ('tem.txt','r')
lines = file.readlines()
file.close()
linesArray = []
lineTemp = ""
for line in lines:
if 'Text' in line:
if lineTemp:
linesArray.append(lineTemp)
lineTemp = ""
else:
lineTemp += line.strip()
linesArray.append(lineTemp)
for newline in linesArray:
print(newline.split(' ')[0][0])

This should work only if you want to view the first character. Essentially, this code will read your text file, convert multiple lines in the text file to one single string and print out the required first character.
with open(r'tem.txt', 'r') as f:
data = f.readlines()
line = ''.join(data)
print(line[0])
EDITED RESPONSE
Try using regex. Hope this helps.
import re
pattern = re.compile(r'(Text:[0-9]+\s)+')
with open(r'tem.txt', 'r') as f:
data = f.readlines()
data = [i for i in data if len(i.strip())>0]
line = ' '.join([i.strip() for i in data if len(i)>0]).strip()
occurences = re.findall(pattern, line)
for i in occurences:
match_i = re.search(i, line)
start = match_i.end()
print(line[start])

Related

How to read particular text lines with python? [duplicate]

This question already has answers here:
Python: Searching for text between lines with keywords
(2 answers)
Closed last month.
I want to read particular lines from the text file. E.g. all the contents between "This contents information"
I have created a script to perform the task, but it's not a good method. Are there any better way to do it?
readText=open("test.txt","r")
wanted_lines = [4,5,6,7]
count = 1
with open('test.txt', 'r') as infile:
for line in infile:
line = line.strip()
if count in wanted_lines:
print(line)
else:
pass
count += 1
You can compare each line to the sentinel, start outputting once it matches, and stop outputting once it matches again:
with open('test.txt') as infile:
for output in False, True:
for line in map(str.rstrip, infile):
if line == 'This contents information':
break
if output:
print(line)
Demo: https://replit.com/#blhsing/TroubledMysteriousMonitors
You could consider reading the entire text file into a string, and then using a regular expression to extract the contents you want:
with open('test.txt', 'r') as file:
data = file.read()
contents = re.search(r'^This contents information\n(.*?)\nThis contents information\b', inp, flags=re.M|re.S).group(1)
print(contents)
This prints:
City:LK
Country:LL
Postcode:123
You can use split, with "This contents information" as the delimiter.
In the example above, the file will be split into 3 sections, of which we only need to grab the second one (index=1). You can then use .strip() to remove unwanted space.
Code:
with open('test.txt', 'r') as infile:
text = infile.read()
required_info = text.split("This contents information")[1].strip()
print(required_info)
Output:
City:LK
Country: LL
Postcode:123
Instead of prewriting the line numbers, just have a conditional statement that checks for the data you want.
readText=open("test.txt","r")
with open('test.txt', 'r') as infile:
for line in infile:
line = line.strip()
if line == "text to look for":
printline = True
elif line == "text to end content":
printline = False
elif printline == True:
print(line)
I think the best method would be to use regex.
import re
text=""
with open('test.txt', 'r') as infile:
text = infile.read()
# Don't forget to replace here with the word you want to search among what you want to find.
# This contents information(.*?)\nThis contents information
# this regex finds everything between these two words
# example: 'test 123asda test' -> test(.*?)test => ' 123asda '
regex = re.compile(r'This contents information(.*?)\nThis contents information', re.DOTALL)
matches = [m.groups()[0] for m in regex.finditer(text)]
for m in matches:
print(f'{m.strip()}')
import re
with open("file.txt","r") as f:
data =f.readlines()
string="".join(data) #join each line into one string
ls=re.split(r"(\n*?)This contents information\n",string) #split the string where the regex we specified.
for i in range(len(ls)): #print the list. Ohoo you got the answer
print(ls[i])

how to split file lines based on comma and concatenate new string

I have a file which has lines like below,
Files are in /tmp/file
cat /tmp/file
server1,server2
server4,server4
I want to concatenate each line to new word ".com" so it will look like below.
When i covert this and split it doesn't work , Please guide
newfile = server1.com,server2.com
server3.com,server4.com
with open('/tmp/file', 'r') as file1:
newline = ''
for line in file1:
y = line.split()
print(y)
for line in y:
z = str(y).split(',')
print(z)
newline = str(z)+".com".join('' )
print(newline)
Results :
['server1,server2']
['server3,server4']
["['server3", "server4']"]
["['server3", "server4']"]
Expected :
server1.com,server2.com
server3.com,server4.com
with open('/tmp/file', 'r') as r:
for line in r:
newline = line.strip()
newline = newline.split(',')
for i in range(len(newline)):
newline[i] = newline[i] + ".com"
newline = ",".join(newline)
print newline
This code can be optimized a lot. But this is for better understanding

Locate a specific line in a file based on user input then delete a specific number of lines

I'm trying to delete specific lines in a text file the way I need to go about it is by prompting the user to input a string (a phrase that should exist in the file) the file is then searched and if the string is there the data on that line and the number line number are both stored.
After the phrase has been found it and the five following lines are printed out. Now I have to figure out how to delete those six lines without changing any other text in the file which is my issue lol.
Any Ideas as to how I can delete those six lines?
This was my latest attempt to delete the lines
file = open('C:\\test\\example.txt', 'a')
locate = "example string"
for i, line in enumerate(file):
if locate in line:
line[i] = line.strip()
i = i+1
line[i] = line.strip()
i = i+1
line[i] = line.strip()
i = i+1
line[i] = line.strip()
i = i + 1
line[i] = line.strip()
i = i+1
line[i] = line.strip()
break
Usually I would not think it's desirable to overwrite the source file - what if the user does something by mistake? If your project allows, I would write the changes out to a new file.
with open('source.txt', 'r') as ifile:
with open('output.txt', 'w') as ofile:
locate = "example string"
skip_next = 0
for line in ifile:
if locate in line:
skip_next = 6
print(line.rstrip('\n'))
elif skip_next > 0:
print(line.rstrip('\n'))
skip_next -= 1
else:
ofile.write(line)
This is also robust to finding the phrase multiple times - it will just start counting lines to remove again.
You can find the occurrences, copy the list items between the occurrences to a new list and then save the new list into the file.
_newData = []
_linesToSkip = 3
with open('data.txt', 'r') as _file:
data = _file.read().splitlines()
occurrences = [i for i, x in enumerate(data) if "example string" in x]
_lastOcurrence = 0
for ocurrence in occurrences:
_newData.extend(data[_lastOcurrence : ocurrence])
_lastOcurrence = ocurrence + _linesToSkip
_newData.extend(data[_lastOcurrence:])
# Save new data into the file
There are a couple of points that you clearly misunderstand here:
.strip() removes whitespace or given characters:
>>> print(str.strip.__doc__)
S.strip([chars]) -> str
Return a copy of the string S with leading and trailing
whitespace removed.
If chars is given and not None, remove characters in chars instead.
incrementing i doesn't actually do anything:
>>> for i, _ in enumerate('ignore me'):
... print(i)
... i += 10
...
0
1
2
3
4
5
6
7
8
You're assigning to the ith element of the line, which should raise an exception (that you neglected to tell us about)
>>> line = 'some text'
>>> line[i] = line.strip()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
Ultimately...
You have to write to a file if you want to change its contents. Writing to a file that you're reading from is tricky business. Writing to an alternative file, or just storing the file in memory if it's small enough is a much healthier approach.
search_string = 'example'
lines = []
with open('/tmp/fnord.txt', 'r+') as f: #`r+` so we can read *and* write to the file
for line in f:
line = line.strip()
if search_string in line:
print(line)
for _ in range(5):
print(next(f).strip())
else:
lines.append(line)
f.seek(0) # back to the beginning!
f.truncate() # goodbye, original lines
for line in lines:
print(line, file=f) # python2 requires `from __future__ import print_function`
There is a fatal flaw in this approach, though - if the sought after line is any closer than the 6th line from the end, it's going to have problems. I'll leave that as an exercise for the reader.
You are appending to your file by using open with 'a'. Also, you are not closing your file (bad habit). str.strip() does not delete the line, it removes whitespace by default. Also, this would usually be done in a loop.
This to get started:
locate = "example string"
n=0
with open('example.txt', 'r+') as f:
for i,line in enumerate(f):
if locate in line:
n = 6
if n:
print( line, end='' )
n-=1
print( "done" )
Edit:
Read-modify-write solution:
locate = "example string"
filename='example.txt'
removelines=5
with open(filename) as f:
lines = f.readlines()
with open(filename, 'w') as f:
n=0
for line in lines:
if locate in line:
n = removelines+1
if n:
n-=1
else:
f.write(line)

extracting certain strings from a a file using python

I have a file with some lines. Out of those lines I will choose only lines which starts with xxx. Now the lines which starts with xxx have pattern as follows:
xxx:(12:"pqrs",223,"rst",-90)
xxx:(23:"abc",111,"def",-80)
I want to extract only the string which are their in the first double quote
i.e., "pqrs" and "abc".
Any help using regex is appreciated.
My code is as follows:
with open("log.txt","r") as f:
f = f.readlines()
for line in f:
line=line.rstrip()
for phrase in 'xxx:':
if re.match('^xxx:',line):
c=line
break
this code is giving me error
Your code is wrongly indented. Your f = f.readlines() has 9 spaces in front while for line in f: has 4 spaces. It should look like below.
import re
list_of_prefixes = ["xxx","aaa"]
resulting_list = []
with open("raw.txt","r") as f:
f = f.readlines()
for line in f:
line=line.rstrip()
for phrase in list_of_prefixes:
if re.match(phrase + ':\(\d+:\"(\w+)',line) != None:
resulting_list.append(re.findall(phrase +':\(\d+:\"(\w+)',line)[0])
Well you are heading in the right direction.
If the input is this simple, you can use regex groups.
with open("log.txt","r") as f:
f = f.readlines()
for line in f:
line=line.rstrip()
m = re.match('^xxx:\(\d*:("[^"]*")',line)
if m is not None:
print(m.group(1))
All the magic is in the regular expression.
^xxx:(\d*:("[^"]*") means
Start from the beginning of the line, match on "xxx:(<any number of numbers>:"<anything but ">"
and because the sequence "<anything but ">" is enclosed in round brackets it will be available as a group (by calling m.group(1)).
PS: next time make sure to include the exact error you are getting
results = []
with open("log.txt","r") as f:
f = f.readlines()
for line in f:
if line.startswith("xxx"):
line = line.split(":") # line[1] will be what is after :
result = line[1].split(",")[0][1:-1] # will be pqrs
results.append(result)
You want to look for lines that start with xxx
then split the line on the :. The first thing after the : is what you want -- up to the comma. Then your result is that string, but remove the quotes. There is no need for regex. Python string functions will be fine
To check if a line starts with xxx do
line.startswith('xxx')
To find the text in first double-quotes do
re.search(r'"(.*?)"', line).group(1)
(as match.group(1) is the first parenthesized subgroup)
So the code will be
with open("file") as f:
for line in f:
if line.startswith('xxx'):
print(re.search(r'"(.*?)"', line).group(1))
re module docs

Return First Letter of Line in File

I am trying to pull the first letter of every line in a file, then print those letters to a new file. I am working step-by-step so I created the code that would be able to pull the first letter of every line, however, when I added the code to read a specific file it appears that it is not properly iterating over the entire files content. Does anyone know why my for loop is not iterating? Or perhaps, is the issue that it is iterating but not properly adding the letters to 'lines'.
def secret2(m):
infile = open(m, 'r')
text = infile.read()
for line in text:
lines = text[0]
for i in range(len(text)):
if text[i] == '\n':
lines += text[i+1]
print(lines)
return(lines)
m.close()
Output:
>>> secret2('file.txt')
A
'A'
>>>
Proper output would be:
>>> secret2('file.txt')
'ALICE'
>>>
Your code is iterating over the characters instead of lines. You could print the first character from each line with following code:
def secret2(m):
with open(m) as infile:
print(''.join(line[0] for line in infile if line))
You want to consider the each line as a single data. So use readlines() instead of read. So your code should be
def secret2(m):
infile = open(m, 'r')
text = infile.readlines()
for j in (text):
print j[0]
You can use this:
def get_1st_chr(your_file, id_line) :
with open(your_file) as f :
text_splitted = f.read().splitlines()
f.close()
return text_splitted[id_line][0]
Or, if you want all of the first lines character:
def get_1st_chr(your_file, nb_lines) :
with open(your_file) as f :
text_splitted = f.read().splitlines()
f.close()
for i in range(nb_lines) :
print(text_splitted[[i][0])
You could replace 0 with the id of the character you want to print of course.

Categories

Resources