Print next x lines from string1 until string2 - python

I'm trying to write a function that reads through a text file until it finds a word (say "hello"), then print the next x lines of string starting with string 1 (say "start_description") until string 2 (say "end_description").
hello
start_description 123456 end_description
The function should look like description("hello") and the following output should look like
123456
It's a bit hard to explain. I know how to find the certain word in the text file but I don't know how to print, as said, the next few lines between the two strings (start_description and end_description).
EDIT1:
I found some code which allows to print the next 8, 9, ... lines. But because the text in between the two strings is of variable length, that does not work...
EDIT2:
Basically it's the same question as in this post: Python: Print next x lines from text file when hitting string, but the range(8) does not work for me (see EDIT1).
The input file could look like:
HELLO
salut
A: 123456.
BYE
au revoir
A: 789123.
The code should then look like:
import re
def description(word):
doc = open("filename.txt",'r')
word = word.upper()
for line in doc:
if re.match(word,line):
#here it should start printing all the text between start_description and end_description, for example 123456
return output
print description("hello")
123456
print description("bye")
789123

Here's a way using split:
start_desc = 'hello'
end_desc = 'bye'
str = 'hello 12345\nabcd asdf\nqwer qwer erty\n bye'
print str.split('hello')[1].split('bye')[0]
The first split will result in:
('', ' 12345\nabcd asdf\nqwer qwer erty\n bye')
So feed the second element to the second split and it will result in:
('12345\nabcd asdf\nqwer qwer erty\n ', '')
Use the first element.
You can then use strip() to remove the surrounding spaces if you wish.

def description(infilepath, startblock, endblock, word, startdesc, enddesc):
with open(infilepath) as infile:
inblock = False
name = None
found = False
answer = []
for line in infile:
if found and not inblock: return answer
if line.strip() != startblock and not inblock: continue
if line.strip() == startblock: inblock = True
elif line.strip() == endblock: inblock = False
if not line.startswith(startdesc):
name = line.strip()
continue
if name is not None and name != word: continue
if not line.startswith(startdesc): continue
answer.append(line.strip().lstrip(startdesc).rstrip(enddesc))

Related

Swap quoted word in random position with last word in Python

I have a txt file with lines of text like this, and I want to swap the word in
quotations with the last word that is separated from the sentence with a tab:
it looks like this:
This "is" a person are
She was not "here" right
"The" pencil is not sharpened a
desired output:
This "are" a person is
She was not "right" here
Some ideas:
#1: Use Numpy
Seperate all the words by whitespace with numpy-> ['This','"is"','a','person',\t,'are']
Problems:
How do I tell python the position of the quoted word
How to convert the list back to normal text. Concatenate all?
#2: Use Regex
Use regex and find the word in ""
with open('readme.txt','r') as x:
x = x.readlines()
swap = x[-1]
re.findall(\"(\w+)\", swap)
Problems:
I don't know what to read the txt file with regex. most examples I see here will assign the entire sentence to a variable.
Is it something like this?
with open('readme.txt') as f:
lines = f.readlines()
lines.findall(....)
Thanks guys
You don't really need re for something this trivial.
Assuming you want to rewrite the file:
with open('foo.txt', 'r+') as txt:
lines = txt.readlines()
for k, line in enumerate(lines):
words = line.split()
for i, word in enumerate(words[:-1]):
if word[0] == '"' and word[-1] == '"':
words[i] = f'"{words[-1]}"'
words[-1] = word[1:-1]
break
lines[k] = ' '.join(words[:-1]) + f'\t{words[-1]}'
txt.seek(0)
print(*lines, sep='\n', file=txt)
txt.truncate()
This is my solution:
regex = r'"[\s\S]*"'
import re
file1 = open('test.txt', 'r')
count = 0
while True:
# Get next line from file
line = file1.readline()
# if line is empty
# end of file is reached
if not line:
break
get_tab = line.strip().split('\t')[1]
regex = r'\"[\s\S]*\"'
print("original: {} mod ----> {}".format(line.strip(), re.sub(regex, get_tab, line.strip().split('\t')[0])))
Try:
import re
pat = re.compile(r'"([^"]*)"(.*\t)(.*)')
with open("your_file.txt", "r") as f_in:
for line in f_in:
print(pat.sub(r'"\3"\2\1', line.rstrip()))
Prints:
This "are" a person is
She was not "right" here
"a" pencil is not sharpened The
I guess this is also a way to solve it:
Input readme.txt contents:
This "is" a person are
She was not "here" right
"The" pencil is not sharpened a
Code:
import re
changed_text = []
with open('readme.txt') as x:
for line in x:
splitted_text = line.strip().split("\t") # ['This "is" a person', 'are'] etc.
if re.search(r'\".*\"', line.strip()): # If a quote is found
qouted_text = re.search(r'\"(.*)\"', line.strip()).group(1)
changed_text.append(splitted_text[0].replace(qouted_text, splitted_text[1])+"\t"+qouted_text)
with open('readme.txt.modified', 'w') as x:
for line in changed_text:
print(line)
x.write(line+"\n")
Result (readme.txt.modified):
Thare "are" a person is
She was not "right" here
"a" pencil is not sharpened The

My code doesn't return anything from the wordlist

a_file = open(r"C:\Users\lisin\Desktop\Code\Bomb Party\wordlist.txt", "r")
list_of_lists = []
for line in a_file:
stripped_line = line.strip()
line_list = stripped_line.split()
list_of_lists.append(line_list)
a_file.close()
wordlist = list_of_lists
contains = input("? ")
matches = [match for match in wordlist if str(contains) in match]
print(matches)
When I run the code and put any letters in, it returns nothing. The wordlist has it, but it is still not returning anything. I'm trying to get any word that contains what you input.
Edit: I was not very clear of want I wanted to create. want to be able to input a string, lets say "ee", and have it return any words that have "ee" in them, like "bee" or "free"
Fixed! It turns out it was making a list of lists and I did not realize that somehow. So I just converted the list into a string and then separated it into a list
def Convert(string):
li = list(string.split(" "))
return li
with a_file as myfile:
x = myfile.read().replace('\n', ' ')
Sorry if I wasn't clear about what I wanted. Thanks anyway
Your problem is how you fill matches. match is a list of string and not a string itself, this is why the condition is never fulfilled.
I assumed you wanted every occurrences of contains in wordlist. To resolve this, you want a list of coordinates as (line, word). If I write it like you with tuple of int:
matches = [(index_line, index_word) for index_line, line_list in enumerate(wordlist) for index_word, word in enumerate(line_list) if word == contains]
I think it's hard to read. I recommend this instead:
matches = []
for index_line, line_list in enumerate(wordlist):
for index_word, word in enumerate(line_list):
if word == contains:
matches.append((index_line, index_word))
print("Occurence(s) of the word can be found at :")
for match in matches:
print(f" Line {match[0]}, word {match[1]}")
You may also use a function:
def matches(wordlist : list, contains : str) -> list:
for index_line, line_list in enumerate(wordlist):
for index_word, word in enumerate(line_list):
if word == contains:
yield (index_line, index_word)
print("Occurence(s) of the word can be found at :")
for match in matches(wordlist, contains):
print(f" Line {match[0]}, word {match[1]}")
enumerate() return two variables : the index of the element in the list, and the element itself. yield is harder to understand, it let you read the answer to this question.
Input :
Hello !
My name is John
And you ?
John too !
contains = "John"
Output :
Occurence(s) of the word can be found at :
Line 2, word 4
Line 4, word 1

Go from '\\n' to '\n'

I want to get an input from the console that is a text with several lines. Since I want to do it all in a single input I have to mark '\n' in the middle to signal the different lines.
After I get the input I want to save the text in a matrix where each line is a line from the text with the words separated.
This is my function to do so:
def saveText(text):
text = text[6:len(text)-6]
line = 0
array = [[]]
cur = ""
for i in range (len(text)):
if (text[i] == '\n'):
line+=1
array.append([])
else:
if ((text[i] == ' ' or text[i] == ',' or text[i] == ';' or text[i] == '.') and cur != ""):
array[line].append(cur)
cur = ""
else:
cur += text[i]
return array
However, when I print the variable array it appears as a matrix with only one line, and besides the '\n' are counted as words, they also appear as '\n'.
Can anyone help me with this?
You didn't provide an input string to test with, so I just made my own. You can use .split() to split on new lines and spaces (or anything else that you want).
Edit: I think I understand what you mean now. I think you are trying to have the user input newline characters \n when asking them for input. This isn't possible using input, but there is a workaround. I integrated the answer from that link into the code below.
If you instead wanted the user to manually write \n when getting input from them, then you'd need to change text.splitlines() to text.split('\\n'). You could also replace\nwith\nby usingtext.replace('\n', '\n')`.
But, I think it'd be less error prone to just use the multi-line input as shown below and discussed further in like above.
lines = []
while True:
line = input()
if line:
lines.append(line)
else:
break
input_text = '\n'.join(lines)
def save_text(text):
lines = text.splitlines()
matrix = []
for line in lines:
matrix.append(line.split(' '))
return matrix
print(save_text(input_text))
Input from user looks like this:
hello how
are you
doing this fine
day?
outputs:
[['hello', 'how'], ['are', 'you'], ['doing', 'this', 'fine'], ['day?']]
text = "line1: hello wolrd\nline2: test\nline2: i don't know what to write"
lines = [x.split() for x in text.split("\n")]
print(lines)
This will return:
[['line1:', 'hello', 'wolrd'], ['line2:', 'test'], ['line2:', 'i', "don't", 'know', 'what', 'to', 'write']]
the idea is the same as Stuart's solution, it's just a little bit more efficient.

find words in txt files Python 3

I'd like to create a program in python 3 to find how many time a specific words appears in txt files and then to built an excel tabel with these values.
I made this function but at the end when I recall the function and put the input, the progam doesn't work. Appearing this sentence: unindent does not match any outer indentation level
def wordcount(filename, listwords):
try:
file = open( filename, "r")
read = file.readlines()
file.close()
for x in listwords:
y = x.lower()
counter = 0
for z in read:
line = z.split()
for ss in line:
l = ss.lower()
if y == l:
counter += 1
print(y , counter)
Now I try to recall the function with a txt file and the word to find
wordcount("aaa.txt" , 'word' )
Like output I'd like to watch
word 4
thanks to everybody !
Here is an example you can use to find the number of time a specific word is in a text file;
def searching(filename,word):
counter = 0
with open(filename) as f:
for line in f:
if word in line:
print(word)
counter += 1
return counter
x = searching("filename","wordtofind")
print(x)
The output will be the word you try to find and the number of time it occur.
As short as possible:
def wordcount(filename, listwords):
with open(filename) as file_object:
file_text = file_object.read()
return {word: file_text.count(word) for word in listwords}
for word, count in wordcount('aaa.txt', ['a', 'list', 'of', 'words']).items():
print("Count of {}: {}".format(word, count))
Getting back to mij's comment about passing listwofwords as an actual list: If you pass a string to code that expects a list, python will interpret the string as a list of characters, which can be confusing if this behaviour is unfamiliar.

replacing values from file python

if a file contains A 2 B 3 , have to replace the user input if it contains an A or B with the values 2 and 3, (Example: A banana should turn to 2 banana) so far i have done like this:
word=input("Enter string: ")
word=list(word)
with open('mapping.txt') as f:
key = {}
for line in f:
first, second = line.split()
key[first] = second
for i in word:
if first in i:
word=word.replace(i,key[i])
but it does not change even does nott even prints, would u kindly Help Me
The reason it doesn't work is because each time you read the mapping.txt file, you create your dictionary, and at the same time you are checking the replacement words. So the first line from mapping will create one item in the dictionary, and then you check that one item against the string.
You also don't print anything.
You need to create your mapping once and then check the entire dictionary, like this:
mapping = {}
with open('mapping.txt') as f:
for line in f:
word, replacement = line.split()
mapping[word.strip()] = replacement.strip()
user_input = input("Enter string: ")
new_line = ' '.join(mapping.get(word, word) for word in user_input.split())
print(new_line)
When you run that, here is what you'll get:
Enter string: this is A string with a B
this is 2 string with a 3
I think this should work:
#!/usr/local/bin/python3
word=input("Enter string: ")
with open('input.txt') as f:
key = {}
for line in f:
first, second = line.split()
key[first] = second
for replacement in key:
word=word.replace(replacement,key[replacement])
print(word)

Categories

Resources