Python variable in re.match - python

I am trying to write a function that takes in a key (among other things) and returns the word after this key in the file. The code below works, but only if the key happens to be the first phrase in the file. Could anyone point out where I'm going wrong?
def findmatch(key, split_by, tempsl, filename, temp):
rx=r''+key+'(.*)'
f = open(tempsl + filename, 'r', encoding='windows-1252')
for eachline in f:
string=re.match(rx, eachline)
if string:
return (string.group().split(' ')[split_by])
else:
return "didn't work"

You end your for loop after the first iteration because
if string:
return (string.group().split(' ')[split_by])
else:
return "didn't work"
will always break the loop. It will return some result only if you have a keyword in first line. So I suggest this:
for eachline in f:
string=re.match(rx, eachline)
if string:
return (string.group().split(' ')[split_by])
else: # else statemant is a part of for loop (moved to the left)
return "didn't work"
and try this:
m = re.search('(?<=' + key + ')\w+', eachline)
m.group(0)

Related

Openning all text file & getting a string in python [duplicate]

I want to check if a string is in a text file. If it is, do X. If it's not, do Y. However, this code always returns True for some reason. Can anyone see what is wrong?
def check():
datafile = file('example.txt')
found = False
for line in datafile:
if blabla in line:
found = True
break
check()
if True:
print "true"
else:
print "false"
The reason why you always got True has already been given, so I'll just offer another suggestion:
If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):
with open('example.txt') as f:
if 'blabla' in f.read():
print("true")
Another trick: you can alleviate the possible memory problems by using mmap.mmap() to create a "string-like" object that uses the underlying file (instead of reading the whole file in memory):
import mmap
with open('example.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('blabla') != -1:
print('true')
NOTE: in python 3, mmaps behave like bytearray objects rather than strings, so the subsequence you look for with find() has to be a bytes object rather than a string as well, eg. s.find(b'blabla'):
#!/usr/bin/env python3
import mmap
with open('example.txt', 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
if s.find(b'blabla') != -1:
print('true')
You could also use regular expressions on mmap e.g., case-insensitive search: if re.search(br'(?i)blabla', s):
As Jeffrey Said, you are not checking the value of check(). In addition, your check() function is not returning anything. Note the difference:
def check():
with open('example.txt') as f:
datafile = f.readlines()
found = False # This isn't really necessary
for line in datafile:
if blabla in line:
# found = True # Not necessary
return True
return False # Because you finished the search without finding
Then you can test the output of check():
if check():
print('True')
else:
print('False')
Here's another way to possibly answer your question using the find function which gives you a literal numerical value of where something truly is
open('file', 'r').read().find('')
in find write the word you want to find
and 'file' stands for your file name
if True:
print "true"
This always happens because True is always True.
You want something like this:
if check():
print "true"
else:
print "false"
Good luck!
I made a little function for this purpose. It searches for a word in the input file and then adds it to the output file.
def searcher(outf, inf, string):
with open(outf, 'a') as f1:
if string in open(inf).read():
f1.write(string)
outf is the output file
inf is the input file
string is of course, the desired string that you wish to find and add to outf.
Your check function should return the found boolean and use that to determine what to print.
def check():
datafile = file('example.txt')
found = False
for line in datafile:
if blabla in line:
found = True
break
return found
found = check()
if found:
print "true"
else:
print "false"
the second block could also be condensed to:
if check():
print "true"
else:
print "false"
Two problems:
Your function does not return anything; a function that does not explicitly return anything returns None (which is falsy)
True is always True - you are not checking the result of your function
.
def check(fname, txt):
with open(fname) as dataf:
return any(txt in line for line in dataf)
if check('example.txt', 'blabla'):
print "true"
else:
print "false"
How to search the text in the file and Returns an file path in which the word is found
(Как искать часть текста в файле и возвращять путь к файлу в котором это слово найдено)
import os
import re
class Searcher:
def __init__(self, path, query):
self.path = path
if self.path[-1] != '/':
self.path += '/'
self.path = self.path.replace('/', '\\')
self.query = query
self.searched = {}
def find(self):
for root, dirs, files in os.walk( self.path ):
for file in files:
if re.match(r'.*?\.txt$', file) is not None:
if root[-1] != '\\':
root += '\\'
f = open(root + file, 'rt')
txt = f.read()
f.close()
count = len( re.findall( self.query, txt ) )
if count > 0:
self.searched[root + file] = count
def getResults(self):
return self.searched
In Main()
# -*- coding: UTF-8 -*-
import sys
from search import Searcher
path = 'c:\\temp\\'
search = 'search string'
if __name__ == '__main__':
if len(sys.argv) == 3:
# создаем объект поисковика и передаем ему аргументы
Search = Searcher(sys.argv[1], sys.argv[2])
else:
Search = Searcher(path, search)
# начать поиск
Search.find()
# получаем результат
results = Search.getResults()
# выводим результат
print 'Found ', len(results), ' files:'
for file, count in results.items():
print 'File: ', file, ' Found entries:' , count
If user wants to search for the word in given text file.
fopen = open('logfile.txt',mode='r+')
fread = fopen.readlines()
x = input("Enter the search string: ")
for line in fread:
if x in line:
print(line)
found = False
def check():
datafile = file('example.txt')
for line in datafile:
if blabla in line:
found = True
break
return found
if check():
print "true"
else:
print "false"
found = False
def check():
datafile = file('example.txt')
for line in datafile:
if "blabla" in line:
found = True
break
return found
if check():
print "found"
else:
print "not found"
Here's another. Takes an absolute file path and a given string and passes it to word_find(), uses readlines() method on the given file within the enumerate() method which gives an iterable count as it traverses line by line, in the end giving you the line with the matching string, plus the given line number. Cheers.
def word_find(file, word):
with open(file, 'r') as target_file:
for num, line in enumerate(target_file.readlines(), 1):
if str(word) in line:
print(f'<Line {num}> {line}')
else:
print(f'> {word} not found.')
if __name__ == '__main__':
file_to_process = '/path/to/file'
string_to_find = input()
word_find(file_to_process, string_to_find)
"found" needs to be created as global variable in the function as "if else" statement is out of the function. You also don't need to use "break" to break the loop code.
The following should work to find out if the text file has desired string.
with open('text_text.txt') as f:
datafile = f.readlines()
def check():
global found
found = False
for line in datafile:
if 'the' in line:
found = True
check()
if found == True:
print("True")
else:
print("False")

read a file line by line and print only after its done

So I am working on doing a "simple" task since like 2h and still can't find the solution, so where is my question :
I want to search in a file, line by line, and if no result is found, at the end print something, else call a function.
def DeletItemCheckUp():
import re
find = True
itemNumber = input("\n what is the item you want to delet : ")
fileItem = open('Data_Item', 'r', encoding='Utf-8')
for line in fileItem:
sr = re.search(r'^\b%s\b'%itemNumber,(line.split(';')[0]))
if (sr == None):
pass
print("This item don't exist.")
fileItem.close()
if (find == True):
return itemNumber
DeletItem()
so here is the problem I have got with different try :
1. Print "This item don't exist." for every line that didn't had my itemNumber.
2. When there was actually no match found, its would not call DeletItem().
objectif of the code :
Ask for a item to delet, check in a file if the unique item number exist, if so, call DeletItem() to delet it, else, tell the user that this unique item number don't exist.
Few overlooks in there to achieve what you ask. We are going to use a flag (true/false) to know when we found something, and based on that we will decide whether to call the function or print/return the number.
def DeletItemCheckUp():
import re
find = False # initialize to False
itemNumber = input("\n what is the item you want to delet : ")
fileItem = open('Data_Item', 'r', encoding='Utf-8')
for line in fileItem:
sr = re.search(r'^\b%s\b'%itemNumber,(line.split(';')[0]))
if (sr == None):
continue # do nothing and continue
else:
# we found the number, set the flag and break
find = True
break # no need to continue searching
fileItem.close()
if (find):
DeletItem() # call the function
else:
print("This item don't exist.")
1) replace the pass with your print('This item doesn't exist'). "Pass" means "do nothing."
2) Your DeleteItem() is after the return. Nothing executes after the return because you have returned to the place the function was called from. You want
else:
DeleteItem()

Python - print statistics from one file into another

import sys
import pickle
import string
def Menu():
print ("***********MENU************")
print ("0. Quit")
print ("1. Read text file")
print ("2. Display counts")
print ("3. Display statistics of word lengths")
print ("4. Print statistics to file")
def readFile():
while True:
fileName = input("Please enter a file name: ")
if (fileName.lower().endswith(".txt")):
break
else:
print("That was an incorrect file name. Please try again.")
continue
return fileName
THE_FILE = ""
myDictionary = 0
def showCounts(fileName):
numCount = 0
dotCount = 0
commaCount = 0
lineCount = 0
wordCount = 0
with open(fileName, 'r') as f:
for line in f:
wordCount+=len(line.split())
lineCount+=1
for char in line:
if char.isdigit() == True:
numCount+=1
elif char == '.':
dotCount+=1
elif char == ',':
commaCount+=1
print("Number count: " + str(numCount))
print("Comma count: " + str(commaCount))
print("Dot count: " + str(dotCount))
print("Line count: " + str(lineCount))
print("Word count: " + str(wordCount))
def showStats(fileName):
temp1 = []
temp2 = []
lengths = []
myWords = []
keys = []
values = []
count = 0
with open(fileName, 'r') as f:
for line in f:
words = line.split()
for word in words:
temp2.append(word)
temp1.append(len(word))
for x in temp1:
if x not in lengths:
lengths.append(x)
lengths.sort()
dictionaryStats = {}
for x in lengths:
dictionaryStats[x] = []
for x in lengths:
for word in temp2:
if len(word) == x:
dictionaryStats[x].append(word)
for key in dictionaryStats:
print("Key = " + str(key) + " Total number of words with " + str(key) + " characters = " + str(len(dictionaryStats[key])))
return dictionaryStats
def printStats(aDictionary):
aFile = open("statsWords.dat", 'w')
for key in aDictionary:
aFile.write(str(key) + " : " + str(aDictionary[key]) + "\n")
aFile.close()
choice = -1
while choice !=0:
Menu()
choice = (int(input("Please choose 1-4 to perform function. Press 0 to exit the program. Thank you. \n")))
if choice == 0:
print ("Exit program. Thank you.")
sys.exit
elif choice == 1:
THE_FILE = readFile()
elif choice == 2:
showCounts(THE_FILE)
elif choice == 3:
showStats(THE_FILE)
elif choice == 4:
printStats(myDictionary)
else:
print ("Error.")
I'm trying to open a file, have it display the statistics of the word lengths, and then have it make a new file with the statistics of the word lengths. I can read the file and have it display the statistics, but when I print the statistics to file I get an error - "int" object is not iterable. Any ideas? Thanks guys!
Error:
Traceback (most recent call last):
File "hw4_ThomasConnor.py", line 111, in <module>
printStats(myDictionary)
File "hw4_ThomasConnor.py", line 92, in printStats
for key in aDictionary:
TypeError: 'int' object is not iterable
The problem is that you set myDictionary to 0 at the top of your program, and then are sending it to your file writing function here printStats(myDictionary).
In this function you have this line for key in aDictionary, and since you passed in 0, this is effectively for key in 0 which is where the error comes from.
You need to send the result of the showStats function to your printStats function.
As this is looking like homework, I will leave it at that for now.
Sorry I am confused. in the showStats function I have to somehow say
"send results to printStats function" and then in the printStats
function I have to call the results? How would I do that exactly?
The printStats function is expecting a dictionary to print. This dictionary is generated by the showStats function (in fact, it returns this dictionary).
So you need to send the result of the showStats function to the printStats function.
To save the return value of a method, you can assign it on the LHS (left hand side) of the call expression, like this:
>>> def foo(bar):
... return bar*2
...
>>> def print_results(result):
... print('The result was: {}'.format(result))
...
>>> result = foo(2) # Save the returned value
Since result is just like any other name in Python, you can pass it to any other function:
>>> print_results(result)
The result was: 4
If we don't want to store the result of the function, and just want to send it to another function, then we can use this syntax:
>>> print_results(foo(2))
The result was: 4
You need to do something similar in your main loop where you execute the functions.
Since the dictionary you want to print is returned by the showStats function, you must call the showStats function first before calling the printStats function. This poses a problem if your user selects 4 before selecting 3 - make sure you find out a work around for this. A simple work around would be to prompt the user to calculate the stats by selecting 3 before selecting 4. Try to think of another way to get around this problem.
Here:
THE_FILE = ""
myDictionary = 0
you set integer to myDictionary.
and later you do:
printStats(myDictionary)
and as you try to interate over keys of dictionary inside, you fail.

Python Search function in a tab delimited column file

while True:
try:
OpenFile=raw_input(str("Please enter a file name: "))
infile=open(OpenFile,"r")
contents=infile.readlines()
infile.close()
user_input = raw_input(str("Enter A=<animal> for animal search or B=<where lives?> for place of living search: \n"))
if user_input.startswith("A="):
def find_animal(user_input,column):
return next(("\t".join(line) for line in contents
if line[column-1]==user_input),None)
find_animal(user_input[1:])
print str((find_animal(user_input[1:], "WHO?"))) #"Who?" is the name of the first column.
else:
print "Unknown option!"
except IOError:
print "File with this name does not exist!"
1.Enter the name of an animal.
2.Program searches for the lines that have this particular name in the first column.
3.Program prints lines that have this name in the first column.
My function can't seem to work properly here. Can you please help me find the mistake(s)? Thank you!
EDIT
def ask_for_filename():
filename=str(raw_input("Please enter file name: "))
return filename
def read_data(filename):
contents=open(filename,"r")
data=contents.read()
return data
def column_matches(line, substring, which_column):
for line in data:
if column_matches(line, substring, 0):
print line
Big chunks of code are hard to read and debug, try splitting your code into smaller functions, for example like this:
def ask_for_filename():
#left as an exercise
return filename
def read_data(filename):
#left as an exercise
return data
def column_matches(line, substring, which_column):
#left as an exercise
def show_by_name(name, data):
for line in data:
if column_matches(line, name, 0):
print line
def do_search(data):
propmt = "Enter A=<animal> for animal search or B=<where lives?> for place of living search: \n"
user_input = raw_input(prompt)
if user_input.startswith('A='):
show_by_name(user_input[2:], data)
# main program
filename = ask_for_filename()
data = read_data(filename)
while True:
do_search(data)
Test and debug these functions separately until you're sure they work properly. Then write and test the main program.
column_matches() is supposed to return true if some column (which_column) in a line is equal to substring. For example, column_matches("foo\tbar\tbaz", "bar", 1) is True. To achieve that
split a line by a delimiter - this gives us a list of values
get the n-th element of the list
compare it with the substing
return True if they are equal and False otherwise
Putting it all together:
def column_matches(line, substring, which_column):
delimiter = '\t'
columns = line.split(delimiter)
value = columns[which_column]
if value == substring:
return True
else:
return False
or, in a more concise and "pythonic" form:
def column_matches(line, substring, which_column):
return line.split('\t')[which_column] == substring

Read file error in Python, even though print function is printing the list

I have been trying different ways of writing this code but cannot get past this. Currently the program will run all the way to the write_names(list) function and create the file, and the print function will print the sorted list. The program refuses to get the user input for the search_names() function but it will print anything I ask it to.
Debug highlights: while index < len(list) and in the debug I\O only states "read file error". Hopefully someone has an idea what I'm doing wrong.
'# Abstract: This program creates a list of names. The list is printed,
'# sorted, printed again, written to file, and searched.
'#=============================================================================
'#define the main function
def main():
#try:
##open data file for read
#infile = open('names.txt', 'r')
#call get_names function
list = get_names()
#call print function
print_names(list)
#sort list
list.sort()
#print sorted list
print_names(list)
#write sorted list to new file
write_names(list)
#allow user to search list
search_names(list)
def get_names():
try:
infile = open('names.txt', 'r')
#read file contents into a list
list = infile.readlines()
#close file
infile.close()
#strip \n from each element
index = 0
while index < len(list):
list[index] = list[index].rstrip('\n')
index += 1
return list
except IOError:
print 'Read file error'
def print_names(list):
#print header
print '******************'
#print list line by line
index = 0
while index < len(list):
print list[index]
index += 1
return
def write_names(list):
#open file for writing
outfile = open('sortedNames.txt', 'w')
#write the list to the file
for item in list:
outfile.write(str(item) + '\n')
#close file
outfile.close()
def search_names(list):
#set user test variable
again = 'Y'
while again.upper == 'Y':
#get search from user
search = raw_input('Enter a name to search for: ')
#open list for search
if search in list:
try:
item_index = list.index(search)
print search, 'found.', item_index
except ValueError:
print search, 'not found.'
main()
'
Thanks in advance!
Your issue is that upper is a function, and you are not calling it. Your while in search_names() should read:
while again.upper() == 'Y':
instead of:
#strip \n from each element
index = 0
while index < len(list):
list[index] = list[index].rstrip('\n')
index += 1
return list
just use this list comprehension:
lines = infile.readlines()
infile.close()
return [ line.strip() for line in lines ]
edit:
It looks like you are using an index and a while loop where a for loop can be used.
Instead of:
while index < len(list):
print list[index]
index += 1
use:
# using name_list instead of list
for name in name_list:
print name
also, your search_names() function looks flawed:
def search_names(list):
#set user test variable
again = 'Y'
while again.upper == 'Y':
#get search from user
search = raw_input('Enter a name to search for: ')
#open list for search
if search in list:
try:
item_index = list.index(search)
print search, 'found.', item_index
except ValueError:
print search, 'not found.'
would never exit (again is never reassigned). try:
def search_names(names_list):
again = 'Y'
while again.upper() == 'Y':
s_name = raw_input('Enter a name to search for: ')
if s_name in names_list:
print s_name, 'found.', names_list.index(s_name)
else:
print search, 'not found.'
again = raw_input('Search for another name (Y|N)?: ')
or:
def search_names(names_list):
again = 'Y'
while again == 'Y':
s_name = raw_input('Enter a name to search for: ')
try:
idx = names_list.index(s_name)
print s_name, 'found.', idx
except ValueError:
print search, 'not found.'
again = raw_input('Search for another name (Y|N)?: ').upper()
Which brings up the issue of when to catch exceptions vs using an if statement:
from msdn:
The method you choose depends on how
often you expect the event to occur.
If the event is truly exceptional and
is an error (such as an unexpected
end-of-file), using exception handling
is better because less code is
executed in the normal case. If the
event happens routinely, using the
programmatic method to check for
errors is better. In this case, if an
exception occurs, the exception will
take longer to handle.
Comments begin with #, not '# - you are making every other line of your header a docstring.
You are using an index to iterate across lists, which is inefficient - just iterate on the list items.
Calling a variable list is bad because it prevents you from accessing the list() datatype.
Using with is a more reliable replacement for open() .. close()
again.upper is a function reference - you have to call the function, ie again.upper().
You never change the value of again - this will be an infinite loop!
You test if search in list but then do a try..except block which will only fail if it is not in the list (ie you are testing for the same failure twice).
.
#
# Operate on a list of names
#
def load_names(fname):
try:
with open(fname, 'r') as inf:
return [line.strip() for line in inf]
except IOError:
print "Error reading file '{0}'".format(fname)
return []
def print_names(namelist):
print '******************'
print '\n'.join(namelist)
def write_names(namelist, fname):
with open(fname, 'w') as outf:
outf.write('\n'.join(namelist))
def search_names(namelist):
while True:
lookfor = raw_input('Enter a name to search for (or nothing to quit): ').strip()
if lookfor:
try:
ind = namelist.index(lookfor)
print("{0} found.".format(lookfor))
except ValueError:
print("{0} not found.".format(lookfor))
else:
break
def main():
namelist = load_names('names.txt')
print_names(namelist)
namelist.sort()
print_names(namelist)
write_names(namelist, 'sorted_names.txt')
search_names(namelist)
if __name__=="__main__":
main()

Categories

Resources