Python Search function in a tab delimited column file - python

while True:
try:
OpenFile=raw_input(str("Please enter a file name: "))
infile=open(OpenFile,"r")
contents=infile.readlines()
infile.close()
user_input = raw_input(str("Enter A=<animal> for animal search or B=<where lives?> for place of living search: \n"))
if user_input.startswith("A="):
def find_animal(user_input,column):
return next(("\t".join(line) for line in contents
if line[column-1]==user_input),None)
find_animal(user_input[1:])
print str((find_animal(user_input[1:], "WHO?"))) #"Who?" is the name of the first column.
else:
print "Unknown option!"
except IOError:
print "File with this name does not exist!"
1.Enter the name of an animal.
2.Program searches for the lines that have this particular name in the first column.
3.Program prints lines that have this name in the first column.
My function can't seem to work properly here. Can you please help me find the mistake(s)? Thank you!
EDIT
def ask_for_filename():
filename=str(raw_input("Please enter file name: "))
return filename
def read_data(filename):
contents=open(filename,"r")
data=contents.read()
return data
def column_matches(line, substring, which_column):
for line in data:
if column_matches(line, substring, 0):
print line

Big chunks of code are hard to read and debug, try splitting your code into smaller functions, for example like this:
def ask_for_filename():
#left as an exercise
return filename
def read_data(filename):
#left as an exercise
return data
def column_matches(line, substring, which_column):
#left as an exercise
def show_by_name(name, data):
for line in data:
if column_matches(line, name, 0):
print line
def do_search(data):
propmt = "Enter A=<animal> for animal search or B=<where lives?> for place of living search: \n"
user_input = raw_input(prompt)
if user_input.startswith('A='):
show_by_name(user_input[2:], data)
# main program
filename = ask_for_filename()
data = read_data(filename)
while True:
do_search(data)
Test and debug these functions separately until you're sure they work properly. Then write and test the main program.
column_matches() is supposed to return true if some column (which_column) in a line is equal to substring. For example, column_matches("foo\tbar\tbaz", "bar", 1) is True. To achieve that
split a line by a delimiter - this gives us a list of values
get the n-th element of the list
compare it with the substing
return True if they are equal and False otherwise
Putting it all together:
def column_matches(line, substring, which_column):
delimiter = '\t'
columns = line.split(delimiter)
value = columns[which_column]
if value == substring:
return True
else:
return False
or, in a more concise and "pythonic" form:
def column_matches(line, substring, which_column):
return line.split('\t')[which_column] == substring

Related

Counting a desired word in a text file

I have to count the number of times a given word appears in a given text file, this one being the Gettysburg Address. For some reason, it is not counting my input of 'nation' so the output looks as such:
'nation' is found 0 times in the file gettysburg.txt
Here is the code I have currently, could someone point out what I am doing incorrectly?
fname = input("Enter a file name to process:")
find = input("Enter a word to search for:")
text = open(fname, 'r').read()
def processone():
if text is not None:
words = text.lower().split()
return words
else:
return None
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
if word == token:
count += 1
return count
words = processone()
word = find
frequency = count_word(words, word)
print("'"+find+"'", "is found", str(frequency), "times in the file", fname)
My first function splits the file into a string and turns all letters in it lower case. The second one removes the punctuation and is supposed to count the word given in the input.
Taking my first coding class, if you see more flaws in my coding or improvements that could be made, as well as helping find the solution to my problem, feel free.
In the for loop in the count_word() function, you have a return statement at the end of the loop, which exits the function immediately, after only one loop iteration.
You probably want to move the return statement to be outside of the for loop.
as a starter I would suggest you to use print statements and see what variables are printing, that helps to breakdown the problem. For example, print word was showing only first word from the file, which would have explained the problem in your code.
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
print (word)
if word == token:
count += 1
return count
Enter a file name to process:gettysburg.txt
Enter a word to search for:nation
fourscore
'nation' is found 0 times in the file gettysburg.txt
Use code below:
fname = input("Enter a file name to process:")
find = input("Enter a word to search for:")
text = open(fname, 'r').read()
def processone():
if text is not None:
words = text.lower().split()
return words
else:
return None
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
if word == token:
count += 1
return count
words = processone()
word = find
frequency = count_word(words, word)
print("'"+find+"'", "is found", str(frequency), "times in the file", fname)
statement "return" go out statement "for"

Openning all text file & getting a string in python [duplicate]

I want to check if a string is in a text file. If it is, do X. If it's not, do Y. However, this code always returns True for some reason. Can anyone see what is wrong?
def check():
datafile = file('example.txt')
found = False
for line in datafile:
if blabla in line:
found = True
break
check()
if True:
print "true"
else:
print "false"
The reason why you always got True has already been given, so I'll just offer another suggestion:
If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):
with open('example.txt') as f:
if 'blabla' in f.read():
print("true")
Another trick: you can alleviate the possible memory problems by using mmap.mmap() to create a "string-like" object that uses the underlying file (instead of reading the whole file in memory):
import mmap
with open('example.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('blabla') != -1:
print('true')
NOTE: in python 3, mmaps behave like bytearray objects rather than strings, so the subsequence you look for with find() has to be a bytes object rather than a string as well, eg. s.find(b'blabla'):
#!/usr/bin/env python3
import mmap
with open('example.txt', 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
if s.find(b'blabla') != -1:
print('true')
You could also use regular expressions on mmap e.g., case-insensitive search: if re.search(br'(?i)blabla', s):
As Jeffrey Said, you are not checking the value of check(). In addition, your check() function is not returning anything. Note the difference:
def check():
with open('example.txt') as f:
datafile = f.readlines()
found = False # This isn't really necessary
for line in datafile:
if blabla in line:
# found = True # Not necessary
return True
return False # Because you finished the search without finding
Then you can test the output of check():
if check():
print('True')
else:
print('False')
Here's another way to possibly answer your question using the find function which gives you a literal numerical value of where something truly is
open('file', 'r').read().find('')
in find write the word you want to find
and 'file' stands for your file name
if True:
print "true"
This always happens because True is always True.
You want something like this:
if check():
print "true"
else:
print "false"
Good luck!
I made a little function for this purpose. It searches for a word in the input file and then adds it to the output file.
def searcher(outf, inf, string):
with open(outf, 'a') as f1:
if string in open(inf).read():
f1.write(string)
outf is the output file
inf is the input file
string is of course, the desired string that you wish to find and add to outf.
Your check function should return the found boolean and use that to determine what to print.
def check():
datafile = file('example.txt')
found = False
for line in datafile:
if blabla in line:
found = True
break
return found
found = check()
if found:
print "true"
else:
print "false"
the second block could also be condensed to:
if check():
print "true"
else:
print "false"
Two problems:
Your function does not return anything; a function that does not explicitly return anything returns None (which is falsy)
True is always True - you are not checking the result of your function
.
def check(fname, txt):
with open(fname) as dataf:
return any(txt in line for line in dataf)
if check('example.txt', 'blabla'):
print "true"
else:
print "false"
How to search the text in the file and Returns an file path in which the word is found
(Как искать часть текста в файле и возвращять путь к файлу в котором это слово найдено)
import os
import re
class Searcher:
def __init__(self, path, query):
self.path = path
if self.path[-1] != '/':
self.path += '/'
self.path = self.path.replace('/', '\\')
self.query = query
self.searched = {}
def find(self):
for root, dirs, files in os.walk( self.path ):
for file in files:
if re.match(r'.*?\.txt$', file) is not None:
if root[-1] != '\\':
root += '\\'
f = open(root + file, 'rt')
txt = f.read()
f.close()
count = len( re.findall( self.query, txt ) )
if count > 0:
self.searched[root + file] = count
def getResults(self):
return self.searched
In Main()
# -*- coding: UTF-8 -*-
import sys
from search import Searcher
path = 'c:\\temp\\'
search = 'search string'
if __name__ == '__main__':
if len(sys.argv) == 3:
# создаем объект поисковика и передаем ему аргументы
Search = Searcher(sys.argv[1], sys.argv[2])
else:
Search = Searcher(path, search)
# начать поиск
Search.find()
# получаем результат
results = Search.getResults()
# выводим результат
print 'Found ', len(results), ' files:'
for file, count in results.items():
print 'File: ', file, ' Found entries:' , count
If user wants to search for the word in given text file.
fopen = open('logfile.txt',mode='r+')
fread = fopen.readlines()
x = input("Enter the search string: ")
for line in fread:
if x in line:
print(line)
found = False
def check():
datafile = file('example.txt')
for line in datafile:
if blabla in line:
found = True
break
return found
if check():
print "true"
else:
print "false"
found = False
def check():
datafile = file('example.txt')
for line in datafile:
if "blabla" in line:
found = True
break
return found
if check():
print "found"
else:
print "not found"
Here's another. Takes an absolute file path and a given string and passes it to word_find(), uses readlines() method on the given file within the enumerate() method which gives an iterable count as it traverses line by line, in the end giving you the line with the matching string, plus the given line number. Cheers.
def word_find(file, word):
with open(file, 'r') as target_file:
for num, line in enumerate(target_file.readlines(), 1):
if str(word) in line:
print(f'<Line {num}> {line}')
else:
print(f'> {word} not found.')
if __name__ == '__main__':
file_to_process = '/path/to/file'
string_to_find = input()
word_find(file_to_process, string_to_find)
"found" needs to be created as global variable in the function as "if else" statement is out of the function. You also don't need to use "break" to break the loop code.
The following should work to find out if the text file has desired string.
with open('text_text.txt') as f:
datafile = f.readlines()
def check():
global found
found = False
for line in datafile:
if 'the' in line:
found = True
check()
if found == True:
print("True")
else:
print("False")

read a file line by line and print only after its done

So I am working on doing a "simple" task since like 2h and still can't find the solution, so where is my question :
I want to search in a file, line by line, and if no result is found, at the end print something, else call a function.
def DeletItemCheckUp():
import re
find = True
itemNumber = input("\n what is the item you want to delet : ")
fileItem = open('Data_Item', 'r', encoding='Utf-8')
for line in fileItem:
sr = re.search(r'^\b%s\b'%itemNumber,(line.split(';')[0]))
if (sr == None):
pass
print("This item don't exist.")
fileItem.close()
if (find == True):
return itemNumber
DeletItem()
so here is the problem I have got with different try :
1. Print "This item don't exist." for every line that didn't had my itemNumber.
2. When there was actually no match found, its would not call DeletItem().
objectif of the code :
Ask for a item to delet, check in a file if the unique item number exist, if so, call DeletItem() to delet it, else, tell the user that this unique item number don't exist.
Few overlooks in there to achieve what you ask. We are going to use a flag (true/false) to know when we found something, and based on that we will decide whether to call the function or print/return the number.
def DeletItemCheckUp():
import re
find = False # initialize to False
itemNumber = input("\n what is the item you want to delet : ")
fileItem = open('Data_Item', 'r', encoding='Utf-8')
for line in fileItem:
sr = re.search(r'^\b%s\b'%itemNumber,(line.split(';')[0]))
if (sr == None):
continue # do nothing and continue
else:
# we found the number, set the flag and break
find = True
break # no need to continue searching
fileItem.close()
if (find):
DeletItem() # call the function
else:
print("This item don't exist.")
1) replace the pass with your print('This item doesn't exist'). "Pass" means "do nothing."
2) Your DeleteItem() is after the return. Nothing executes after the return because you have returned to the place the function was called from. You want
else:
DeleteItem()

finding item in text file during 2nd read

In the beginning of my program, i opened a file with f = open("foods.txt", "r+"). Later I called this method i created
def findFood(food):
foodRegex = re.compile(r'(?P<food>\S+)\s+\-.*')
for line in f.readlines():
print line
duplicateFound = re.search(foodRegex, line)
if duplicateFound.group('food') == food:
return duplicateFound
else:
return False
However I run the method again. But my program doesn't work the way I want it to. Specifically
def build_meal_plan():
number_of_items = int(raw_input("How many items would you like to add to your meal plan? "))
count = 0
while number_of_items > 0:
print count
food = raw_input("Enter in food name: ")
print food
if findFood(food):
servings = int(raw_input("Number of servings: "))
else:
print "Food not found! Try again? (y/n): ",
choice = raw_input()
if choice == 'y' or choice == "yes":
number_of_items += 1
else:
return
However during the 2nd run of my findFood method I cannot locate an item i know exists within the .txt file. I am not sure why I cannot find the same item I found in the text file during the first run. My assumption is that you can only go through a txt file once.
Once you call f.readlines(), you are at the end of the file. To return to the start, so you can go through it again, call f.seek(0):
def findFood(food):
foodRegex = re.compile(r'(?P<food>\S+)\s+\-.*')
for line in f.readlines():
...
f.seek(0)
Alternatively, you can import the contents of the file to a list:
def import_file(filename):
with open(filename) as f:
content = [line.strip() for line in f]
return content
And use that instead of referring back to the file.
def findFood(food, data):
foodRegex = re.compile(r'(?P<food>\S+)\s+\-.*')
for line in data:
...
Then you don't need to worry about returning to the start.

Read file error in Python, even though print function is printing the list

I have been trying different ways of writing this code but cannot get past this. Currently the program will run all the way to the write_names(list) function and create the file, and the print function will print the sorted list. The program refuses to get the user input for the search_names() function but it will print anything I ask it to.
Debug highlights: while index < len(list) and in the debug I\O only states "read file error". Hopefully someone has an idea what I'm doing wrong.
'# Abstract: This program creates a list of names. The list is printed,
'# sorted, printed again, written to file, and searched.
'#=============================================================================
'#define the main function
def main():
#try:
##open data file for read
#infile = open('names.txt', 'r')
#call get_names function
list = get_names()
#call print function
print_names(list)
#sort list
list.sort()
#print sorted list
print_names(list)
#write sorted list to new file
write_names(list)
#allow user to search list
search_names(list)
def get_names():
try:
infile = open('names.txt', 'r')
#read file contents into a list
list = infile.readlines()
#close file
infile.close()
#strip \n from each element
index = 0
while index < len(list):
list[index] = list[index].rstrip('\n')
index += 1
return list
except IOError:
print 'Read file error'
def print_names(list):
#print header
print '******************'
#print list line by line
index = 0
while index < len(list):
print list[index]
index += 1
return
def write_names(list):
#open file for writing
outfile = open('sortedNames.txt', 'w')
#write the list to the file
for item in list:
outfile.write(str(item) + '\n')
#close file
outfile.close()
def search_names(list):
#set user test variable
again = 'Y'
while again.upper == 'Y':
#get search from user
search = raw_input('Enter a name to search for: ')
#open list for search
if search in list:
try:
item_index = list.index(search)
print search, 'found.', item_index
except ValueError:
print search, 'not found.'
main()
'
Thanks in advance!
Your issue is that upper is a function, and you are not calling it. Your while in search_names() should read:
while again.upper() == 'Y':
instead of:
#strip \n from each element
index = 0
while index < len(list):
list[index] = list[index].rstrip('\n')
index += 1
return list
just use this list comprehension:
lines = infile.readlines()
infile.close()
return [ line.strip() for line in lines ]
edit:
It looks like you are using an index and a while loop where a for loop can be used.
Instead of:
while index < len(list):
print list[index]
index += 1
use:
# using name_list instead of list
for name in name_list:
print name
also, your search_names() function looks flawed:
def search_names(list):
#set user test variable
again = 'Y'
while again.upper == 'Y':
#get search from user
search = raw_input('Enter a name to search for: ')
#open list for search
if search in list:
try:
item_index = list.index(search)
print search, 'found.', item_index
except ValueError:
print search, 'not found.'
would never exit (again is never reassigned). try:
def search_names(names_list):
again = 'Y'
while again.upper() == 'Y':
s_name = raw_input('Enter a name to search for: ')
if s_name in names_list:
print s_name, 'found.', names_list.index(s_name)
else:
print search, 'not found.'
again = raw_input('Search for another name (Y|N)?: ')
or:
def search_names(names_list):
again = 'Y'
while again == 'Y':
s_name = raw_input('Enter a name to search for: ')
try:
idx = names_list.index(s_name)
print s_name, 'found.', idx
except ValueError:
print search, 'not found.'
again = raw_input('Search for another name (Y|N)?: ').upper()
Which brings up the issue of when to catch exceptions vs using an if statement:
from msdn:
The method you choose depends on how
often you expect the event to occur.
If the event is truly exceptional and
is an error (such as an unexpected
end-of-file), using exception handling
is better because less code is
executed in the normal case. If the
event happens routinely, using the
programmatic method to check for
errors is better. In this case, if an
exception occurs, the exception will
take longer to handle.
Comments begin with #, not '# - you are making every other line of your header a docstring.
You are using an index to iterate across lists, which is inefficient - just iterate on the list items.
Calling a variable list is bad because it prevents you from accessing the list() datatype.
Using with is a more reliable replacement for open() .. close()
again.upper is a function reference - you have to call the function, ie again.upper().
You never change the value of again - this will be an infinite loop!
You test if search in list but then do a try..except block which will only fail if it is not in the list (ie you are testing for the same failure twice).
.
#
# Operate on a list of names
#
def load_names(fname):
try:
with open(fname, 'r') as inf:
return [line.strip() for line in inf]
except IOError:
print "Error reading file '{0}'".format(fname)
return []
def print_names(namelist):
print '******************'
print '\n'.join(namelist)
def write_names(namelist, fname):
with open(fname, 'w') as outf:
outf.write('\n'.join(namelist))
def search_names(namelist):
while True:
lookfor = raw_input('Enter a name to search for (or nothing to quit): ').strip()
if lookfor:
try:
ind = namelist.index(lookfor)
print("{0} found.".format(lookfor))
except ValueError:
print("{0} not found.".format(lookfor))
else:
break
def main():
namelist = load_names('names.txt')
print_names(namelist)
namelist.sort()
print_names(namelist)
write_names(namelist, 'sorted_names.txt')
search_names(namelist)
if __name__=="__main__":
main()

Categories

Resources