Python If == true statement only working on last line of readline - python

My function only says that the last word in a file of words is an anagram (the first helper function). But every word in the file is an anagram of the word I tested and returns true independently with the helper function outside of the main function. I am not sure if it has something to do with /n being a part of the string, and then it accounting for that, but I tried putting in an if statement saying to delete it if it was in there and that did not work either. I also did test to make sure it is running through each word in the .txt file and it is.
def is_anagram(string1, string2):
"""Returns True if the two strings are anagrams of eachother.
str, str -> bool"""
if sorted(string1)==sorted(string2):
return True
else:
return False
def find_anagrams(word):
final = []
content = open("small_list.txt")
content.close
while True:
line = content.readline()
print(line)
if is_anagram(word, line) == True:
print("bruh")
final.append(line)
elif line == '':
break
return final

This is expected, based on the method you use to read a line (file.readline). From the documentation:
f.readline() reads a single line from the file; a newline character
(\n) is left at the end of the string, and is only omitted on the last
line of the file if the file doesn’t end in a newline.
Your line has a trailing newline, but word certainly does not. So, in the end, all you'd need to change is:
line = content.readline().rstrip()
Well, that's all you'd need to change to get it working. Additionally, I'd also recommend using the with...as context manager to handle file I/O. It's good practice, and you'll thank yourself for it.
with open("small_list.txt") as f:
for line in f:
if is_anagram(word, line.rstrip()):
... # do something here
It's better to use a for loop to iterate over the lines of a file (rather than a while, it's cleaner). Also, there's no need to explicitly call f.close() when you use a context manager (you're not currently doing it, you're only referencing the method without actually calling it).
Incorporating #Christian Dean's suggestion in this answer, you can simplify your anagram function as well - call sorted and return the result in a single line:
def is_anagram(a, b):
return sorted(a) == sorted(b)

Related

Evaluation of IF statement in Python

Please help me try to understand the evaluation of this script. It must be something simple I'm not understanding. I want to scan a text file and output lines with specific string value.
So for example my txt file will contain simple stuff:
beep
boop
bop
Hey
beep
boop
bop
and the script is looking for the line with "Hey" and trying to output that line
file_path = "C:\\downloads\\test.txt"
i = 0
file_in = open(file_path,"r+") # read & write
for lines in file_in.readlines():
i+=1
#if lines.rfind("this entity will be skipped..."):
if lines.find("Hey"):
print(i, lines)
file_in.close()
For some reason it outputs every line except the one it found a match on. How is this possible?
It's probably more straightforward to do if "Hey" in lines: instead of if lines.find("Hey"):. If you really want to use find(), you could do this: if lines.find("Hey") != -1:
While Daniel's answer is, of course, correct and should be accepted, I want to fuel my OCD and offer some improvements:
# use context managers to automatically release file handler
with open(file_path) as file_in:
# file objects are iterable, will return lines
# reading entire file in memory can crash if file is too big
# enumerate() is a more readable alternative to manual counters
for i, line in enumerate(file_in): # 'line' should be singular
if "Hey" in line: # same as Daniel
print(i, line)
.find(sub) returns an integer - the first index of sub if sub is found, and -1 if it is not.
When "Hey" is present, it is at the first index (0), so that is what .find() returns. Otherwise, "Hey" is not found, so .find() returns -1.
Since python treats all integers as True except 0, then the conditional evaluates to True when 0 is returned, i.e. when "Hey" is found.
Change your use of .find() to something which fulfills your if statement the way you want.

using readline() in a function to read through a log file will not iterate

In the code below readline() will not increment. I've tried using a value, no value and variable in readline(). When not using a value I don't close the file so that it will iterate but that and the other attempts have not worked.
What happens is just the first byte is displayed over and over again.
If I don't use a function and just place the code in the while loop (without 'line' variable in readline()) it works as expected. It will go through the log file and print out the different hex numbers.
i=0
x=1
def mFinder(line):
rgps=open('c:/code/gps.log', 'r')
varr=rgps.readline(line)
varr=varr[12:14].rstrip()
rgps.close()
return varr
while x<900:
val=mFinder(i)
i+=1
x+=1
print val
print 'this should change'
It appears you have misunderstood what file.readline() does. Passing in an argument does not tell the method to read a specific numbered line.
The documentation tells you what happens instead:
file.readline([size])
Read one entire line from the file. A trailing newline character is kept in the string (but may be absent when a file ends with an incomplete line). If the size argument is present and non-negative, it is a maximum byte count (including the trailing newline) and an incomplete line may be returned.
Bold emphasis mine, you are passing in a maximum byte count and rgps.readline(1) reads a single byte, not the first line.
You need to keep a reference to the file object around until you are done with it, and repeatedly call readline() on it to get successive lines. You can pass the file object to a function call:
def finder(fileobj):
line = fileobj.readline()
return line[12:14].rstrip()
with open('c:/code/gps.log') as rgps:
x = 0
while x < 900:
section = finder(rgps)
print section
# do stuff
x += 1
You can also loop over files directly, because they are iterators:
for line in openfilobject:
or use the next() function to get a next line, as long as you don't mix .readline() calls and iteration (including next()). If you combine this witha generator function, you can leave the file object entirely to a separate function that will read lines and produce sections until you are done:
def read_sections():
with open('c:/code/gps.log') as rgps:
for line in rgps:
yield line[12:14].rstrip()
for section in read_sections():
# do something with `section`.

If statements not catching empty lines in files

I am currently writing a static function where an open file object is passed in as a parameter. It then reads the file, and if the line is empty, it returns False. If the line is not empty, it uses the line in question plus the next three to create a new object of Person class (the class being designed in my module). For some reason, my if statement is not catching newlines, no matter what method I have tried, and I keep getting errors because of it. What am I doing wrong?
#staticmethod
def read_person(fobj):
p_list = []
for line in fobj:
if line.isspace() or line == "\n":
return False
else:
p_list.append(line)
return Person(p_list[0],p_list[1],p_list[2],p_list[3])
Thanks for your help!
The magic you want is:
if line.strip() == "":
You can get caught up in all the little cases possible in blank line processing. Is it space-newline? space-space-newline? tab-newline? space-tab-newline? Etc.
So, don't check all those cases. Use strip() to remove all left and right whitespace. If you have an empty string remaining, it's a blank line, and Bob's your uncle.

find common elements in the strings python

I'm trying to find common elements in the strings reading from a file. And this is what I wrote:
file = open ("words.txt", 'r')
while 1:
line = file.readlines()
if len(line) == 0:
break
print line
file.close
def com_Letters(*strings):
return set.intersection(*map(set,strings))
and the result turns out: ['out\n', 'dog\n', 'pingo\n', 'coconut']
I put com_Letters(line), but the result is empty.
There are two problems, but neither one is with com_Letters.
First, this code guarantees that line will always be an empty list:
while 1:
line = file.readlines()
if len(line) == 0:
break
print line
The first time through the loop, you call readlines(), which will
Read until EOF using readline() and return a list containing the lines thus read.
If the file is empty, that's an empty list, so you'll break.
Otherwise, you'll print out the list, and go back into the loop. At which point readlines() is going to have nothing left to read, since you already read until EOF, so it's guaranteed to be an empty list. Which means you'll break.
Either way, list ends up empty.
It's not clear what you're trying to do with that loop. There's never any good reason to call readlines() repeatedly on the same file. But, even if there were, you'd probably want to accumulate all of the results, rather than just keeping the last (guaranteed-empty) result. Something like this:
while 1:
new_line = file.readlines()
if len(new_line) == 0:
break
print new_line
line += new_line
Anyway, if you fix that problem (e.g., by scrapping the whole loop and just using line = file.readlines()), you're calling com_Letters with a single list of strings. That's not particularly useful; it's just a very convoluted way of calling set. If it's not clear why:
Since there's only one argument (a list of strings), *strings ends up as a one-element tuple of that argument.
map(set, strings) on a single-element tuple just calls set on that element and returns a single-element list.
*map(set, strings) explodes that into one argument, the set.
set.intersection(s) is the same thing as s.intersection(), which just returns s itself.
All of this would be easier to see if you broke up some of those complex expressions and printed the intermediate values. Then you'd know exactly where it first goes wrong, instead of just knowing it's somewhere in a long chain of events.
A few side notes:
You forgot the () on the file.close, which means you're not actually closing the file. One of the many reasons that with is better is that it means you can't make that mistake.
Use plural names for collections. line sounds like a variable that should have a single line in it, not a variable that should have all of your lines.
The readlines function with no sizehint argument is basically useless. If you're just going to iterate over the lines, you can do that to the file itself. If you really need the lines in a list instead of reading them lazily, list(file) makes your intention clearer—and doesn't mislead you into thinking it might be useful to do repeatedly.
The Pythonic way to check for an empty collection is just if not line:, rather than if len(line) == 0:.
while True is clearer than while 1.
I suggest modifying the function as follows:
def com_Letters(strings):
return set.intersection(*map(set,strings))
I think the function is treating the argument strings as a list of a list of strings (only one argument passed in this case a single list) and therefore not finding the intersection.

Homework - Printing lines of a file between two line numbers

Using Python, how do I print the lines of a text file, given a starting and ending line number?
I have come up with a function, but this doesn't work.
def printPart(src, des, varFile):
returnLines = ""
for curLine in range(src, des):
returnLines += linecache.getline(varFile, curLine)
return returnLines
Since file objects are iterable in Python, you can apply all the functions from itertools to them. Have a look at itertools.islice(). (Since this is homework, I'll leave the details to you.)
I would start with the first line in the file, using readline() reading each line counting count += 1. once count gets to the start line number, start printing. Once it gets to the last line number, sys.exit()

Categories

Resources