For loop inside function executing only once - python

I'm facing a problem that I have never encountered, I do invoke a function that contains a for loop many times from another function, but the latter is only executing once and I don't know what I did wrong.
here is the function that contains the for loop.
def check_neutre(word):
global f4
i = 0
neutre = []
print(word)
for ligne in f4:
neutre.append(ligne.strip())
print(len(neutre))
return "done"
f4 is a file variable opened at the top.
and here the function that calls it
def check_words(words):
polarite = 0
exist = False
for word in words:
print(check_neutre(word))
check_words(words)
words variable is a list of words.
the output above shows that the loop is executed only once

I think you are searching for looping over each line of the file, and in your code you just loop it all at once
you can change to this in the start of the file:
f4_lines = f4.readlines()
and use the global of f4_lines instad of f4
If it is a large file maybe there is another solution cause this will loads the file into the memory

Related

I have 2 functions that does the same thing, but while calling both, only one returns while other doesn't [duplicate]

This question already has answers here:
Iterating on a file doesn't work the second time [duplicate]
(4 answers)
Closed 2 years ago.
This is actually a very basic issue i'm facing
# create list using append & idiom method to test process time
fln=open('CROSSWD.TXT')
def check_1(fln):
res=[]
for line in fln:
word=line.strip()
res.append(word) # just create a new list
return res
def check_2(fln):
res2=[]
for line in fln:
word2=line.strip()
res2+=[word2] # using another way
return res2
n=check_2(fln) # now this where the problem occurs. n returns the value
m=check_1(fln) # m return a void list
# it should call both m,n & print same length. They work separately but calling at once does'nt work why?
print (len(n))
print(len(m))
But if I run them separately they work as intended. This is a very basic issue, hope somone can clarify me on this basics
the problem is that you read the file in your first function call till the end thus there is nothing left. Moreover, you are never closing the file.
Thats why it is recommended to use a context manager to interact with files like this:
fln='CROSSWD.TXT'
def check_1(fln):
res=[]
with open(fln) as file:
ctx = file.read()
for line in ctx:
word=line.strip()
res.append(word) # just create a new list
return res
def check_2(fln):
res2=[]
with open(fln) as file:
ctx = file.read()
for line in fln:
word2=line.strip()
res2+=[word2] # using another way
return res2
if __name__ == "__main__":
n=check_2(fln)
m=check_1(fln)
print(len(n))
print(len(m))
The file is "used up" by reading it in check_2. The call of check_1 is trying to continue stepping through the same file, but the end of that file has been reached by the end of the call to check_2.
To read it twice, call fln=open('CROSSWD.TXT') twice.
Another point: Your code neglects to close the file. In a script which exits right after reading a file, you can leave it to the operating system to close the file on exit. But still, you should get use to opening files with the context manager pattern, using with and indenting the block that uses the file.
with fln=open('CROSSWD.TXT'):
res=[]
for line in fln:
word=line.strip()
res.append(word) # just create a new list
return res
Open the file twice.
def check_1(fln):
res=[]
for line in fln:
word=line.strip()
res.append(word) # just create a new list
return res
def check_2(fln):
res2=[]
for line in fln:
word2=line.strip()
res2+=[word2] # using another way
return res2
n=check_2(open('CROSSWD.TXT', 'r')) # now this where the problem occurs. n returns the value
m=check_1(open('CROSSWD.TXT', 'r')) # m return a void list
# it should call both m,n & print same length. They work separately but calling at once does'nt work why?
print(len(n))
print(len(m))

What are the consequences of using a 'while True' loop inside 'with open()' block?

For instance:
def read_file(f):
with open(f, 'r') as file_to_read:
while True:
line = file_to_read.readline()
if line:
yield line
else:
time.sleep(0.1)
The generator is consumed by another function:
def fun_function(f):
l = read_file(f)
for line in l:
do_fun_stuff()
A use case would be reading an infinitely updating text file like a log where new lines are added every second or so.
As far as I understand the read_file() function is blocking others as long as something is yielded. But since nothing should be done unless a new line is present in the file, this seems to be okay in this case. My question would be if there are other reasons not to prefer this blocking pattern (like performance)?

Re-running a python script from where it was stopped

I am preparing a python script which contains 4 functions, and a main function calling them.
When the script is ran, all the functions are executed one by one, but I manually terminated the script after 2nd function is executed completely.
Now I want, when the script is re-run, it should ignore first two function and start with the 3rd one.
I got one idea to use a file and add an entry for each function when it is executed and next time read from that file, but it will require too much of nested if.
Any other ideas?
Why "nested" ifs? If the functions are supposed to run in sequence, you can write the number of the last function to finish to a file, and then have a sequence of non-nested ifs, e.g, something like this pseudo-code
... # read n from file, or 0 if it doesn't exist
if n < 1:
f1()
... # write 1 to the file
if n < 2:
f2()
... # write 2 to the file
if n < 3:
f3()
... # write 3 to the file
...
So indeed one if per function, but no nesting.
If the functions may run in a different order, you can write a different file for each one, or a different line into a single file as you suggested, but I don't understand why it will have to be nested ifs.
Put the function calls in a try/catch with Keyboard interrupt
fn = 0 # read from file
try:
if fn < 1: f1()
fn=1
if fn < 2: f2()
fn=2
...
except KeyboardInterrupt: # user stopped
... # write fn to file
This cuts down on code repetition from writing fn to a file.
If you know what order you want the functions to run in you could do this:
fn = 0 # read from file
fn_reached = fn
functions = [f1, f2, f3, f4] # list of function objects
try:
for f in functions[fn:]:
f()
fn_reached += 1
except KeyboardInterrupt:
... # write fn_reached to file
If you want the program to reset after a full run, try this:
if fn_reached == len(functions):
... # write 0 to file
Other solution being using an environment variable, You can define a utility function, that will call all function. which will check if the environment variable is set.
import os
def utiltiy_function():
function_count = os.environ.get("function_count")
if not function_count:
os.environ['function_count']=0 #this will set initially value when the function has not been called
if function_count==0:
func_1()
elif function_count==1:
func_2()
elif function_count==2:
func_3()
elif function_count==3:
func_4()
else:
pass
and in each function end you can update the value of the environment variable, If you still have doubt let me know.

Python keeps refering to outdated variables

I have a kinda weird problem, here is my attempt at an explanation:
I'm currently making a program which opens a txt file and then reads a line for that file, using the following command linecache.getline(path,number), after the function is done I then use commmand linecache.clearcache.
If I then change something in the text file it keeps returning the pre-changed line.
Following is the code I'm using (I know it aint really pretty)
def SR(Path,LineNumber):
returns = lc.getline(Path,LineNumber)
x = np.array([])
y = np.array([])
words = returns.split()
for word in words:
x = np.append([x],[word])
for i in range(len(x)):
t = float(x[i])
y = np.append([y],[t])
return y
del x
del y
del t
del words
lc.clearcache()
Nothing after the return statement will ever be executed. If you want to call clearcache, you need to call it before the return statement.
Also, as a side note, your del statements aren't really going to do anything either, even if they were placed before the return. del effectively just decrements the reference counter in the gc, which will happen when the interpreter exits the function scope anyway.

What's the benefit of using generator in this case?

I'm learning Python's generator from this slide: http://www.dabeaz.com/generators/Generators.pdf
There is an example in it, which can be describe like this:
You have a log file called log.txt, write a program to watch the content of it, if there are new line added to it, print them. Two solutions:
1. with generator:
import time
def follow(thefile):
while True:
line = thefile.readline()
if not line:
time.sleep(0.1)
continue
yield line
logfile = open("log.txt")
loglines = follow(logfile)
for line in loglines:
print line
2. Without generator:
import time
logfile = open("log.txt")
while True:
line = logfile.readline()
if not line:
time.sleep(0.1)
continue
print line
What's the benefit of using generator here?
If all you have is a hammer, everything looks like a nail
I'd almost just like to answer this question with just the above quote. Just because you can does not mean you need to all the time.
But conceptually the generator version separates functionality, the follow function serves the purpose of encapsulating the continuous reading from a file while waiting for new input. Which frees you to do anything in your loop with the new line that you want. In the second version the code to read from the file, and to print out is intermingled with the control loop. This might not be really an issue in this small example but that is something you might want to think about.
One benefit is the ability to pass your generator around (say to different functions) and iterate manually by calling .next(). Here is a slightly modified version of your initial generator example:
import time
def follow(file_name):
with open(file_name, 'rb') as f:
for line in f:
if not line:
time.sleep(0.1)
continue
yield line
loglines = follow(logfile)
first_line = loglines.next()
second_line = loglines.next()
for line in loglines:
print line
First of all I opened the file with a context manager (with statement, which auto-closes the file when you're done with it, or on exception). Next, at the bottom I've demonstrated using the .next() method, allowing you to manually step through. This can be useful sometimes if you need to break logic out from a simple for item in gen loop.
A generator function is defined like a normal function, but whenever it needs to generate a value, it does so with the yield keyword rather than return. Its main advantage is it allows its code to produce a series of values over time, rather than computing them at once and sending them back like a list.For example
# A Python program to generate squares from 1
# to 100 using yield and therefore generator
# An infinite generator function that prints
# next square number. It starts with 1
def nextSquare():
i = 1;
# An Infinite loop to generate squares
while True:
yield i*i
i += 1 # Next execution resumes
# from this point
# Driver code to test above generator
# function
for num in nextSquare():
if num > 100:
break
print(num)
Return sends a specified value back to its caller whereas Yield can produce a sequence of values. We should use yield when we want to iterate over a sequence, but don’t want to store the entire sequence in memory.
Ideally most loops are roughly of the form:
for element in get_the_next_value():
process(element)
However sometimes (as in your example #2), the loop is actually more complex as you sometimes get an element and sometimes don't. That means in your example without the element you have mixed up code for generating an element with the code for processing it. It doesn't show too clearly in the example because the code to generate the next value isn't actually too complex and the processing is just one line, but example number 1 is separating these two concepts more cleanly.
A better example might be one that processes variable length paragraphs from a file with blank lines separating each paragraph: try writing code for that with and without generators and you should see the benefit.
While your example might be a bit simple to fully take advantage of generators, I prefer to use generators to encapsulate the generation of any sequence data where there is also some kind of filtering of the data. It keeps the 'what I'm doing with the data' code separated from the 'how I get the data' code.

Categories

Resources