Python - Multiprocessing suddenly getting stuck, although code is correct

Python - Multiprocessing suddenly getting stuck, although code is correct - python

I´ve spent the last few hours on the multiprocessing tool in python. Last thing I did was to set up a Pool with a few workers to execute a function. Everything was doing fine and multiprocessing was working.
After I switched to another file to implement my multiprocessing insights in my actual project, the code was running, but the .get() invocation following the apply_async(func, args) never returned anything.
I thought it was due to a false implementation, but turned out to be something else. I went back to the first file where I did my experiments and tried to run this one again. Suddenly also here multiprocessing didn´t work anymore. Apart from trying to implement the multiprocessing in my project I didn´t really do anything else on the code or the environment. The only thing maybe correlated to the problem is that I tried to execute an older version of the __main__ program in the same environment, leading to an error because the __main__ file took to newer project files to execute.
Here´s the code I used for the simple experiment:
import torch.multiprocessing as mp
import torch
import time
*#thats code from my project*
import data.preprocessor as preprocessor
import eco_neat_t1_cfg as cfg
def f(x):
return x*x
if __name__ == "__main__":
p = mp.Pool(1)
j = p.apply_async(f, (5,))
r = j.get(timeout=10)
print(r)
So nothing to complicated here, it should actually work.
However, there is no result being returned. I notice the CPU is working according to the number of processes invoked, so I assume it is caught in some kind of loop.
I am running on Win10, with Python running over Anaconda Spyder with the IPython Console.
When I close Spyder and all Python processes, restart and execute it again it works, but only if I am not importing my own code. Once I import the eco_neat_t1_cfg as cfg, which is basically only a dataclass with a few other imports, the program wont return anything and gets stuck in the execution.
I am getting a Reloaded modules: data, data.preprocessor, data.generator error. This is probably because cfg also imports data.preprocessor. The weird thing is, if I now delete all imports from the source code, save the file and run it, it will work again, although the imports are not even specified in the code anymore... however, every time I restart Python and Spyder I need to specify the modules again ofc.
Also, the simple code is working after deletion of all imports, but the more complicated I actually want to execute is not, although it was perfectly fine before all this happened. Now I am getting a PicklingError: Can't pickle <class 'neat.genotype.genome.Pickleable_Genome'>: it's not the same object as neat.genotype.genome.Pickleable_Genome error. As I said, there were no problems with the pickling before. That´s the "more complicated" piece of code I pass to apply_async.
def create_net(genome):
net = feedforward.Net("cpu", genome, 1.0)
inp = torch.ones(2, 5)
input = torch.FloatTensor(inp).view(1, -1).squeeze()
res = net(input)
return res
What I did was to delete all pycache in my project, also reset Spyder settings and update Anaconda and Spyder. Also I tried to look at every piece of code of the cfg file, but even if I delete every line, also no imports, the program will not work. Nothing helps, I really don´t have any idea of what´s the problem and how to solve it. Maybe one of you has an idea.
One more thing to say, if I manually interrupt (CTRL + C) the execution, in the stacktrace I noticed the processes to be waiting.
File "C:/Users/klein/Desktop/bak/test/eco_neat_t1/blabla.py", line 72, in <module>
r = j.get(timeout=10)
File "C:\Users\klein\Anaconda3\envs\eco_neat\lib\multiprocessi\pool.py", line 651, in get
self.wait(timeout)
File "C:\Users\klein\Anaconda3\envs\eco_neat\lib\multiprocessing\pool.py", line 648, in wait
self._event.wait(timeout)
File "C:\Users\klein\Anaconda3\envs\eco_neat\lib\threading.py", line 552, in wait
signaled = self._cond.wait(timeout)
Cheers
SOLUTION: The bug was my directory structure that somehow led python to fail. I had both of my __main__ test files running in the root/tests/eco_neat_t1 directory. I two new files t1_cfg and t1 with the same content in the root directory. Suddenly everthing works fine...weird world

Related

calling another python script in the main file does not execute

I have two main python scripts in my project:
sftpConnector.py and fileIterator.py
sftpConnector.py establishes connection to an SFTP server and downloads some file, it however does not call any other script. It has only function: establisher()
While, fileiterator.py also has one function iterator() it calls two other scripts: folderGenerator.py and dataProcessor.py()
When independently called, both the scripts, sftpConnector.py and fileIterator.py execute properly without any issues, however, when I try to create a main.py file to call these two scripts one after the other, it does not work.
The main.py looks like this:
import sftpConnector
import fileIterator
if __name__ == 'main':
folder_name = input('File name:\n')
print('folder_name')
sftpConnector.establisher(folder_name)
fileIterator.iterator(folder_name)
This main.py file runs the first script and function, sftpConnector.establisher(folder_name) however it does not even go inside the second script. I am not sure what's going on here.
Also, not sure if this will help, but neither of the script are returning anything.

Your run guard should be
if __name__ == "__main__":
That’s how to ensure that your code only runs if run as a module (versus being imported). You accidentally checked for the wrong name (“main”) and I’m guessing that’s why the code in your if block isn’t running. It’s an easy mistake to make as it’s easy to assume that __name__ refers to the name of your file when you run your script directly, but it doesn’t; it refers to the environment where the toplevel code is run. The line if __name__ == '__main__': is asking the question: “Is this script the toplevel one?”

Unable to read file because of unfinished subprocess: Python

I'm using subprocess to run a script.R file from test.py. My test.py goes as:
import subprocess
import pandas as pd
subprocess.call(["/usr/bin/Rscript", "--vanilla", "script.R"]) #line 3
df=pd.read_csv("output.csv") #line 4
script.R goes as:
library(limma)
df <- read.csv("input.csv")
df<-normalizeCyclicLoess(df)
write.csv(df,"output.csv")
When I run the above file (test.py), I get an error:
FileNotFoundError: [Errno 2] File b'output.csv' does not exist: b'output.csv'
I understand that this error is because a output.csv file doesn't exist in my working directory. But, I assume that it would be created by script.R, which isn't happening probably because before the execution of line 3 finishes, python goes to line 4. We used help from here for this, and as mentioned, we are using call for this. What's going wrong then? Thanks...
EDIT: I noticed that if I don't import any libraries in my code (like limma above), everything works fine. I can write any lengthy code as I want, and it doesn't give me any error and proceeds to completion. But, as soon as I import any library, subprocess.call(....) gives me a non-zero result (zero means process completed). For a test, I changed my script.R to library(limma) (and nothing else) (tried for other libraries too- got the same results), and it gave me a non-zero result. Thus, I think, there's some problem with the import of libraries using subprocess. Please note that I'm able to run this R code directly, and nothing is wrong with my code/library. Please give me some hints on what might be going wrong here...

Apologies, my initial answer was completely wrong.
Your subprocess call:
subprocess.call(["/usr/bin/Rscript", "--vanilla", "script.R"])
will return a number - this is the exit code of the process. If it succeeds, without error, it should be zero. You could check this to make sure your R code ran correctly.
Have you tried running your R code directly? Does it produce the output.csv? And if so, does it produce it in the right place?

reload the currently running python script

In python ,There is a reload method to reload an imported module , but is there a method to reload the currently running script without restarting it, this would be very helpful in debugging a script and changing the code live while the script is running. In visual basic I remember a similar functionality was called "apply code changes", but I need a similar functionality as a function call like "refresh()" which will apply the code changes instantly.
This would work smoothly when an independent function in the script is modified and we need to immediately apply the code change without restarting the script.
Inshort will:
reload(self)
work?

reload(self) will not work, because reload() works on modules, not live instances of classes. What you want is some logic external to your main application, which checks whether it has to be reloaded. You have to think about what is needed to re-create your application state upon reload.
Some hints in this direction:
Guido van Rossum wrote once this: xreload.py; it does a bit more than reload() You would need a loop, which checks for changes every x seconds and applies this.
Also have a look at livecoding which basically does this. EDIT: I mistook this project for something else (which I didn't find now), sorry.
perhaps this SO question will help you

Perhaps you mean something like this:
import pdb
import importlib
from os.path import basename
def function():
print("hello world")
if __name__ == "__main__":
# import this module, but not as __main__ so everything other than this
# if statement is executed
mainmodname = basename(__file__)[:-3]
module = importlib.import_module(mainmodname)
while True:
# reload the module to check for changes
importlib.reload(module)
# update the globals of __main__ with the any new or changed
# functions or classes in the reloaded module
globals().update(vars(module))
function()
pdb.set_trace()
Run the code, then change the contents of function and enter c at the prompt to run the next iteration of the loop.

test.py
class Test(object):
def getTest(self):
return 'test1'
testReload.py
from test import Test
t = Test()
print t.getTest()
# change return value (test.py)
import importlib
module = importlib.import_module(Test.__module__)
reload(module)
from test import Test
t = Test()
print t.getTest()

Intro
reload is for imported modules. Documentation for reload advises against reloading __main__.
Reloading sys, __main__, builtins and other key modules is not recommended.
To achieve similar behavior on your script you will need to re-execute the script. This will - for normal scripts - also reset the the global state. I propose a solution.
NOTE
The solution I propose is very powerful and should only be used for code you trust. Automatically executing code from unknown sources can lead to a world of pain. Python is not a good environment for soapboxing unsafe code.
Solution
To programmatically execute a python script from another script you only need the built-in functions open, compile and exec.
Here is an example function that will re-execute the script it is in:
def refresh():
with open(__file__) as fo:
source_code = fo.read()
byte_code = compile(source_code, __file__, "exec")
exec(byte_code)
The above will in normal circumstances also re-set any global variables you might have. If you wish to keep these variables you should check weather or not those variables have already been set. This can be done with a try-except block covering NameError exceptions. But that can be tedious so I propose using a flag variable.
Here is an example using the flag variable:
in_main = __name__ == "__main__"
first_run = "flag" not in globals()
if in_main and first_run:
flag = True

None of these answers did the job properly for me, so I put together something very messy and very non-pythonic to do the job myself. Even after running it for several weeks, I am finding small issues and fixing them. One issue will be if your PWD/CWD changes.
Warning this is very ugly code. Perhaps someone will make it pretty, but it does work.
Not only does it create a refresh() function that properly reloads your script in a manner such that any Exceptions will properly display, but it creates refresh_<scriptname> functions for previously loaded scripts just-in-case you need to reload those.
Next I would probably add a require portion, so that scripts can reload other scripts -- but I'm not trying to make node.js here.
First, the "one-liner" that you need to insert in any script you want to refresh.
with open(os.path.dirname(__file__) + os.sep + 'refresh.py', 'r') as f: \
exec(compile(f.read().replace('__BASE__', \
os.path.basename(__file__).replace('.py', '')).replace('__FILE__', \
__file__), __file__, 'exec'))
And second, the actual refresh function which should be saved to refresh.py in the same directory. (See, room for improvement already).
def refresh(filepath = __file__, _globals = None, _locals = None):
print("Reading {}...".format(filepath))
if _globals is None:
_globals = globals()
_globals.update({
"__file__": filepath,
"__name__": "__main__",
})
with open(filepath, 'rb') as file:
exec(compile(file.read(), filepath, 'exec'), _globals, _locals)
def refresh___BASE__():
refresh("__FILE__")
Tested with Python 2.7 and 3.

Take a look at reload. You just need to install the plugin an use reload ./myscript.py, voilà

If you are running in an interactive session you could use ipython autoreload
autoreload reloads modules automatically before entering the execution of code typed at the IPython prompt.
Of course this also works on module level, so you would do something like:
>>>import myscript
>>>myscript.main()
*do some changes in myscript.py*
>>>myscript.main() #is now changed

Dropping a file onto a script to run as argument causes exception in Vista

edit:OK, I could swear that the way I'd tested it showed that the getcwd was also causing the exception, but now it appears it's just the file creation. When I move the try-except blocks to their it actually does catch it like you'd think it would. So chalk that up to user error.
Original Question:
I have a script I'm working on that I want to be able to drop a file on it to have it run with that file as an argument. I checked in this question, and I already have the mentioned registry keys (apparently the Python 2.6 installer takes care of it.) However, it's throwing an exception that I can't catch. Running it from the console works correctly, but when I drop a file on it, it throws an exception then closes the console window. I tried to have it redirect standard error to a file, but it threw the exception before the redirection occurred in the script. With a little testing, and some quick eyesight I saw that it was throwing an IOError when I tried to create the file to write the error to.
import sys
import os
#os.chdir("C:/Python26/Projects/arguments")
try:
print sys.argv
raw_input()
os.getcwd()
except Exception,e:
print sys.argv + '\n'
print e
f = open("./myfile.txt", "w")
If I run this from the console with any or no arguments, it behaves as one would expect. If I run it by dropping a file on it, for instance test.txt, it runs, prints the arguments correctly, then when os.getcwd() is called, it throws the exception, and does not perform any of the stuff from the except: block, making it difficult for me to find any way to actually get the exception text to stay on screen. If I uncomment the os.chdir(), the script doesn't fail. If I move that line to within the except block, it's never executed.
I'm guessing running by dropping the file on it, which according to the other linked question, uses the WSH, is somehow messing up its permissions or the cwd, but I don't know how to work around it.

Seeing as this is probably not Python related, but a Windows problem (I for one could not reproduce the error given your code), I'd suggest attaching a debugger to the Python interpreter when it is started. Since you start the interpreter implicitly by a drag&drop action, you need to configure Windows to auto-attach a debugger each time Python starts. If I remember correctly, this article has the needed information to do that (you can substitute another debugger if you are not using Visual Studio).
Apart from that, I would take a snapshot with ProcMon while dragging a file onto your script, to get an idea of what is going on.

As pointed out in my edit above, the errors were caused by the working directory changing to C:\windows\system32, where the script isn't allowed to create files. I don't know how to get it to not change the working directory when started that way, but was able to work around it like this.
if len(sys.argv) == 1:
files = [filename for filename in os.listdir(os.getcwd())
if filename.endswith(".txt")]
else:
files = [filename for filename in sys.argv[1:]]
Fixing the working directory can be managed this way I guess.
exepath = sys.argv[0]
os.chdir(exepath[:exepath.rfind('\\')])

Debugging a pyQT4 app?

I have a fairly simple app built with pyqt4. I wanted to debug one of the functions connected to one of the buttons in my app. However, when I do the following
python -m pdb app.pyw
> break app.pyw:55 # This is where the signal handling function starts.
things don't quite work like I'd hope. Instead of breaking in the function where I've set the breakpoint and letting me step through it, the debugger enters an infinite loop printing out QCoreApplication::exec: The event loop is already running and I am unable to input anything. Is there a better way to do this?

You need to call QtCore.pyqtRemoveInputHook. I wrap it in my own version of set_trace:
def debug_trace():
'''Set a tracepoint in the Python debugger that works with Qt'''
from PyQt4.QtCore import pyqtRemoveInputHook
# Or for Qt5
#from PyQt5.QtCore import pyqtRemoveInputHook
from pdb import set_trace
pyqtRemoveInputHook()
set_trace()
And when you are done debugging, you can call QtCore.pyqtRestoreInputHook(), probably best when you are still in pdb, and then after you hit enter, and the console spam is happening, keep hitting 'c' (for continue) until the app resumes properly. (I had to hit 'c' several times for some reason, it kept going back into pdb, but after hitting it a few times it resumed normally)
For further info Google "pyqtRemoveInputHook pdb". (Really obvious isn't it? ;P)

I had to use a "next" command at the trace point to get outside of that function first. For that I made a modification of the code from mgrandi:
def pyqt_set_trace():
'''Set a tracepoint in the Python debugger that works with Qt'''
from PyQt4.QtCore import pyqtRemoveInputHook
import pdb
import sys
pyqtRemoveInputHook()
# set up the debugger
debugger = pdb.Pdb()
debugger.reset()
# custom next to get outside of function scope
debugger.do_next(None) # run the next command
users_frame = sys._getframe().f_back # frame where the user invoked `pyqt_set_trace()`
debugger.interaction(users_frame, None)
This worked for me. I found the solution from here : Python (pdb) - Queueing up commands to execute

In my tests, jamk's solution works, while the previous one, although simpler, does not.
In some situations, for reasons that are unclear to me, I've been able to debug Qt without doing any of this.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - Multiprocessing suddenly getting stuck, although code is correct - python

Related

calling another python script in the main file does not execute

Unable to read file because of unfinished subprocess: Python

reload the currently running python script

Dropping a file onto a script to run as argument causes exception in Vista

Debugging a pyQT4 app?

Categories

Resources