Original problem
I am creating an API using express that queries a sqlite DB and outputs the result as a PDF using html-pdf module.
The problem is that certain queries might take a long time to process and thus would like to de-couple the actual query call from the node server where express is running, otherwise the API might slow down if several clients are running heavy queries.
My idea to solve this was to decouple the execution of the sqlite query and instead run that on a python script. This script can then be called from the API and thus avoid using node to query the DB.
Current problem
After quickly creating a python script that runs a sqlite query, and calling that from my API using child_process.spawn(), I found out that express seems to get an exit code signal as soon as the python script starts to execute the query.
To confirm this, I created a simple python script that just sleeps in between printing two messages and the problem was isolated.
To reproduce this behavior you can create a python script like this:
print("test 1")
sleep(1)
print("test 2)
Then call it from express like this:
router.get('/async', function(req, res, next) {
var python = child_process.spawn([
'python3'
);
var output = "";
python.stdout.on('data', function(data){
output += data
console.log(output)
});
python.on('close', function(code){
if (code !== 0) {
return res.status(200).send(code)
}
return res.status(200).send(output)
});
});
If you then run the express server and do a GET /async you will get a "1" as the exit code.
However if you comment the sleep(1) line, the server successfully returns
test 1
test 2
as the response.
You can even trigger this using sleep(0).
I have tried flushing the stdout before the sleep, I have also tried piping the result instead of using .on('close') and I have also tried using -u option when calling python (to use unbuffered streams).
None of this has worked, so I'm guessing there's some mechanism baked into express that closes the request as soon as the spawned process sleeps OR finishes (instead of only when finishing).
I also found this answer related to using child_process.fork() but I'm not sure if this would have a different behavior or not and this one is very similar to my issue but has no answer.
Main question
So my question is, why does the python script send an exit signal when doing a sleep() (or in the case of my query script when running cursor.execute(query))?
If my supposition is correct that express closes the request when a spawned process sleeps, is this avoidable?
One potential solution I found suggested the use of ZeroRPC, but I don't see how that would make express keep the connection open.
The only other option I can think of is using something like Kue so that my express API will only need to respond with some sort of job ID, and then Kue will actually spawn the python script and wait for its response, so that I can query the result via some other API endpoint.
Is there something I'm missing?
Edit:
AllTheTime's comment is correct regarding the sleep issue. After I added from time import sleep it worked. However my sqlite script is not working yet.
As it turns out AllTheTime was indeed correct.
The problem was that in my python script I was loading a config.json file, which was loaded correctly when called from the console because the path was relative to the script.
However when calling it from node, the relative path was no longer correct.
After fixing the path it worked exactly as expected.
Related
I'm currently working on a desktop application that makes use of React for frontend and runs through an electron window. Electron needs to be able to communicate with my Python backend. I've found an example online that works fine for running a simple Python script from Electron and returning the result to React.
Electron code that waits for signal from React:
ipcMain.on("REACT_TEST_PYTHON", (event, args) => {
let pyshell = new PythonShell(
path.join(__dirname, "../background/python/test.py"),
{
args: [args.test],
}
);
pyshell.on("message", function (results) {
mainWindow.webContents.send("ELECTRON_TESTED_PYTHON", { tasks: results });
});
})
test.py that is being run by electron:
import sys
data = sys.argv[1]
def factorial(x):
if x == 1 :
return 1
else:
return x * factorial(x-1)
print(factorial( int(data) ))
I completely understand how this works, but for my application the python scripts will not be as simple. Basically, I want electron to create a Python shell that starts a named task. This Python shell should continue running in the background while the frontend works normally. The part I'm stuck on is figuring out how to access data from this initial python shell from a different python shell created by a subsequent signal in electron. If that doesn't make sense this is my intended pipeline:
User clicks React button to start "Task1"
Electron gets signal from React and starts a Python shell to start "Task1". This python shell is running in the background (Electron should not wait for a result to continue processing).
Later on, user decides to click React button to cancel "Task1"
Electron gets signal from React and creates a new Python shell to cancel "Task1". In order to do this the new Python shell needs to access data from the original Python shell.
This new Python shell should also close the original Python shell so that it doesn't continue to try and run "Task1"
What would be the best way to do this?
Some thoughts I've had on how to do this:
Creating a file where necessary data for "Task1" could be written. I think I would need the mmap module in order to use some kind of shared memory between the shells (but I could be wrong). Would also need to figure out how to close the original Python shell.
Somehow saving a reference to the original Python shell which could allow for Electron to cancel "Task1" using the original shell. Not sure if this is possible though since the original Python shell will still be running, I doubt I could access the shell while its in the middle of processing.
Thank you for any help or insight you can provide! I apologize that this may be a confusing question, please let me know if I can clear anything up.
Trying to use rpyc server via progress script, and have the script do different tasks based on different values I'm trying to get from the client.
I'm using a rpyc server to automate some tasks on demand from users, and trying to implement the client in a progress script in this way:
1.progress script launches.
2.progress script calls the rpyc client via cmd, to run a function that checks if the server is live, and returns some sort of var to indicate wether the server is live or not (doesn't really matter to me what kind of indication is used, I guess different chars like 0-live 1-not live would be preferable).
3.based on the value returned in the last step, either notifies the user that the server is down and quits, or proceeds to the rest of the code.
The part I'm struggling with is stage 2, how to call the client in a way that stores a value it should return, and how to actually return the value properly to the script.
I thought about using -param command, but couldn't figure how to use it in my scenario, where the value I'm trying to return is to a script that already midrun, and not just call another progress script with the value.
The code of the client that I use for checking if the server is up is:
def client_check():
c=rpyc.connect(host,18812)
if __name__=="__main__":
try:
client_check()
except:
#some_method_of_transferring_the_indication#
For the progress script, as mentioned I didn't really managed to figure out the right way to call the client and store a value in the way I'm trying to..
I guess I can make the server create a file that will use as an indicator for his status, and check for the file at the start of the script, but I don't know if that's the right way to do so, and prefare to avoid using this if possible.
I am guessing that you are saying that you are shelling out from the Progress script to run your rpyc script as an external process?
In that case something along these lines will read the first line of output from that rpyc script:
define variable result as character no-undo format "x(30)".
input through value( "myrpycScript arg1 arg2" ). /* this runs your rpyc script */
import unformatted result. /* this reads the result */
input close.
display result.
Summary: I have a python script which collects tweets using Twitter API and i have postgreSQL database in the backend which collects all the streamed tweets. I have custom code which overcomes the ratelimit issue and i made it to run 24/7 for months.
Issue: Sometimes streaming breaks and sleeps for given secs but it is not helpful. I do not want to check it manually.
def on_error(self,status)://tweepy method
self.mailMeIfError(['me <me#localhost'],'listen.py <root#localhost>','Error Occured on_error method',str(error))
time.sleep(300)
return True
Assume mailMeIfError is a method which takes care of sending me a mail.
I want a simple cron script which always checks the process and restart the python script if not running/error/breaks. I have gone through some answers from stackoverflow where they have used Process ID. In my case process ID still exists because this script sleeps if Error.
Thanks in advance.
Using Process ID is much easier and safer. Try using watchdog.
This can all be done in your one script. Cron would need to be configured to start your script periodically, say every minute. The start of your script then just needs to determine if it is the only copy of itself running on the machine. If it spots that another copy is running, it just silently terminates. Else it continues to run.
This behaviour is called a Singleton pattern. There are a number of ways to achieve this for example Python: single instance of program
I need you guys :D
I have a web page, on this page I have check some items and pass their value as variable to python script.
problem is:
I Need to write a python script and in that script I need to put this variables into my predefined shell commands and run them.
It is one gnuplot and one other shell commands.
I never do anything in python can you guys send me some advices ?
THx
I can't fully address your questions due to lack of information on the web framework that you are using but here are some advice and guidance that you will find useful. I did had a similar problem that will require me to run a shell program that pass arguments derived from user requests( i was using the django framework ( python ) )
Now there are several factors that you have to consider
How long will each job takes
What is the load that you are expecting (are there going to be loads of jobs)
Will there be any side effects from your shell command
Here are some explanation that why this will be important
How long will each job takes.
Depending on your framework and browser, there is a limitation on the duration that a connection to the server is kept alive. In other words, you will have to take into consideration that the time for the server to response to a user request do not exceed the connection time out set by the server or the browser. If it takes too long, then you will get a server connection time out. Ie you will get an error response as there is no response from the server side.
What is the load that you are expecting.
You will have probably figure that if a work that you are requesting is huge,it will take out more resources than you will need. Also, if you have multiple requests at the same time, it will take a huge toll on your server. For instance, if you do proceed with using subprocess for your jobs, it will be important to note if you job is blocking or non blocking.
Side effects.
It is important to understand what are the side effects of your shell process. For instance, if your shell process involves writing and generating lots of temp files, you will then have to consider the permissions that your script have. It is a complex task.
So how can this be resolve!
subprocesswhich ship with base python will allow you to run shell commands using python. If you want more sophisticated tools check out the fabric library. For passing of arguments do check out optparse and sys.argv
If you expect a huge work load or a long processing time, do consider setting up a queue system for your jobs. Popular framework like celery is a good example. You may look at gevent and asyncio( python 3) as well. Generally, instead of returning a response on the fly, you can retur a job id or a url in which the user can come back later on and have a look
Point to note!
Permission and security is vital! The last thing you want is for people to execute shell command that will be detrimental to your system
You can also increase connection timeout depending on the framework that you are using.
I hope you will find this useful
Cheers,
Biobirdman
Lets say I have a view page(request) which loads page.html.
Now after successfully loading page.html, I want to automatically run a python script behind the scene 10 - 15 sec after the page.html loaded. How it is possible?
Also, is it possible to show the status of the script dynamically (running/ stopped/ Syntax Error..etc)
Runing a script from the javascript is not a clean way to do it, because the user can close the browser, disable js ... etc. instead you can use django-celery, it let you run backgroud scripts and you can check to status of the script dynamically from a middleware. Good luck
You could add a client-side timeout to AJAX back to the server 10-15 sec later. Point it to a different view and execute your script within that view. For example:
function runServerScript() {
$.get("/yourviewurlhere", function(data) {
// Do something with the return data
});
}
setTimeout("runServerScript()", 10000);
If you want status to be displayed, the client would have to make multiple requests back to the server.
Celery might come in handy for such use cases. You can start a task (or script as you call them) from a view (even with a delay, as you want). Sending status reports back to the browser will be harder unless you opt for something like WebSockets but that's highly experimental right now.