Newbie with Azure Batch here.
I have created a pipeline in ADF to run a python script to call an API and do some data manipulations. I call this script by using a "Custom" activity through Azure Batch.
Tested this locally and the Python script itself runs and finishes perfectly. Through Azure, I can see that the script has run due to the CPU usage in the database. However, the script never finishes and goes to the next step:
The script code:
The script is doing a cURL POST request to an interface which runs some SQL on the background. This should take roughly 15 minutes and finishes correctly if running locally.
If I'm looking at my pool in Azure Batch, I can see that I have 1 node in a running state at 100%.
After 12 hours, I'm getting this message from ADF:
How can I troubleshoot what is going on? Why is my script not finishing?
Related
I regularly have Python scripts that take up to 8+ hours to complete that I want to run on a remote server. However, I don't want to go through the hassle of setting up a server, setting an environment, running the script and shutting down the server after the script is done every time.
Ideally, I'm looking for a CLI product like Heroku that spins up a server, runs the script in an environment and shuts down the server after the script is done.
AWS Lambda functions sound close to what I'm looking for, but they have a runtime limit. Are there other solutions that would fit these criteria?
Thanks!
I have a project in which one of the tests consists of running a process indefinitely in order to collect data on the program execution.
It's a Python script that runs locally on a Linux machine, but I'd like for other people in my team to have access to it as well because there are specific moments where the process needs to be restarted.
Is there a way to set up a workflow on this machine that when dispatched, stops and restarts the process?
You can execute commands on your Linux host via GH Actions and SSH. Take a look at this action.
I have a Python Program which runs perfectly as standalone program
Time Taken - 5 days
I dockerize the program and execute it with 10% of dataset
docker runs and the program executes successfully
When i use full dataset(108K records) and build and run the new docker
The Docker starts running for 4 hours and logs the steps perfectly
After 4 hours no logging is done
when i inspect using htop no resource is being used
htop image - sys resource use
for docker stats it is not using any resource
docker stats image
For docker ps it shows the image is running
docker ps image
Kindly let me know what I am doing wrong
Is docker has any limits to running a program or logging data
Are you running docker directly on linux or are you using OSX/Windows for it, if so, you might be hitting memory limits.
If running on the cloud (AWS...) check that the machine has no expiry or something like that. I recommend trying to run that locally first.
I'm running a python script using the task scheduler. The script gets data from a database using pandas and stores them in csv.gz files.
It's running properly but there is a difference in data size when the Task Scheduler runs automatically and when it is run manually via Task scheduler run button. The manual run gets all the data, whereas automated run gets lesser data.
It runs the same script. I've checked multiple times. But unable to identify the issue.
PS:
Using Windows Server 2008, pymssql to connect to Database
I have coded a Python Script for Twitter Automation using Tweepy. Now, when i run on my own Linux Machine as python file.py The file runs successfully and it keeps on running because i have specified repeated Tasks inside the Script and I also don't want to stop the script either. But as it is on my Local Machine, the script might get stopped when my Internet Connection is off or at Night. So i couldn't keep running the Script Whole day on my PC..
So is there any way or website or Method where i could deploy my Script and make it Execute forever there ? I have heard about CRON JOBS before in Cpanel which can Help repeated Tasks but here in my case i want to keep running my Script on the Machine till i don't close the script .
Are their any such solutions. Because most of twitter bots i see are running forever, meaning their Script is getting executed somewhere 24x7 . This is what i want to know, How is that Task possible?
As mentioned by Jon and Vincent, it's better to run the code from a cloud service. But either way, I think what you're looking for is what to put into the terminal to run the code even after you close the terminal. This is what worked for me:
nohup python code.py &
You can add a systemd .service file, which can have the added benefit of:
logging (compressed logs at a central place, or over network to a log server)
disallowing access to /tmp and /home-directories
restarting the service if it fails
starting the service at boot
setting capabilities (ref setcap/getcap), disallowing file access if the process only needs network access, for instance