I have a python program which dynamically move and rename files into a hadoop cluster. The files usually range from 10mb(parsed) up to 1.5gb (raw data). For the move commands to finish it can take a while and from what I can tell python races through them and none of the move commands get to finish. What is the proper way to have python wait for previous commands. I store the commands in a variable and pass it to os.system. The relevant code is
os.system(moverawfile)
os.system(renamerawfile)
os.system(moveparsedfile)
os.system(renameparsedfile)
I know rename commands are done basically instantaneously. Am I not supposed to use os.system? How do i ensure that python will wait for each command to finish before moving onto the next one.
I would suggest that you use run from subprocess as per Python documentation. It waits for your command to complete before returning.
Related
so title doesn't explain much.
I have a function where it should run on a separate .py file.
So I have a python file where it takes some variables and waits for event to happen than continue until it finishes. You can think this as event listener (web socket).
This file runs and doesn't give output just does some functions and when event finishes it closes. So running one file only is no problem but I want to run more than 10 of these at the same time for different purposes but same event, this causes problems where some of them doesn't work or miss the event.
I do this by running 10 terminals (cmd or shell). Which I think it creates problem because of running this much of event handling in shells, in future I might use more than 10 files maybe 50-100.
So what I tried:
I tried one-file using threading (multi-threading) and it didn't work.
My goal
I want help with this problem where I can run as many as these files without missing events and slowing down the system. I am open to ideas.
Concurrent.future could be a good option, it execute a piece of
code in another thread. See documentation
https://docs.python.org/3/library/concurrent.futures.html.
The threading library that comes with Python allows you to start
mutliple times the same function in different threads whithout having
to wait for them to finish. See the documentation
ttps://docs.python.org/3/library/threading.html.
A similar API is in the library multiprocessing allow you do the
same in pultiple processes. Documentation is
https://docs.python.org/3/library/multiprocessing.html. One
difference is that in Python threads are virtual, all manage in the
single interpreter process. With multiprocessing you start several
processes and probably have less impact on the performance.
The code you have to run in a process or a thread has to be in a defined function. It seems that this code is in a separate .py file, a module, therefore you have to import it (https://docs.python.org/3/tutorial/modules.html) first. So one file manage the thread/multiprocess in a loop, another for the code listening the event and only one terminal will be required to start them.
You can use multi Threading.
here in this page you will find some very useful examples of what you want (I recommend using the concurrent.futures cause in the new version of python 3 you will run into some bugs using the threading ).
https://realpython.com/intro-to-python-threading/#using-a-threadpoolexecutor
I want to store the output of the terminal command top into a file, using Python.
In the terminal, when I type top and hit enter, I get an output that is real time, so it keeps updating. I want to store this into a file for a fixed duration and then stop writing.
file=open("data.txt","w")
file.flush()
import os,time
os.system("top>>data.txt -n 1")
time.sleep(5)
exit()
file.close()
I have tried to use time.sleep() and then exit(), but it doesn't work, and the only way top can be stopped is in the terminal, by Control + C
The process keeps running and the data is continuously written onto the file, which is not ideal, as one would guess
For clarity: I know how to write the output on to the file, I just want to stop writing after a period
system will wait for the end of the child process. If you do not want that, the Pythonic way is to directly use the subprocess module:
import subprocess
timeout=60 # let top run for one minute
file=open("data.txt","w")
top = subprocess.Popen(["top", "-n", 1], stdout=file)
if top.wait(timeout) is None: # wait at most timeout seconds
top.terminate() # and terminate child
The panonoic way (which is highly recommended for robust code) would be to use the full path of top. I have not here, because it may depend on the actual system...
The issue you could be facing is that os.system starts the process as part of the current process. So the rest of your script will not be run until the command you run has completed execution.
I think what you want to be doing is executing your console command on another thread so that the thread running your python script can continue while the command runs in the background. See run a python program on a new thread for more info.
I'd suggest something like (this is untested):
import os
import time
import multiprocessing
myThread = multiprocessing.process(target=os.system, args=("top>>data.txt -n 1",))
myThread.start()
time.sleep(5)
myThread.terminate()
That being said, you may need to consider the thread safety of os.system(), if it is not thread safe you'll need to find an alternative that is.
Something else worth noting (and that I know little about) is that it may not be ideal to terminate threads in this way, see some of the answers here: Is there any way to kill a Thread?
I made a script which plays a video file by using subprocess.run().
import subprocess
DATA_DIR = 'path\\to\\video\\files'
MEDIA_PLAYER = 'path\\to\\my\\media-player'
# returns path of random video file
p = chooseOne(DATA_DIR)
print('playing {}'.format(p))
# runs chosen path
subprocess.run([MEDIA_PLAYER, p])
But I would like to kill the python script running this code immediately after opening the child subprocess.
Is this possible? And if not, is there an alternative means of opening an external process using Python which would allow the script to terminate?
Note: I am using Python v3.6
Don't use subprocess.run; use os.execl instead. That makes your media player replace your Python code in the current process, rather that starting a new process.
os.execl(MEDIA_PLAYER, p)
subprocess.run effectively does the same thing, but forks first so that there are temporarily two processes running your Python script; in one, subprocess.run returns without doing anything else to allow your script to continue. In the other, it immediately uses one of the os.exec* functions—there are 8 different varieties—to execute your media player. In your case, you just want the first process to exit anyway, so save the effort of forking and just use os.execl right away.
I am using Expect for automation, and I want to execute a Python script from it. But it is not working... This is what I have tried so far:
#!/usr/bin/expect
spawn "./os_fun"
and
#!/usr/bin/expect
spawn "./os_fun.py"
and
#!/usr/bin/expect
spawn python "./os_fun(.py)"
The "os_fun.py" contains the simple code:
#!/bin/usr/python
import os
print os.getcwd()
I would also like to mention that I must use Expect only and not Bash as I need to do the automation part, and I am not supposed to use Pexpect.
When it comes to Expect, you always have to expect something, so that Expect will wait for it. Else, it will proceed as such. Simply spawning a processing does not make sense, as Expect don't wait to see for it which in turn makes the user not to see the output as well.
In your case, you just have to run the code and see the output till the program completes. I hope my understanding is correct.
!/usr/bin/expect
spawn python os_fun.py
expect eof; # will wait till 'eof' seen
Here, expect command will wait till it sees close of the running program.
Default timeout is 10 seconds which can be changed as
set timeout 60; # Timeout value as 1 min
I start a script and I want to start second one immediately after the first one is completed successfully?
The problem here is that this script can take 10min or 10hours according to specific cases and I do not want to fix the start of the second script.
Also, I am using python to develop the script, so if you can provide me a solution with python control on the cron it will be OK.
Thank you,
You can use a lock file to indicate that the first script is still running.
You could use atexit module. As part of first script, you could register the function. From the registered function at exit, you could call the second script or execute the second script using system.
import atexit
def at_exit():
print 'invoke the second script'
atexit.register(at_exit)