I would like to know how many processes the Linux kernel created during
a period of time.
Usually during one minute.
My background: If too many processes got created during a minute, then there is something wrong. Most of our legacy code base was moved from shell to python, but sometimes there are still some shell scripts which are slow because they a lot of processes.
I would like to create a graph from this number. Then I would like to check on which host and why so many processes got created.
I want to implement this with Python.
Answers how to read this from /proc or /sys would be great.
It would be nice if the solution works for the wrap around which happens if pid_max gets reached.
The limit (maximum number of pids) is /proc/sys/kernel/pid_max. The manual says:
/proc/sys/kernel/pid_max (since Linux 2.5.34)
This file specifies the value at which PIDs wrap around (i.e., the
value in this file is one greater than the maximum PID). The default
value for this file, 32768, results in the same range of PIDs as on
earlier kernels
check /proc/stat, there is a processes
field, counts numbers of fork since boot, doc:
$ grep processes /proc/stat
processes 81579558
I use this under windows but maybe you can try it as a starting point
>>> import subprocess
>>> subprocess.Popen('tasklist')
<subprocess.Popen object at 0x00000268164C3CC0>
>>>
Name PID Session name No. of s Utilisation
========================= ======== ================ =========== ============
this will give you a table which you can capture with
subprocess.Popen('tasklist').communicate()[0], just count the lines and you'll get the current number of processes. Do it again in 1 minute and see what's changed
Instead of manually looking at /proc or /sys , let linux do it for you:
import subprocess
from time import sleep
time = 0
ps = subprocess.Popen(["ps",'-A', '-o', 'pid'], stdout=subprocess.PIPE)
pids = [int(x) for x in ps.communicate()[0].split()[1:]]
new_pids_count = 0;
while time < 60:
ps = subprocess.Popen(["ps",'-A', '-o', 'pid'], stdout=subprocess.PIPE)
output = [int(x) for x in ps.communicate()[0].split()[1:]]
for x in output:
if x not in pids:
new_pids_count += 1
pids.append(x)
time += 1
sleep(1)
Initially, I get all the currently running PIDS, using ps -A -i -pid, and put them all in a list.
I repeat this every second, to check for the newly spawned process, by comparing the results from the running it again, and the growing pids list.
Related
I am using multiprocessing to calculate a large mass of data; i.e. I periodically spawn a process so that the total number of processes is equal to the number of CPU's on my machine.
I periodically print out the progress of the entire calculation... but this is inconveniently interspersed with Python's welcome messages from each child!
To be clear, this is a Windows specific problem due to how multiprocessing is handled.
E.g.
> python -q my_script.py
Python Version: 3.7.7 on Windows
Then many subsequent duplicates of the same version message print; one for each child process.
How can I suppress these?
I understand that if you run Python on the command line with a -q flag, it suppresses the welcome message; though I don't know how to translate that into my script.
EDIT:
I tried to include the interpreter flag -q like so:
multiprocessing.set_executable(sys.executable + ' -q')
Yet to no avail. I receive a FileNotFoundError which tells me I cannot pass options this way due to how they check arguments.
Anyways, here is the relevant section of code (It's an entire function):
def _parallelize(self, buffer, func, cpus):
## Number of Parallel Processes ##
cpus_max = mp.cpu_count()
cpus = min(cpus_max, cpus) if cpus else int(0.75*cpus_max)
## Total Processes to-do ##
N = ceil(self.SampleLength / DATA_MAX) # Number of Child Processes
print("N: ", N)
q = mp.Queue() # Child Process results Queue
## Initialize each CPU w/ a Process ##
for p in range(min(cpus, N)):
mp.Process(target=func, args=(p, q)).start()
## Collect Validation & Start Remaining Processes ##
for p in tqdm(range(N)):
n, data = q.get() # Collects a Result
i = n * DATA_MAX # Shifts to Proper Interval
buffer[i:i + len(data)] = data # Writes to open HDF5 file
if p < N - cpus: # Starts a new Process
mp.Process(target=func, args=(p + cpus, q)).start()
SECOND EDIT:
I should probably mention that I'm doing everything within an anaconda environment.
The message is printed on interactive startup.
A spawned process does inherit some flags from the child process.
But looking at the code in multiprocessing it does not seem possible to change these parameters from within the program.
So the easiest way to get rid of the messages should be to add the -q option to the original python invocation that starts your program.
I have confirmed that the -q flag is inherited.
So that should suppress the message for the original process and the children that it spawns.
Edit:
If you look at the implementation of set_executable, you will see that you cannot add or change arguments that way. :-(
Edit2:
You wrote:
I'm doing everything within an anaconda environment.
You you mean a virtual environment, or some kind of fancy IDE like spyder?
If you ever have a Python problem, first try reproducing it in plain CPython, running from the command line. IDE's and fancy environments like anaconda sometimes do weird things when running Python.
I'm running Python 2.7 on the GCE platform to do calculations. The GCE instances boot, install various packages, copy 80 Gb of data from a storage bucket and runs a "workermaster.py" script with nohangup. The workermaster runs on an infinite loop which checks a task-queue bucket for tasks. When the task bucket isn't empty it picks a random file (task) and passes work to a calculation module. If there is nothing to do the workermaster sleeps for a number of seconds and checks the task-list again. The workermaster runs continuously until the instance is terminated (or something breaks!).
Currently this works quite well, but my problem is that my code only runs instances with a single CPU. If I want to scale up calculations I have to create many identical single-CPU instances and this means there is a large cost overhead for creating many 80 Gb disks and transferring the data to them each time, even though the calculation is only "reading" one small portion of the data for any particular calculation. I want to make everything more efficient and cost effective by making my workermaster capable of using multiple CPUs, but after reading many tutorials and other questions on SO I'm completely confused.
I thought I could just turn the important part of my workermaster code into a function, and then create a pool of processes that "call" it using the multiprocessing module. Once the workermaster loop is running on each CPU, the processes do not need to interact with each other or depend on each other in any way, they just happen to be running on the same instance. The workermaster prints out information about where it is in the calculation and I'm also confused about how it will be possible to tell the "print" statements from each process apart, but I guess that's a few steps from where I am now! My problems/confusion are that:
1) My workermaster "def" doesn't return any value because it just starts an infinite loop, where as every web example seems to have something in the format myresult = pool.map(.....); and
2) My workermaster "def" doesn't need any arguments/inputs - it just runs, whereas the examples of multiprocessing that I have seen on SO and on the Python Docs seem to have iterables.
In case it is important, the simplified version of the workermaster code is:
# module imports are here
# filepath definitions go here
def workermaster():
while True:
tasklist = cloudstoragefunctions.getbucketfiles('<my-task-queue-bucket')
if tasklist:
tasknumber = random.randint(2, len(tasklist))
assignedtask = tasklist[tasknumber]
print 'Assigned task is now: ' + assignedtask
subprocess.call('gsutil -q cp gs://<my-task-queue-bucket>/' + assignedtask + ' "' + taskfilepath + assignedtask + '"', shell=True)
tasktype = assignedtask.split('#')[0]
if tasktype == 'Calculation':
currentcalcid = assignedtask.split('#')[1]
currentfilenumber = assignedtask.split('#')[2].replace('part', '')
currentstartfile = assignedtask.split('#
currentendfile = assignedtask.split('#')[4].replace('.csv', '')
calcmodule.docalc(currentcalcid, currentfilenumber, currentstartfile, currentendfile)
elif tasktype == 'Analysis':
#set up and run analysis module, etc.
print ' Operation completed!'
os.remove(taskfilepath + assignedtask)
else:
print 'There are no tasks to be processed. Going to sleep...'
time.sleep(30)
Im trying to "call" the function multiple times using the multiprocessing module. I think I need to use the "pool" method, so I've tried this:
import multiprocessing
if __name__ == "__main__":
p = multiprocessing.Pool()
pool_output = p.map(workermaster, [])
My understanding from the docs is that the __name__ line is there only as a workaround for doing multiprocessing in Windows (which I am doing for development, but GCE is on Linux). The p = multiprocessing.Pool() line is creating a pool of workers equal to the number of system CPUs as no argument is specified. It the number of CPUs was 1 then I would expect the code to behave as it does before I attempted to use multiprocessing. The last line is the one that I don't understand. I thought that it was telling each of the processors in the pool that the "target" (thing to run) is workermaster. From the docs there appears to be a compulsory argument which is an iterable, but I don't really understand what this is in my case, as workermaster doesn't take any arguments. I've tried passing it an empty list, empty string, empty brackets (tuple?) and it doesn't do anything.
Please would it be possible for someone help me out? There are lots of discussions about using multiprocessing and this thread Mulitprocess Pools with different functions and this one python code with mulitprocessing only spawns one process each time seem to be close to what I am doing but still have iterables as arguments. If there is anything critical that I have left out please advise and I will modify my post - thank you to anyone who can help!
Pool() is useful if you want to run the same function with different argumetns.
If you want to run function only once then use normal Process().
If you want to run the same function 2 times then you can manually create 2 Process().
If you want to use Pool() to run function 2 times then add list with 2 arguments (even if you don't need arguments) because it is information for Pool() to run it 2 times.
But if you run function 2 times with the same folder then it may run 2 times the same task. if you will run 5 times then it may run 5 times the same task. I don't know if it is needed.
As for Ctrl+C I found on Stackoverflow Catch Ctrl+C / SIGINT and exit multiprocesses gracefully in python but I don't know if it resolves your problem.
I am a novice user of python multithreading/multiprocessing, so please bear with me.
I would like to solve the following problem and I need some help/suggestions in this regard.
Let me describe in brief:
I would like to start a python script which does something in the
beginning sequentially.
After the sequential part is over, I would like to start some jobs
in parallel.
Assume that there are four parallel jobs I want to start.
I would like to also start these jobs on some other machines using "lsf" on the computing cluster.My initial script is also running on a ” lsf”
machine.
The four jobs which I started on four machines will perform two logical steps A and B---one after the other.
When a job started initially, they start with logical step A and finish it.
After every job (4jobs) has finished the Step A; they should notify the first job which started these. In other words, the main job which started is waiting for the confirmation from these four jobs.
Once the main job receives confirmation from these four jobs; it should notify all the four jobs to do the logical step B.
Logical step B will automatically terminate the jobs after finishing the task.
Main job is waiting for the all jobs to finish and later on it should continue with the sequential part.
An example scenario would be:
Python script running on an “lsf” machine in the cluster starts four "tcl shells" on four “lsf” machines.
In each tcl shell, a script is sourced to do the logical step A.
Once the step A is done, somehow they should inform the python script which is waiting for the acknowledgement.
Once the acknowledgement is received from all the four, python script inform them to do the logical step B.
Logical step B is also a script which is sourced in their tcl shell; this script will also close the tcl shell at the end.
Meanwhile, python script is waiting for all the four jobs to finish.
After all four jobs are finished; it should continue with the sequential part again and finish later on.
Here are my questions:
I am confused about---should I use multithreading/multiprocessing. Which one suits better?
In fact what is the difference between these two? I read about these but I wasn't able to conclude.
What is python GIL? I also read somewhere at any one point in time only one thread will execute.
I need some explanation here. It gives me an impression that I can't use threads.
Any suggestions on how could I solve my problem systematically and in a more pythonic way.
I am looking for some verbal step by step explanation and some pointers to read on each step.
Once the concepts are clear, I would like to code it myself.
Thanks in advance.
In addition to roganjosh's answer, I would include some signaling to start the step B after A has finished:
import multiprocessing as mp
import time
import random
import sys
def func_A(process_number, queue, proceed):
print "Process {} has started been created".format(process_number)
print "Process {} has ended step A".format(process_number)
sys.stdout.flush()
queue.put((process_number, "done"))
proceed.wait() #wait for the signal to do the second part
print "Process {} has ended step B".format(process_number)
sys.stdout.flush()
def multiproc_master():
queue = mp.Queue()
proceed = mp.Event()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in range(4)]
for p in processes:
p.start()
#block = True waits until there is something available
results = [queue.get(block=True) for p in processes]
proceed.set() #set continue-flag
for p in processes: #wait for all to finish (also in windows)
p.join()
return results
if __name__ == '__main__':
split_jobs = multiproc_master()
print split_jobs
1) From the options you listed in your question, you should probably use multiprocessing in this case to leverage multiple CPU cores and compute things in parallel.
2) Going further from point 1: the Global Interpreter Lock (GIL) means that only one thread can actually execute code at any one time.
A simple example for multithreading that pops up often here is having a prompt for user input for, say, an answer to a maths problem. In the background, they want a timer to keep incrementing at one second intervals to register how long the person took to respond. Without multithreading, the program would block whilst waiting for user input and the counter would not increment. In this case, you could have the counter and the input prompt run on different threads so that they appear to be running at the same time. In reality, both threads are sharing the same CPU resource and are constantly passing an object backwards and forwards (the GIL) to grant them individual access to the CPU. This is hopeless if you want to properly process things in parallel. (Note: In reality, you'd just record the time before and after the prompt and calculate the difference rather than bothering with threads.)
3) I have made a really simple example using multiprocessing. In this case, I spawn 4 processes that compute the sum of squares for a randomly chosen range. These processes do not have a shared GIL and therefore execute independently unlike multithreading. In this example, you can see that all processes start and end at slightly different times, but we can aggregate the results of the processes into a single queue object. The parent process will wait for all 4 child processes to return their computations before moving on. You could then repeat the code for func_B (not included in the code).
import multiprocessing as mp
import time
import random
import sys
def func_A(process_number, queue):
start = time.time()
print "Process {} has started at {}".format(process_number, start)
sys.stdout.flush()
my_calc = sum([x**2 for x in xrange(random.randint(1000000, 3000000))])
end = time.time()
print "Process {} has ended at {}".format(process_number, end)
sys.stdout.flush()
queue.put((process_number, my_calc))
def multiproc_master():
queue = mp.Queue()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in xrange(4)]
for p in processes:
p.start()
# Unhash the below if you run on Linux (Windows and Linux treat multiprocessing
# differently as Windows lacks os.fork())
#for p in processes:
# p.join()
results = [queue.get() for p in processes]
return results
if __name__ == '__main__':
split_jobs = multiproc_master()
print split_jobs
With the RaspberryPi system I have to synchronize a Raspbian system command (raspivid -t 20000) with a while loop that reads continuously from a sensor adn stores samples in an array. The Raspbian command start a video recording by the RaspberryPi camera CSI module and I have to be sure that it starts at the same instant of the acquisition by the sensor. I have seen many solution that have confused me among modules like multiprocessing, threading, subprocess, ecc. So far the only thing that I have understood is that the os.system() function blocks execution of following python's commands placed in the script as long as it runs. So if I try with:
import os
import numpy as np
os.system("raspivid -t 20000 /home/pi/test.h264")
data = np.zeros(20000, dtype="float") #memory pre-allocation supposing I have to save 20000 samples from the sensor (1 for each millisecond of the video)
indx=0
while True:
sens = readbysensor() #where the readbysensor() function is defined before in the script and reads a sample from the sensor
data[indx]=sens
if indx==19999:
break
else:
indx+=1
that while-loop will run only when the os.system() function will finish. But as I wrote above I need that the two processes are synchronized and work in parallel. Any suggestion?
Just add an & at the end, to make the process detach to the background:
os.system("raspivid -t 20000 /home/pi/test.h264 &")
According to bash man pages:
If a command is terminated by the control operator &, the shell
executes the command in the background in a subshell. The shell does
not wait for the command to finish, and the return status is 0.
Also, if you want to minimize the time it takes for the loop to start after executing raspivid, you should allocate your data and indx prior to the call:
data = np.zeros(20000, dtype="float")
indx=0
os.system("raspivid -t 20000 /home/pi/test.h264 &")
while True:
# ....
Update:
Since we discussed further in the comments, it is clear that there is no really a need to start the loop "at the same time" as raspivid (whatever that might mean), because if you are trying to read data from the I2C and make sure you don't miss any data, you will be best of starting the reading operation prior to running raspivid. This way you are certain that in the meantime (however big of delay there is between those two executions) you are not missing any data.
Taking this into consideration, your code could look something like this:
data = np.zeros(20000, dtype="float")
indx=0
os.system("(sleep 1; raspivid -t 20000 /home/pi/test.h264) &")
while True:
# ....
This is the simplest version in which we add a delay of 1 second before running raspivid, so we have time to enter our while loop and start waiting for I2C data.
This works, but it is hardly a production quality code. For a better solution, run the data acquisition function in one thread and the raspivid in a second thread, preserving the launch order (the reading thread is started first).
Something like this:
import Queue
import threading
import os
# we will store all data in a Queue so we can process
# it at a custom speed, without blocking the reading
q = Queue.Queue()
# thread for getting the data from the sensor
# it puts the data in a Queue for processing
def get_data(q):
for cnt in xrange(20000):
# assuming readbysensor() is a
# blocking function
sens = readbysensor()
q.put(sens)
# thread for processing the results
def process_data(q):
for cnt in xrange(20000):
data = q.get()
# do something with data here
q.task_done()
t_get = threading.Thread(target=get_data, args=(q,))
t_process = threading.Thread(target=process_data, args=(q,))
t_get.start()
t_process.start()
# when everything is set and ready, run the raspivid
os.system("raspivid -t 20000 /home/pi/test.h264 &")
# wait for the threads to finish
t_get.join()
t_process.join()
# at this point all processing is completed
print "We are all done!"
You could rewrite your code as:
import subprocess
import numpy as np
n = 20000
p = subprocess.Popen(["raspivid", "-t", str(n), "/home/pi/test.h264"])
data = np.fromiter(iter(readbysensor, None), dtype=float, count=n)
subprocess.Popen() returns immidiately without waiting for raspivid to end.
Here is my code for a simple multiprocessing task in python
from multiprocessing import Process
def myfunc(num):
tmp = num * num
print 'squared O/P will be ', tmp
return(tmp)
a = [ i**3 for i in range(5)] ## just defining a list
task = [Process(target = myfunc, args = (i,)) for i in a] ## creating processes
for each in task : each.start() # starting processes <------ problem line
for each in task : each.join() # waiting all to finish up
When I run this code, it hangs at certain point, so to identify it I ran it line by line in python shell and found that when I call 'each.start()' The shell pops out a dialogue box as:
" The program is still running , do you want to kill it? '
and I select 'yes' the shell closes.
When I replace Process with 'threading.Thread' the same code runs but with this nonsense output:
Squared Squared Squared Squared Squared 0 1491625
36496481
Is there any help in this regard ? thank in advance
To run my python codes I use Idlex IDE and I start it from terminal.
I have Intel Xeon Processor with 4 cores / 8 Threads, and 8GB RAM
With a little thought I finally found the problem.
This is happening because in Python, the float and int objects are not 'thread-safe', meaning the memory allocated to calculate any function's value by one thread/process can be overwritten by another and hence they show absurd values. This is called a race condition.
To solve this problem, use deque() from the collections module or, even better, use the 'Lock' facility. deque() works with arrays but it's meant for arrays of the same kind (much like MATLAB arrays) and is thread/process safe. 'Lock' avoids race conditions.
So the edit would be :
def myfunc(num):
lock.acquire()
.......some code .....
.......some code......
lock.release()
That's all.
But one problem still persists and that is with the multiprocessing module. Even after calling 'lock', the problem mentioned in the question remains.
Save the code above into a .py file and then run it in a gnome-terminal with
python myfile.py
Where "myfile.py" is the filename you saved to.
I would assume that the IDE you are using is confused somehow by Process()