I have a python generator that returns lots of items, for example:
import itertools
def generate_random_strings():
chars = "ABCDEFGH"
for item in itertools.product(chars, repeat=10):
yield "".join(item)
I then iterate over this and perform various tasks, the issue is that I'm only using one thread/process for this:
my_strings = generate_random_strings()
for string in my_strings:
# do something with string...
print(string)
This works great, I'm getting all my strings, but it's slow. I would like to harness the power of Python multiprocessing to "divide and conquer" this for loop. However, of course, I want each string to be processed only once. While I've found much documentation on multiprocessing, I'm trying to find the most simple solution for this with the least amount of code.
I'm assuming each thread should take a big chunk of items every time and process them before coming back and getting another big chunk etc...
Many thanks,
Most simple solution with least code? multiprocessing context manager.
I assume that you can put "do something with string" into a function called "do_something"
from multiprocessing import Pool as ProcessPool
number_of_processes = 4
with ProcessPool(number_of_processes) as pool:
pool.map(do_something, my_strings)
If you want to get the results of "do_something" back again, easy!
with ProcessPool(number_of_processes) as pool:
results = pool.map(do_something, my_strings)
You'll get them in a list.
Multiprocessing.dummy is a syntactic wrapper for process pools that lets you use the multiprocessing syntax. If you want threads instead of processes, just do this:
from multiprocessing.dummy import Pool as ThreadPool
You may use multiprocessing.
import multiprocessing
def string_fun(string):
# do something with string...
print(string)
my_strings = generate_random_strings()
num_of_threads = 7
pool = multiprocessing.Pool(num_of_threads)
pool.map(string_fun, my_strings)
Assuming you're using the lastest version of Python, you may want to read something about asyncio module. Multithreading is not easy to implement due to GIL lock: "In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe."
So you can swap on Multiprocessing, or, as reported above, take a look at asycio module.
asyncio — Asynchronous I/O > https://docs.python.org/3/library/asyncio.html
I'll integrate this answer with some code as soon as possible.
Hope it helps,
Hele
As #Hele mentioned, asyncio is best of all, here is an example
Code
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# python 3.7.2
from asyncio import ensure_future, gather, run
import random
alphabet = 'ABCDEFGH'
size = 1000
async def generate():
tasks = list()
result = None
for el in range(1, size):
task = ensure_future(generate_one())
tasks.append(task)
result = await gather(*tasks)
return list(set(result))
async def generate_one():
return ''.join(random.choice(alphabet) for i in range(8))
if __name__ == '__main__':
my_strings = run(generate())
print(my_strings)
Output
['CHABCGDD', 'ACBGAFEB', ...
Of course, you need to improve generate_one, this variant is very slow.
You can see source code here.
Related
Suppose I have the following in Python
# A loop
for i in range(10000):
Do Task A
# B loop
for i in range(10000):
Do Task B
How do I run these loops simultaneously in Python?
If you want concurrency, here's a very simple example:
from multiprocessing import Process
def loop_a():
while 1:
print("a")
def loop_b():
while 1:
print("b")
if __name__ == '__main__':
Process(target=loop_a).start()
Process(target=loop_b).start()
This is just the most basic example I could think of. Be sure to read http://docs.python.org/library/multiprocessing.html to understand what's happening.
If you want to send data back to the program, I'd recommend using a Queue (which in my experience is easiest to use).
You can use a thread instead if you don't mind the global interpreter lock. Processes are more expensive to instantiate but they offer true concurrency.
There are many possible options for what you wanted:
use loop
As many people have pointed out, this is the simplest way.
for i in xrange(10000):
# use xrange instead of range
taskA()
taskB()
Merits: easy to understand and use, no extra library needed.
Drawbacks: taskB must be done after taskA, or otherwise. They can't be running simultaneously.
multiprocess
Another thought would be: run two processes at the same time, python provides multiprocess library, the following is a simple example:
from multiprocessing import Process
p1 = Process(target=taskA, args=(*args, **kwargs))
p2 = Process(target=taskB, args=(*args, **kwargs))
p1.start()
p2.start()
merits: task can be run simultaneously in the background, you can control tasks(end, stop them etc), tasks can exchange data, can be synchronized if they compete the same resources etc.
drawbacks: too heavy!OS will frequently switch between them, they have their own data space even if data is redundant. If you have a lot tasks (say 100 or more), it's not what you want.
threading
threading is like process, just lightweight. check out this post. Their usage is quite similar:
import threading
p1 = threading.Thread(target=taskA, args=(*args, **kwargs))
p2 = threading.Thread(target=taskB, args=(*args, **kwargs))
p1.start()
p2.start()
coroutines
libraries like greenlet and gevent provides something called coroutines, which is supposed to be faster than threading. No examples provided, please google how to use them if you're interested.
merits: more flexible and lightweight
drawbacks: extra library needed, learning curve.
Why do you want to run the two processes at the same time? Is it because you think they will go faster (there is a good chance that they wont). Why not run the tasks in the same loop, e.g.
for i in range(10000):
doTaskA()
doTaskB()
The obvious answer to your question is to use threads - see the python threading module. However threading is a big subject and has many pitfalls, so read up on it before you go down that route.
Alternatively you could run the tasks in separate proccesses, using the python multiprocessing module. If both tasks are CPU intensive this will make better use of multiple cores on your computer.
There are other options such as coroutines, stackless tasklets, greenlets, CSP etc, but Without knowing more about Task A and Task B and why they need to be run at the same time it is impossible to give a more specific answer.
from threading import Thread
def loopA():
for i in range(10000):
#Do task A
def loopB():
for i in range(10000):
#Do task B
threadA = Thread(target = loopA)
threadB = Thread(target = loobB)
threadA.run()
threadB.run()
# Do work indepedent of loopA and loopB
threadA.join()
threadB.join()
You could use threading or multiprocessing.
How about: A loop for i in range(10000): Do Task A, Do Task B ? Without more information i dont have a better answer.
I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script.
See Section: Using a pool of workers
Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. Once you understand what those lines are doing, the following example I constructed will make a lot of sense.
import numpy as np
from multiprocessing import Pool
def desired_function(option, processes, data, etc...):
# your code will go here. option allows you to make choices within your script
# to execute desired sections of code for each pool or subprocess.
return result_array # "for example"
result_array = np.zeros("some shape") # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...) # Arguments to be passed into desired function.
multiple_results = []
for i in range(processes): # Executes each pool w/ option (1-4 in this case).
multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.
results = np.array(res.get() for res in multiple_results) # Retrieves results after
# every pool is finished!
for i in range(processes):
result_array = result_array + results[i] # Combines all datasets!
The code will basically run the desired function for a set number of processes. You will have to carefully make sure your function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example, that's how I used it. Hope this simplifies or helps you better understand the power of multiprocessing in Python!
Let suppose I have complicated function I cant to run on a list :
import concurrent.futures
import random
import numpy as np
inputData = random.sample(range(60000, 1000000), 50)
def complicated_function(x):
"""complicated stuff"""
return x**x
I can process directly in a loop this way (that are how I want the results at the end of my code, with the same order if possible) :
#without parallelization
processedData = [complicated_function(x) for x in inputData]
for parallel processing I look at tutorials and it I made this code :
#with parallelization
processedData2 = []
with concurrent.futures.ThreadPoolExecutor() as e:
fut = [e.submit(complicated_function, inputData[i]) for i in range(len(inputData))]
for r in concurrent.futures.as_completed(fut):
processedData2.append(r.result())
The problem is that watching at my system monitor there is just one core working when this is running.. so obviously this is not working as I need...
Thank a lot in advance for your help !
It is because you are threading, and pythons global interpreter lock doesnt actually run the tasks in parallel across multiple cores, when you use threading.
Instead u can use multiprocessing..Just like ThreadPoolExecutor, there is a ProcessPoolExecutor which will make sure multiple cores are utilised.
try code like this:
import gc
import random
from concurrent.futures import ThreadPoolExecutor
zen = "Special cases aren't special enough to break the rules. "
def abc(length: int):
msg = ''.join(random.sample(zen, length))
print(msg)
del msg
if __name__ == '__main__':
pool = ThreadPoolExecutor(max_workers=8)
while True:
for x in range(256):
pool.submit(abc, random.randint(2, 6))
print('===================================================')
gc.collect()
The code may take about 8MB if it running without ThreadPoolExecutor, or about 30MB using str() instead of ''.join(). But this code keep eating RAM without limit. I thought it is caused by random.sample or something else, but it proved that ''.join() in ThreadPoolExecutor cause this problem.
It confuse me as there is no modules mutually import each other(share zen only), & neither the del or Gc work :(
ps: please notes that infinite loop is not a problem. when you run something like:
while True:
print(1234567)
the memory usage will keeping under a certain line (code above may take not more than 1MB?). The code at the top don't have an increasing list or dict, & the variable has been del at the end of the module. So it should be cleaned up when a thread finish as I think, which obviously not.
pss: let's talk like this: the cause of the problem is anything in ''.join() will not be recycled. As if we change the abc module this way:
tmp = random.sample(zen, length)
msg = ''.join(tmp)
print(msg[:8])
del msg, tmp
Gc works effectively, and the usage keeping about 26MB.
So is there something I missed when using ''.join() or python language has a bug there?
When you run the code without threads each sentence will execute completely, and with that I mean that the gc.collect() will be called after the inner loop has ended.
But when you execute the code with threads a new thread will be called before the most recent one has ended, so the number of new threads will rapidly increase and as there's no limit for the number of threads you will have more threads than your CPU's can handle causing an accumulation of threads.
I have generated permutations with the itertools.permutations function in python. The problem is that the result is very big and I would like to go through it with multiple threads but don't really know how to accomplish that here is what I have so far:
perms = itertools.permutations('1234', r=4)
#I would like to iterate through 'perms' with multiple threads
for perm in perms:
print perm
If the work you want to do with the items from the permutation generator is CPU intensive, you probably want to use processes rather than threads. CPython's Global Interpreter Lock (GIL) makes multithreading of limited value when doing CPU bound work.
Instead, use the multiprocessing module's Pool class, like so:
import multiprocessing
import itertools
def do_stuff(perm):
# whatever
return list(reversed(perm))
if __name__ == "__main__":
with multiprocessing.Pool() as pool: # default is optimal number of processes
results = pool.map(do_stuff, itertools.permutations('1234', r=4))
# do stuff with results
Note that if you will be iterating over results (rather than doing something with it as a list), you can use imap instead of map to get an iterator that you can use to work on the results as they are produced from the worker processes. If it doesn't matter what order the items are returned, you can use imap_unordered to (I think) save a bit of memory.
The if __name__ is "__main__" boilerplate is required on Windows, where the multiprocessing module has to work around the OS's limitations (no fork).
Split the index of the number of perms between threads then use this function to generate the perm from its index in each thread rather than generating all the perms and splitting them between threads.
Assuming your processing function is f(x) you want to do
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
perms = itertools.permutations('1234', r=4)
for r in pool.map(f, perms):
print (r)
In fact, using threads would not execute the processes in parallel, unless it is IO bound. If it is CPU bound and you have a quad core, then it's the way to go. If you don't have multicore and it is CPU bound, then I'm afraid that making it parallel will not improve your current situation.
Python's futures module makes it easy to split work between threads. In this example, 4 threads will be used, but you can modify that to suit your needs.
from concurrent import futures
def thread_process(perm):
#do something
with futures.ThreadPoolExecutor(max_workers=4) as executor:
for perm in perms:
executor.submit(thread_process, perm)
I'm a multiprocessing newbie,
I know something about threading but I need to increase the speed of this calculation, hopefully with multiprocessing:
Example Description: sends string to a thread, alters string + benchmark test,
send result back for printing.
from threading import Thread
class Alter(Thread):
def __init__(self, word):
Thread.__init__(self)
self.word = word
self.word2 = ''
def run(self):
# Alter string + test processing speed
for i in range(80000):
self.word2 = self.word2 + self.word
# Send a string to be altered
thread1 = Alter('foo')
thread2 = Alter('bar')
thread1.start()
thread2.start()
#wait for both to finish
while thread1.is_alive() == True: pass
while thread2.is_alive() == True: pass
print(thread1.word2)
print(thread2.word2)
This is currently takes about 6 seconds and I need it to go faster.
I have been looking into multiprocessing and cannot find something equivalent to the above code. I think what I am after is pooling but examples I have found have been hard to understand. I would like to take advantage of all cores (8 cores) multiprocessing.cpu_count() but I really just have scraps of useful information on multiprocessing and not enough to duplicate the above code. If anyone can point me in the right direction or better yet, provide an example that would be greatly appreciated. Python 3 please
Just replace threading with multiprocessing and Thread with Process. Threads in Pyton are (almost) never used to gain performance because of the big bad GIL! I explained it in an another SO-post with some links to documentation and a great talk about threading in python.
But the multiprocessing module is intentionally very similar to the threading module. You can almost use it as an drop-in replacement!
The multiprocessing module doesn't AFAIK offer a functionality to enforce the use of a specific amount of cores. It relies on the OS-implementation. You could use the Pool object and limit the worker-onjects to the core-count. Or you could look for an other MPI library like pypar. Under Linux you could use a pipe under the shell to start multiple instances on different cores