Use Python's completely undocumented
ThreadPool to parallelize non-bytecode work on shared memory!
from multiprocessing.pool import ThreadPool pool = ThreadPool() pool.map(io_heavy_work, data_in_shared_memory)
In contrast to
where each process gets a copy of the second argument of
map and the main process is responsible for collecting results as returned by
io_heavy_work function here may modify the instances of data it gets, since it's threading with shared-memory.
There are two main reasons for writing multi-threaded code:
- high-performance shared-memory programs (including games),
- programs that spend a lot of time waiting for many different I/O tasks.
In Python code, due to the GIL which restricts bytecode execution to a single thread, the first point becomes hopeless and only the second one remains.
I recently had such a workload, specifically a web-forum crawler. I've prototyped all code for crawling in a single-threaded fashing, something like this:
class Subject: def update(self): # Look for new replies and store them somewhere s = Subject(url) while True: s.update()
Now imagine you've got a lot of
urls, so many in fact, that running each in its own thread doesn't make sense.
update spends most of its time waiting for replies from the server.
This is a perfect illustration of the second case I've mentioned earlier.
If Python had a parallel for, we could simply do something like this:
subjects = [Subject(url) for url in all_my_urls] while True: parallel for s in subjects: s.update()
Unfortuantely, this doesn't exist.
Instead, people are always redirected to
Pool for such operations:
pool = multiprocessing.Pool() subjects = [Subject(url) for url in all_my_urls] while True: pool.map(lambda s: s.update(), subjects)
Well, beautiful! Except that this won't work.
The reason is that every process gets its own, separate copy of
subjects and won't update the objects in the main process!
The usual way of solving this, is to completely restructure your code into a "return the result" and "main collector" way,
which works, but is a nontrivial amount of effort and unnecessary code obfuscation.
If only we had threads (and thus shared memory) instead of processes, this would just work.
Well, turns out that Python comes with a thread pool! It's just completely undocumented, because nobody bothered to document it so far. Check this out:
from multiprocessing.pool import ThreadPool pool = ThreadPool() subjects = [Subject(url) for url in all_my_urls] while True: pool.map(lamda s: s.update(), subjects)
This now actually works, because all threads share the exact same
Beautiful, isn't it?