Twisted - Deferred, if its not asynchronous, then whats the point?

Twisted - Deferred, if its not asynchronous, then whats the point? - python

I've been tasked with learning Twisted.
I am also somewhat new to Python in general, but have used other modern programming languages.
In reading over Twisted documentation, I keep running into examples that are
Not complete executable examples
Run in one thread
Coming from other languages, when I use some asynchronous mechanism, there is usually another thread of execution while I carry out some manner of work, then I am notified when that work is completed, and I react to its results.
I do see that it has some built in asynchronous mechanisms, but none of them provide the user with a means to create custom CPU bound asynchronous tasks akin to 'Tasks' in C# or 'work' with boost::asio in C++ that would run in parallel to the main thread.
I see that Twisted provides a means to asynchronously wait on IO and do things in on the same thread while waiting, if we are waiting on:
network reads and writes
keyboard input
It also shows me how to:
Do some manner of integration with GUI tool kits to make use of their event loop, but doesn't go into detail.
Schedule tasks using reactor on a timer, but doesn't do that task in parallel to anything else
It talks about async/await, but that is for python 3 only, and I am using python 2.7
I figured the some manner of thread pooling must be built into the reactor, but then when I read about the reactor, it says that everything runs on the main thread in reactor.run().
So, I am left confused.
What is the point of deferreds, creating a callback chain and reacting to the results, if we aren't running anything in parallel?
If we are running asynchronous code, how are we making our own custom asynchronous functions? (see keyword async in C#)
In other languages, I might create an async task to count from 1 to 10, while on the main thread, I might count from 'a' to 'z' at the same time. When the the task is complete I would get notified via a callback on a thread from a threadpool. I'd have the option to sync up, if I wanted to, by calling some 'wait' method. While the definition of "asynchronous" only involves the posting of the task, the getting of the result, and the callback when its done....I've never seen it used without doing things in parallel.

I'll address your questions (and statements that seem confusing) one-by-one:
"Examples that are not complete"
Restating what I posted in the comments: see my two previous answers for complete examples ( https://stackoverflow.com/a/30399317/3334178 & https://stackoverflow.com/a/23274411/3334178 ) and go through Krondo's Twisted Introduction
You said you are discounting these because "The examples are the network code in twisted, which has the asynchronisity built in and hidden.". I disagree with that assertion and will explain this in the next section.
"Examples are not asynchronous"
When your talking about "asynchronous programming" in the vain of pythons twisted/tornado/asyncio (or Node.JS or C select/poll/kpoll) your talking about model/pattern of programming that allows the programmer shape their code so that parts of it can run while other parts are blocked (in almost all cases the blocking is caused by a part of the program having to wait for IO).
These libraries/languages will certainly have ways they can do threading and/or multiprocessing, but those are layers grafted on top of the async design - and if that's genuinely what you need (I.E. you have an exclusively CPU bound need) the async systems are going to be a bad choice.
Let's use your "hidden away" comment to get into this a bit more
"Network examples are asych, but the asynchronousity is built in and hidden away"
The fundamental element of the async design is that you write your code so it should never block for IO - You've been calling out network but really we are talking about network/disk/keyboard/mouse/sound/serial - anything that (for whatever reason) can run slower than the CPU (and that the OS has a file-descriptor for).
Also, there isn't anything really "hidden away" about how it functions - async programming always uses non-blocking (status checking / call-back) calls for any IO channel it can operate on. If you dig enough in the twisted codebase all the async logic is in plain sight (Krondo's tutorial is really good for giving examples of this)
Let me use the keyboard as an example.
In sync code, you would use an input or a read - and the program would pause waiting for that line (or key) to be typed.
In async code (at least in featureful implementations like twisted) you will fetch the file-descriptor for "input" and register it with call-back function, to be called when the file-descriptor changes, to the OS-level async engine (select, poll, kpoll, etc...)
The act of doing that registration - which takes almost no time LETS YOU run other logic while the keyboard logic waits for the keyboard event to happen (see the stdio.StandardIO(keyboardobj,sys.stdin.fileno()) line from near the end of my example code in https://stackoverflow.com/a/30399317/3334178).
"[This] leads me to believe there is some other means to use deferreds with asynchronous"
deferreds aren't magic. They are just clever lists of function callback. There are numerous clever ways they can be chained together, but in the end, they are just a tool to help you take advantage of the logic above
"It also talks about async/await, that is for python 3 only, and I am using python 2.7"
async and await are just the python 3 way of doing what was done in python2 with #defer.inlineCallbacks and yield. These systems are shortcuts that rewire code so that to the reader the code looks and acts like sync code, but when its run the code is morphed into a "register a callback and move-on" flow
"when I read about the reactor, it says that everything runs on the main thread in reactor.run()"
Yes, because (as above) async is about not-waiting-for-IO - its not about threading or multi-processing
Your last few questions "point of deferreds" and "how do you make asynchronous" feel like I answered them above - but if not, let me know in the comments, and I'll spell them out.
Also your comment requesting "an example where we count from 1 to 10 in some deferred function while we count from a to z in the main thread?" doesn't make sense when talking about async (both because you talk about a "thread" - which is a different construct, and because those are both (likely) CPU tasks), but I will give you a different example that counts while watching for keyboard input (which is something that definitely DOES make sense when talking about async:
#!/usr/bin/env python
#
# Frankenstein-esk amalgam of example code
# Key of which comes from the Twisted "Chat" example
# (such as: http://twistedmatrix.com/documents/12.0.0/core/examples/chatserver.py)
import sys # so I can get at stdin
import os # for isatty
import termios, tty # access to posix IO settings
from twisted.internet import reactor
from twisted.internet import stdio # the stdio equiv of listenXXX
from twisted.protocols import basic # for lineReceiver for keyboard
from twisted.internet import task
class counter(object):
runs = 0
def runEverySecond():
counter.runs += 1
print "async counting demo: " + str(counter.runs)
# to set keyboard into cbreak mode - so keys don't require a CR before causing an event
class Cbreaktty(object):
org_termio = None
my_termio = None
def __init__(self, ttyfd):
if(os.isatty(ttyfd)):
self.org_termio = (ttyfd, termios.tcgetattr(ttyfd))
tty.setcbreak(ttyfd)
print ' Set cbreak mode'
self.my_termio = (ttyfd, termios.tcgetattr(ttyfd))
else:
raise IOError #Not something I can set cbreak on!
def retToOrgState(self):
(tty, org) = self.org_termio
print ' Restoring terminal settings'
termios.tcsetattr(tty, termios.TCSANOW, org)
class KeyEater(basic.LineReceiver):
def __init__(self):
self.setRawMode() # Switch from line mode to "however much I got" mode
def rawDataReceived(self, data):
key = str(data).lower()[0]
if key == 'q':
reactor.stop()
else:
print "--------------"
print "Press:"
print " q - to cleanly shutdown"
print "---------------"
# Custom tailored example for SO:56013998
#
# This code is a mishmash of styles and techniques. Both to provide different examples of how
# something can be done and because I'm lazy. Its been built and tested on OSX and linux,
# it should be portable (other then perhaps termal cbreak mode). If you want to ask
# questions about this code contact me directly via mail to mike at partialmesh.com
#
#
# Once running press any key in the window where the script was run and it will give
# instructions.
def main():
try:
termstate = Cbreaktty(sys.stdin.fileno())
except IOError:
sys.stderr.write("Error: " + sys.argv[0] + " only for use on interactive ttys\n")
sys.exit(1)
keyboardobj = KeyEater()
l = task.LoopingCall(runEverySecond)
l.start(1.0) # call every second
stdio.StandardIO(keyboardobj,sys.stdin.fileno())
reactor.run()
termstate.retToOrgState()
if __name__ == '__main__':
main()
(I know technically I didn't use a deferred - but I ran out of time - and this case is a bit too simple to really need it (I don't have a chain of callback anywhere, which is what deferreds are for))

Related

With Logbook and ZeroMQ, why do I need to wait to be able to pass messages?

Code description:
My code is simple, it starts a ZeroMQHandler (socket based messaging) from the Logbook library (using pyzmq). The Logger (log) runs throughout the application. Finally, the handler closes the port. The .push() and .pop_application() methods are there instead of the
with handler.applicationbound(): and indent.
Purpose:
I'm testing this queue based messaging to see if it can be a low-impact asynchronous logging solution. I need to log about 15 000 messages per second. I'd prefer to use Python, but my fallback is writing the logger in C++ and exposing its handles to python.
The issue:
The issue is that if I don't wait a quarter of a second or more after opening the handler(socket), the program executes without any messages getting through (the test program takes less than 0.25 seconds to execute). This I interpret as being a set-up time needed for the ZeroMQ socket or something like that. So I'm reaching out to see if anyone has a similar experience, maybe this is documented anywhere, but I can't seem to figure it out by myself. I want to know why this is needed. Thanks for any input.
my working code looks something like this :
from logbook.queues import ZeroMQHandler
from logbook import Logger
import time
addr='tcp://127.0.0.1:5053'
handler = ZeroMQHandler(addr)
time.sleep(0.25) ################################################# THIS ! ####
log = Logger("myLogbook")
handler.push_application()
log.info("start of program")
foo()
log.info("end of program")
handler.close()
handler.pop_application()
receiver, running in different python kernel (for tests, gives output to stdout):
from logbook.queues import ZeroMQSubscriber
from logbook import Logger, StreamHandler
import sys
import time
addr='tcp://127.0.0.1:5053'
print("ZeroMQSubscriber begin with address {}".format(addr))
subscriber = ZeroMQSubscriber(addr)
handler = StreamHandler(sys.stdout)
log = Logger("A receiver")
handler.push_application()
try:
i=0
while True:
i += 1
record = subscriber.recv(2)
if not record:
pass # timeout
else:
print("got message!")
log.handle(record)
except KeyboardInterrupt:
print("C-C caught, program end after {} iterations".format(i))
handler.pop_application()

ZeroMQ indeed spends some time to create a Context()-instance per-se, it then asks O/S to allocate memory-bound resources, to spawn I/O-threads, which also takes some additional time. Next, each Socket()-instantiation consumes some add-on overhead time.
It is well documented in both the native API-documentation and in educational resources, that asynchronous signalling / messaging framework does spend some time, before any API-request actually gets processed inside the both "local" and "remote" Context()-instances and finally marked as a deliverable readable for the "remote"-end of some of the ZeroMQ Scalable Formal Communication Archetype.
This said, there ought be no surprise, that an even more re-wrapped use of ZeroMQ tools ( re-wrapped by another level of abstraction, coded into the logbook.queue.ZeroMQSubscriber, logbook.queue.ZeroMQHandler classes ), will only add additional { setup & operational }-overheads, so the known asynchronicity of the service will only grow.
If your application requires any sort of mutual reconfirmation between any pair of the both ends that they have reached a Ready-To-Operate state ( RTO-state ), the best would be to introduce some sort of smart-reconciliation policy, instead of remaining in a blind belief, relying on a long enough .sleep() to hope the things were let enough time to settle down and got into RTO.
In distributed-system there is always better be explicit, rather than remaining in an optimistic hope.
Epilogue:
Given your sustained throughput ought safely go under and also remain under the expected threshold of <= 66 [us/message] per message dispatched, let me also raise an interest into a proper Context()-parametrisation, so as to be indeed smoothly carrying your required workload under realistic hardware and system-wide resources planning.
Default values are not hammered into a stone and never ought be a point to rely on.

Python - Executing Functions During a raw_input() Prompt [duplicate]

I'm trying to write a simple Python IRC client. So far I can read data, and I can send data back to the client if it automated. I'm getting the data in a while True, which means that I cannot enter text while at the same time reading data. How can I enter text in the console, that only gets sent when I press enter, while at the same time running an infinite loop?
Basic code structure:
while True:
read data
#here is where I want to write data only if it contains '/r' in it

Another way to do it involves threads.
import threading
# define a thread which takes input
class InputThread(threading.Thread):
def __init__(self):
super(InputThread, self).__init__()
self.daemon = True
self.last_user_input = None
def run(self):
while True:
self.last_user_input = input('input something: ')
# do something based on the user input here
# alternatively, let main do something with
# self.last_user_input
# main
it = InputThread()
it.start()
while True:
# do something
# do something with it.last_user_input if you feel like it

What you need is an event loop of some kind.
In Python you have a few options to do that, pick one you like:
Twisted https://twistedmatrix.com/trac/
Asyncio https://docs.python.org/3/library/asyncio.html
gevent http://www.gevent.org/
and so on, there are hundreds of frameworks for this, you could also use any of the GUI frameworks like tkinter or PyQt to get a main event loop.
As comments have said above, you can use threads and a few queues to handle this, or an event based loop, or coroutines or a bunch of other architectures. Depending on your target platforms one or the other might be best. For example on windows the console API is totally different to unix ptys. Especially if you later need stuff like colour output and so on, you might want to ask more specific questions.

You can use a async library (see answer of schlenk) or use https://docs.python.org/2/library/select.html
This module provides access to the select() and poll() functions
available in most operating systems, epoll() available on Linux 2.5+
and kqueue() available on most BSD. Note that on Windows, it only
works for sockets; on other operating systems, it also works for other
file types (in particular, on Unix, it works on pipes). It cannot be
used on regular files to determine whether a file has grown since it
was last read.

coroutine before Python 3.4

I want to confirm the following questions:
There is no native coroutine decorator in Python 2. It would be provided by something like [1].
Prior to Python 3.4, all other Python 3 releases require pip install asyncio in order to use import asyncio.coroutine [2].
trollius is the reference port implementation of aysncio and tulips for Python 2 (and the community thinks that's the one to use)?
Thank you.

I have used Python generators as coroutines, using only built-in methods. I have not used coroutines in any other environment, so my approach may be misinformed.
Here's some starter code I wrote that uses generators to send and receive data in a coroutine capacity, without even using Python 3's yield from syntax:
def sleep(timer, action=None):
''' wait for time to elapse past a certain timer '''
yield # first yield cannot accept a message
now = then = yield
while now - then < timer:
now = yield
if action:
action()
else:
yield "timer finished"
def buttonwait():
''' yields button presses '''
yield
yield
while True:
c = screen.getch()
if c:
yield c
yield
next, the wait function which manages the coroutines, sending the current time and listening for data
def wait(processes):
start = time.time()
for process in processes:
process.next()
process.send(start)
while True:
now = time.time()
for process in processes:
value = process.send(now)
if value:
return value
last, an implementation of those:
def main():
processes = []
process.append(timer(5)
processes.append(buttonwait())
wait(processes)
I used this on a Raspberry Pi with a 2x16 LCD screen to:
respond to button presses
turn off the backlight after a timeout
scroll long text
move motors
send serial commands
It's a bit complicated to get started, knowing where to put the yields and whatnot, but seems reasonably functional once it is.

Judging from your question, I think you've conflated two things: co-routines and async I/O-run loops. They do not depend upon each other.
Because in Python one can send values into a generator, you can create code implementing the co-routine pattern. All that is needed is generator.send(). Your link to David Beazley's code is a good example of how to create a co-routine. It is just a pattern. The new to Python v3.3 yield from allows this pattern to be used even more flexibly. David has an excellent talk on all of the things you can do with co-routines in a synchronous computing world.
Python's asyncio module depends upon the yield from construct. They have created an async co-routine to package this capability and associate it with a run loop. The goal, I believe, is to allow folks to easily build run loop oriented applications. There is still a role for the Twisteds and Tornados of the world.
(I myself am wary of projects like Trollius. They are kind of "Dancing Bear" project. They miss the point. Asyncio is about bringing a straightforward run loop implementation to Python 3 as a standard service. Python 2 already has two or more excellent async I/O libraries. Albeit these libraries are complex. IOW, if you are starting with Python 3 and your needs are straightforward, then use asyncio otherwise use Tornado. [Twisted isn't ported yet.] If you are starting with Python 2, then Twisted or Tornado is probably where you should start. Trollius? A code incompatible version with Python 3? An excellent "Dancing Bear.")
In my book, asyncio is an excellent reason to move your code to Python 3. We live in an asynchronous world and run loops are too important a feature to be project specific.
Anon,
Andrew

sound way to feed commands to twisted ssh after reactor.run()

Guys this is a question about python twisted ssh lib.
All sample code even production code I saw acting as a ssh client based on twisted.conch.ssh are all interacting with server in such a mode:
prepare some commands to run remotely;
define call backs;
kick off reactor then suspend for new feedback;
After the reactor.run(), I never found people tried to deliver commands to sshd, the script just sit their waiting. I think it'll be possible to fork or spawn stuffs to send commands. However since one of twisted's strengths is its de-multiplexing mechanism so it doesn't have to fork to process incoming requests when running as a server. May I say it is a reasonable requirement not to fork (as a client script) to continuously send requests to server?
Any thought on this ?
TIA.

joefis' answer is basically sound, but I bet some examples would be helpful. First, there are a few ways you can have some code run right after the reactor starts.
This one is pretty straightforward:
def f():
print "the reactor is running now"
reactor.callWhenRunning(f)
Another way is to use timed events, although there's probably no reason to do it this way instead of using callWhenRunning:
reactor.callLater(0, f)
You can also use the underlying API which callWhenRunning is implemented in terms of:
reactor.addSystemEventTrigger('after', 'startup', f)
You can also use services. This is a bit more involved, since it involves using using twistd(1) (or something else that's going to hook the service system up to the reactor). But you can write a class like this:
from twisted.application.service import Service
class ThingDoer(Service):
def startService(self):
print "The reactor is running now."
And then write a .tac file like this:
from twisted.application.service import Application
from thatmodule import ThingDoer
application = Application("Do Things")
ThingDoer().setServiceParent(application)
And finally, you can run this .tac file using twistd(1):
$ twistd -ny thatfile.tac
Of course, this only tells you how to do one thing after the reactor is running, which isn't exactly what you're asking. It's the same idea, though - you define some event handler and ask to receive an event by having that handler called; when it is called, you get to do stuff. The same idea applies to anything you do with Conch.
You can see this in the Conch examples, for example in sshsimpleclient.py we have:
class CatChannel(channel.SSHChannel):
name = 'session'
def openFailed(self, reason):
print 'echo failed', reason
def channelOpen(self, ignoredData):
self.data = ''
d = self.conn.sendRequest(self, 'exec', common.NS('cat'), wantReply = 1)
d.addCallback(self._cbRequest)
def _cbRequest(self, ignored):
self.write('hello conch\n')
self.conn.sendEOF(self)
def dataReceived(self, data):
self.data += data
def closed(self):
print 'got data from cat: %s' % repr(self.data)
self.loseConnection()
reactor.stop()
In this example, channelOpen is the event handler called when a new channel is opened. It sends a request to the server. It gets back a Deferred, to which it attaches a callback. That callback is an event handler which will be called when the request succeeds (in this case, when cat has been executed). _cbRequest is the callback it attaches, and that method takes the next step - writing some bytes to the channel and then closing it. Then there's the dataReceived event handler, which is called when bytes are received over the chnanel, and the closed event handler, called when the channel is closed.
So you can see four different event handlers here, some of which are starting operations that will eventually trigger a later event handler.
So to get back to your question about doing one thing after another, if you wanted to open two cat channels, one after the other, then in the closed event handler could open a new channel (instead of stopping the reactor as it does in this example).

You're trying to put a square peg in a round hole. Everything in Twisted is asynchronous, so you have to think about the sequence of events differently. You can't say "here are 10 operations to be run one after the other" that's serial thinking.
In Twisted you issue the first command and register a callback that will be triggered when it completes. When that callback occurs you issue the 2nd command and register a callback that will be triggered when that completes. And so on and so on.

How to synchronize a python dict with multiprocessing

I am using Python 2.6 and the multiprocessing module for multi-threading. Now I would like to have a synchronized dict (where the only atomic operation I really need is the += operator on a value).
Should I wrap the dict with a multiprocessing.sharedctypes.synchronized() call? Or is another way the way to go?

Intro
There seems to be a lot of arm-chair suggestions and no working examples. None of the answers listed here even suggest using multiprocessing and this is quite a bit disappointing and disturbing. As python lovers we should support our built-in libraries, and while parallel processing and synchronization is never a trivial matter, I believe it can be made trivial with proper design. This is becoming extremely important in modern multi-core architectures and cannot be stressed enough! That said, I am far from satisfied with the multiprocessing library, as it is still in its infancy stages with quite a few pitfalls, bugs, and being geared towards functional programming (which I detest). Currently I still prefer the Pyro module (which is way ahead of its time) over multiprocessing due to multiprocessing's severe limitation in being unable to share newly created objects while the server is running. The "register" class-method of the manager objects will only actually register an object BEFORE the manager (or its server) is started. Enough chatter, more code:
Server.py
from multiprocessing.managers import SyncManager
class MyManager(SyncManager):
pass
syncdict = {}
def get_dict():
return syncdict
if __name__ == "__main__":
MyManager.register("syncdict", get_dict)
manager = MyManager(("127.0.0.1", 5000), authkey="password")
manager.start()
raw_input("Press any key to kill server".center(50, "-"))
manager.shutdown()
In the above code example, Server.py makes use of multiprocessing's SyncManager which can supply synchronized shared objects. This code will not work running in the interpreter because the multiprocessing library is quite touchy on how to find the "callable" for each registered object. Running Server.py will start a customized SyncManager that shares the syncdict dictionary for use of multiple processes and can be connected to clients either on the same machine, or if run on an IP address other than loopback, other machines. In this case the server is run on loopback (127.0.0.1) on port 5000. Using the authkey parameter uses secure connections when manipulating syncdict. When any key is pressed the manager is shutdown.
Client.py
from multiprocessing.managers import SyncManager
import sys, time
class MyManager(SyncManager):
pass
MyManager.register("syncdict")
if __name__ == "__main__":
manager = MyManager(("127.0.0.1", 5000), authkey="password")
manager.connect()
syncdict = manager.syncdict()
print "dict = %s" % (dir(syncdict))
key = raw_input("Enter key to update: ")
inc = float(raw_input("Enter increment: "))
sleep = float(raw_input("Enter sleep time (sec): "))
try:
#if the key doesn't exist create it
if not syncdict.has_key(key):
syncdict.update([(key, 0)])
#increment key value every sleep seconds
#then print syncdict
while True:
syncdict.update([(key, syncdict.get(key) + inc)])
time.sleep(sleep)
print "%s" % (syncdict)
except KeyboardInterrupt:
print "Killed client"
The client must also create a customized SyncManager, registering "syncdict", this time without passing in a callable to retrieve the shared dict. It then uses the customized SycnManager to connect using the loopback IP address (127.0.0.1) on port 5000 and an authkey establishing a secure connection to the manager started in Server.py. It retrieves the shared dict syncdict by calling the registered callable on the manager. It prompts the user for the following:
The key in syncdict to operate on
The amount to increment the value accessed by the key every cycle
The amount of time to sleep per cycle in seconds
The client then checks to see if the key exists. If it doesn't it creates the key on the syncdict. The client then enters an "endless" loop where it updates the key's value by the increment, sleeps the amount specified, and prints the syncdict only to repeat this process until a KeyboardInterrupt occurs (Ctrl+C).
Annoying problems
The Manager's register methods MUST be called before the manager is started otherwise you will get exceptions even though a dir call on the Manager will reveal that it indeed does have the method that was registered.
All manipulations of the dict must be done with methods and not dict assignments (syncdict["blast"] = 2 will fail miserably because of the way multiprocessing shares custom objects)
Using SyncManager's dict method would alleviate annoying problem #2 except that annoying problem #1 prevents the proxy returned by SyncManager.dict() being registered and shared. (SyncManager.dict() can only be called AFTER the manager is started, and register will only work BEFORE the manager is started so SyncManager.dict() is only useful when doing functional programming and passing the proxy to Processes as an argument like the doc examples do)
The server AND the client both have to register even though intuitively it would seem like the client would just be able to figure it out after connecting to the manager (Please add this to your wish-list multiprocessing developers)
Closing
I hope you enjoyed this quite thorough and slightly time-consuming answer as much as I have. I was having a great deal of trouble getting straight in my mind why I was struggling so much with the multiprocessing module where Pyro makes it a breeze and now thanks to this answer I have hit the nail on the head. I hope this is useful to the python community on how to improve the multiprocessing module as I do believe it has a great deal of promise but in its infancy falls short of what is possible. Despite the annoying problems described I think this is still quite a viable alternative and is pretty simple. You could also use SyncManager.dict() and pass it to Processes as an argument the way the docs show and it would probably be an even simpler solution depending on your requirements it just feels unnatural to me.

I would dedicate a separate process to maintaining the "shared dict": just use e.g. xmlrpclib to make that tiny amount of code available to the other processes, exposing via xmlrpclib e.g. a function taking key, increment to perform the increment and one taking just the key and returning the value, with semantic details (is there a default value for missing keys, etc, etc) depending on your app's needs.
Then you can use any approach you like to implement the shared-dict dedicated process: all the way from a single-threaded server with a simple dict in memory, to a simple sqlite DB, etc, etc. I suggest you start with code "as simple as you can get away with" (depending on whether you need a persistent shared dict, or persistence is not necessary to you), then measure and optimize as and if needed.

In response to an appropriate solution to the concurrent-write issue. I did very quick research and found that this article is suggesting a lock/semaphore solution. (http://effbot.org/zone/thread-synchronization.htm)
While the example isn't specificity on a dictionary, I'm pretty sure you could code a class-based wrapper object to help you work with dictionaries based on this idea.
If I had a requirement to implement something like this in a thread safe manner, I'd probably use the Python Semaphore solution. (Assuming my earlier merge technique wouldn't work.) I believe that semaphores generally slow down thread efficiencies due to their blocking nature.
From the site:
A semaphore is a more advanced lock mechanism. A semaphore has an internal counter rather than a lock flag, and it only blocks if more than a given number of threads have attempted to hold the semaphore. Depending on how the semaphore is initialized, this allows multiple threads to access the same code section simultaneously.
semaphore = threading.BoundedSemaphore()
semaphore.acquire() # decrements the counter
... access the shared resource; work with dictionary, add item or whatever.
semaphore.release() # increments the counter

Is there a reason that the dictionary needs to be shared in the first place? Could you have each thread maintain their own instance of a dictionary and either merge at the end of the thread processing or periodically use a call-back to merge copies of the individual thread dictionaries together?
I don't know exactly what you are doing, so keep in my that my written plan may not work verbatim. What I'm suggesting is more of a high-level design idea.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.