Guys this is a question about python twisted ssh lib.
All sample code even production code I saw acting as a ssh client based on twisted.conch.ssh are all interacting with server in such a mode:
prepare some commands to run remotely;
define call backs;
kick off reactor then suspend for new feedback;
After the reactor.run(), I never found people tried to deliver commands to sshd, the script just sit their waiting. I think it'll be possible to fork or spawn stuffs to send commands. However since one of twisted's strengths is its de-multiplexing mechanism so it doesn't have to fork to process incoming requests when running as a server. May I say it is a reasonable requirement not to fork (as a client script) to continuously send requests to server?
Any thought on this ?
TIA.
joefis' answer is basically sound, but I bet some examples would be helpful. First, there are a few ways you can have some code run right after the reactor starts.
This one is pretty straightforward:
def f():
print "the reactor is running now"
reactor.callWhenRunning(f)
Another way is to use timed events, although there's probably no reason to do it this way instead of using callWhenRunning:
reactor.callLater(0, f)
You can also use the underlying API which callWhenRunning is implemented in terms of:
reactor.addSystemEventTrigger('after', 'startup', f)
You can also use services. This is a bit more involved, since it involves using using twistd(1) (or something else that's going to hook the service system up to the reactor). But you can write a class like this:
from twisted.application.service import Service
class ThingDoer(Service):
def startService(self):
print "The reactor is running now."
And then write a .tac file like this:
from twisted.application.service import Application
from thatmodule import ThingDoer
application = Application("Do Things")
ThingDoer().setServiceParent(application)
And finally, you can run this .tac file using twistd(1):
$ twistd -ny thatfile.tac
Of course, this only tells you how to do one thing after the reactor is running, which isn't exactly what you're asking. It's the same idea, though - you define some event handler and ask to receive an event by having that handler called; when it is called, you get to do stuff. The same idea applies to anything you do with Conch.
You can see this in the Conch examples, for example in sshsimpleclient.py we have:
class CatChannel(channel.SSHChannel):
name = 'session'
def openFailed(self, reason):
print 'echo failed', reason
def channelOpen(self, ignoredData):
self.data = ''
d = self.conn.sendRequest(self, 'exec', common.NS('cat'), wantReply = 1)
d.addCallback(self._cbRequest)
def _cbRequest(self, ignored):
self.write('hello conch\n')
self.conn.sendEOF(self)
def dataReceived(self, data):
self.data += data
def closed(self):
print 'got data from cat: %s' % repr(self.data)
self.loseConnection()
reactor.stop()
In this example, channelOpen is the event handler called when a new channel is opened. It sends a request to the server. It gets back a Deferred, to which it attaches a callback. That callback is an event handler which will be called when the request succeeds (in this case, when cat has been executed). _cbRequest is the callback it attaches, and that method takes the next step - writing some bytes to the channel and then closing it. Then there's the dataReceived event handler, which is called when bytes are received over the chnanel, and the closed event handler, called when the channel is closed.
So you can see four different event handlers here, some of which are starting operations that will eventually trigger a later event handler.
So to get back to your question about doing one thing after another, if you wanted to open two cat channels, one after the other, then in the closed event handler could open a new channel (instead of stopping the reactor as it does in this example).
You're trying to put a square peg in a round hole. Everything in Twisted is asynchronous, so you have to think about the sequence of events differently. You can't say "here are 10 operations to be run one after the other" that's serial thinking.
In Twisted you issue the first command and register a callback that will be triggered when it completes. When that callback occurs you issue the 2nd command and register a callback that will be triggered when that completes. And so on and so on.
Related
I've been tasked with learning Twisted.
I am also somewhat new to Python in general, but have used other modern programming languages.
In reading over Twisted documentation, I keep running into examples that are
Not complete executable examples
Run in one thread
Coming from other languages, when I use some asynchronous mechanism, there is usually another thread of execution while I carry out some manner of work, then I am notified when that work is completed, and I react to its results.
I do see that it has some built in asynchronous mechanisms, but none of them provide the user with a means to create custom CPU bound asynchronous tasks akin to 'Tasks' in C# or 'work' with boost::asio in C++ that would run in parallel to the main thread.
I see that Twisted provides a means to asynchronously wait on IO and do things in on the same thread while waiting, if we are waiting on:
network reads and writes
keyboard input
It also shows me how to:
Do some manner of integration with GUI tool kits to make use of their event loop, but doesn't go into detail.
Schedule tasks using reactor on a timer, but doesn't do that task in parallel to anything else
It talks about async/await, but that is for python 3 only, and I am using python 2.7
I figured the some manner of thread pooling must be built into the reactor, but then when I read about the reactor, it says that everything runs on the main thread in reactor.run().
So, I am left confused.
What is the point of deferreds, creating a callback chain and reacting to the results, if we aren't running anything in parallel?
If we are running asynchronous code, how are we making our own custom asynchronous functions? (see keyword async in C#)
In other languages, I might create an async task to count from 1 to 10, while on the main thread, I might count from 'a' to 'z' at the same time. When the the task is complete I would get notified via a callback on a thread from a threadpool. I'd have the option to sync up, if I wanted to, by calling some 'wait' method. While the definition of "asynchronous" only involves the posting of the task, the getting of the result, and the callback when its done....I've never seen it used without doing things in parallel.
I'll address your questions (and statements that seem confusing) one-by-one:
"Examples that are not complete"
Restating what I posted in the comments: see my two previous answers for complete examples ( https://stackoverflow.com/a/30399317/3334178 & https://stackoverflow.com/a/23274411/3334178 ) and go through Krondo's Twisted Introduction
You said you are discounting these because "The examples are the network code in twisted, which has the asynchronisity built in and hidden.". I disagree with that assertion and will explain this in the next section.
"Examples are not asynchronous"
When your talking about "asynchronous programming" in the vain of pythons twisted/tornado/asyncio (or Node.JS or C select/poll/kpoll) your talking about model/pattern of programming that allows the programmer shape their code so that parts of it can run while other parts are blocked (in almost all cases the blocking is caused by a part of the program having to wait for IO).
These libraries/languages will certainly have ways they can do threading and/or multiprocessing, but those are layers grafted on top of the async design - and if that's genuinely what you need (I.E. you have an exclusively CPU bound need) the async systems are going to be a bad choice.
Let's use your "hidden away" comment to get into this a bit more
"Network examples are asych, but the asynchronousity is built in and hidden away"
The fundamental element of the async design is that you write your code so it should never block for IO - You've been calling out network but really we are talking about network/disk/keyboard/mouse/sound/serial - anything that (for whatever reason) can run slower than the CPU (and that the OS has a file-descriptor for).
Also, there isn't anything really "hidden away" about how it functions - async programming always uses non-blocking (status checking / call-back) calls for any IO channel it can operate on. If you dig enough in the twisted codebase all the async logic is in plain sight (Krondo's tutorial is really good for giving examples of this)
Let me use the keyboard as an example.
In sync code, you would use an input or a read - and the program would pause waiting for that line (or key) to be typed.
In async code (at least in featureful implementations like twisted) you will fetch the file-descriptor for "input" and register it with call-back function, to be called when the file-descriptor changes, to the OS-level async engine (select, poll, kpoll, etc...)
The act of doing that registration - which takes almost no time LETS YOU run other logic while the keyboard logic waits for the keyboard event to happen (see the stdio.StandardIO(keyboardobj,sys.stdin.fileno()) line from near the end of my example code in https://stackoverflow.com/a/30399317/3334178).
"[This] leads me to believe there is some other means to use deferreds with asynchronous"
deferreds aren't magic. They are just clever lists of function callback. There are numerous clever ways they can be chained together, but in the end, they are just a tool to help you take advantage of the logic above
"It also talks about async/await, that is for python 3 only, and I am using python 2.7"
async and await are just the python 3 way of doing what was done in python2 with #defer.inlineCallbacks and yield. These systems are shortcuts that rewire code so that to the reader the code looks and acts like sync code, but when its run the code is morphed into a "register a callback and move-on" flow
"when I read about the reactor, it says that everything runs on the main thread in reactor.run()"
Yes, because (as above) async is about not-waiting-for-IO - its not about threading or multi-processing
Your last few questions "point of deferreds" and "how do you make asynchronous" feel like I answered them above - but if not, let me know in the comments, and I'll spell them out.
Also your comment requesting "an example where we count from 1 to 10 in some deferred function while we count from a to z in the main thread?" doesn't make sense when talking about async (both because you talk about a "thread" - which is a different construct, and because those are both (likely) CPU tasks), but I will give you a different example that counts while watching for keyboard input (which is something that definitely DOES make sense when talking about async:
#!/usr/bin/env python
#
# Frankenstein-esk amalgam of example code
# Key of which comes from the Twisted "Chat" example
# (such as: http://twistedmatrix.com/documents/12.0.0/core/examples/chatserver.py)
import sys # so I can get at stdin
import os # for isatty
import termios, tty # access to posix IO settings
from twisted.internet import reactor
from twisted.internet import stdio # the stdio equiv of listenXXX
from twisted.protocols import basic # for lineReceiver for keyboard
from twisted.internet import task
class counter(object):
runs = 0
def runEverySecond():
counter.runs += 1
print "async counting demo: " + str(counter.runs)
# to set keyboard into cbreak mode - so keys don't require a CR before causing an event
class Cbreaktty(object):
org_termio = None
my_termio = None
def __init__(self, ttyfd):
if(os.isatty(ttyfd)):
self.org_termio = (ttyfd, termios.tcgetattr(ttyfd))
tty.setcbreak(ttyfd)
print ' Set cbreak mode'
self.my_termio = (ttyfd, termios.tcgetattr(ttyfd))
else:
raise IOError #Not something I can set cbreak on!
def retToOrgState(self):
(tty, org) = self.org_termio
print ' Restoring terminal settings'
termios.tcsetattr(tty, termios.TCSANOW, org)
class KeyEater(basic.LineReceiver):
def __init__(self):
self.setRawMode() # Switch from line mode to "however much I got" mode
def rawDataReceived(self, data):
key = str(data).lower()[0]
if key == 'q':
reactor.stop()
else:
print "--------------"
print "Press:"
print " q - to cleanly shutdown"
print "---------------"
# Custom tailored example for SO:56013998
#
# This code is a mishmash of styles and techniques. Both to provide different examples of how
# something can be done and because I'm lazy. Its been built and tested on OSX and linux,
# it should be portable (other then perhaps termal cbreak mode). If you want to ask
# questions about this code contact me directly via mail to mike at partialmesh.com
#
#
# Once running press any key in the window where the script was run and it will give
# instructions.
def main():
try:
termstate = Cbreaktty(sys.stdin.fileno())
except IOError:
sys.stderr.write("Error: " + sys.argv[0] + " only for use on interactive ttys\n")
sys.exit(1)
keyboardobj = KeyEater()
l = task.LoopingCall(runEverySecond)
l.start(1.0) # call every second
stdio.StandardIO(keyboardobj,sys.stdin.fileno())
reactor.run()
termstate.retToOrgState()
if __name__ == '__main__':
main()
(I know technically I didn't use a deferred - but I ran out of time - and this case is a bit too simple to really need it (I don't have a chain of callback anywhere, which is what deferreds are for))
I'm writing application that uses python Twisted API (namely WebSocketClientProtocol, WebSocketClientFactory, ReconnectiongClientFactory. I want to wrap client factory into reader with following interface
class Reader:
def start(self):
pass
def stop(self):
pass
Start function will be used to open connection (i.e. connect on ws api and start reading data), while stop will stop such connection.
My issue is that if I use reactor.run() inside start, connection starts and everything is OK, but my code never goes pass that line (looks like blocking call to me) and I cannot execute subsequent lines (include .stop in my tests).
I have tried using variants such as reactor.callFromThread(reactor.run) and reactor.callFromThread(reactor.stop) or even excplicity calling Thread(target=...) but none seems to work (they usually don't build protocol or open connection at all).
Any help or guidelines on how to implement Reader.start and Reader.stop are welcome.
If you put reactor.run inside Reader.start then Reader will be a difficult component to use alongside other code. Your difficulties are just the first symptom of this.
Calling reactor.run and reactor.stop are the job of code responsible for managing the lifetime of your application. Put those calls somewhere separate from your WebSocket application code. For example:
r = Reader()
r.start()
reactor.run()
Or better yet, implement a twist(d) plugin and let twist(d) manage the reactor for you.
TL;DR: I have a beautifully crafted, continuously running piece of Python code controlling and reading out a physics experiment. Now I want to add an HTTP API.
I have written a module which controls the hardware using USB. I can script several types of autonomously operating experiments, but I'd like to control my running experiment over the internet. I like the idea of an HTTP API, and have implemented a proof-of-concept using Flask's development server.
The experiment runs as a single process claiming the USB connection and periodically (every 16 ms) all data is read out. This process can write hardware settings and commands, and reads data and command responses.
I have a few problems choosing the 'correct' way to communicate with this process. It works if the HTTP server only has a single worker. Then, I can use python's multiprocessing.Pipe for communication. Using more-or-less low-level sockets (or things like zeromq) should work, even for request/response, but I have to implement some sort of protocol: send {'cmd': 'set_voltage', 'value': 900} instead of calling hardware.set_voltage(800) (which I can use in the stand-alone scripts). I can use some sort of RPC, but as far as I know they all (SimpleXMLRPCServer, Pyro) use some sort of event loop for the 'server', in this case the process running the experiment, to process requests. But I can't have an event loop waiting for incoming requests; it should be reading out my hardware! I googled around quite a bit, but however I try to rephrase my question, I end up with Celery as the answer, which mostly fires off one job after another, but isn't really about communicating with a long-running process.
I'm confused. I can get this to work, but I fear I'll be reinventing a few wheels. I just want to launch my app in the terminal, open a web browser from anywhere, and monitor and control my experiment.
Update: The following code is a basic example of using the module:
from pysparc.muonlab.muonlab_ii import MuonlabII
muonlab = MuonlabII()
muonlab.select_lifetime_measurement()
muonlab.set_pmt1_voltage(900)
muonlab.set_pmt1_threshold(500)
lifetimes = []
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
The module lives at https://github.com/HiSPARC/pysparc/tree/master/pysparc/muonlab.
My current implementation of the HTTP API lives at https://github.com/HiSPARC/pysparc/blob/master/bin/muonlab_with_http_api.
I'm pretty happy with the module (with lots of tests) but the HTTP API runs using Flask's single-threaded development server (which the documentation and the internet tells me is a bad idea) and passes dictionaries through a Pipe as some sort of IPC. I'd love to be able to do something like this in the above script:
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
process_remote_requests()
where process_remote_requests is a fairly short function to call the muonlab instance or return data. Then, in my Flask views, I'd have something like:
muonlab = RemoteMuonlab()
#app.route('/pmt1_voltage', methods=['GET', 'PUT'])
def get_data():
if request.method == 'PUT':
voltage = request.form['voltage']
muonlab.set_pmt1_voltage(voltage)
else:
voltage = muonlab.get_pmt1_voltage()
return jsonify(voltage=voltage)
Getting the measurement data from the app is perhaps less of a problem, since I could store that in SQLite or something else that handles concurrent access.
But... you do have an IO loop; it runs every 16ms.
You can use BaseHTTPServer.HTTPServer in such a case; just set the timeout attribute to something small. bascially...
class XmlRPCApi:
def do_something(self):
print "doing something"
server = SimpleXMLRPCServer(("localhost", 8000))
server.register_instance(XMLRpcAPI())
server.timeout = 0
while True:
sleep(0.016)
do_normal_thing()
x.handle_request()
Edit: python has a built in server, also built on BaseHTTPServer, capable of serving a flask app. since flask.Flask() happens to be a wsgi compliant application, your process_remote_requests() should look like this:
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
# app here is just your Flask() application!
# as before, set timeout to zero so that you can go right back
# to your event loop if there are no requests to handle
remote_server.timeout = 0
def process_remote_requests():
remote_server.handle_request()
This works well enough if you have only short running requests; but if you need to handle requests that may possibly take longer than your event loop's normal polling interval, or if you need to handle more requests than you have polls per unit of time, then you can't use this approach, exactly.
You don't necessarily need to fork off another process, though, You can potentially get by using a pool of workers in another thread. roughly:
import threading
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
POOL_SIZE = 10 # or some other value.
pool = [threading.Thread(target=remote_server.serve_forever) for dummy in xrange(POOL_SIZE)]
for thread in pool:
thread.daemon = True
thread.start()
while True:
pass # normal experiment processing here; don't handle requests in this thread.
However; this approach has one major shortcoming, you now have to deal with concurrency! It's not safe to manipulate your program state as freely as you could with the above loop, since you might be, concurrently manipulating that same state in the main thread (or another http server thread). It's up to you to know when this is valid, wrapping each resource with some sort of mutex lock or whatever is appropriate.
According to the source the BaseServer.shutdown() must be called from a different thread than the one the server is running on.
However I am trying to shut down the server with a specific value provided to the server in a web request.
The request handler obviously runs in this thread so it will still deadlock after I have done this:
httpd = BaseHTTPServer.HTTPServer(('', 80), MyHandler)
print("Starting server in thread")
threading.Thread(target=httpd.serve_forever).start()
How can I accomplish what I want? Must I set up a socket or pipe or something (please show me how to do this, if it is the solution), where the main thread can block and wait on the child thread to send a message, at which point it will be able to call shutdown()?
I am currently able to achieve some "kind of works" behavior by calling "httpd.socket.close()" from the request handler. This generates an [Errno 9] Bad file descriptor error and seems to terminate the process on Windows.
But this is clearly not the right way to go about this.
Update ca. Aug. 2013 I have eloped with node.js to fill the need for robust async I/O, but plenty of side projects (e.g. linear algebra research, various command line tool frontends) keep me coming back to python. Looking back on this question, BaseHTTPServer probably has little practical value comparatively to other options (like various microframeworks that Phil mentions).
1. from the BaseHTTPServer docs
To create a server that doesn’t run forever, but until some condition is fulfilled:
def run_while_true(server_class=BaseHTTPServer.HTTPServer,
handler_class=BaseHTTPServer.BaseHTTPRequestHandler):
"""
This assumes that keep_running() is a function of no arguments which
is tested initially and after each request. If its return value
is true, the server continues.
"""
server_address = ('', 8000)
httpd = server_class(server_address, handler_class)
while keep_running():
httpd.handle_request()
Allow some url call to set the condition to terminate, use whatever strikes your fancy there.
Edit: keep_running is any function you choose, could be as simple as:
def keep_running():
global some_global_var_that_my_request_controller_will_set
return some_global_var_that_my_request_controller_will_set
you probably want something smarter and without_rediculously_long_var_names.
2. BaseHTTPServer is usually lower than you want to go. There are plenty of micro-frameworks that might suit your needs
threading.Event is useful for signalling other threads. e.g.,
please_die = threading.Event()
# in handler
please_die.set()
# in main thread
please_die.wait()
httpd.shutdown()
You might use a Queue if you want to send data between threads.
I wrote a small Python application that runs as a daemon. It utilizes threading and queues.
I'm looking for general approaches to altering this application so that I can communicate with it while it's running. Mostly I'd like to be able to monitor its health.
In a nutshell, I'd like to be able to do something like this:
python application.py start # launches the daemon
Later, I'd like to be able to come along and do something like:
python application.py check_queue_size # return info from the daemonized process
To be clear, I don't have any problem implementing the Django-inspired syntax. What I don't have any idea how to do is to send signals to the daemonized process (start), or how to write the daemon to handle and respond to such signals.
Like I said above, I'm looking for general approaches. The only one I can see right now is telling the daemon constantly log everything that might be needed to a file, but I hope there's a less messy way to go about it.
UPDATE: Wow, a lot of great answers. Thanks so much. I think I'll look at both Pyro and the web.py/Werkzeug approaches, since Twisted is a little more than I want to bite off at this point. The next conceptual challenge, I suppose, is how to go about talking to my worker threads without hanging them up.
Thanks again.
Yet another approach: use Pyro (Python remoting objects).
Pyro basically allows you to publish Python object instances as services that can be called remotely. I have used Pyro for the exact purpose you describe, and I found it to work very well.
By default, a Pyro server daemon accepts connections from everywhere. To limit this, either use a connection validator (see documentation), or supply host='127.0.0.1' to the Daemon constructor to only listen for local connections.
Example code taken from the Pyro documentation:
Server
import Pyro.core
class JokeGen(Pyro.core.ObjBase):
def __init__(self):
Pyro.core.ObjBase.__init__(self)
def joke(self, name):
return "Sorry "+name+", I don't know any jokes."
Pyro.core.initServer()
daemon=Pyro.core.Daemon()
uri=daemon.connect(JokeGen(),"jokegen")
print "The daemon runs on port:",daemon.port
print "The object's uri is:",uri
daemon.requestLoop()
Client
import Pyro.core
# you have to change the URI below to match your own host/port.
jokes = Pyro.core.getProxyForURI("PYROLOC://localhost:7766/jokegen")
print jokes.joke("Irmen")
Another similar project is RPyC. I have not tried RPyC.
What about having it run an http server?
It seems crazy but running a simple web server for administrating your
server requires just a few lines using web.py
You can also consider creating a unix pipe.
Use werkzeug and make your daemon include an HTTP-based WSGI server.
Your daemon has a collection of small WSGI apps to respond with status information.
Your client simply uses urllib2 to make POST or GET requests to localhost:somePort. Your client and server must agree on the port number (and the URL's).
This is very simple to implement and very scalable. Adding new commands is a trivial exercise.
Note that your daemon does not have to respond in HTML (that's often simple, though). Our daemons respond to the WSGI-requests with JSON-encoded status objects.
I would use twisted with a named pipe or just open up a socket. Take a look at the echo server and client examples. You would need to modify the echo server to check for some string passed by the client and then respond with whatever requested info.
Because of Python's threading issues you are going to have trouble responding to information requests while simultaneously continuing to do whatever the daemon is meant to do anyways. Asynchronous techniques or forking another processes are your only real option.
# your server
from twisted.web import xmlrpc, server
from twisted.internet import reactor
class MyServer(xmlrpc.XMLRPC):
def xmlrpc_monitor(self, params):
return server_related_info
if __name__ == '__main__':
r = MyServer()
reactor.listenTCP(8080, Server.Site(r))
reactor.run()
client can be written using xmlrpclib, check example code here.
Assuming you're under *nix, you can send signals to a running program with kill from a shell (and analogs in many other environments). To handle them from within python check out the signal module.
You could associate it with Pyro (http://pythonhosted.org/Pyro4/) the Python Remote Object. It lets you remotely access python objects. It's easily to implement, has low overhead, and isn't as invasive as Twisted.
You can do this using multiprocessing managers (https://docs.python.org/3/library/multiprocessing.html#managers):
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
Example server:
from multiprocessing.managers import BaseManager
class RemoteOperations:
def add(self, a, b):
print('adding in server process!')
return a + b
def multiply(self, a, b):
print('multiplying in server process!')
return a * b
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations', RemoteOperations)
manager = RemoteManager(address=('', 12345), authkey=b'secret')
manager.get_server().serve_forever()
Example client:
from multiprocessing.managers import BaseManager
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations')
manager = RemoteManager(address=('localhost', 12345), authkey=b'secret')
manager.connect()
remoteops = manager.RemoteOperations()
print(remoteops.add(2, 3))
print(remoteops.multiply(2, 3))