How do I drive Ansible programmatically and concurrently?

How do I drive Ansible programmatically and concurrently? - python

I would like to use Ansible to execute a simple job on several remote nodes concurrently. The actual job involves grepping some log files and then post-processing the results on my local host (which has software not available on the remote nodes).
The command line ansible tools don't seem well-suited to this use case because they mix together ansible-generated formatting with the output of the remotely executed command. The Python API seems like it should be capable of this though, since it exposes the output unmodified (apart from some potential unicode mangling that shouldn't be relevant here).
A simplified version of the Python program I've come up with looks like this:
from sys import argv
import ansible.runner
runner = ansible.runner.Runner(
pattern='*', forks=10,
module_name="command",
module_args=(
"""
sleep 10
"""),
inventory=ansible.inventory.Inventory(argv[1]),
)
results = runner.run()
Here, sleep 10 stands in for the actual log grepping command - the idea is just to simulate a command that's not going to complete immediately.
However, upon running this, I observe that the amount of time taken seems proportional to the number of hosts in my inventory. Here are the timing results against inventories with 2, 5, and 9 hosts respectively:
exarkun#top:/tmp$ time python howlong.py two-hosts.inventory
real 0m24.285s
user 0m0.216s
sys 0m0.120s
exarkun#top:/tmp$ time python howlong.py five-hosts.inventory
real 0m55.120s
user 0m0.224s
sys 0m0.160s
exarkun#top:/tmp$ time python howlong.py nine-hosts.inventory
real 1m57.272s
user 0m0.360s
sys 0m0.284s
exarkun#top:/tmp$
Some other random observations:
ansible all --forks=10 -i five-hosts.inventory -m command -a "sleep 10" exhibits the same behavior
ansible all -c local --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently (but only works for local-only connections, of course)
ansible all -c paramiko --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently
Perhaps this suggests the problem is with the ssh transport and has nothing to do with using ansible via the Python API as opposed to from the comand line.
What is wrong here that prevents the default transport from taking only around ten seconds regardless of the number of hosts in my inventory?

Some investigation reveals that ansible is looking for the hosts in my inventory in ~/.ssh/known_hosts. My configuration has HashKnownHosts enabled. ansible isn't ever able to find the host entries it is looking for because it doesn't understand the hash known hosts entry format.
Whenever ansible's ssh transport can't find the known hosts entry, it acquires a global lock for the duration of the module's execution. The result of this confluence is that all execution is effectively serialized.
A temporary work-around is to give up some security and disabled host key checking by putting host_key_checking = False into ~/.ansible.cfg. Another work-around is to use the paramiko transport (but this is incredibly slow, perhaps tens or hundreds of times slower than the ssh transport, for some reason). Another work-around is to let some unhashed entries get added to the known_hosts file for ansible's ssh transport to find.

Since you have HashKnownHosts enabled, you should upgrade to the latest version of Ansible. Version 1.3 added support for hashed known_hosts, see the bug tracker and changelog. This should solve your problem without compromising security (workaround using host_key_checking=False) or sacrificing speed (your workaround using paramiko).

With Ansible 2.0 Python API, I switched off StrictHostKeyChecking with
import ansible.constants
ansible.constants.HOST_KEY_CHECKING = False
I managed to speed up Ansible considerably by setting the following on managed computers. Newer sshd have the default the other way around, I think, so it might not be needed in your case.
/etc/ssh/sshd_config
----
UseDNS no

Related

Will os.system(f"{var}") expose the variable (print/stdout)?

I have some security concerns regarding pythons os.system.
As I couldn't find an answer I would like to ask for your help.
So I have stored the username & password for my database as environment variables.
Then I want to start a server with a shell statement:
os.system(f"server --host 0.0.0.0 " f"{db_type}://{db_user}:{db_pw}#host")
I removed some parts of the statement as they are not relevant for this question.
My question is:
Will my variables db_user or db_pw get exposed somewhere? I am concerned that os.system will print or stdout the whole statement with the clear variables.
If so, is there a way to prevent it?
The code will run on an ec2/aws.
I know there are other ways to start a server but I am interested in this specific scenario.

Yes, the contents will be exposed. Not specifically on stdout/err, but you can see the contents.
Take the example
password='secret'
os.system(f"echo {password} && sleep 1000")
This will start the command in a new subshell (as per documentation). This process will now run, so it will be visible in the running process list. Start for example top or htop and search for that process.
That might display something like this:
There you can see the content of the password variable.
This is due to the fact, that first the complete string argument to os.system is evaluated and substituted. This string is then passed to sh to start a new subshell.
As a unix user can list the machines processes, it's never a good idea to pass secrets via cli arguments. Neither is passing via ENV-variables, as you could inspect the environment via cat /proc/{$pid}/environ.
The best way would be to pass the data via stdin to the subprocess.

How would a Python script running on Linux call a routine in a Python script running under Wine?

I have a Python (3) script running on Linux, referred to as the main script, which has to call a routine from a proprietary DLL. So far, I have solved this with Wine using the following construct:
# Main script running on Linux
import subprocess
# [...]
subprocess.Popen('echo "python dll_call.py %s" | wine cmd &' % options, shell = True)
# [...]
The script dll_call.py is executed by a Windows Python (3) interpreter installed under Wine. It dumps the return values into a file which is then picked up by the waiting main script. It's not exactly reliable and agonizingly slow if I have to do this a few times in a row.
I'd like to start the script dll_call.py once, offering some type of a simple server, which should expose the required routine in some sort of way. At the end of the day, I'd like to have a main script looking somewhat like this:
# Main script running on Linux
import subprocess
# [...]
subprocess.Popen('echo "python dll_call_server.py" | wine cmd &', shell = True)
# [...]
return_values = call_into_dll(options)
How can this be implemented best (if speed is required and security not a concern)?
Thank you #jsbueno and #AustinHastings for your answers and suggestions.
For those having similar problems: Inspired by the mentioned answers, I wrote a small Python module for calling into Windows DLLs from Python on Linux. It is based on IPC between a regular Linux/Unix Python process and a Wine-based Python process. Because I have needed it in too many different use-cases / scenarios, I designed it as a "generic" ctypes module drop-in replacement, which does most of the required plumbing automatically in the background.
Example: Assume you're in Python on Linux, you have Wine installed, and you want to call into msvcrt.dll (the Microsoft C runtime library). You can do the following:
from zugbruecke import ctypes
dll_pow = ctypes.cdll.msvcrt.pow
dll_pow.argtypes = (ctypes.c_double, ctypes.c_double)
dll_pow.restype = ctypes.c_double
print('You should expect "1024.0" to show up here: "%.1f".' % dll_pow(2.0, 10.0))
Source code (LGPL), PyPI package & documentation. It's still a bit rough around the edges (i.e. alpha and insecure), but it does handle most types of parameters (including pointers).

You can use the XMLRPC client and servers built-in Python's stdlib to do what you want. Just make your Wine-Python expose the desired functions as XMLRPC methods, and make an inter-process call from any other Python program to that.
It also works for calling functions running in Jython or IronPython from CPython, and also across Python2 and Python3 - the examples included in the module documentation themselves should be enough.Just check the docs: https://docs.python.org/2/library/xmlrpclib.html
If you need the calls to be asynchronous on the client side, or the server site to respond to more than one process, you can find other frameworks over which to build the calls - Celery should also work across several different Pythons while preserving call compatibility, and it is certainly enough performance-wise.

You want to communicate between two processes, where one of them is obscured by being under the control of the WINE engine.
My first thought here is to use a very decoupled form of IPC. There are just too many things that can go wrong with tight coupling and something like WINE involved.
And finally, how can this be made easy for someone new to this kind of stuff?
The obvious answer is to set up a web server. There are plenty of tutorials using plenty of packages in Python to respond to HTTP requests, and to generate HTTP requests.
So, set up a little HTTP responder in your WINE process, listen to some non-standard port (not 8080 or 80), and translate requests into calls to your DLL. If you're clever, you'll interpret web requests (http://localhost:108000/functionname?arg1=foo&arg2=bar) into possibly different DLL calls.
On the other side, create a HTTP client in your non-WINE code and make requests to your server.

Python ( or maybe linux in general) file operation flow control or file lock

I am using a cluster of computers to do some parallel computation. My home directory is shared across the cluster. In one machine, I have a ruby code that creates bash script containing computation command and write the script to, say, ~/q/ directory. The scripts are named *.worker1.sh, *.worker2.sh, etc.
On other 20 machines, I have 20 python code running ( one at each machine ) that (constantly) check the ~/q/ directory and look for jobs that belong to that machine, using a python code like this:
jobs = glob.glob('q/*.worker1.sh')
[os.system('sh ' + job + ' &') for job in jobs]
For some additional control, the ruby code will create a empty file like workeri.start (i = 1..20) at q directory after it write the bash script to q directory, the python code will check for that 'start' file before it runs the above code. And in the bash script, if the command finishes successfully, the bash script will create an empty file like 'workeri.sccuess', the python code checks this file after it runs the above code to make sure the computation finishs successfully. If python finds out that the computation finishs successfully, it will remove the 'start' file in q directory, so the ruby code knows that job finishs successfully. After the 20 bash script all finished, the ruby code will create new bash script and python read and executes new scripts and so on.
I know this is not a elegant way to coordinate the computation, but I haven't figured out a better to communicate between different machines.
Now the question is: I expect that the 20 jobs will run somewhat in parallel. The total time to finish the 20 jobs will not be much longer than the time to finish one job. However, it seems that these jobs runs sequentially and time is much longer than I expected.
I suspect that part of the reason is that multiple codes are reading and writing the same directory at once but the linux system or python locks the directory and only allow one process to oprate the directory. This makes the code execute one at a time.
I am not sure if this is the case. If I split the bash scripts to different directories, and let the python code on different machines read and write different directories, will that solve the problem? Or is there any other reasons that cause the problem?
Thanks a lot for any suggestions! Let me know if I didn't explain anything clearly.
Some additional info:
my home directory is at /home/my_group/my_home, here is the mount info for it
:/vol/my_group on /home/my_group type nfs (rw,nosuid,nodev,noatime,tcp,timeo=600,retrans=2,rsize=65536,wsize=65536,addr=...)
I say constantly check the q directory, meaning a python loop like this:
While True:
if 'start' file exists:
find the scripts and execute them as I mentioned above

I know this is not a elegant way to coordinate the computation, but I
haven't figured out a better to communicate between different
machines.
While this isn't directly what you asked, you should really, really consider fixing your problem at this level, using some sort of shared message queue is likely to be a lot simpler to manage and debug than relying on the locking semantics of a particular networked filesystem.
The simplest solution to set up and run in my experience is redis on the machine currently running the Ruby script that creates the jobs. It should literally be as simple as downloading the source, compiling it and starting it up. Once the redis server is up and running, you change your code to append your the computation commands to one or more Redis lists. In ruby you would use the redis-rb library like this:
require "redis"
redis = Redis.new
# Your other code to build up command lists...
redis.lpush 'commands', command1, command2...
If the computations need to be handled by certain machines, use a list per-machine like this:
redis.lpush 'jobs:machine1', command1
# etc.
Then in your Python code, you can use redis-py to connect to the Redis server and pull jobs off the list like so:
from redis import Redis
r = Redis(host="hostname-of-machine-running-redis")
while r.llen('jobs:machine1'):
job = r.lpop('commands:machine1')
os.system('sh ' + job + ' &')
Of course, you could just as easily pull jobs off the queue and execute them in Ruby:
require 'redis'
redis = Redis.new(:host => 'hostname-of-machine-running-redis')
while redis.llen('jobs:machine1')
job = redis.lpop('commands:machine1')
`sh #{job} &`
end
With some more details about the needs of the computation and the environment it's running in, it would be possible to recommend even simpler approaches to managing it.

Try a while loop? If that doesn't work, on the python side try using a TRY statement like so:
Try:
with open("myfile.whatever", "r") as f:
f.read()
except:
(do something if it doesnt work, perhaps a PASS? (must be in a loop to constantly check this)
else:
execute your code if successful

can a python script know that another instance of the same script is running... and then talk to it?

I'd like to prevent multiple instances of the same long-running python command-line script from running at the same time, and I'd like the new instance to be able to send data to the original instance before the new instance commits suicide. How can I do this in a cross-platform way?
Specifically, I'd like to enable the following behavior:
"foo.py" is launched from the command line, and it will stay running for a long time-- days or weeks until the machine is rebooted or the parent process kills it.
every few minutes the same script is launched again, but with different command-line parameters
when launched, the script should see if any other instances are running.
if other instances are running, then instance #2 should send its command-line parameters to instance #1, and then instance #2 should exit.
instance #1, if it receives command-line parameters from another script, should spin up a new thread and (using the command-line parameters sent in the step above) start performing the work that instance #2 was going to perform.
So I'm looking for two things: how can a python program know another instance of itself is running, and then how can one python command-line program communicate with another?
Making this more complicated, the same script needs to run on both Windows and Linux, so ideally the solution would use only the Python standard library and not any OS-specific calls. Although if I need to have a Windows codepath and an *nix codepath (and a big if statement in my code to choose one or the other), that's OK if a "same code" solution isn't possible.
I realize I could probably work out a file-based approach (e.g. instance #1 watches a directory for changes and each instance drops a file into that directory when it wants to do work) but I'm a little concerned about cleaning up those files after a non-graceful machine shutdown. I'd ideally be able to use an in-memory solution. But again I'm flexible, if a persistent-file-based approach is the only way to do it, I'm open to that option.
More details: I'm trying to do this because our servers are using a monitoring tool which supports running python scripts to collect monitoring data (e.g. results of a database query or web service call) which the monitoring tool then indexes for later use. Some of these scripts are very expensive to start up but cheap to run after startup (e.g. making a DB connection vs. running a query). So we've chosen to keep them running in an infinite loop until the parent process kills them.
This works great, but on larger servers 100 instances of the same script may be running, even if they're only gathering data every 20 minutes each. This wreaks havoc with RAM, DB connection limits, etc. We want to switch from 100 processes with 1 thread to one process with 100 threads, each executing the work that, previously, one script was doing.
But changing how the scripts are invoked by the monitoring tool is not possible. We need to keep invocation the same (launch a process with different command-line parameters) but but change the scripts to recognize that another one is active, and have the "new" script send its work instructions (from the command line params) over to the "old" script.
BTW, this is not something I want to do on a one-script basis. Instead, I want to package this behavior into a library which many script authors can leverage-- my goal is to enable script authors to write simple, single-threaded scripts which are unaware of multi-instance issues, and to handle the multi-threading and single-instancing under the covers.

The Alex Martelli approach of setting up a communications channel is the appropriate one. I would use a multiprocessing.connection.Listener to create a listener, in your choice. Documentation at:
http://docs.python.org/library/multiprocessing.html#multiprocessing-listeners-clients
Rather than using AF_INET (sockets) you may elect to use AF_UNIX for Linux and AF_PIPE for Windows. Hopefully a small "if" wouldn't hurt.
Edit: I guess an example wouldn't hurt. It is a basic one, though.
#!/usr/bin/env python
from multiprocessing.connection import Listener, Client
import socket
from array import array
from sys import argv
def myloop(address):
try:
listener = Listener(*address)
conn = listener.accept()
serve(conn)
except socket.error, e:
conn = Client(*address)
conn.send('this is a client')
conn.send('close')
def serve(conn):
while True:
msg = conn.recv()
if msg.upper() == 'CLOSE':
break
print msg
conn.close()
if __name__ == '__main__':
address = ('/tmp/testipc', 'AF_UNIX')
myloop(address)
This works on OS X, so it needs testing with both Linux and (after substituting the right address) Windows. A lot of caveats exists from a security point, the main one being that conn.recv unpickles its data, so you are almost always better of with recv_bytes.

The general approach is to have the script, on startup, set up a communication channel in a way that's guaranteed to be exclusive (other attempts to set up the same channel fail in a predictable way) so that further instances of the script can detect the first one's running and talk to it.
Your requirements for cross-platform functionality strongly point towards using a socket as the communication channel in question: you can designate a "well known port" that's reserved for your script, say 12345, and open a socket on that port listening to localhost only (127.0.0.1). If the attempt to open that socket fails, because the port in question is "taken", then you can connect to that port number instead, and that will let you communicate with the existing script.
If you're not familiar with socket programming, there's a good HOWTO doc here. You can also look at the relevant chapter in Python in a Nutshell (I'm biased about that one, of course;-).

Perhaps try using sockets for communication?

Sounds like your best bet is sticking with a pid file but have it not only contain the process Id - have it also include the port number that the prior instance is listening on. So when starting up check for the pid file and if present see if a process with that Id is running - if so send your data to it and quit otherwise overwrite the pid file with the current process's info.

Safely executing user-submitted python code on the server

I am looking into starting a project which involves executing python code that the user enters via a HTML form. I know this can be potentially lethal (exec), but I have seen it done successfully in at least one instance.
I sent an email off to the developers of the Python Challenge and I was told they are using a solution they came up with themselves, and they only let on that they are using "security features provided by the operating system" and that "the operating system [Linux] provides most of the security you need if you know how to use it."
Would anyone know how a safe and secure way to go about doing this? I thought about spawning a new VM for every submission, but that would have way too much overhead and be pert-near impossible to implement efficiently.

On a modern Linux in addition to chroot(2) you can restrict process further by using clone(2) instead of fork(2). There are several interesting clone(2) flags:
CLONE_NEWIPC (new namespace for semaphores, shared memory, message queues)
CLONE_NEWNET (new network namespace - nice one)
CLONE_NEWNS (new set of mountpoints)
CLONE_NEWPID (new set of process identifiers)
CLONE_NEWUTS (new hostname, domainname, etc)
Previously this functionality was implemented in OpenVZ and merged then upstream, so there is no need for patched kernel anymore.

http://codepad.org/about has implemented such a system successfully (as a public code pasting/running service!)
codepad.org is an online compiler/interpreter, and a simple collaboration tool. It's a pastebin that executes code for you. [...]
How it works
Code execution is handled by a supervisor based on geordi. The strategy is to run everything under ptrace, with many system calls disallowed or ignored. Compilers and final executables are both executed in a chroot jail, with strict resource limits. The supervisor is written in Haskell.
[...]
When your app is remote code execution, you have to expect security problems. Rather than rely on just the chroot and ptrace supervisor, I've taken some additional precautions:
The supervisor processes run on virtual machines, which are firewalled such that they are incapable of making outgoing connections.
The machines that run the virtual machines are also heavily firewalled, and restored from their source images periodically.

If you run the script as user nobody (on Linux), it can write practically nowhere and read no data that has its permissions set up properly. But it could still cause a DoS attack by, for example:
filling up /tmp
eating all RAM
eating all CPU
Furthermore, outside network connections can be opened, etcetera etcetera. You can probably lock all these down with kernel limits, but you are bound to forget something.
So I think that a virtual machine with no access to the network or the real hard drive would be the only (reasonably) safe route. Perhaps the developers of the Python Challenge use KVM which is, in principle, "provided by the operating system".
For efficiency, you could run all submissions in the same VM. That saves you much overhead, and in the worst-case scenario they only hamper each other, but not your server.

Using chroot (Wikipedia) may be part of the solution, e.g. combined with ulimit and some other common (or custom) tools.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.