I need some Python module to support forwarding command line arguments to other commands.
argparse allows to parse arguments easily, but doesn't deliver any "deparsing" tool.
I could just forward os.sys.argv if I hadn't need to delete or change values of any of them, but I have.
I can imagine myself a class that just operates on array of strings all the time, without losing any information, but I failed finding any.
Does somebody know such tool or maybe met similar problem and found out another nice way to handle?
(Sorry for English :()
If you use the subprocess module to run the commands with delegated arguments you can specify your command as a list of strings that won't be subject to shell parsing (as long as you don't use shell=True). You therefore don't need to bother about quoting concerns the same way you would if you were reconstructing a command line. See https://docs.python.org/2/library/subprocess.html#frequently-used-arguments for further details.
Related
Here's a general example of what I need to do:
For example, I would initiate a back trace by sending the command "bt" to GDB from the program. Then I would search for a word such as "pardrivr" and get the line number associated with it by using regular expressions. Then I would input "f [line_number_of_pardriver]" into GDB. This process would be repeated until the correct information is eventually extracted.
I want to use named pipes in bash or python to accomplish this.
Could someone please provide a simple example of how to do this?
My recommendation is not to do this. Instead there are two more supportable ways to go:
Write your code in Python directly in gdb. Gdb has been extensible in Python for several years now.
Use the gdb MI ("Machine Interface") approach. There are libraries available to parse this already (not sure if there is one in Python but I assume so). This is better than parsing gdb's command-line output because some pains are taken to avoid gratuitous breakage -- this is the preferred way for programs to interact with gdb.
All I can find is this reference:
Is it possible to use POD(plain old documentation) with Python?
which looks like you have to generate a whole separate set of docs to go with code.
I would like to try Python for making cmdline utils, but when I do this with Perl I can embed the docs directly in the source, and use the Pod2Usage module along with Getopt so that any of my scripts can be run like this:
cmd --man
and this triggers the pod system to dump documentation that is embedded in the script in man-page format. It can also generate shorter (synopsis), or medium formats.
It looks like I could use the pydoc code and kind of reverse engineer it to sort-of do the task (at least showing the full documentation), but I am hoping something better already exists.
The python-modargs package lets you create self-documenting command line interfaces. You define a function for each command you want to make available, and the function's docstring becomes the help text for that function. The function's keyword arguments become named arguments and python-modargs will parse inline comments after the keyword arguments to be help text for that argument.
I use python-modargs to generate the command line interface for dexy, here is the module which defines the commands:
https://github.com/ananelson/dexy/blob/027954f9234363d506225d40b675b3d6478994f4/dexy/commands.py#L144
You need to implement a help_command method to get the generated help, it's a 1-liner.
I think pydoc may be what you're looking for.
It certainly isn't quite the same as POD, as you have to call pydoc itself (e.g. pydoc myscript.py), but I guess it can be a good starting point.
Of course, you can always add pydoc support for your script by importing from it and using it's functions/classes.
Checkout pydoc's own cli implementation for the best example.
I am writing a command-line plugin-based program where the plugins will provide additional functionality on top of whatever I provide.
So for example suppose I wrote a simple script that parsed images and stored them, and that's all I do. Then someone else can write a set of scripts to manipulate the image, putting his scripts in a plugin.
The plugin would be loaded and users can access the plugin by specifying its name in the command line.
It is not uncommon for scripts to want to provide additional options for the user.
So suppose in some years, 20 different plugins have been written.
Now, all of the authors want to allow users to provide options, so the main engine should take the user's options and pass them to the plugin so that it can handle them however it wants.
To keep it uniform, they might agree that certain options should perform a similar operation. Like "-o name" should set the output name to "name". They would then go about implementing their own options and stuff, which the main engine does not know about (of course, it shouldn't know what the plugins do)
I am using the deprecated getopt module, and it will throw exceptions whenever I specify an undefined option. I have heard of optparse and argparse, but I am not sure if these will allow the user to specify any options he wants without the code throwing an exception.
How can I make it so I can specify any command-line option?
argparse lets you partially parse an argument list with the parse_known_args method, returning what was parsed correctly, together with a list of the remaining arguments.
The solution you want is probably to treat the command line arguments as a sort of in process pipeline. Which options are also a part of where the options may go.
command <global options> sub_command <sub_options> new_sub_command <new_sub_options>
each command will shift options off of sys.argv until it finds one it doesn't understand, or one that cannot be a valid option, and then it stops parsing arguments, does its job, and returns control to the plugin-dispatcher.
I am about to get a bunch of python scripts from an untrusted source.
I'd like to be sure that no part of the code can hurt my system, meaning:
(1) the code is not allowed to import ANY MODULE
(2) the code is not allowed to read or write any data, connect to the network etc
(the purpose of each script is to loop through a list, compute some data from input given to it and return the computed value)
before I execute such code, I'd like to have a script 'examine' it and make sure that there's nothing dangerous there that could hurt my system.
I thought of using the following approach: check that the word 'import' is not used (so we are guaranteed that no modules are imported)
yet, it would still be possible for the user (if desired) to write code to read/write files etc (say, using open).
Then here comes the question:
(1) where can I get a 'global' list of python methods (like open)?
(2) Is there some code that I could add to each script that is sent to me (at the top) that would make some 'global' methods invalid for that script (for example, any use of the keyword open would lead to an exception)?
I know that there are some solutions of python sandboxing. but please try to answer this question as I feel this is the more relevant approach for my needs.
EDIT: suppose that I make sure that no import is in the file, and that no possible hurtful methods (such as open, eval, etc) are in it. can I conclude that the file is SAFE? (can you think of any other 'dangerous' ways that built-in methods can be run?)
This point hasn't been made yet, and should be:
You are not going to be able to secure arbitrary Python code.
A VM is the way to go unless you want security issues up the wazoo.
You can still obfuscate import without using eval:
s = '__imp'
s += 'ort__'
f = globals()['__builtins__'].__dict__[s]
** BOOM **
Built-in functions.
Keywords.
Note that you'll need to do things like look for both "file" and "open", as both can open files.
Also, as others have noted, this isn't 100% certain to stop someone determined to insert malacious code.
An approach that should work better than string matching us to use module ast, parse the python code, do your whitelist filtering on the tree (e.g. allow only basic operations), then compile and run the tree.
See this nice example by Andrew Dalke on manipulating ASTs.
built in functions/keywords:
eval
exec
__import__
open
file
input
execfile
print can be dangerous if you have one of those dumb shells that execute code on seeing certain output
stdin
__builtins__
globals() and locals() must be blocked otherwise they can be used to bypass your rules
There's probably tons of others that I didn't think about.
Unfortunately, crap like this is possible...
object().__reduce__()[0].__globals__["__builtins__"]["eval"]("open('/tmp/l0l0l0l0l0l0l','w').write('pwnd')")
So it turns out keywords, import restrictions, and in-scope by default symbols alone are not enough to cover, you need to verify the entire graph...
Use a Virtual Machine instead of running it on a system that you are concerned about.
Without a sandboxed environment, it is impossible to prevent a Python file from doing harm to your system aside from not running it.
It is easy to create a Cryptominer, delete/encrypt/overwrite files, run shell commands, and do general harm to your system.
If you are on Linux, you should be able to use docker to sandbox your code.
For more information, see this GitHub issue: https://github.com/raxod502/python-in-a-box/issues/2.
I did come across this on GitHub, so something like it could be used, but that has a lot of limits.
Another approach would be to create another Python file which parses the original one, removes the bad code, and runs the file. However, that would still be hit-and-miss.
I am working on a series of command line tools which connect to the same server and do related but different things. I'd like users to be able to have a single configuration file where they can place common arguments such as connection information that can be shared across all the tools. Ideally, I'd like something that does the following for me:
If the server address is specified at the command line use this and ignore any other values
If the server address is not specified at the command line but is in a config file that is specified at the command line use this address. Ignore any other values.
If the server address is not specified at the command line or a config file specified at the command, but is available in a in a config file in the user's home directory (say .myapprc), use this value.
If the server address is not specified in any of the above mechinisms exit with an error message.
The closest I've seen to this is the configparse module, which from what I can tell offers an option parser that will also look at config files, but does not seem to have the notion of "Must be specified somewhere" which I need.
Does anyone know of an existing module that can cover my use case above? If not, a simple extension to optparse, configparse, or some other module I have not reviewed would also be greatly appreciated.
This-party module configparse is written to extend optparse from the standard Python library. As the optparse docs I pointed to mention, "optparse doesn’t prevent you from implementing required options, but doesn’t give you much help at it either" (though it follows with a couple of URLs that show you ways to do it). Simplest is to use the default value functionality: specify a default value that's not actually a legal value (for something like a server's address, that's pretty easy) -- then, once options are processed, verify that the specified value is legal (which is a good idea anyway!-) and raise the appropriate exception otherwise.
I've used opster's middleware feature together with SafeConfigParser to achieve a similar (but slightly simpler) effect as you ask. You have to implement the specific logic you described yourself, but it assists you enough to make it relatively painless. An example of opster's middleware use is in its test/test.py example.
use a dict to store options to your program.
first parse the option file in the user's directory and store every options in a dict (configparse or any other module is welcome). then parse the command line (using any module you want, optparse might fit well), if an arguments specifies a config file, parse the specified file in a dict and update your options from what you read (dict.update is really handy to merge 2 dict). then store all other arguments into another dict, and merge them again (dict.update again...).
this way, you are sure that the dict in which you stored the options contains the value you want, which was either read from the user's file, from the specified config file or directly from the command line. if it does not contain a required value, exit with an error.