Python: parse commandline - python

I'm writing a CLI application in python with is used by means of a rather elaborate commandline language. The idea is very similar to find(1) which arguably has the same property.
Currently, the parser is completely handwritten using a handmade EBNF description language. The problem is that this language is very awkward to use because I have to write everything as python structures. I also feel that my program is still way too bloated because of the parsing.
Is there any lib that features ease of use, and a true description language (input as string/document) for commandline parsing? From the syntax tree, I would like to directly map each item to a class instance. Naturally, I don't want a tokenizer, or at least the tokenizer must map straight from commandline arguments to tokens.
Thanks for all suggestions!
UPDATE: The whole point of my program is to generate objects and pass them through any number of filters (possibly unpure/effectful actions) that might or might not output the objects again, or might even output objects of another type. The general idea is obviously gleaned from find(1). An example commandline would be:
~/picdb.py -sqlselect 'select * from pics where dirname like "testdir%"' -tagged JoSo -updateFromFile [ -resx +300 -or -resX +200 -resY +500 ] -printfXml '<jpegfile><src>%fp</src><DateTimeOriginal>%ed</DateTimeOriginal><Manufacturer>%eM</Manufacturer><Model>%em</Model></jpegfile>%NL'

This is a very tricky problem...You can "bind" actions to commandline arguments using argparse quite easily (e.g. create a class, operate on a previously created class ...). Here's a silly example of that...(argument --foo creates an object, argument --bar modifies the object created by --foo).
from argparse import ArgumentParser,Action
class Foo(object):
def __init__(self,*args):
self.args=args
def __str__(self):
return str(self.args)
class FooAction(Action):
def __call__(self,parser,namespace,values,option_string=None):
setattr(namespace,self.dest,Foo(*values)) #Add Foo to the options...
class BarAction(Action):
def __call__(self,parser,namespace,values,option_string=None):
FooObj=getattr(namespace,'foo') #raises an error if foo isn't in namespace...
#In this way, BarAction is like a filter on the
#object created by foo.
FooObj.args=tuple(list(FooObj.args)+list(values)) #append to the list of args.
parser=ArgumentParser()
parser.add_argument('--foo',nargs='*',action=FooAction,help="Foo!")
parser.add_argument('--bar',nargs='*',action=BarAction,help="Bar! : Must be used after --foo")
namespace=parser.parse_args("--foo Hello World --bar Nice Day".split())
print (namespace)
print (namespace.foo)
However, this is a little different from yours in that -argument is not really possible with argparse, only -a or --argument. That may already be a deal breaker for you, I'm not sure...
The next difficulty is dealing with the brackets... [ and ]. If you can treat those as arguments to a different commandline option, you might be OK...You might be able to set up a second parser to parse out the inside portions -- but I've never tried anything like that before... (If anyone else has any ideas about how to deal with the brackets, I'd be very interested to hear them).
As far as optparse and getopt are concerned, I'm pretty sure that anything you can do with them, you can do with argparse, which is why I've left them out of the discussion.

There are al least three modules you could try; argparse, optparse (deprecated in 2.7) and getopt. See chapter 15 of the Python standard library manual.

Related

python string to a function call with arguments, without using eval

I have a string stored in a database stands for a class instance creation for example module1.CustomHandler(filename="abc.csv", mode="rb"), where CustomHandler is a class defined in module1.
I would like to evaluate this string to create a class instance for a one time use. Right now I am using something like this
statement = r'module1.CustomHandler(filename="abc.csv", mode="rb")' # actually read from db
exec(f'from parent.module import {statement.split(".")[0]}')
func_or_instance = eval(statement) # this is what I need
Only knowledgable developers can insert such records into database so I am not worried about eval some unwanted codes. But I've read several posts saying eval is unsafe and there is always a better way. Is there a way I can achieve this without using eval?
You might want to take a look at the ast Python module, which stands for abstract syntax trees. It's mainly used when you need to process the grammar of the programming language, work with code in string format, and so much more functions available in the official documentation.
In this case eval() function looks like the best solution, clear and readable, but safe only under certain conditions. For example, if you try to evaluate a code that contains a class not implemented in the code, it will throw an exception. That's the main reason why eval is sometimes unsafe.

python getopt - what's the second returned list?

I'm learning getops and I don't understand this construction:
opts, argv = getopt.getopt(argv, "s:e:", ["start_date=", "end_date="])
I tried just to print opts and argv for debugging. First variable - opts is a list of key-value pairs of arguments used and it's values, which is easy and understandable for me, but I completely do not understand the purpose of "argv" variable. Original documentation says:
the second is the list of program arguments left after the option list
was stripped (this is a trailing slice of args).
So, when I run my script like this: python --argument AAA ggg argv variable will store list with one element "ggg", but what for? Could you please give me any use case for that?
That is meant for positional arguments. Often they are used for filenames that refer to some kind of input or output.
Say you're writing a text editor, you could say that all positional arguments refer to files to open in your editor.
I would not recommend using getopt, by the way. argparse does everything getopt does but does it better, and allows you to do things getopt just can't do. The Python documentation agrees with me:
The getopt module is a parser for command line options whose API is designed to be familiar to users of the C getopt() function. Users who are unfamiliar with the C getopt() function or who would like to write less code and get better help and error messages should consider using the argparse module instead. (source)

Making API from Python docstrings written in PEP8

I've written a code in Python. I tried to be follow common guidelines about writing helpful comments at the beginnings of functions. My style is PEP8, e.g.
def __init__(self, f_name=None, list_=None, cut_list=None, n_events=None, description=None):
"""
Parse an LHCO or ROOT file into a list of Event objects.
It is possible to initialize an Events class without a LHCO file,
and later append events to the list.
Arguments:
f_name -- Name of an LHCO or ROOT file, including path
list_ -- A list for initalizing Events
cut_list -- Cuts applied to events and their acceptance
n_events -- Number of events to read from LHCO file
description -- Information about events
"""
I want to automatically generate a helpful API from my code. I've found a few options and was looking at Sphinx in particular. It seemed to do what I wanted (though I struggled to make it generate an API, rather than a manual for my program). The drawback, however, was that it has it's own expected style for docstrings:
"""
:param x: My parameter
:type x: Its type
"""
Is it really best for me to rewrite all my docstrings with this syntax? They produce a nice API, but I don't like them in the code, and it'll be time consuming if it turns out to be a bad idea. What is standard, best practice? Should I convert? If so, can something do it automatically for me?
The Sphinx default format for docstrings is really quite powerful and is definitely worth the time if you want to generate clean API documentation and if you need to review your own code in months, years. So yes, it is a good idea.
If you don't like the default Sphinx-ReST syntax, you could try writing your docstrings the way Numpy do, e.g.:
def func(arg1, arg2):
"""Summary line.
Extended description of function.
Parameters
----------
arg1 : int
Description of arg1
arg2 : str
Description of arg2
Returns
-------
bool
Description of return value
"""
return True
There's a Sphinx extension (Napoleon) which allows Sphinx to parse this style (or the Google style, which is even simpler).
I think that the Sphinx syntax is pretty lightweight (be glad it's not Javadoc) so in terms of pretty raw code it's not a serious disadvantage.
My IDE, PyCharm, automatically creates skeletons in the Sphinx style when I add a docstring to a function. So there's some developers who know a thing or two about Python (and who also like to push for PEP8 style in other areas a lot) and recommend Sphinx. PyCharm even has a type hinting system used for inference and type checking, which starts by checking the declarations in the docstring.
Here's a regex you can use to make the conversion automatically. Replace
^(\s+)(\w+) -- (.+)$
with
$1:param $2: $3\n$1:type $2:
where $n represents the nth group. Of course you will need to fill out the type yourself.

Hot swapping python code (duck type functions?)

I've been thinking about this far too long and haven't gotten any idea, maybe some of you can help.
I have a folder of python scripts, all of which have the same surrounding body (literally, I generated it from a shell script), but have one chunk that's different than all of them. In other words:
Top piece of code (always the same)
Middle piece of code (changes from file to file)
Bottom piece of code (always the same)
And I realized today that this is a bad idea, for example, if I want to change something from the top or bottom sections, I need to write a shell script to do it. (Not that that's hard, it just seems like it's very bad code wise).
So what I want to do, is have one outer python script that is like this:
Top piece of code
Dynamic function that calls the middle piece of code (based on a parameter)
Bottom piece of code
And then every other python file in the folder can simply be the middle piece of code. However, normal module wouldn't work here (unless I'm mistaken), because I would get the code I need to execute from the arguement, which would be a string, and thus I wouldn't know which function to run until runtime.
So I thought up two more solutions:
I could write up a bunch of if statements, one to run each script based on a certain parameter. I rejected this, as it's even worse than the previous design.
I could use:
os.command(sys.argv[0] scriptName.py)
which would run the script, but calling python to call python doesn't seem very elegant to me.
So does anyone have any other ideas? Thank you.
If you know the name of the function as a string and the name of module as a string, then you can do
mod = __import__(module_name)
fn = getattr(mod, fn_name)
fn()
Another possible solution is to have each of your repetitive files import the functionality from the main file
from topAndBottom import top, bottom
top()
# do middle stuff
bottom()
In addition to the several answers already posted, consider the Template Method design pattern: make an abstract class such as
class Base(object):
def top(self): ...
def bottom(self): ...
def middle(self): raise NotImplementedError
def doit(self):
self.top()
self.middle()
self.bottom()
Every pluggable module then makes a class which inherits from this Base and must override middle with the relevant code.
Perhaps not warranted for this simple case (you do still have to import the right module in order to instantiate its class and call doit on it), but still worth keeping in mind (together with its many Pythonic variations, which I have amply explained in many tech talks now available on youtube) for cases where the number or complexity of "pluggable pieces" keeps growing -- Template Method (despite its horrid name;-) is a solid, well-proven and highly scalable pattern [[sometimes a tad too rigid, but that's exactly what I address in those many tech talks -- and that problem doesn't apply to this specific use case]].
However, normal module wouldn't work here (unless I'm mistaken), because I would get the code I need to execute from the arguement, which would be a string, and thus I wouldn't know which function to run until runtime.
It will work just fine - use __import__ builtin or, if you have very complex layout, imp module to import your script. And then you can get the function by module.__dict__[funcname] for example.
Importing a module (as explained in other answers) is definitely the cleaner way to do this, but if for some reason that doesn't work, as long as you're not doing anything too weird you can use exec. It basically runs the content of another file as if it were included in the current file at the point where exec is called. It's the closest thing Python has to a source statement of the kind included in many shells. As a bare minimum, something like this should work:
exec(open(filename).read(None))
How about this?
function do_thing_one():
pass
function do_thing_two():
pass
dispatch = { "one" : do_thing_one,
"two" : do_thing_two,
}
# do something to get your string from the command line (optparse, argv, whatever)
# and put it in variable "mystring"
# do top thing
f = dispatch[mystring]
f()
# do bottom thing

What do you wish you'd known about when you started learning Python? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've decided to learn Python 3. For those that have gone before, what did you find most useful along the way and wish you'd known about sooner?
I learned Python back before the 1.5.2 release, so the things that were key for me back then may not be the key things today.
That being said, a crucial thing that took me a little bit to realize, but I now consider crucial: much functionality that other languages would make intrinsic is actually made available by the standard library and the built-ins.
The language itself is small and simple, but until you're familiar with the built-ins and the "core parts" of the standard library (e.g., nowadays, sys, itertools, collections, copy, ...), you'll be reinventing the wheel over and over. So, the more time you invest in getting familiar with those parts, the smoother your progress will be. Every time you have a task you want to do, that doesn't seem to be directly supported by the language, first ask yourself: what built-ins or modules in the standard library will make the task much simpler, or even do it all for me? Sometimes there won't be any, but more often than not you'll find excellent solutions by proceeding with this mindset.
I wished I didn't know Java.
More functional programming. (see itertools module, list comprehension, map(), reduce() or filter())
List comprehension (makes a list cleanly):
[x for x in y if x > z]
Generator expansion (same as list comprehension but doesn't evaluate until it is used):
(x for x in y if x > z)
Two brain-cramping things. One of which doesn't apply to Python 3.
a = 095
Doesn't work. Why? The leading zero is an octal literal. The 9 is not valid in an octal literal.
def foo( bar=[] ):
bar.append( 1 )
return bar
Doesn't work. Why? The mutable default object gets reused.
What enumerate is for.
That seq = seq.append(item) and seq = seq.sort() both set seq to None.
Using set to remove duplicates.
Pretty much everything in the itertools and collections modules.
How the * and ** prefixes for function arguments work.
How default arguments to functions work internally (i.e. what f.func_defaults is).
How (why, really) to design functions so that they are useful in conjunction with map and zip.
The role of __dict__ in classes.
What import actually does.
Learn how to use iPython
It's got Tab completion.
View all the elements in your namespace with 'whos'.
After you import a module, it's easy to view the code:
>>> import os
>>> os?? # this display the actual source of the method
>>> help() # Python's interactive help. Fantastic!
Most Python modules are well documented; in theory, you could learn iPython and the rest of what you'd need to know could be learned through the same tool.
iPython also has a debug mode, pdb().
Finally, you can even use iPython as a python enabled command line. The basic UNIX commands work as %magic methods. Any commands that aren't magic command can be executed:
>>> os.system('cp file1 file2')
Don't have variable names that are types. For example, don't name a variable "file" or "dict"
Decorators. Writing your own is not something you might want to do right away, but knowing that #staticmethod and #classmethod are available from the beginning (and the difference between what they do) is a real plus.
using help() in the shell on any object, class or path
you can run import code;
code.interact(local=locals()) anywhere in your code and it will start a python shell at that exact point
you can run python -i yourscript.py to start a shell at the end of yourscript.py
Most helpful: Dive Into Python. As a commenter points out, if you're learning Python 3, Dive Into Python 3 is more applicable.
Known about sooner: virtualenv.
That a tuple of a single item must end with a comma, or it won't be interpreted as a tuple.
pprint() is very handy (yes, 2 p's)
reload() is useful when you're re-testing a module while making lots of rapid changes to a dependent module.
And learn as many common "idioms" as you can, otherwise you'll bang your head looking for a better way to do something, when the idiom really is regarded as the best way (e.g. ugly expressions like ' '.join(), or the answer to why there is no isInt(string) function.... the answer is you can just wrap the usage of a "possible" integer with a try: and then catch the exception if it's not a valid int. The solution works well, but it sounds like a terrible answer when you first encounter it, so you can waste a lot of time convincing yourself it really is a good approach.
Those are some things that wasted several hours of my time to determine that my first draft of some code which felt wrong, really was acceptable.
Readings from python.org:
http://wiki.python.org/moin/BeginnerErrorsWithPythonProgramming
http://wiki.python.org/moin/PythonWarts
List comprehensions, if you're coming to Python fresh (not from an earlier version).
Closures. Clean and concise, without having to resort to using a Strategy Pattern unlike languages such as Java
If you learn from a good book, it will not only teach you the language, it will teach you the common idioms. The idioms are valuable.
For example, here is the standard idiom for initializing a class instance with a list:
class Foo(object):
def __init__(self, lst=None):
if lst is None:
self.lst = []
else:
self.lst = lst
If you learn this as an idiom from a book, you don't have to learn the hard way why this is the standard idiom. #S.Lott already explained this one: if you try to make the default initializer be an empty list, the empty list gets evaluated just once (at compile time) and every default-initialized instance of your class gets the same list instance, which was not what was intended here.
Some idioms protect you from non-intended effects; some help you get best performance out of the language; and some are just small points of style, which help other Python fans understand your code better.
I learned out of the book Learning Python and it introduced me to some of the idioms.
Here's a web page devoted to idioms: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
P.S. Python code that follows the best-practice Python idioms often is called "Pythonic" code.
I implemented plenty of recursive directory walks by hand before I learned about os.walk()
Lambda functions
http://www.diveintopython.org/power_of_introspection/lambda_functions.html
One of the coolest things I learned about recently was the commands module:
>>> import commands
>>> commands.getoutput('uptime')
'18:24 up 10:22, 7 users, load averages: 0.37 0.45 0.41'
It's like os.popen or os.system but without all of the DeprecationWarnings.
And let's not forget PDB (Python Debugger):
% python -m pdb poop.py
Dropping into interactive mode in IPython
from IPython.Shell import IPShellEmbed
ipshell = IPShellEmbed()
ipshell()
When I started with python, started out with main methods from the examples. This was because I didn't know better, after that I found this on how to create a better main method.
Sequential imports overwrite:
If you import two files like this:
from foo import *
from bar import *
If both foo.py and bar.py have a function named fubar(), having imported the files this way, when you call fubar, fubar as defined in bar.py will be executed. The best way to avoid this is to do this:
import foo
import bar
and then call foo.fubar or bar.fubar. This way, you ALWAYS know which file's definition of fubar will be executed.
Maybe a touch more advanced, but I wish I'd known that you don't use threads to take advantage of multiple cores in (C)python. You use the multiprocessing library.
Tab completion and general readline support, including histories, even in the regular python shell.
$ cat ~/.pythonrc.py
#!/usr/bin/env python
try:
import readline
except ImportError:
print("Module readline not available.")
else:
import rlcompleter
readline.parse_and_bind("tab: complete")
import os
histfile = os.path.join(os.environ["HOME"], ".pyhist")
try:
readline.read_history_file(histfile)
except IOError:
pass
import atexit
atexit.register(readline.write_history_file, histfile)
del os, histfile
and then add a line to your .bashrc
export PYTHONSTARTUP=~/.pythonrc.py
These two things lead to an exploratory programming style of "it looks like this library might do what I want", so then I fire up the python shell and then poke around using tab-completion and the help() command until I find what I need.
Generators and list comprehensions are more useful than you might think. Don't just ignore them.
I wish I knew well a functional language. After playing a bit with Clojure, I realized that lots of Python's functional ideas are borrowed from Lisp or other functional langs
I wish I'd known right off the bat how to code idiomatically in Python. You can pick up any language you like and start coding in it like it's C, Java, etc. but ideally you'll learn to code in "the spirit" of the language. Python is particularly relevant, as I think it has a definite style of its own.
While I found it a little later in my Python career than I would have liked, this excellent article wraps up many Python idioms and the little tricks that make it special. Several of the things people have mentioned in their answers so far are contained within:
Code Like a Pythonista: Idiomatic Python.
Enjoy!
Pretty printing:
>>> print "%s world" %('hello')
hello world
%s for string
%d for integer
%f for float
%.xf for exactly x many decimal places of a float. If the float has lesser decimals that indicated, then 0s are added
I really like list comprehension and all other semifunctional constructs. I wish I had known those when I was in my first Python project.
What I really liked: List comprehensions, closures (and high-order functions), tuples, lambda functions, painless bignumbers.
What I wish I had known about sooner: The fact that using Python idioms in code (e.g. list comprehensions instead of loops over lists) was faster.
That multi-core was the future. Still love Python. It's writes a fair bit of my code for me.
Functional programming tools, like all and any

Categories

Resources