I have not found any related questions or examples of Python code using f-strings to generate Python code. Is there an underlying problem that I should know about? f-strings are really convenient and seem to be rather efficient for my needs.
I am generating some python scripts that can be used in the command line for automatically processing folders with remote sensing images. I was going to manually write some files by hand but realized is was relatively easy to automate the process by externalizing metadata regarding the expressions.
Program logic:
Extract expressions containing information on the different types of calculations that could be performed on the images (any sort of remote sensing indice)
Iterate through expressions
Create file with expression name (and other standards)
Insert expression information into f-strings
Write f-string to create file
I will also generate some tests automatically once settled on the method to be used. Is there a limit from which f-strings will not handle the code efficiently?
Some people have discussed using Python templates like Jinja2. However, if f-strings are sufficient I do not wish to integrate another external dependency.
from expressions_meta import expressions
for key in expressions.keys():
file_name = '_'.join([key, 'dir', 'cl.py'])
with open(file_name, 'w') as f:
f.write(f"""
import sys
import getopt
from gdal_dir_calc import GDALDirCalc
expression = {expressions[key]}
band_meta = {{}}
[...]
gdal_dir_obj.main()
""")
I might just be overly cautious but I think the topic could address other applications as well.
Any other tips regarding the use of f-strings for Python code generation or another tool?
If you are coming across this question, be sure to review the design of your program.
Most likely you are trying to implement a solution in which you violate DRY principles.
For command line applications:
Instead of generating many specific commands, look into to passing a name argument which in turn can be used with a selection of arguments.
From the standard library, some tools that may be helpful are configparser, shlex, cmd, getopt and argparse. See the standard library documentation on these tools.
Click is an interesting third party package.
Thanks #SergeBallesta for your helpful comments.
Related
So you've got some legacy code lying around in a fairly hefty project. How can you find and delete dead functions?
I've seen these two references: Find unused code and Tool to find unused functions in php project, but they seem specific to C# and PHP, respectively.
Is there a Python tool that'll help you find functions that aren't referenced anywhere else in the source code (notwithstanding reflection/etc.)?
In Python you can find unused code by using dynamic or static code analyzers. Two examples for dynamic analyzers are coverage and figleaf. They have the drawback that you have to run all possible branches of your code in order to find unused parts, but they also have the advantage that you get very reliable results.
Alternatively, you can use static code analyzers that just look at your code, but don't actually run it. They run much faster, but due to Python's dynamic nature the results may contain false positives.
Two tools in this category are pyflakes and vulture. Pyflakes finds unused imports and unused local variables. Vulture finds all kinds of unused and unreachable code. (Full disclosure: I'm the maintainer of Vulture.)
The tools are available in the Python Package Index https://pypi.org/.
I'm not sure if this is helpful, but you might try using the coverage, figleaf or other similar modules, which record which parts of your source code is used as you actually run your scripts/application.
Because of the fairly strict way python code is presented, would it be that hard to build a list of functions based on a regex looking for def function_name(..) ?
And then search for each name and tot up how many times it features in the code. It wouldn't naturally take comments into account but as long as you're having a look at functions with less than two or three instances...
It's a bit Spartan but it sounds like a nice sleepy-weekend task =)
unless you know that your code uses reflection, as you said, I would go for a trivial grep. Do not underestimate the power of the asterisk in vim as well (performs a search of the word you have under your cursor in the file), albeit this is limited only to the file you are currently editing.
Another solution you could implement is to have a very good testsuite (seldomly happens, unfortunately) and then wrap the routine with a deprecation routine. if you get the deprecation output, it means that the routine was called, so it's still used somewhere. This works even for reflection behavior, but of course you can never be sure if you don't trigger the situation when your routine call is performed.
its not only searching function names, but also all the imported packages not in use.
you need to search the code for all the imported packages (including aliases) and search used functions, then create a list of the specific imports from each package (example instead of import os, replace with from os import listdir, getcwd,......)
After reading the Software Carpentry essay on Handling Configuration Files I'm interested in their Method #5: put parameters in a dynamically-loaded code module. Basically I want the power to do calculations within my input files to create my variables.
Based on this SO answer for how to import a string as a module I've written the following function to import a string or oen fileobject or STringIO as a module. I can then access varibales using the . operator:
import imp
def make_module_from_text(reader):
"""make a module from file,StringIO, text etc
Parameters
----------
reader : file_like object
object to get text from
Returns
-------
m: module
text as module
"""
#for making module out of strings/files see https://stackoverflow.com/a/7548190/2530083
mymodule = imp.new_module('mymodule') #may need to randomise the name; not sure
exec reader in mymodule.__dict__
return mymodule
then
import textwrap
reader = textwrap.dedent("""\
import numpy as np
a = np.array([0,4,6,7], dtype=float)
a_normalise = a/a[-1]
""")
mymod = make_module_from_text(reader)
print(mymod.a_normalise)
gives
[ 0. 0.57142857 0.85714286 1. ]
All well and good so far, but having looked around it seems using python eval and exec introduces security holes if I don't trust the input. A common response is "Never use eval orexec; they are evil", but I really like the power and flexibility of executing the code. Using {'__builtins__': None} I don't think will work for me as I will want to import other modules (e.g. import numpy as np in my above code). A number of people (e.g. here) suggest using the ast module but I am not at all clear on how to use it(can ast be used with exec?). Is there simple ways to whitelist/allow specific functionality (e.g. here)? Is there simple ways to blacklist/disallow specific functionality? Is there a magic way to say execute this but don't do anythinh nasty.
Basically what are the options for making sure exec doesn't run any nasty malicious code?
EDIT:
My example above of normalising an array within my input/configuration file is perhaps a bit simplistic as to what computations I would want to perform within my input/configuration file (I could easily write a method/function in my program to do that). But say my program calculates a property at various times. The user needs to specify the times in some way. Should I only accept a list of explicit time values so the user has to do some calculations before preparing the input file? (note: even using a list as configuration variable is not trivial see here). I think that is very limiting. Should I allow start-end-step values and then use numpy.linspace within my program? I think that is limiting too; whatif I want to use numpy.logspace instead? What if I have some function that can accept a list of important time values and then nicely fills in other times to get well spaced time values. Wouldn't it be good for the user to be able to import that function and use it? What if I want to input a list of user defined objects? The thing is, I don't want to code for all these specific cases when the functinality of python is already there for me and my user to use. Once I accept that I do indead want the power and functionality of executing code in my input/configuration file I wonder if there is actually any difference, security wise, in using exec vs using importlib vs imp.load_source and so on. To me there is the limited standard configparser or the all powerful, all dangerous exec. I just wish there was some middle ground with which I could say 'execute this... without stuffing up my computer'.
"Never use eval or exec; they are evil". This is the only answer that works here, I think. There is no fully safe way to use exec/eval on an untrusted string string or file.
The best you can do is to come up with your own language, and either interpret it yourself or turn it into safe Python code before handling it to exec. Be careful to proceed from the ground up --- if you allow the whole Python language minus specific things you thought of as dangerous, it would never be really safe.
For example, you can use the ast module if you want Python-like syntax; and then write a small custom ast interpreter that only recognizes a small subset of all possible nodes. That's the safest solution.
If you are willing to use PyPy, then its sandboxing feature is specifically designed for running untrusted code, so it may be useful in your case. Note that there are some issues with CPython interoperability mentioned that you may need to check.
Additionally, there is a link on this page to an abandoned project called pysandbox, explaining the problems with sandboxing directly within python.
Long story short, a piece of code that I'm working with at work has the line:
from System import System
with a later bit of code of:
desc_ = System()
xmlParser = Parser(desc_.getDocument())
# xmlParser.setEntityBase(self.dtdBase)
for featureXMLfile in featureXmlList.split(","):
print featureXMLfile
xmlParser.parse(featureXMLfile)
feat = desc_.get(featureName)
return feat
Parser is an XML parser in Java (it's included in a different import), but I don't get what the desc_ bit is doing. I mean obviously, it somehow holds the feature that we're trying to pull out, but I don't entirely see where. Is System a standard library in Python or Java, or am I looking at something custom?
Unfortunately, everyone else in my group is out for Christmas Eve vacation, so I can't ask them directly. Thank you for your help. I'm still not horribly familiar with Python.
This isn't from the standard library, so you'll need to check your system (Python has plenty of introspection to help you with that).
You can tell as Python modules in the standard library use lowercase names as per PEP-8, or by searching the library reference.
Note as well that Python has it's own XML parsing tools that will be much nicer to work with in Python than Java's.
Edit: As you have noted in the comments you are using Jython, it seems likely this is Java's System package.
millimoose indicated the correct answer in his comment, but neglected to submit it as an answer, so I'm posting to indicate the correct answer. It was indeed a custom module built by my company. I was able to determine this by typing import System; print(System) into the interpreter.
i am creating ( researching possibility of ) a highly customizable python client and would like to allow users to actually edit the code in another language to customize the running of program. ( analogous to browser which itself coded in c/c++ and run another language html/js ). so my question is , is there any programming language implemented in pure python which i can see as a reference ( or use directly ? ) -- i need simple language ( simple statements and ifs can do )
edit: sorry if i did not make myself clear but what i want is "a language to customize the running of program" , even though pypi seems a great option, what i am looking for is more simple which i can study and extend myself if need arise. my google searches pointing towards xml based langagues. ( BMEL , XForms etc ).
The question isn't completely clear on scope, but I have a hunch that PyPy, embedding other full languages, and similar solutions might be overkill. It sounds like iamgopal may really be interested in something more like Interpreter Pattern or Little Language.
If the language you want to support is really small (see the Interpreter Pattern link), then hand-coding this yourself in Python won't be too hard. You can write a simple parser (Google around; here's one example), then walk the AST and evaluate user expressions.
However, if you expect this to be used for a long time or by many people, it may be worth throwing a real language at the problem. (I'd recommend Python itself if your users are already familiar with basic Python syntax).
Ren'Py is a modification to Python syntax built on top of Python itself, using the language tools in the stdlib.
For your user's sake, don't use an XML based language - XML is an awful basis for a programming language and your users will hate you for it.
Here is a suggestion. Use a strict subset of Python for your language. Use the compiler module to convert their code into an abstract syntax tree and walk the tree to to validate that the code conforms to your subset before converting the AST into python bytecode.
N.B. I just checked the docs and see that the compiler package is deprecated in 2.6 and removed in Python 3.x. Does anyone know why that is?
Numerous template languages such as Cheetah, Django templates, Genshi, Mako, Mighty might serve as an example.
Why not Python itself? With some care you can use eval to run user code.
One of the good thing about interpreted scripting languages is that you don't need another extra scripting language!
PLY (Python Lex-Yacc)
is something of your interest.
Possibly Common Lisp (or any other Lisp) will be the best choice for that task. Because Lisp make it possible to easily extend host language with powerful macroses and construct DSL (domain specific language).
If all you need is simple if statements and expressions, I'm sure it wouldn't be an awful task to parse each line. Something like
if some flag
activate some feature
deactivate some feature
elif some other flag
activate some feature
activate some feature
else
logout
Just write a class which, while parsing takes the first word, checks if it's "if, elif, else," etc, and if so, check a flag and set a flag saying you either are or are not executing until the next conditional. If it's not a conditional, call a function based on the first keyword that would modify the program state in some way.
The class could store some local execution state (are we in an if statement? If so are we executing this branch?) and have another class containing some global application state (flags that are checkable by if statements, etc).
This is probably the wrong thing to do in your situation (it's very prone to bugs, it's dangerous if you don't treat the data in the scripts correctly), but it's at least a start if you do decide to interpret your own mini-language.
Seriously though, if you try this, be very, very, srs careful. Don't give the scripts any functionality that they don't definitely need, because you are almost certainly opening security holes by doing something like this.
Don't say I didn't warn you.
So you've got some legacy code lying around in a fairly hefty project. How can you find and delete dead functions?
I've seen these two references: Find unused code and Tool to find unused functions in php project, but they seem specific to C# and PHP, respectively.
Is there a Python tool that'll help you find functions that aren't referenced anywhere else in the source code (notwithstanding reflection/etc.)?
In Python you can find unused code by using dynamic or static code analyzers. Two examples for dynamic analyzers are coverage and figleaf. They have the drawback that you have to run all possible branches of your code in order to find unused parts, but they also have the advantage that you get very reliable results.
Alternatively, you can use static code analyzers that just look at your code, but don't actually run it. They run much faster, but due to Python's dynamic nature the results may contain false positives.
Two tools in this category are pyflakes and vulture. Pyflakes finds unused imports and unused local variables. Vulture finds all kinds of unused and unreachable code. (Full disclosure: I'm the maintainer of Vulture.)
The tools are available in the Python Package Index https://pypi.org/.
I'm not sure if this is helpful, but you might try using the coverage, figleaf or other similar modules, which record which parts of your source code is used as you actually run your scripts/application.
Because of the fairly strict way python code is presented, would it be that hard to build a list of functions based on a regex looking for def function_name(..) ?
And then search for each name and tot up how many times it features in the code. It wouldn't naturally take comments into account but as long as you're having a look at functions with less than two or three instances...
It's a bit Spartan but it sounds like a nice sleepy-weekend task =)
unless you know that your code uses reflection, as you said, I would go for a trivial grep. Do not underestimate the power of the asterisk in vim as well (performs a search of the word you have under your cursor in the file), albeit this is limited only to the file you are currently editing.
Another solution you could implement is to have a very good testsuite (seldomly happens, unfortunately) and then wrap the routine with a deprecation routine. if you get the deprecation output, it means that the routine was called, so it's still used somewhere. This works even for reflection behavior, but of course you can never be sure if you don't trigger the situation when your routine call is performed.
its not only searching function names, but also all the imported packages not in use.
you need to search the code for all the imported packages (including aliases) and search used functions, then create a list of the specific imports from each package (example instead of import os, replace with from os import listdir, getcwd,......)