How to I get scons to invoke an external script?

How to I get scons to invoke an external script? - python

I'm trying to use scons to build a latex document. In particular, I want to get scons to invoke a python program that generates a file containing a table that is \input{} into the main document. I've looked over the scons documentation but it is not immediately clear to me what I need to do.
What I wish to achieve is essentially what you would get with this makefile:
document.pdf: table.tex
pdflatex document.tex
table.tex:
python table_generator.py
How can I express this in scons?

Something along these lines should do -
env.Command ('document.tex', '', 'python table_generator.py')
env.PDF ('document.pdf', 'document.tex')
It declares that 'document.tex' is generated by calling the Python script, and requests a PDF document to be created from this generatd 'document.tex' file.
Note that this is in spirit only. It may require some tweaking. In particular, I'm not certain what kind of semantics you would want for the generation of 'document.tex' - should it be generated every time? Only when it doesn't exist? When some other file changes? (you would want to add this dependency as the second argument to Command() that case).
In addition, the output of Command() can be used as input to PDF() if desired. For clarity, I didn't do that.

In this simple case, the easiest way is to just use the subprocess module
from subprocess import call
call("python table_generator.py")
call("pdflatex document.tex")
Regardless of where in your SConstruct file these lines are placed, they will happen before any of the compiling and linking performed by SCons.
The downside is that these commands will be executed every time you run SCons, rather than only when the files have changed, which is what would happen in your example Makefile. So if those commands take a long time to run, this wouldn't be a good solution.
If you really need to only run these commands when the files have changed, look at the SCons manual section Writing Your Own Builders.

Related

Is there any way to change the name of creator of a module?

I have created a code that automates some kind of task using python.
So for that I imported a module named Pywhatkit.
But after importing when I run the program first I get this(shown in the attached image below) note from creator of Pywhatkit(highlighted part in the attachment below)and then I am able to run my program.
Sometimes this seems unprofessional.
So is there any way too either change the name that appears on running the program or is there any way with which I can remove this part?

Python modules are generally written in python. If you need to make a modification like this, you can usually just look up where the module is installed and change the relevant bit of the code (usually in your python version's site_modules directory, in a folder named the same as the module, where the file __init__.py is the first thing executed, so that's a good place to start if you're blindly searching for something to change).
In this particular case, do the following:
Run the console command pip show pywhatkit to find the location of the installed pywhatkit module. Should be the third-to-last line of that command's output. I'll call this $pwkdir
Open the file $pwkdir/pywhatkit/mainfunctions.py in your text editor of choice
Comment out lines 300 through 304, and save the file.
the cause of the output you're seeing is a straightforward print() call, so removing it is easy and harmless.
I found the location that needs to be commented out by doing the command grep -nr 'Hello from the' $pwkdir/pywhatkit (i.e. searching for any usages of the observed phrase, because the first three words are enough to identify it), and reading the code.
You will probably need to do this again every time you reinstall or update this module to a new version.
Note that there are other places within the module where it prints to console. You may wish to search for and comment out those lines as well, or disable printing to stdout before importing the module for the first time.

Noted, this "Note from the creator" will be removed from the next update. Meanwhile, you can do as suggested in the other answer, you may consider editing the "mainfunctions.py", commenting the last print statement of that file will stop printing the message.

How to tell make to track the output of a python script and only run the script if the input file was updated

I know that in the context of creating a c/c++ executable, make automatically takes care of creating the executable only if the dependency files have been updated.
I have been wondering if there is a way to specify the input and output files of a python script to make, such that make can do its magic depending on the status of the input file.
Makefile entry:
update_info:
python update_info.py info.xml
# Output file : info.hpp
Basically I would like make to check the version of info.xml. If it has not been updated, I don't want make to run the update_info.py script.
Just trying to find out if there is an existing way to do it. I can live with it, if this cannot be done.
Thanks.

I have been wondering if there is a way to specify the input and
output files of a python script to make, such that make can do its
magic depending on the status of the input file.
Of course. At its core, make is a very simple and general system. It defines a format for expressing the prerequisites for building various targets, and for expressing machine-actionable instructions for actually building those targets from their designated prerequisites. Given that information, make figures out which recipes, if any, to execute to build a requested target or to bring it up to date with respect to its direct and indirect dependencies.
Make knows nothing about the nature of the targets and prerequisites other than that they are timestamped files (or so it assumes). Make having been designed with building software in mind, it does come with some built-in rules aimed at building programs from source code written in a few select languages, but that's a convenience feature, not a limitation.
If you want to provide for building a file named info.hpp based on another file named info.xml then info.hpp is the target and info.xml is the (a) prerequisite. A rule along these lines would then be appropriate:
info.hpp: info.xml
python update_info.py info.xml
Generally speaking, though, it is best to avoid repeating target and prerequisite names in the recipe wherever possible. Instead, use the appropriate automatic make variables, such as $^, which represents the distinct elements of the prerequisite list:
info.hpp: info.xml
python update_info.py $^
Under some circumstances, you might want the Python script to be a prerequisite, too, since changes to the script could cause it to produce meaningfully different output from the same input.

Figured out a way to do this for existing files which are updated by a script:
The solution is to create a new file by copying the make target file.
update_info: target/info.hpp
target/info.hpp: info.xml
python update_info.xml info.xml
cp info.hpp target/info.hpp
This way, the current folder can contain a previous version of info.hpp. Make tracks the info.hpp file in target folder and will only run the script if the input file info.xml has changed.
The side effect of this is that the info.hpp in the current folder is updated only if info.xml has changed. Which is what was needed.
If you are using git, you may need to add the target/info.hpp file to the ignore list.
This method is helpful in using make to track those files in a repository with an existing file which is updated by a script.

when using Watchman's watch-make I want to access the name of the changed files

I am writing a watchman command with watchman-make and I'm at a loss when trying to access exactly what was changed in the directory. I want to run my upload.py script and inside the script I would like to access filenames of newly created files in /var/spool/cups-pdf/ANONYMOUS .
so far I have
$ watchman-make -p '/var/spool/cups-pdf/ANONYMOUS' -—run 'python /home/pi/upload.py'
I'd like to add another argument to python upload.py so I can have an exact filepath to the newly created file so that I can send the new file over to my database in upload.py,
I've been looking at the docs of watchman and the closest thing I can think to use is a trigger object. Please help!

Solution with watchman-wait:
Assuming project layout like this:
/posts/_SUBDIR_WITH_POST_NAME_/index.md
/Scripts/convert.sh
And the shell script like this:
#!/bin/bash
# File: convert.sh
SrcDirPath=$(cd "$(dirname "$0")/../"; pwd)
cd "$SrcDirPath"
echo "Converting: $SrcDirPath/$1"
Then we can launch watchman-wait like this:
watchman-wait . --max-events 0 -p 'posts/**/*.md' | while read line; do ./Scripts/convert.sh $line; done
When we changing file /posts/_SUBDIR_WITH_POST_NAME_/index.md the output will be like this:
...
Converting: /Users/.../Angular/dartweb_quickstart/posts/swift-on-android-building-toolchain/index.md
Converting: /Users/.../Angular/dartweb_quickstart/posts/swift-on-android-building-toolchain/index.md
...

watchman-make is intended to be used together with tools that will perform a follow-up query of their own to discover what they want to do as a next step. For example, running the make tool will cause make to stat the various deps to bring things up to date.
That means that your upload.py script needs to know how to do this for itself if you want to use it with watchman.
You have a couple of options, depending on how sophisticated you want things to be:
Use pywatchman to issue an ad-hoc query
If you want to be able to run upload.py whenever you want and have it figure out the right thing (just like make would do) then you can have it ask watchman directly. You can have upload.py use pywatchman (the python watchman client) to do this. pywatchman will get installed if the the watchman configure script thinks you have a working python installation. You can also pip install pywatchman. Once you have it available and in your PYTHONPATH:
import pywatchman
client = pywatchman.client()
client.query('watch-project', os.getcwd())
result = client.query('query', os.getcwd(), {
"since": "n:pi_upload",
"fields": ["name"]})
print(result["files"])
This snippet uses the since generator with a named cursor to discover the list of files that changed since the last query was issued using that same named cursor. Watchman will remember the associated clock value for you, so you don't need to complicate your script with state tracking. We're using the name pi_upload for the cursor; the name needs to be unique among the watchman clients that might use named cursors, so naming it after your tool is a good idea to avoid potential conflict.
This is probably the most direct way to extract the information you need without requiring that you make more invasive changes to your upload script.
Use pywatchman to initiate a long running subscription
This approach will transform your upload.py script so that it knows how to directly subscribe to watchman, so instead of using watchman-make you'd just directly run upload.py and it would keep running and performing the uploads. This is a bit more invasive and is a bit too much code to try and paste in here. If you're interested in this approach then I'd suggest that you take the code behind watchman-wait as a starting point. You can find it here:
https://github.com/facebook/watchman/blob/master/python/bin/watchman-wait
The key piece of this that you might want to modify is this line:
https://github.com/facebook/watchman/blob/master/python/bin/watchman-wait#L169
which is where it receives the list of files.
Why not triggers?
You could use triggers for this, but we're steering folks away from triggers because they are hard to manage. A trigger will run in the background and have its output go to the watchman log file. It can be difficult to tell if it is running, or to stop it running.
The interface is closer to the unix model and allows you to feed a list of files on stdin.
Speaking of unix, what about watchman-wait?
We also have a command that emits the list of changed files as they change. You could potentially stream the output from watchman-wait in your upload.py. This would make it have some similarities with the subscription approach but do so without directly using the pywatchman client.

Is there a way to combine a python project codebase that spans across different files into one file?

The reason I want to this is I want to use the tool pyobfuscate to obfuscate my python code. Butpyobfuscate can only obfuscate one file.

I've answered your direct question separately, but let me offer a different solution to what I suspect you're actually trying to do:
Instead of shipping obfuscated source, just ship bytecode files. These are the .pyc files that get created, cached, and used automatically, but you can also create them manually by just using the compileall module in the standard library.
A .pyc file with its .py file missing can be imported just fine. It's not human-readable as-is. It can of course be decompiled into Python source, but the result is… basically the same result you get from running an obfuscater on the original source. So, it's slightly better than what you're trying to do, and a whole lot easier.
You can't compile your top-level script this way, but that's easy to work around. Just write a one-liner wrapper script that does nothing but import the real top-level script. If you have if __name__ == '__main__': code in there, you'll also need to move that to a function, and the wrapper becomes a two-liner that imports the module and calls the function… but that's as hard as it gets.) Alternatively, you could run pyobfuscator on just the top-level script, but really, there's no reason to do that.
In fact, many of the packager tools can optionally do all of this work for you automatically, except for writing the trivial top-level wrapper. For example, a default py2app build will stick compiled versions of your own modules, along with stdlib and site-packages modules you depend on, into a pythonXY.zip file in the app bundle, and set up the embedded interpreter to use that zipfile as its stdlib.

There are a definitely ways to turn a tree of modules into a single module. But it's not going to be trivial. The simplest thing I can think of is this:
First, you need a list of modules. This is easy to gather with the find command or a simple Python script that does an os.walk.
Then you need to use grep or Python re to get all of the import statements in each file, and use that to topologically sort the modules. If you only do absolute flat import foo statements at the top level, this is a trivial regex. If you also do absolute package imports, or from foo import bar (or from foo import *), or import at other levels, it's not much trickier. Relative package imports are a bit harder, but not that big of a deal. Of course if you do any dynamic importing, use the imp module, install import hooks, etc., you're out of luck here, but hopefully you don't.
Next you need to replace the actual import statements. With the same assumptions as above, this can be done with a simple sed or re.sub, something like import\s+(\w+) with \1 = sys.modules['\1'].
Now, for the hard part: you need to transform each module into something that creates an equivalent module object dynamically. This is the hard part. I think what you want to do is to escape the entire module code so that it can put into a triple-quoted string, then do this:
import types
mod_globals = {}
exec('''
# escaped version of original module source goes here
''', mod_globals)
mod = types.ModuleType(module_name)
mod.__dict__.update(mod_globals)
sys.modules[module_name] = mod
Now just concatenate all of those transformed modules together. The result will be almost equivalent to your original code, except that it's doing the equivalent of import foo; del foo for all of your modules (in dependency order) right at the start, so the startup time could be a little slower.

You can make a tool that:
Reads through your source files and puts all identifiers in a set.
Subtracts all identifiers from recursively searched standard- and third party modules from that set (modules, classes, functions, attributes, parameters).
Subtracts some explicitly excluded identifiers from that list as well, as they may be used in getattr/setattr/exec/eval
Replaces the remaining identifiers by gibberish
Or you can use this tool I wrote that does exactly that.
To obfuscate multiple files, use it as follows:
For safety, backup your source code and valuable data to an off-line medium.
Put a copy of opy_config.txt in the top directory of your project.
Adapt it to your needs according to the remarks in opy_config.txt.
This file only contains plain Python and is exec’ed, so you can do anything clever in it.
Open a command window, go to the top directory of your project and run opy.py from there.
If the top directory of your project is e.g. ../work/project1 then the obfuscation result will be in ../work/project1_opy.
Further adapt opy_config.txt until you’re satisfied with the result.
Type ‘opy ?’ or ‘python opy.py ?’ (without the quotes) on the command line to display a help text.

I think you can try using the find command with -exec option.
you can execute all python scripts in a directory with the following command.
find . -name "*.py" -exec python {} ';'
Wish this helps.
EDIT:
OH sorry I overlooked that if you obfuscate files seperately they may not run properly, because it renames function names to different names in different files.

Module for Python so that you can generate a free-standing, self-documenting command line utility ala Perl's pod2man/pod2usage?

All I can find is this reference:
Is it possible to use POD(plain old documentation) with Python?
which looks like you have to generate a whole separate set of docs to go with code.
I would like to try Python for making cmdline utils, but when I do this with Perl I can embed the docs directly in the source, and use the Pod2Usage module along with Getopt so that any of my scripts can be run like this:
cmd --man
and this triggers the pod system to dump documentation that is embedded in the script in man-page format. It can also generate shorter (synopsis), or medium formats.
It looks like I could use the pydoc code and kind of reverse engineer it to sort-of do the task (at least showing the full documentation), but I am hoping something better already exists.

The python-modargs package lets you create self-documenting command line interfaces. You define a function for each command you want to make available, and the function's docstring becomes the help text for that function. The function's keyword arguments become named arguments and python-modargs will parse inline comments after the keyword arguments to be help text for that argument.
I use python-modargs to generate the command line interface for dexy, here is the module which defines the commands:
https://github.com/ananelson/dexy/blob/027954f9234363d506225d40b675b3d6478994f4/dexy/commands.py#L144
You need to implement a help_command method to get the generated help, it's a 1-liner.

I think pydoc may be what you're looking for.
It certainly isn't quite the same as POD, as you have to call pydoc itself (e.g. pydoc myscript.py), but I guess it can be a good starting point.
Of course, you can always add pydoc support for your script by importing from it and using it's functions/classes.
Checkout pydoc's own cli implementation for the best example.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.