Bundling Multiple Python Modules

Bundling Multiple Python Modules - python

I have some Python modules, some of them require more than 20 others.
My question is, if there a tool is which helps me to bundle some Python modules to one big file.
Here a simple example:
HelloWorld.py:
import MyPrinter
MyPrinter.displayMessage("hello")
MyPrinter.py:
def displayMessage(msg):
print msg
should be converted to one file, which contains:
def displayMessage(msg):
print msg
displayMessage("hello")
Ok, I know that this example is a bit bad, but i hope that someone understand what i mean and can help me. And one note: I talk about huge scripts with very much imports, if they were smaller I could do that by myself.
Thanks.

Aassuming you are using Python 2.6 or later, you could package the scripts into a zip file, add a __main__.py and run the zip file directly.
If you really want to collapse everything down to a single file, I expect you're going to have to write it yourself. The source code transformation engine in lib2to3 may help with the task.

You cannot and should not 'convert them into one file'.
If your application consists of several modules, you should just organize it into package.
There is pretty good tutorial on packages here: http://diveintopython3.org/packaging.html
And you should read docs on it here: http://docs.python.org/library/distutils.html

Pip supports bundling. This is an installation format, and will unpack into multiple files. Anything else would be a bad idea, as it would break imports and per-module metadata.

Related

Stitching non-R scripts with knitr, say .py

is it possible to stitch a python script using stitch() with knitr?
I want to produce a simple document by typing:
stitch('someScript.py')
analogous to:
stitch('someScript.R')

As I probably have answered you on Twitter, you can try
library(knitr)
opts_chunk$set(engine = 'python')
stitch('script.py')
But note the Python support in knitr is very limited at the moment. You may or may not be satisfied by it.

Is there a way to combine a python project codebase that spans across different files into one file?

The reason I want to this is I want to use the tool pyobfuscate to obfuscate my python code. Butpyobfuscate can only obfuscate one file.

I've answered your direct question separately, but let me offer a different solution to what I suspect you're actually trying to do:
Instead of shipping obfuscated source, just ship bytecode files. These are the .pyc files that get created, cached, and used automatically, but you can also create them manually by just using the compileall module in the standard library.
A .pyc file with its .py file missing can be imported just fine. It's not human-readable as-is. It can of course be decompiled into Python source, but the result is… basically the same result you get from running an obfuscater on the original source. So, it's slightly better than what you're trying to do, and a whole lot easier.
You can't compile your top-level script this way, but that's easy to work around. Just write a one-liner wrapper script that does nothing but import the real top-level script. If you have if __name__ == '__main__': code in there, you'll also need to move that to a function, and the wrapper becomes a two-liner that imports the module and calls the function… but that's as hard as it gets.) Alternatively, you could run pyobfuscator on just the top-level script, but really, there's no reason to do that.
In fact, many of the packager tools can optionally do all of this work for you automatically, except for writing the trivial top-level wrapper. For example, a default py2app build will stick compiled versions of your own modules, along with stdlib and site-packages modules you depend on, into a pythonXY.zip file in the app bundle, and set up the embedded interpreter to use that zipfile as its stdlib.

There are a definitely ways to turn a tree of modules into a single module. But it's not going to be trivial. The simplest thing I can think of is this:
First, you need a list of modules. This is easy to gather with the find command or a simple Python script that does an os.walk.
Then you need to use grep or Python re to get all of the import statements in each file, and use that to topologically sort the modules. If you only do absolute flat import foo statements at the top level, this is a trivial regex. If you also do absolute package imports, or from foo import bar (or from foo import *), or import at other levels, it's not much trickier. Relative package imports are a bit harder, but not that big of a deal. Of course if you do any dynamic importing, use the imp module, install import hooks, etc., you're out of luck here, but hopefully you don't.
Next you need to replace the actual import statements. With the same assumptions as above, this can be done with a simple sed or re.sub, something like import\s+(\w+) with \1 = sys.modules['\1'].
Now, for the hard part: you need to transform each module into something that creates an equivalent module object dynamically. This is the hard part. I think what you want to do is to escape the entire module code so that it can put into a triple-quoted string, then do this:
import types
mod_globals = {}
exec('''
# escaped version of original module source goes here
''', mod_globals)
mod = types.ModuleType(module_name)
mod.__dict__.update(mod_globals)
sys.modules[module_name] = mod
Now just concatenate all of those transformed modules together. The result will be almost equivalent to your original code, except that it's doing the equivalent of import foo; del foo for all of your modules (in dependency order) right at the start, so the startup time could be a little slower.

You can make a tool that:
Reads through your source files and puts all identifiers in a set.
Subtracts all identifiers from recursively searched standard- and third party modules from that set (modules, classes, functions, attributes, parameters).
Subtracts some explicitly excluded identifiers from that list as well, as they may be used in getattr/setattr/exec/eval
Replaces the remaining identifiers by gibberish
Or you can use this tool I wrote that does exactly that.
To obfuscate multiple files, use it as follows:
For safety, backup your source code and valuable data to an off-line medium.
Put a copy of opy_config.txt in the top directory of your project.
Adapt it to your needs according to the remarks in opy_config.txt.
This file only contains plain Python and is exec’ed, so you can do anything clever in it.
Open a command window, go to the top directory of your project and run opy.py from there.
If the top directory of your project is e.g. ../work/project1 then the obfuscation result will be in ../work/project1_opy.
Further adapt opy_config.txt until you’re satisfied with the result.
Type ‘opy ?’ or ‘python opy.py ?’ (without the quotes) on the command line to display a help text.

I think you can try using the find command with -exec option.
you can execute all python scripts in a directory with the following command.
find . -name "*.py" -exec python {} ';'
Wish this helps.
EDIT:
OH sorry I overlooked that if you obfuscate files seperately they may not run properly, because it renames function names to different names in different files.

compile py2exe from executable

I have seen a similar question on this site but not answered correctly for my requirements. I am reasonably familiar with py2exe.
I'd like to create a program (in python and py2exe) that I can distribute to my customers which would enable them to add their own data (not code, just numbers) and redistribute as a new/amended exe for further distribution (as a single file, so my code + data). I understand this can be done with more than one file.
Is this conceptually possible without my customers installing python? I guess I'm asking how to perform the 'bundlefiles' option?
Many thanks

I think it is possible. I'm not sure how py2exe works, but I know how pyinstaller does and since both does the same it should work similiar.
Namely, one-file flag doesn't really create one file. It looks like that for end user, but when user run app, it unpacks itself and files are stored somewhere physically. You could try to edit some source file (ie numbers.py, or data.py) and pack it again with changed data.
I know it's not the best explanation, you have to think further on your own. I'm just showing you the possible way.

viewing files in python?

I am creating a sort of "Command line" in Python. I already added a few functions, such as changing login/password, executing, etc., But is it possible to browse files in the directory that the main file is in with a command/module, or will I have to make the module myself and use the import command? Same thing with changing directories to view, too.

Browsing files is as easy as using the standard os module. If you want to do something with those files, that's entirely different.
import os
all_files = os.listdir('.') # gets all files in current directory
To change directories you can issue os.chdir('path/to/change/to'). In fact there are plenty of useful functions found in the os module that facilitate the things you're asking about. Making them pretty and user-friendly, however, is up to you!

I'd like to see someone write a a semantic file-browser, i.e. one that auto-generates tags for files according to their input and then allows views and searching accordingly.
Think about it... take an MP3, lookup the lyrics, run it through Zemanta, bam! a PDF file, a OpenOffice file, etc., that'd be pretty kick-butt! probably fairly intensive too, but it'd be pretty dang cool!
Cheers,
-C

Bash alias to Python script -- is it possible?

The particular alias I'm looking to "class up" into a Python script happens to be one that makes use of the cUrl -o (output to file) option. I suppose I could as easily turn it into a BASH function, but someone advised me that I could avoid the quirks and pitfalls of the different versions and "flavors" of BASH by taking my ideas and making them Python scripts.
Coincident with this idea is another notion I had to make a feature of legacy Mac OS (officially known as "OS 9" or "Classic") pertaining to downloads platform-independent: writing the URL to some part of the file visible from one's file navigator {Konqueror, Dolphin, Nautilus, Finder or Explorer}. I know that only a scant few file types support this kind of thing using some other command-line tools (exiv2, wrjpgcom, etc). Which is perfectly fine with me as I only use this alias to download single-page image files such as JPEGs anyways.
I reckon I might as well take full advantage of the power of Python by having the script pass the string which is the source URL of the download (entered by the user and used first by cUrl) to something like exiv2 which could write it to the Comment block, EXIF User Comment block, and (taking as a first and worst example) Windows XP's File Description field. Starting small is sometimes a good way to start.
Hope someone has advice or suggestions.
BZT

The relevant section from the Bash manual states:
Aliases allow a string to be
substituted for a word when it is used
as the first word of a simple command.
So, there should be nothing preventing you from doing e.g.
$ alias geturl="python /some/cool/script.py"
Then you could use it like any other shell command:
$ geturl http://example.com/excitingstuff.jpg
And this would simply call your Python program.

I thought Pycurl might be the answer. Ahh Daniel Sternberg and his innocent presumptions that everybody knows what he does. I asked on the list whether or not pycurl had a "curl -o" analogue, and then asked 'If so: How would one go about coding it/them in a Python script?' His reply was the following:
"curl.setopt(pycurl.WRITEDATA, fp)
possibly combined with:
curl.setopt(pycurl.WRITEFUNCITON, callback) "
...along with Sourceforge links to two revisions of retriever.py. I can barely recall where easy_install put the one I've got; how am I supposed to compare them?
It's pretty apparent this gentleman never had a helpdesk or phone tech support job in the Western Hemisphere, where you have to assume the 'customer' just learned how to use their comb yesterday and be prepared to walk them through everything and anything. One-liners (or three-liners with abstruse links as chasers) don't do it for me.
BZT

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bundling Multiple Python Modules - python

Pip supports bundling. This is an installation format, and will unpack into multiple files. Anything else would be a bad idea, as it would break imports and per-module metadata.

Related

Stitching non-R scripts with knitr, say .py

Is there a way to combine a python project codebase that spans across different files into one file?

compile py2exe from executable

viewing files in python?

Bash alias to Python script -- is it possible?

Categories

Resources