No default bytecode compilation of Python code on Windows?

No default bytecode compilation of Python code on Windows? - python

I just ran into this SO question and I'm baffled. I'd say I have a fair experience with Python, but only on *nux(-like) OSes and I thought bytecode compilation was a given.
I'm obviously missing something here: was something happening behind the curtain I didn't know about on my OSes, like some configuration defaults ? Is it only on Windows and then, why? Is there any reason why not to compile to bytecode?
Thanks a lot in advance for any opinion, I'm very curious.

Basically, it's offering to pre-compile the Python installation files to bytecode, which would normally happen the first time they're used. As I understand it, it's just intended as a general convenience to avoid that initial compilation on use.
It would potentially save space not compiling all of the files in the standard library, but that would last only as long as you didn't try to use them.

Related

Python directory invocation

I have a directory with several python modules in it. Each module is mutually exclusive of all the others but after lots of trial and error, I have decided that each module horks out when using the multiprocessing functionality in Python. I have used the join() function on each process and its just not working like I want.
What I am really looking for is the ability to drop new mutually exclusive python modules in to the directory and have them invoked when the directory is launched. Does anyone know how to do this?

It sounds to me like you are asking about plugin architecture and sandboxing. Does that sound right?
The plugin component has been done and written about else where. SO has code examples on basic ways to import all the files.
The sandbox part is going to be a harder. Have a look at RestrictedPython and the Restricted Execution docs and the generally older but nevertheless helpful discussion of sandboxing.
If you aren't worried about untrusted code but rather want to isolate errors you could just wrap each module in a generic try/except that handles all errors. This would make debugging hard but would ensure that an error in one module didn't bring down the whole system.
If you aren't worried about untrused code but do need to have each file totally isolated then you might be best off looking into various systems of interprocess communication. I've actually had some luck using Redis for this (which sounds ridiculous but actually has been very easy and effective).
Anyway hopefully some of that helps you. Without more information it's hard to provide more than general thoughts and a guide to better googling.

what is the sole purpose of python being an interpreter? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What exactly is the sole purpose of python being an interpreter.
It doesn't provide executable files (how a commercial software developer use it?)
If any part of the code had bugs, it doesn't show up unless python
goes to that line at run it. In large projects, all parts of code
doesn't get interpreted every time, so, there would be a lot of
hidden bugs inside project
Every system should have a python installed in it to run those software's...
I am using py2exe, and I find myself puzzled to just look at the executable file size (too large).

First, answers to your questions.
They can use it for parts of their system for which they don't mind the source being visible (e.g. extensions) or they can Open Source their application. They can also use it to develop backend services for something which they're providing as a service (e.g. Youtube). They can also use it for internal tools which they don't plan to release(e.g. with Google).
That's why you need to write tests, exercise discipline and measure test coverage regularly. You sacrifice the compilers ability to check for things and some speed for advantages which I've detailed below.
Yes but it's not too hard to bundle Python along with your app. The entire interpreter + libraries is not that big. Python is pretty much a standard on most UNIX environments today. This is usually not a practical problem. The same issue is there with (say) Java (you need the JVM installed).
py2exe bundles all the modules into a single executable. It will be big. If you want to do compiled programs that are lean, don't use Python. Wrong fit.
Now, a few reasons on why "interpreted".
Faster development time. Programmer time is costlier than computer time so we should optimise for that.
No compilation cycle. Very easy to make incremental changes and check. Quick turnaround.
Introspection and dynamic typing allows certain kinds of coding not possible with some compiled languages like C.
Cross platform. If you have an interpreter for your platform, the program will run there even if it was written on a different platform.

You bring up a few different issues, here are some responses:
1) Technically, Python isn't interpreted (usually) - it is compiled to bytecode and that bytecode is run on a virtual machine.
So Python doesn't provide executables because it runs bytecode, not machine code.
You could just as well ask why Java doesn't produce executables.
The standard advantages of virtual machines apply: A big one being a simplified cross-platform development experience.
You could distribute just the .pyc (compiled bytecode) files if you don't want your source to be available. See this reference.
2) Here, you are talking about dynamic vs. static languages. There are tradeoffs, of course. One disadvantage of dynamic languages, as you mention, is that you get more run-time errors rather than compile-time errors.
There are, of course, corresponding advantages. I'll point you to some resources discussing both sides:
Dynamic type languages versus static type languages
What do people find so appealing about dynamic languages?
http://research.microsoft.com/en-us/um/people/emeijer/Papers/RDL04Meijer.pdf
3) Quite right. Just as you need the Java VM installed to run Java, perl to run Perl, etc.
Regarding your last point:
The whole idea of running in a VM is that you can install that VM once, then run many different apps. By bundlig the whole VM with every app (such as with py2exe), you are going against that concept. So yes, you have to pay the cost in terms of size.

Sole purpose of python is to provide a beautiful language to program in.
Your point #1 and #3 are similar and answer is that professional programmers use py2exe/pyinstaller etc to bundle their programs and distribute, in cases of frameworks/libraries they even don't need to do that.
Your point number #2 is also valid for statically compiled languages, something compiles correctly in C++ doesn't mean it will not crash at run-time or business logic is correct, you anyway need to test each part of your code, so with good unittests and functional tests python is at par with other languages in finding bugs, and as it doesn't need to be comiled and being dynamic means better productivity.
IMO

Python is not an interpreter, but an interpreted language.
This question is more about interpreted language VS compiled languages which has actually no other answer that the usual "it depends of your need".
See Noufal Ibrahim for details, but I'm not sure this question is a good fit for SO.

(1) You can provide installers for Python code (which may install the Python environment). This doesn't prevent you from having commercial code. Note that you can also have Java (also "interpreted" or JIT-compiled) commercial or desktop code and require your users to install a JRE.
(2) Any language, even compiled and strongly type, may produce errors that only show up when you get to that given code (e.g. division by zero). I guess you may be referring to strongly v.s. loosely typed languages. It's not just a matter of compilation, but the fact that strongly-typed languages generally make it easier to find "structural" bugs (e.g. trying to use a string as a number) during the compilation process. In contrast, loosely-typed language often lead to shorter code, which may be easier to manage. What to use really depends on the goal of your application.

Interactivity is good. I find it encourages making small, easily testable functions that build together to make an application.
Unless you are writing simple, statically linked applications, you will usually have some run-time baggage that must be included or installed (mfc, dot net, etc.) for compiled languages. Look at the winsxs folder. Sure, you get to "share" that stuff most of the time, but there is still a lot of space taken up by "needed", if hidden, requirements.
And as far as bugs, run-time bugs will be the same no matter what. Any good programmer would test as much as possible when making changes. This should catch what would be compile time bugs in other languages as well as testing run-time behavior.

A python .exe has to necessarily include a complete copy of the python interpreter. As you say, since it's interpreted it won't touch every line of code until that line is actually run. Some parts may actually invoke a python parse/compile sequence (e.g. eval(), some regexes, etc...). These would fail in a compiled .exe unless the full interpreter was available.

Is it easy to fully decompile python compiled(*.pyc) files?

I wanted to know that how much easy is to decompile python byte code. I have made an application in python whose source I want to be secure. I am using py2exe which basically relies on python compiled files.
Does that secure the code?

Depends on your definition of "fully" (in "fully decompile")...;-). You won't easily get back the original Python source -- but getting the bytecode is easy, and standard library module dis exists exactly to make bytecode easily readable (though it's still bytecode, not full Python source code;-).

Compiling .pyc does not secure the code. They are easily read. See How do I protect Python code?

If you search online you can find decompilers for Python bytecode: there's a free version for downloading but which only handles bytecode up to Python 2.3, and an online service which will decompile up to version 2.6.
There don't appear to be any decompilers yet for more recent versions of Python bytecode, but that's almost certainly just because nobody has felt the need to write one rather than any fundamental difficulty with the bytecode itself.
Some people have tried to protect Python bytecode by modifying the interpreter: there's no particular reason why you can't compile your own interpreter with the different values used for the bytecode: that will prevent simple examination of the code with import dis, but won't stand up long to any determined attack and it all costs money that code be better put into improving the program itself.
In short, if you want to protect your program then use the law to do it: use an appropriate software license and prosecute those who ignore it. Code is expensive to write, but the end result is rarely the valuable part of a software package: data is much more valuable.

Now we can say: VERY EASY decompile, and NOT secure.
For I have use uncompyle6 to decompile my (latest 3.8.0) Python code:
uncompyle6 utils.cpython-38.pyc > utils.py
and decompile effect is NEARLY PERFECT.
origin python vs decompiled python:
-> ALMOST same py code except no comments
Not forget, other can use unpy2exe to
Extract .pyc files from executables created with py2exe
so your method not secure.

Disassembling with python - no easy solution?

I'm trying to create a python script that will disassemble a binary (a Windows exe to be precise) and analyze its code.
I need the ability to take a certain buffer, and extract some sort of struct containing information about the instructions in it.
I've worked with libdisasm in C before, and I found it's interface quite intuitive and comfortable.
The problem is, its Python interface is available only through SWIG, and I can't get it to compile properly under Windows.
At the availability aspect, diStorm provides a nice out-of-the-box interface, but it provides only the Mnemonic of each instruction, and not a binary struct with enumerations defining instruction type and what not.
This is quite uncomfortable for my purpose, and will require a lot of what I see as spent time wrapping the interface to make it fit my needs.
I've also looked at BeaEngine, which does in fact provide the output I need, a struct with binary info concerning each instruction, but its interface is really odd and counter-intuitive, and it crashes pretty much instantly when provided with wrong arguments.
The CTypes sort of ultimate-death-to-your-python crashes.
So, I'd be happy to hear about other solutions, which are a little less time consuming than messing around with djgcc or mingw to make SWIGed libdisasm, or writing an OOP wrapper for diStorm.
If anyone has some guidance as to how to compile SWIGed libdisasm, or better yet, a compiled binary (pyd or dll+py), I'd love to hear/have it. :)
Thanks ahead.

Well, after much meddling around, I managed to compile SWIGed libdisasm!
Unfortunately, it seems to crash python on incorrect (and sometimes correct) usage.
How I did it:
I compiled libdisasm.lib using Visual Studio 6, the only thing you need for this is the source code in whichever libdisasm release you use, and stdint.h and inttypes.h (The Visual C++ compatible version, google it).
I SWIGed the given libdisasm_oop.i file with the following command line
swig -python -shadow -o x86disasm_wrap.c -outdir . libdisasm_oop.i
Used Cygwin to run ./configure in the libdisasm root dir. The only real thing you get from this is config.h
I then created a new DLL project, added x86disasm_wrap.c to it, added the c:\PythonXX\libs and c:\PythonXX\Include folders to the corresponding variables, set to Release configuration (important, either this or do #undef _DEBUG before including python.h).
Also, there is a chance you'll need to fix the path to config.h.
Compiled the DLL project, and named the output _x86disasm.dll.
Place that in the same folder as the SWIG generated x86disasm.py and you're done.
Any suggestions for other, less crashy disasm libs for python?

You might try using ctypes to interface directly with libdisasm instead of going through a SWIG layer. It may be take more development time but AFAIK you should be able to access the underlying functionality using ctypes.

I recommend you look at Pym's disassembly library which is also the backend for Pym's online disassembler.

You can use the distorm library: https://code.google.com/p/distorm/
Here's another build: http://breakingcode.wordpress.com/2009/08/31/using-distorm-with-python-2-6-and-python-3-x-revisited/
There's also BeaEngine: http://www.beaengine.org/
Here's a Windows installer for BeaEngine: http://breakingcode.wordpress.com/2012/04/08/quickpost-installer-for-beaenginepython/

Debugging swig extensions for Python

Is there any other way to debug swig extensions except for doing
gdb python stuff.py
?
I have wrapped the legacy library libkdtree++ and followed all the swig related memory managemant points (borrowed ref vs. own ref, etc.). But still, I am not sure whether my binding is not eating up memory. It would be helpful to be able to just debug step by step each publicized function: starting from Python then going to via the C glue binding into C space, and returning back.
Is there already such a possibility?

gdb 7.0 supports python scripting. It might help you in this particular case.

Well, for debugging, you use a debugger ;-).
When debugging, it may be a good idea to configure Python with '--with-pydebug' and recompile. It does additional checks then.
If you are looking for memory leaks, there is a simple way:
Run your code over and over in a loop, and look for Python's memory consumption.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.