I am executing python scripts using python embedding (python.net in C#), I need to make sure these python scripts aren't going to be tampered with. These python scripts can be in a .pyc (compiled) format.
Is there a way to make these scripts tamper-proof? .pyc files can be easily decompiled, tampered with and recompiled. I looked at signet but I believe it requires the python files to be frozen into an executable.
Any help will be welcome.
What you need is signing these scripts. Signing a file means producing a signature from a private key and that file, the idea being that it's impossible to produce that same signature without the private key. Then, you also have a public key (that can be made public), whose only purpose is to verify that the signature matches the file. IIRC, this is the same mechanism that Windows uses to trust software (ie. there are software developers who are trusted by Windows, and when a software has a signature issued by a trusted source, which Windows can verify, then it is considered as trusted software, I think).
This is quite a common cryptographic pattern, so I think there are many tools that implement it, but one that is particularly good is GPG. It's free and open-source, it has bindings in many languages, it is very well documented, and handles everything, from the creation of your key pair to the signing, and much more. This also mean that using GPG is a little bit complex, but I found this post where someone wanted to verify a file they download in C# using GPG, so maybe it's going to be helpful.
Also, notice that in that in the other post they also use a hash check to ensure that the script was not corrupted during download (ie. there was a download error). You could verify that with the signature, but then you would be unable to tell if the error comes from someone deliberately attempting to tamper with your code, or if you just need to re-download the script.
Related
I know there is a lot of debate within this topic.
I made some research, I looked into some of the questions here, but none was exactly it.
I'm developing my app in Django, using Python 3.7 and I'm not looking to convert my app into a single .exe file, actually it wouldn't be reasonable to do so, if even possible.
However, I have seen some apps developed in javascript that use bytenode to compile code to .jsc
Is there such a thing for python? I know there is .pyc, but for all I know those are just runtime compiled files, not actually a bytecode precompiled script.
I wanted to protect the source code on some files that can compromise the security of the app. After all, deploying my app means deploying a fully fledged python installation with a web port open and an app that works on it.
What do you think, is there a way to do it, does it even make sense to you?
Thank you
The precompiled (.pyc) files are what you are looking for. They contain pre-optimized bytecode that can be run by the interpreter even when the original .py file is absent.
You can build the .pyc files directly using python -m py_compile <filename>. There is also a more optimized .pyo format that further reduces the file size by removing identifier names and docstrings. You can turn it on by using -OO.
Note that it might still be possible to decompile the generated bytecode with enough effort, so don't use it as a security measure.
I have a wheel built on MS Windows running in a very restricted environment, (cannot connect to internet). I can copy it to my machine running Linux. Then, I'd like to upload it to private PyPi.
I don't want to use twine. I had too much bad experience with Python infrastructure tools, so would like to avoid them as much as possible, but if this is not reason enough for you, think about it as "learning experience": I just really want to know what API do I need to use in order to put a file on PyPi server.
To spare you some more effort: https://pypiserver.readthedocs.io/en/latest/ I also read this, and there's no useful info here as well.
The only thing I could find in terms of documentation is this: https://www.python.org/dev/peps/pep-0503/ which is useless for my case.
This is the closest I've gotten so far: https://github.com/python/cpython/blob/master/Lib/distutils/command/upload.py#L92 though it still leaves a lot to be desired, as in: what fields are actually necessary and the restrictions on the contents of the fields.
I'd like to (PGP/GPG) sign python code. Yes, I have read this and many other sites that talk about protecting and obfuscating python code - this all is not what I want. I DON'T want to obfuscate code.
I want customers and users to see the code, they could modify code, copy it and make derivative work, I'd like to have the software under the GPLv3.
But I want to have plugins that are "signed", so they can be kind of trusted during execution.
Is this possible in Python? Can I import a library after checking its gpg signing?
What would be easy: check the gpg signing of a file, and then load it via import, else raise an exception. But this only would be possible for single-file-imports, not directory python modules.
It is clear that, if the customer changes the GPG key in the program, or deletes some lines himself in the checking algorithm, all is gone - but this is not the problem.
He could do anything he wants - but this would be silly.
What he wants is trustworthiness.
I want to let him add a third party plugin by copying it into a "plugins" directory, and have the program check the plugin for "trustworthiness" - and then import it.
(So he could run plugins that are not signed, but with his own risk.)
Python's import mechanism already provide all the tools necessary to achieve what you want. You can install different kinds of import hooks in order to support what you want.
In particular you'll probably find convenient to install a meta path hook that searches for "signed modules" and returns a Loader that is able to perform the imports from this signed format.
A very simple and convenient format for your signed plug-ins would be a zip archive containing:
The code of the plug-in in the form of modules/packages
A PGP signature of the above code
In this way:
Your loader should unpack the zip, and check the signature. If it matches then you can safely load the plug-in, if it doesn't match you should ask the user to trust the plug-in (or not and abort)
If the user wants to modify the plug-in it can simply unpack the zip archive and modify it as he wishes.
Imports from zip archives are already implemented in the zipimport module. This means that you don't have to rewrite a loader from scratch.
Actually if you want to reduce the code for the hooks to the minimum you'd simply need to verify the signature and then add the path to the zip archive into sys.path, since python already handles imports from zip archive even without explicitly using zipimport.
Using this design you just have to install these hooks and then you can import the plug-in as if they were normal modules and the verification etc. will be done automatically.
I know this is an old post, but we've developed a new solution. We were confronted with the same challenge -- to distribute python source code, but to prevent hackers from tampering with the code. The solution we developed was to create a custom loader for our application using signet http://jamercee.github.io/signet/.
What signet does is scans your script and it's dependencies creating sha1 hashes. It embeds these hashes into a custom loader which you deliver to your customer with your script. Your customers run the loader which re-verifies the hashes before it transfers control to your script for normal execution. If there's been tampering it emits an error message, and refuses to run the tampered code.
Signet is multiplatform and runs on windows, unix, linux, freebsd, etc... If you deploy to windows, the loader building process can even apply your company code certificate for 100% verification of your code. It also does PE verification.
The code is fully open source including the c++ source code to the default loader template. You can extend the loader to do additional verifications and even take actions if it detects code tampering (like undoing the tampering...).
I'm a self-taught, amateur, purely recreational programmer. I don't understand all the fancy programming lingo, and I certainly don't have any good resources, apart from this website, where I can go for help. (i.e., Please dumb it down for me!) I would imagine my question here is somewhat common, but I honestly couldn't find any answers on Google or this website, probably because I don't know the proper terminology to search for.
~~~
Having said that, I feel I have a pretty solid grasp on the basics of Python. And now, I've created an application that I'd like to share with a friend. My application accesses JPEG image files on my computer using a directory path that I've written into the code itself. However, I'd like my friend to be able to store these image files anywhere on their computer, not necessarily in the file folder that I've been using.
I assume the best way to accomplish this is to allow my friend to choose the directory path for themselves and then to write their chosen directory path to a file at a predetermined location on their computer. My application would then have that file's location prewritten into its code. This way, it would be trivially easy to open the file at the predetermined location, and then that file would point my application to my friend's chosen directory path.
1.) Are any of my intuitions here misguided? Are there better ways of doing this?
2.) If you think my general approach is a reasonable one, then is there a good/common place on the computer where applications typically store their directory paths upon installation?
Any advice - or any recommended resources - would be very much appreciated! Thanks!
Well, the standard way to do this is a lot more complicated and platform-specific:
On traditional Unix, this is pretty simple; you create a text file in some simpler format (e.g., that used by ConfigParser, named, say, ~/.myprogram.cfg, and you write a line to it that looks like image_path=/path/to/images.
On most modern Linux systems, or any other FreeDesktop/XDG-based system, you should (at least for GUI apps) instead use a special directory looked up in the environment as XDG_CONFIG_HOME, falling back to ~/.config, instead of using ~.
On Windows, the standard place to store stuff like this is the Windows Registry (e.g., by using winreg), by creating a key for your program and storing a value with name image_path and value /path/to/images there.
On Mac, the standard place to store stuff like this is in the NSUserDefaults database (e.g., by using PyObjC, which isn't part of the stdlib but does come built-in with Apple's pre-installed Python) by opening the default domain for your program and adding a value with key image_path and value… well, you probably want a Cocoa bookmark (maybe even a security-scoped one), not a path.
That probably all sounds way, way too complicated.
One option is to use a library that wraps this all up for you. If you're already using a heavy-duty framework like, say, Qt, it probably has functionality built-in to do that. Otherwise, it may take a lot of searching to find something.
A simpler alternative is to just pretend everything is like traditional Unix. That will work on Windows and Mac. It will be slightly annoying on some Windows versions that your config file will be visible in their home directory, but not a huge deal. It means you won't get some of the bonus features that Mac provides, like being able to magically follow the directory if the user moves it somewhere else on his hard drive, or remembering the settings if he reinstalls OS X and migrates his old settings, but again, usually that's fine.
In between the extremes, you can pretend everything is like Linux, using a special, and unobtrusive, location for the files on Windows and Mac just as you do there. Both platforms have APIs to look up special directories, called "application data" on Windows and "application support" on Mac. Using PyWin32 or PyObjC, respectively, these are pretty easy to look up. (For example, see this answer.) Then you just create a subdirectory there named My App on Windows, or com.mydomain.myapp on Mac, and store the file there.
I'd like to (PGP/GPG) sign python code. Yes, I have read this and many other sites that talk about protecting and obfuscating python code - this all is not what I want. I DON'T want to obfuscate code.
I want customers and users to see the code, they could modify code, copy it and make derivative work, I'd like to have the software under the GPLv3.
But I want to have plugins that are "signed", so they can be kind of trusted during execution.
Is this possible in Python? Can I import a library after checking its gpg signing?
What would be easy: check the gpg signing of a file, and then load it via import, else raise an exception. But this only would be possible for single-file-imports, not directory python modules.
It is clear that, if the customer changes the GPG key in the program, or deletes some lines himself in the checking algorithm, all is gone - but this is not the problem.
He could do anything he wants - but this would be silly.
What he wants is trustworthiness.
I want to let him add a third party plugin by copying it into a "plugins" directory, and have the program check the plugin for "trustworthiness" - and then import it.
(So he could run plugins that are not signed, but with his own risk.)
Python's import mechanism already provide all the tools necessary to achieve what you want. You can install different kinds of import hooks in order to support what you want.
In particular you'll probably find convenient to install a meta path hook that searches for "signed modules" and returns a Loader that is able to perform the imports from this signed format.
A very simple and convenient format for your signed plug-ins would be a zip archive containing:
The code of the plug-in in the form of modules/packages
A PGP signature of the above code
In this way:
Your loader should unpack the zip, and check the signature. If it matches then you can safely load the plug-in, if it doesn't match you should ask the user to trust the plug-in (or not and abort)
If the user wants to modify the plug-in it can simply unpack the zip archive and modify it as he wishes.
Imports from zip archives are already implemented in the zipimport module. This means that you don't have to rewrite a loader from scratch.
Actually if you want to reduce the code for the hooks to the minimum you'd simply need to verify the signature and then add the path to the zip archive into sys.path, since python already handles imports from zip archive even without explicitly using zipimport.
Using this design you just have to install these hooks and then you can import the plug-in as if they were normal modules and the verification etc. will be done automatically.
I know this is an old post, but we've developed a new solution. We were confronted with the same challenge -- to distribute python source code, but to prevent hackers from tampering with the code. The solution we developed was to create a custom loader for our application using signet http://jamercee.github.io/signet/.
What signet does is scans your script and it's dependencies creating sha1 hashes. It embeds these hashes into a custom loader which you deliver to your customer with your script. Your customers run the loader which re-verifies the hashes before it transfers control to your script for normal execution. If there's been tampering it emits an error message, and refuses to run the tampered code.
Signet is multiplatform and runs on windows, unix, linux, freebsd, etc... If you deploy to windows, the loader building process can even apply your company code certificate for 100% verification of your code. It also does PE verification.
The code is fully open source including the c++ source code to the default loader template. You can extend the loader to do additional verifications and even take actions if it detects code tampering (like undoing the tampering...).