How to deal with user authentication and wrongful modification in scripting languages?

How to deal with user authentication and wrongful modification in scripting languages? - python

I'm building a centralized desktop application using Python/wxPython. One of the requirements is User authentication, which I'm trying to implement using LDAP (although this is not mandatory).
Users of the system will be mechanical and electrical engineers making budgets, and the biggest problem would be industrial espionage. Its a common problem that leaks occur commonly from the bottom on informal ways, and this could pose problems. The system is set up in such a way that every user has access to all and only the information it needs, so that no one person but the people on top has monetary information on the whole project.
The problem is that, for every way I can think to implement the authentication system, Python's openness makes me think of at least one way of bypassing/getting sensible information from the system, because "compiling" with py2exe is the closest I can get to obfuscation of the code on Windows.
I'm not really trying to hide the code, but rather make the authentication routine secure by itself, make it in such a way that access to the code doesn't mean capability to access the application. One thing I wanted to add, was some sort of code signing to the access routine, so the user can be sure that he is not running a modified client app.
One of the ways I've thought to avoid this is making a C module for the authentication, but I would rather not have to do that.
Of course this question is changing now and is not just "Could anyone point me in the right direction as to how to build a secure authentication system running on Python? Does something like this already exist?", but "How do you harden an scripting (Python) against wrongful modification?"

How malicious are your users? Really.
Exactly how malicious?
If your users are evil sociopaths and can't be trusted with a desktop solution, then don't build a desktop solution. Build a web site.
If your users are ordinary users, they'll screw the environment up by installing viruses, malware and keyloggers from porn sites before they try to (a) learn Python (b) learn how your security works and (c) make a sincere effort at breaking it.
If you actually have desktop security issues (i.e., public safety, military, etc.) then rethink using the desktop.
Otherwise, relax, do the right thing, and don't worry about "scripting".
C++ programs are easier to hack because people are lazy and permit SQL injection.

Possibly:
The user enters their credentials into the desktop client.
The client says to the server: "Hi, my name username and my password is password".
The server checks these.
The server says to the client: "Hi, username. Here is your secret token: ..."
Subsequently the client uses the secret token together with the username to "sign" communications with the server.

Related

Python exe - how can I restrict viewing source and byte code?

I'm making a simple project where I will have a downloadable scraper on an HTML website. The scraper is made in Python and is converted to a .exe file for downloading purposes. Inside the python code, however, I included a Google app password to an email account, because the scraper sends an email and I need the server to login with an available Google account. Whilst .exe files are hard to get source code for, I've seen that there are ways to do so, and I'm wondering, how could I make it so that anyone who has downloaded the scraper.exe file cannot see the email login details that I will be using to send them an email when the scraper needs to? If possible, maybe even block them from accessing any of the .exe source code or bytecode altogether? I'm using the Python libraries bs4 and requests.
Additionally, this is off-topic, however, as it is my first time developing a downloadable file, even whilst converting the Python file to a .exe file, my antivirus picked it up as a suspicious file. This is like a 50 line web scraper and obviously doesn't have any malicious code within it. How can I make the code be less suspicious to antivirus programs?

Sadly even today,there is no perfect solution to this problem.
The ideal usecase is to provide this secret_password from web application,but in your case seems unlikelly since you are building a rather small desktop app.
The best and easiest way is to create a function providing this secret_password in a separate file,and compile this file with Cython,thing that will obcufate your script(and your secret_password) at a very good extend.Will this protect you from lets say Anonymous or a state security agency?No.Here comes the reasonable thinking about how secret and important really your password is and from who you mainly can be harmed.
Finally before compiling you can 'salt' your script or further obscufate it with bcrypt or other libaries.
As for your second question antiviruses and specifically windows don't like programms running without installers and unsigned.
You can use inno setup to create a real life program installer.
If you want to deal with UAC or other issues related to unsigned programms you can sign your programm(will cost money).

Firstly, why is it even sending them an email? Since they'll be running the .exe, it can pop up a window and offer to save the file. If an email must be sent, it can be from the user's gmail rather than yours.
Secondly, using your gmail account in this way may be against the terms of service. You could get your account suspended, and it may technically be a felony in the US. Consult a lawyer if this is a concern.
To your question, there's basically no way to obfuscate the password that will be more than a mild annoyance to anyone with the least interest. At the end of the day, (a) the script runs under the control of the user, potentially in a VM or a container, potentially with network communications captured; and (b) at some point it has to decrypt and send the password. Decoding and following either the script, or the network communications that it makes will be relatively straightforward for anyone who wants to put in quite modest effort.

How to safely pass credentials to jdbc interface in Pyspark [duplicate]

The attack
One possible threat model, in the context of credential storage, is an attacker which has the ability to :
inspect any (user) process memory
read local (user) files
AFAIK, the consensus on this type of attack is that it's impossible to prevent (since the credentials must be stored in memory for the program to actually use them), but there's a couple of techniques to mitigate it:
minimize the amount of time the sensitive data is stored in memory
overwrite the memory as soon as the data is not needed anymore
mangle the data in memory, keep moving it, and other security through obscurity measures
Python in particular
The first technique is easy enough to implement, possibly through a keyring (hopefully kernel space storage)
The second one is not achievable at all without writing a C module, to the best of my knowledge (but I'd love to be proved wrong here, or to have a list of existing modules)
The third one is tricky.
In particular, python being a language with very powerful introspection and reflection capabilities, it's difficult to prevent access to the credentials to anyone which can execute python code in the interpreter process.
There seems to be a consensus that there's no way to enforce private attributes and that attempts at it will at best annoy other programmers who are using your code.
The question
Taking all this into consideration, how does one securely store authentication credentials using python? What are the best practices? Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?

There are two very different reasons why you might store authentication credentials:
To authenticate your user: For example, you only allow the user access to the services after the user authenticates to your program
To authenticate the program with another program or service: For example, the user starts your program which then accesses the user's email over the Internet using IMAP.
In the first case, you should never store the password (or an encrypted version of the password). Instead, you should hash the password with a high-quality salt and ensure that the hashing algorithm you use is computationally expensive (to prevent dictionary attacks) such as PBKDF2 or bcrypt. See Salted Password Hashing - Doing it Right for many more details. If you follow this approach, even if the hacker retrieves the salted, slow-hashed token, they can't do very much with it.
In the second case, there are a number of things done to make secret discovery harder (as you outline in your question), such as:
Keeping secrets encrypted until needed, decrypting on demand, then re-encrypting immediately after
Using address space randomization so each time the application runs, the keys are stored at a different address
Using the OS keystores
Using a "hard" language such as C/C++ rather than a VM-based, introspective language such as Java or Python
Such approaches are certainly better than nothing, but a skilled hacker will break it sooner or later.
Tokens
From a theoretical perspective, authentication is the act of proving that the person challenged is who they say they are. Traditionally, this is achieved with a shared secret (the password), but there are other ways to prove yourself, including:
Out-of-band authentication. For example, where I live, when I try to log into my internet bank, I receive a one-time password (OTP) as a SMS on my phone. In this method, I prove I am by virtue of owning a specific telephone number
Security token: To log in to a service, I have to press a button on my token to get a OTP which I then use as my password.
Other devices:
SmartCard, in particular as used by the US DoD where it is called the CAC. Python has a module called pyscard to interface to this
NFC device
And a more complete list here
The commonality between all these approaches is that the end-user controls these devices and the secrets never actually leave the token/card/phone, and certainly are never stored in your program. This makes them much more secure.
Session stealing
However (there is always a however):
Let us suppose you manage to secure the login so the hacker cannot access the security tokens. Now your application is happily interacting with the secured service. Unfortunately, if the hacker can run arbitrary executables on your computer, the hacker can hijack your session for example by injecting additional commands into your valid use of the service. In other words, while you have protected the password, it's entirely irrelevant because the hacker still gains access to the 'secured' resource.
This is a very real threat, as the multiple cross-site scripting attacks have shows (one example is U.S. Bank and Bank of America Websites Vulnerable, but there are countless more).
Secure proxy
As discussed above, there is a fundamental issue in keeping the credentials of an account on a third-party service or system so that the application can log onto it, especially if the only log-on approach is a username and password.
One way to partially mitigate this by delegating the communication to the service to a secure proxy, and develop a secure sign-on approach between the application and proxy. In this approach
The application uses a PKI scheme or two-factor authentication to sign onto the secure proxy
The user adds security credentials to the third-party system to the secure proxy. The credentials are never stored in the application
Later, when the application needs to access the third-party system, it sends a request to the proxy. The proxy logs on using the security credentials and makes the request, returning results to the application.
The disadvantages to this approach are:
The user may not want to trust the secure proxy with the storage of the credentials
The user may not trust the secure proxy with the data flowing through it to the third-party application
The application owner has additional infrastructure and hosting costs for running the proxy
Some answers
So, on to specific answers:
How does one securely store authentication credentials using python?
If storing a password for the application to authenticate the user, use a PBKDF2 algorithm, such as https://www.dlitz.net/software/python-pbkdf2/
If storing a password/security token to access another service, then there is no absolutely secure way.
However, consider switching authentication strategies to, for example the smartcard, using, eg, pyscard. You can use smartcards to both authenticate a user to the application, and also securely authenticate the application to another service with X.509 certs.
Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
IMHO there is nothing wrong with writing a specific module in Python that does it's damnedest to hide the secret information, making it a right bugger for others to reuse (annoying other programmers is its purpose). You could even code large portions in C and link to it. However, don't do this for other modules for obvious reasons.
Ultimately, though, if the hacker has control over the computer, there is no privacy on the computer at all. Theoretical worst-case is that your program is running in a VM, and the hacker has complete access to all memory on the computer, including the BIOS and graphics card, and can step your application though authentication to discover its secrets.
Given no absolute privacy, the rest is just obfuscation, and the level of protection is simply how hard it is obfuscated vs. how much a skilled hacker wants the information. And we all know how that ends, even for custom hardware and billion-dollar products.
Using Python keyring
While this will quite securely manage the key with respect to other applications, all Python applications share access to the tokens. This is not in the slightest bit secure to the type of attack you are worried about.

I'm no expert in this field and am really just looking to solve the same problem that you are, but it looks like something like Hashicorp's Vault might be able to help out quite nicely.
In particular WRT to the problem of storing credentials for 3rd part services. e.g.:
In the modern world of API-driven everything, many systems also support programmatic creation of access credentials. Vault takes advantage of this support through a feature called dynamic secrets: secrets that are generated on-demand, and also support automatic revocation.
For Vault 0.1, Vault supports dynamically generating AWS, SQL, and Consul credentials.
More links:
Github
Vault Website
Use Cases

Disguising username & password on distributed python scripts

This question is a bit far fetched (i don't even know if the way i'm going about doing this is correct).
I have a script that gathers some information on a computer. The intent is to have that script ftp/sftp/any-transfer etc some data to a remote server. This script is intended to be distributed among many people also.
Is it possible to hide the password/user of remote server in the script (or perhaps even the implementation details?). I was thinking of encoding it in some way. Any suggestions?
Also, in compiled languages like java or C, is it safe to just distribute around a compiled version of the code?
Thanks.

The answer is no. You can't put the authentication details into the program and make it impossible for users to get those same authentication details. You can try to obfuscate them, but it is not possible to ensure that they cannot be read.
Compiling the code will not even obfuscate them very much.
One approach to the problem would be to implement a REST web interface and supply each distribution of the program with an API key of some sort. Then set up the program to connect to the interface over SSL using its key and put whatever information it needs there. Then you could track which version is connecting from where and limit each distribution of the program to updating a restricted set of resources on the server. Furthermore you could use server heuristics to guess if an api key has leaked and block an account if that occurs.
Another way would be if all of the hosts/users of the program are trusted, then you could set up user accounts on a server node and each script could authenticate with its own username and password or SSH key. Your server node would then have to restrict access based on what each user is allowed to update. Using SSH key based authentication allows you to avoid leaving the passwords around while still allowing authenticated access to your server.

Just set the name to "username" and password to "password", and then when you give it to your friends, provision an account/credential that's only for them, and tell them to change the script and be done with it. That's the best/easiest way to do this.

to add onto jmh's comments and answer another part of your question, it is possible to decompile the java from the .class byte code and get almost exactly what the .java file contains so that won't help you. C is more difficult to piece back together but again, its certainly possible.

I sometimes compress credentials with zlib and compile to pyo file.
It protect from "open in editor and press ctrl+f" and from not-programmers only.
Sometimes I used PGP cryptography.)

Secure authentication system in python?

I am making a web application in python and I would like to have a secure login system.
I have done login systems many times before by having the user login and then a random string is saved in a cookie which is also saved next to that user in a database which worked fine but it was not very secure.
I believe understand the principles of an advanced system like this but not the specifics:
Use HTTPS for the login page and important pages
Hash the password saved in the database(bcrypt, sha256? use salt?)
Use nonces(encrypted with the page url and ip?)
But apart from those I have no idea how to reliably check if the person logged in is really the user, or how to keep sessions between page requests and multiple open pages securely, etc.
Can I have some directions (preferably specific ones since I am new to this advanced security programming.
I am just trying to accomplish a basic user login-logout to one domain with security, nothing too complicated.

This answer mainly addresses password hashing, and not your other subquestions. For those, my main advice would be don't reinvent the wheel: use existing frameworks that work well with GAE. It offers builtin deployments of Django, but also has a builtin install of WebOb, so various WebOb-based frameworks (Pyramid, Turbogears, etc) should also be considered. All of these will have premade libraries to handle a lot of this for you (eg: many of the WebOb frameworks use Beaker for their cookie-based session handling)
Regarding password hashing... since you indicated in some other comments that you're using Google App Engine, you want to use the SHA512-Crypt password hash.
The other main choices for storing password hashes as securely as possible are BCrypt, PBKDF2, and SCrypt. However, GAE doesn't offer C-accelerated support for these algorithms, so the only way to deploy them is via a pure-python implementation. Unfortunately, their algorithms do way too much bit-fiddling for a pure-python implementation to do a fast enough job to be both secure and responsive. Whereas GAE's implementation of the Python crypt module offers C-accelerated SHA512-Crypt support (at least, every time I've tested it), so it could be run at sufficient strength.
As far as writing actual code goes, you can use the crypt module directly. You'll need to take care of generating your own salt strings when passing them into crypt, and when encrypting new passwords, call crypt.crypt(passwd, "$6$" + salt). The $6$ tells it to use SHA512-Crypt.
Alternately, you can use the Passlib library to handle most of this for you (disclaimer: I'm the author of that library). For quick GAE deployment:
from passlib.context import CryptContext
pwd_context = CryptContext(schemes=["sha512_crypt"],
default="sha512_crypt",
sha512_crypt__default_rounds=45000)
# encrypt password
hash = pwd_context.encrypt("toomanysecrets")
# verify password
ok = pwd_context.verify("wrongpass", hash)
Note: if care about password security, whatever you do, don't use a single HASH(salt+password) algorithm (eg Django, PHPass, etc), as these can be trivially brute-forced.

It's hard to be specific without knowing your setup. However, the one thing you should not do is reinventing the wheel. Security is tricky, if your wheel is lacking something you may not know until it's too late.
I wouldn't be surprised if your web framework came with a module/library/plugin for handling users, logins and sessions. Read its documentation and use it: it was hopefully written by people who know a bit about security.
If you want to know how it's done, study the documentation and source of said module.

Security considerations - office website/portal on GAE

If one needs to create an office website (that serves as a platform for clients/customers/employees) to login and access shared data, what are the security considerations.
to give you some more detail,
The office portal has been developed in django/python and hosted through GAE. Essentially, the end point comes with a login/password to enter into the portal and access data.
I would like to know:
a) what are the things we can do to bring in a high level of security. Essentially the data is critical and hence need to be accessed by authorized people only. So would like to make it such that "The app is as safe as - how safely one keeps his password. Meaning, the only way to enter the system (unauthorized) is through a password leak (by the person) and not in any hackish way." :)
b) can we host the apps on GAE (appspot.com) with https?
c) are there better ways to secure other than passwords (i have heard about ssh keys/certificates). But the ultimate users may not be highly tech savvy.

There is always the choice between usabiity and secutity. The more security features you implent, the more difficult it gets to use it.
can we host the apps on GAE (appspot.com) with https?
Yes, but not on your own domain, only on appspot.com. If you are serving your app off of an own domain, you must direct all secure traffic through your app's appspot domain (on your own domain, you'd have to buy a SSL certificate, and you would need a dedicated IP etc.). If you really have to, there are ways to route SSL traffic over your own domain, but as this requires another server running something like stunnel, it gives attackers another attack target.
If your app has username/password authentication, the app is really as safe as how safely one keeps his password, if you have no bugs in your code that could be exploited. About the "hackish way": on GAE, you don't have to care about server security, the only possible attack target is your code.
These are some strategies for securing your app:
good QA and code review to find critical bugs; Django has already built-in protection against most trivial attacks like XSRF and SQL injection, so look at the parts of your own code that are related to critical data and authentication
think of other authentication methods like client side certificates (easy to use for the end user, most browser support this natively and modern operating systems have a certificate storage; probably not an easy thing to do on GAE)
the weakest point of every secure enviromnent is the user, so you should inform the users about good practices on handling sensitive data and passwords (BTW, requiring a password change every few months does not improves security at all as it usally results in users writing down their passwords as they can't remember it, you loose more security than you gain)
you should have good intrusion detection to lock out an attacker as soon as possible, as example behaviour analysis; Example: if a user from the USA logs in from an IP in Estonia, this is suspicious
network access restrictions: you could block all IP ranges except those from your enterprise of accessing critical data, if a password gets leaked, this minimizes the possible impact
improve end user security: if one of the users have a trojan on their computer that makes screen captures or keylogs, all your security is lost as the attacker could just watch the user while he's vieweing sensitive data; you should have a good security police in your enterprise
force users to access your site over SSL, you should not let the users choose if they prefer security ocer comfort of not

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.