Hello Python Programmers!
I have created a fully-functioning user account system using MongoDB to store the usernames and passwords, along with data associated with that account. It works perfectly and I'm extremely happy with it. You can sign in, sign up, reset password, and more. There's only one problem though now.
I worry about users decompiling the source and being able to take the MongoDB login key/string and remotely accessing the database outside of the application. This is dangerous because it could allow for a data leak of usernames and passwords (along with other data) contained within the database. I am unaware of a method to "obfuscate" the string in a secure way to prevent that, or other methods of authenticating the connection to the database.
I could use PyArmor for obfuscation, but even that I don't trust enough. I know PyArmor has been deobfuscated before, and as said by a wise programmer, "Anything that can be read be a computer can be read by humans". With that being said, if someone can deobfuscate it, they can get the string.
On top of this, I also don't know how to authenticate a user login. Once the user succesfully logs in, two variables marked as "LoggedIn" and "LoggedUser" are changed to their respective values. But, as far as I'm aware from a security perspective, these values could be spoofed and then cause easy access to any benefits a logged in user would have... or even spoof being a logged in user and change values from the database on that user.
If anyone knows ways to protect the string and authenticate the user better, please let me know.
Thanks!
I have built a browser in Python 3.7.2 using the PyQt5 framework. To create a password manager, how can I detect when a username and password has been entered and sent off, and what the username and password is, so that a popup can be prompted to save the username and password?
The method QWebView::url can help you determine the current loaded website. You should have a local database with the stored passwords; so with the current domain, lookup any stored password in said database. (Ideally passwords will be encrypted with a password).
As for the HTMl insertion part, there are a couple of approaches you can take; I haven't done this in a while, but take a look a the following methods:
`QWebView::setHtml`
`QWebFrame::evaluateJavaScript`
`QWebFrame::documentElement()`
With a `QWebFrame` there are many helpful methods that you can use.
UPDATE: These above are deprecated but there are still ways to achieve this.
Note: You should have a robust password-field-detection.
Deciding when is an appropriate time to store a new password is a big more tricky, this is how it seems to work on commercial browsers:
Detects a successful POST request that includes a password-type field (for this you need to monitor and inspect all HTTP requests that happen inside your browser.
Find if there's a stored password for the domain. If there is, prompt users to update the password if it's a different password, otherwise prompt users to store a new password. INSERT/UPDATE the password into the database.
I'm writing a CLI using python that pretty much is wrapping around an API for a website. There is authentication for the API, so I need to ask the user for their username and password. I'm not sure how to store these on the system without having them saved in plaintext somewhere. Is there a best practice for something like this?
As an example, a user might call from the command line:
python some_cli.py
And this will prompt them for their username and password if it isn't already saved. I thought about trying to save them with os.putenv or os.environ, but that won't be saved since this process will die and these won't be saved for future processes. The only thing I can think of is to have a file that this information will be saved in and read from.
Use the credentials the user enters to log into the web API that you are wrapping. The API should return a token or a session, just as if you were using it in the browser. Store this token or session somewhere in your CLI program as a variable or store this in a file. This will need to remain as plaintext. Each CLI instance can use this file to make requests to the API when run. You will need to handle expired sessions/tokens too by asking the user for their credentials to re-authenticate.
Generally, passwords are salted and hashed before they are stored on the system's hard disk. It sounds to me as though you're writing a client-side password storage script. Therefore, I would recommend the SHA-2 or bcrypt hashing algorithms to make your passwords unintelligible before storing them. Do not use MD5 or SHA-1 to hash your passwords, as they have known vulnerabilities.
When the user-supplied password and the real password is compared, they are not compared in plaintext. The user-supplied password is first salted, then hashed. The resulting hash is compared with the hash of the "correct" password that is stored on the disk. Using this method, the plaintext password is never stored on the disk. Additionally, since the probability that two hashes will match is extraordinarily low, it is considered a safer practice than storing plaintext passwords (a much, much safer practice because hashes are extremely difficult to reverse even if the attacker knows the hash).
This thread has a couple of interesting implementations of salting and hashing, including a bcrypt implementation. Salt and hash a password in python
A secure password storage tutorial may help you on your journey.
Keep in mind that cryptography has its weaknesses. Rainbow table attacks, timing attacks, and known plaintext attacks are all things that must be understood when switching to cryptographic password storage. That being said, cryptography is a highly respected field known to offer good security.
I'd recommend you join Stack Exchange's Cryptography Forum
The attack
One possible threat model, in the context of credential storage, is an attacker which has the ability to :
inspect any (user) process memory
read local (user) files
AFAIK, the consensus on this type of attack is that it's impossible to prevent (since the credentials must be stored in memory for the program to actually use them), but there's a couple of techniques to mitigate it:
minimize the amount of time the sensitive data is stored in memory
overwrite the memory as soon as the data is not needed anymore
mangle the data in memory, keep moving it, and other security through obscurity measures
Python in particular
The first technique is easy enough to implement, possibly through a keyring (hopefully kernel space storage)
The second one is not achievable at all without writing a C module, to the best of my knowledge (but I'd love to be proved wrong here, or to have a list of existing modules)
The third one is tricky.
In particular, python being a language with very powerful introspection and reflection capabilities, it's difficult to prevent access to the credentials to anyone which can execute python code in the interpreter process.
There seems to be a consensus that there's no way to enforce private attributes and that attempts at it will at best annoy other programmers who are using your code.
The question
Taking all this into consideration, how does one securely store authentication credentials using python? What are the best practices? Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
There are two very different reasons why you might store authentication credentials:
To authenticate your user: For example, you only allow the user access to the services after the user authenticates to your program
To authenticate the program with another program or service: For example, the user starts your program which then accesses the user's email over the Internet using IMAP.
In the first case, you should never store the password (or an encrypted version of the password). Instead, you should hash the password with a high-quality salt and ensure that the hashing algorithm you use is computationally expensive (to prevent dictionary attacks) such as PBKDF2 or bcrypt. See Salted Password Hashing - Doing it Right for many more details. If you follow this approach, even if the hacker retrieves the salted, slow-hashed token, they can't do very much with it.
In the second case, there are a number of things done to make secret discovery harder (as you outline in your question), such as:
Keeping secrets encrypted until needed, decrypting on demand, then re-encrypting immediately after
Using address space randomization so each time the application runs, the keys are stored at a different address
Using the OS keystores
Using a "hard" language such as C/C++ rather than a VM-based, introspective language such as Java or Python
Such approaches are certainly better than nothing, but a skilled hacker will break it sooner or later.
Tokens
From a theoretical perspective, authentication is the act of proving that the person challenged is who they say they are. Traditionally, this is achieved with a shared secret (the password), but there are other ways to prove yourself, including:
Out-of-band authentication. For example, where I live, when I try to log into my internet bank, I receive a one-time password (OTP) as a SMS on my phone. In this method, I prove I am by virtue of owning a specific telephone number
Security token: To log in to a service, I have to press a button on my token to get a OTP which I then use as my password.
Other devices:
SmartCard, in particular as used by the US DoD where it is called the CAC. Python has a module called pyscard to interface to this
NFC device
And a more complete list here
The commonality between all these approaches is that the end-user controls these devices and the secrets never actually leave the token/card/phone, and certainly are never stored in your program. This makes them much more secure.
Session stealing
However (there is always a however):
Let us suppose you manage to secure the login so the hacker cannot access the security tokens. Now your application is happily interacting with the secured service. Unfortunately, if the hacker can run arbitrary executables on your computer, the hacker can hijack your session for example by injecting additional commands into your valid use of the service. In other words, while you have protected the password, it's entirely irrelevant because the hacker still gains access to the 'secured' resource.
This is a very real threat, as the multiple cross-site scripting attacks have shows (one example is U.S. Bank and Bank of America Websites Vulnerable, but there are countless more).
Secure proxy
As discussed above, there is a fundamental issue in keeping the credentials of an account on a third-party service or system so that the application can log onto it, especially if the only log-on approach is a username and password.
One way to partially mitigate this by delegating the communication to the service to a secure proxy, and develop a secure sign-on approach between the application and proxy. In this approach
The application uses a PKI scheme or two-factor authentication to sign onto the secure proxy
The user adds security credentials to the third-party system to the secure proxy. The credentials are never stored in the application
Later, when the application needs to access the third-party system, it sends a request to the proxy. The proxy logs on using the security credentials and makes the request, returning results to the application.
The disadvantages to this approach are:
The user may not want to trust the secure proxy with the storage of the credentials
The user may not trust the secure proxy with the data flowing through it to the third-party application
The application owner has additional infrastructure and hosting costs for running the proxy
Some answers
So, on to specific answers:
How does one securely store authentication credentials using python?
If storing a password for the application to authenticate the user, use a PBKDF2 algorithm, such as https://www.dlitz.net/software/python-pbkdf2/
If storing a password/security token to access another service, then there is no absolutely secure way.
However, consider switching authentication strategies to, for example the smartcard, using, eg, pyscard. You can use smartcards to both authenticate a user to the application, and also securely authenticate the application to another service with X.509 certs.
Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
IMHO there is nothing wrong with writing a specific module in Python that does it's damnedest to hide the secret information, making it a right bugger for others to reuse (annoying other programmers is its purpose). You could even code large portions in C and link to it. However, don't do this for other modules for obvious reasons.
Ultimately, though, if the hacker has control over the computer, there is no privacy on the computer at all. Theoretical worst-case is that your program is running in a VM, and the hacker has complete access to all memory on the computer, including the BIOS and graphics card, and can step your application though authentication to discover its secrets.
Given no absolute privacy, the rest is just obfuscation, and the level of protection is simply how hard it is obfuscated vs. how much a skilled hacker wants the information. And we all know how that ends, even for custom hardware and billion-dollar products.
Using Python keyring
While this will quite securely manage the key with respect to other applications, all Python applications share access to the tokens. This is not in the slightest bit secure to the type of attack you are worried about.
I'm no expert in this field and am really just looking to solve the same problem that you are, but it looks like something like Hashicorp's Vault might be able to help out quite nicely.
In particular WRT to the problem of storing credentials for 3rd part services. e.g.:
In the modern world of API-driven everything, many systems also support programmatic creation of access credentials. Vault takes advantage of this support through a feature called dynamic secrets: secrets that are generated on-demand, and also support automatic revocation.
For Vault 0.1, Vault supports dynamically generating AWS, SQL, and Consul credentials.
More links:
Github
Vault Website
Use Cases
I have some APIs (django-rest-framework) which do basic authentication (Base64). On one client box, there is a cron job, which sends requests to APIs.
Now, I hardcoded the base64 encrypted username and password on the disk. I know it is not secure. But how to improve it? Can I use another algorithm instead of base64?
Thanks
UPDATE
Token authentication involves key too. so, we need to store the key somewhere for the cron job. I am trying to solve the problem of hard-coding the key somewhere for the crob job. If the hardcode cannot be avoided, I prefer a stronger encryption algorithm. So, I am thinking about a strong encryption algorithm to encrypt the password and username and storing them somewhere.
Any comments welcomed. Thanks.