I found the Python package to encrypt some data and see this in python Cryptography:
It is possible to use passwords with Fernet(symmetric key). To do this, you need to run the password through a key derivation function such as PBKDF2HMAC, bcrypt or scrypt.
But, it turns out that a password works in the same way as a key(use password/key to en/decrypt). So why bother to use password instead of key itself?
I mean why not just use key itself:
from cryptography.fernet import Fernet
key = Fernet.generate_key()
token = Fernet(key).encrypt(b"my deep dark secret")
Fernet(key).decrypt(token)
A password is something that can be remembered by a person whereas a key is usually not remembered, because it is long (at least 128 bit or Hex-encoded in 32 characters) and is supposed to be really random (indistinguishable from random noise). If you want to encrypt something with a key, but this key cannot be transmitted by asymmetric cryptography and instead should be given over the phone or should never be written anywhere, then you can't simply generate a random key and use that. You need have a password/passphrase in place to derive the key from.
Example 1:
A personal password safe like KeePass needs a key for encryption/decryption. A user will not be able to simply remember that key, therefore we have a much shorter password, which can be remembered. Now, the security lies in the fact that a slow key derivation function is used to derive a key from the password, so an attacker still has trouble brute-forcing the key even though it depends on a much shorter password.
Example 2:
You compress a document and use the encryption of the compression software. Now you can send the container via e-mail, but you can't send the password along with it. So, you call the person you've sent the e-mail to and tell them the password. A password is much easier to transmit than a long and random key in this way.
Related
I am making a Django project that will be hosted locally in different environments.
I want users to be able to login by just entering a six-digit PIN on a touch screen or keyboard instead of having to type out a lengthy username/password.
I need to store a PIN for users in the DB. I want the PIN to be hashed or encrypted in some way so that it is not visible in the database. The PIN (and therefore its hash) must be unique but it also must be converted to the same value each time. For instance, every time 123456 is entered it needs to be converted to "jhs8d67RandomString34kds" so that no two users can save the same PIN as the DB column will be unique.
I need to know how to change a user-entered integer and hash it to save in the database.
Then I need to know how to compare it when a user enters the PIN.
I really need some examples on how to implement this and not a lesson in telling me why this is "insecure" or won't work.
Any ideas would be greatly appreciated.
Hashing something doesn't make it secure
All hash function have clashes, the only difference is the probability
Integers have hash function implemented, just use that
Note that for security reasons hashing for strings in randomized in each python process. so those hashes cannot be used for persistent data
You can use module-hashlib:
import hashlib
pincode = "123456"
hashlib.md5(pincode).hexdigest()
'e10adc3949ba59abbe56e057f20f883e'
And the to compare you can do the same:
if hashlib.md5(pincode).hexdigest() == 'e10adc3949ba59abbe56e057f20f883e':
you code here
...
Or use hashlib.pbkdf2_hmac with salt:
hashlib.pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)
import hashlib
dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
dk.hex()
'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
I am looking at different alternatives to hash passwords in a Python app. First I was settling for Flask-bcrypt (https://github.com/maxcountryman/flask-bcrypt), but then decided to use Argon2. The most popular Argon2 bindings for Python is argon2-cffi (https://github.com/hynek/argon2-cffi).
According to its' docs (https://argon2-cffi.readthedocs.io/en/stable/api.html), all I need to do is use 3 methods:
hash to hash a password
verify to compare a password to a hash
check_needs_rehash to see if a password should be rehashed after a change in the hashing parameters
Two things puzzle me.
1) The salt is random, using os.urandom. I thus wonder if the verify method is somehow able to extract the salt from the hash? Or in other words, since I have no say in what the salt is and cannot save it, how can the verify method actually ever compare any password to a password that was hashed with a random salt? Am I supposed to somehow parse the salt from the return value of hash myself, and store it separately from the hashed value? Or is the hash supposed to be stored as is in the docs, untouched, and somehow Argon2 is capable of verifying a password against it? And if indeed Argon2 can extract the salt out of the hash, how is using a salt any safer in that case since a hostile entity who gets a hashed password should then also be able to extract the salt?
2) By default I do not supply any secret to the hash method and instead the password itself seems to be used as a secret. Is this secure? What are the downsides for me not supplying a secret to the hashing method?
1) The salt is random, using os.urandom. I thus wonder if the verify method is somehow able to extract the salt from the hash?
The hash method returns a string that encodes the salt, the parameters, and the password hash itself, as shown in the documentation:
>>> from argon2 import PasswordHasher
>>> ph = PasswordHasher()
>>> hash = ph.hash("s3kr3tp4ssw0rd")
>>> hash
'$argon2id$v=19$m=102400,t=2,p=8$tSm+JOWigOgPZx/g44K5fQ$WDyus6py50bVFIPkjA28lQ'
>>> ph.verify(hash, "s3kr3tp4ssw0rd")
True
The format is summarized in the Argon2 reference implementation; perhaps there are other references. In this case:
$argon2id$...
The hash is Argon2id, which is the specific Argon2 variant that everyone should use (combining the side channel resistance of Argon2i with the more difficult-to-crack Argon2d).
...$v=19$...
The version of the hash is 0x13 (19 decimal), meaning Argon2 v1.3, the version adopted by the Password Hashing Competition.
...$m=102400,t=2,p=8$...
The memory use is 100 MB (102400 KB), the time is 2 iterations, and the parallelism is 8 ways.
...$tSm+JOWigOgPZx/g44K5fQ$...
The salt is tSm+JOWigOgPZx/g44K5fQ (base64), or b5 29 be 24 e5 a2 80 e8 0f 67 1f e0 e3 82 b9 7d (hexadecimal).
...$WDyus6py50bVFIPkjA28lQ
The password hash itself is WDyus6py50bVFIPkjA28lQ (base64), or 58 3c ae b3 aa 72 e7 46 d5 14 83 e4 8c 0d bc 95 (hexadecimal).
The verify method takes this string and a candidate password, recomputes the password hash with all the encoded parameters, and compares it to the encoded password hash.
And if indeed Argon2 can extract the salt out of the hash, how is using a salt any safer in that case since a hostile entity who gets a hashed password should then also be able to extract the salt?
The purpose of the salt is to mitigate the batch advantage of multi-target attacks by simply being different for each user.
If everyone used the same salt, then an adversary trying to find the first of $n$ passwords given hashes would need to spend only about $1/n$ the cost that an adversary trying to find a single specific password given its hash would have to spend. Alternatively, an adversary could accelerate breaking individual passwords by doing an expensive precomputation (rainbow tables).
But if everyone uses a different salt, then that batch advantage or precomputation advantage goes away.
Choosing the salt uniformly at random among 32-byte strings is just an easy way to guarantee every user has a distinct salt. In principle, one could imagine an authority handing out everyone in the world a consecutive number to use as their Argon2 salt, but that system doesn't scale very well—I don't just mean that your application could use the counting authority, but every application in the world would have to use the same counting authority, and I think the Count is too busy at Sesame Street to take on that job.
2) By default I do not supply any secret to the hash method and instead the password itself seems to be used as a secret. Is this secure? What are the downsides for me not supplying a secret to the hashing method?
Generally the password is the secret: if someone knows the password then they're supposed to be able to log in; if they don't know the password, they're supposed to be shown the door!
That said, Argon2 also supports a secret key, which is separate from the salt and separate from the password.
If there is a meaningful security boundary between your password database and your application so that it's plausible an adversary might compromise one but not the other, then the application can pick a uniform random 32-byte string as a secret key, and use that with Argon2 so that the password hash is a secret function of the secret password.
That way, an adversary who dumps the password database but not the application's secret key won't even be able to test a guess for a password because they don't know the secret key needed to compute a password's hash.
The output of hash is actually an encoding of the hash, hash parameters, and salt. You don't need to do anything special with it, just store it normally.
Argon2 is a password hashing algorithm. It doesn't (usually) require any secret. This is secure by design. It's possible to use it with a secret value in addition to the password, which should almost never add any security. It's also possible to use it as a key derivation function, which is almost always wasteful. Neither of these things would reduce security, but they're unnecessary so don't bother.
A little late, but pyargon2 is a valid alternative to overcome this. First to install the repo:
pip install pyargon2
then use:
from pyargon2 import hash
password = 'a strong password'
salt = 'a unique salt'
hex_encoded_hash = hash(password, salt)
More information:
https://github.com/ultrahorizon/pyargon2
Credit: https://github.com/jwsi
I have an old database where user passwords were hashed with md5 without salt. Now I am converting the project into django and need to update passwords without asking users to log in.
I wrote this hasher:
from django.contrib.auth.hashers import PBKDF2PasswordHasher
class PBKDF2WrappedMD5PasswordHasher(PBKDF2PasswordHasher):
algorithm = 'pbkdf2_wrapped_md5'
def encode_md5_hash(self, md5_hash, salt):
return super().encode(md5_hash, salt)
and converting password like:
for data in old_user_data:
hasher = PBKDF2WrappedMD5PasswordHasher()
random_salt = get_random_string(length=8)
# data['password'] is e.g. '972131D979FF69F96DDFCC7AE3769B31'
user.password = hasher.encode_md5_hash(data['password'], random_salt)
but I can't login with my test-user.
any ideas? :/
I'm afraid you cannot do what you want with this. Hashing is strictly one-way, so there is no way to convert from one hash to another. You WILL have to update these passwords to the new hash one-by-one as users log in.
A decent strategy for implementing this change is:
Mark all of your existing hashes as md5. You can just use some kind of boolean flag/column, but there is an accepted standard for this: https://passlib.readthedocs.io/en/stable/modular_crypt_format.html
When the user logs in, authenticate them by first checking which type of hash they have, and then calculating that hash. If they are still md5, calculate the md5 to log them in; if they are now using pbkdf2, calculate that hash instead.
After authenticating the password, if they are still flagged as md5, calculate the new format hash and replace it - making sure to now flag this as pbkdf2.
IMPORTANT: You will want to test this thoroughly before you release it to the wild. If you make a mistake, you might destroy the credentials of any user logging in. I would recommend temporarily retaining a copy of the old md5 hashes until you confirm production is stable, but make absolutely certain you destroy this copy completely. Your users passwords are not safe as long as the md5 hashes exist whatsoever.
I want to code a custom key generator in Python. This key will be used as an input (along with the plain text) to AES algorithm for encryption (I will probably use pycrypto or m2crypto libraries for that).
But the key generator has to be custom, as it would generate the key based on the string that would be supplied by the user.
str = date + case-id + name
where:
date = current date when a case was submitted
(we work on separate security analysis cases, submitted on our ticketing tool)
name = person handling the case
case-id = the ticket id with which it was submitted.
This same key needs to be known to the decryptor (on a different system) so that it can decrypt the data.
So the key will have to be fixed for a specific set of date name and case-id for a specific order and will only be different if any of these 3 change in value or order and should not be random every time.
I've gone through some of stackoverflow articles, where it is suggested to use
random_key = os.urandom(16)
but I don't believe this will serve my purpose.
Suggestion on some articles where to start with if I want to design a key generator from scratch, or some pointers on existing libraries will be highly appreciated.
You're looking for a Password hashing algorithm, such as Argon2 or PBKDF2. It will allow you to deterministically extend the 'password' generated from the input values into a suitable key.
However, note that your passwords may still be very weak. I suspect that there is a strong correlation between case-id and date. Names are probably only a small list of people easily found out. Also, isn't this data sent along with the encrypted data by your system? This makes using it as a password a bad idea.
I made a topic about the built-in python hash function: Old python hashing done left to right - why is it bad?
The previous topic was about why it was bad for encryption, because we have an application called Gruyere which is filled with security holes, and it uses the hash() to encrypt cookies.
# global cookie_secret; only use positive hash values
h_data = str(hash(cookie_secret + c_data) & 0x7FFFFFF)
c_data is a username; cookie_secret is salt (which is just '' by default)
I have implemented a more secure encryption method using md5 hashing with salt, but one excercise is to beat this old encryption and I still cannot understand how :-( I've read the string_hash code from python sourcecode but it's not documented and I can't figure it out.
EDIT: The idea is to write a program which can create a valid cookie any valid user, so I think I need to find out cookie_secret somehow
Zack described the answer already in your last question: It's easy to find a collision.
Let's say you save hash("pwd") in the database (that you actually do something different doesn't matter. Now, if you enter "pwd" in the site, you can enter. But how is this checked? Again, the hash of "pwd" is token, and compared to the value in the database. But what if there is a second string, say "hello", and hash("hello") == hash("pwd")? Then you could also use "hello" as password. So to beat the encryption, you don't need to find "pwd", you just need any string which has the same hash-value. You can just search for such a string brute-force (and I guess you can do some optimizations based on the knowledge of the source of hash)