I am experimenting with the passlib.hash.sha256_crypt algorithm in an App Engine app and it seems rather simple to implement.
Is this secure enough with it's default parameters of autogenerated salt and 80,000 rounds? Should it first be padded with random chars?
Password is being posted in from a form and encrypted as above.
I can't judge what "secure enough" means.
Last I read the best options were scrypt, bcrypt, or pbkdf2. If you can't use those then I recommend sha512 with many thousands of iterations. One of the benefits of using passlib CryptContext is you can update your scheme later as needed (and as better implementations become available) while keeping easy compatibility with previously stored passwords. sha512_crypt is very easy to implement on GAE with passlib CryptContext.
I'm not sure padding with characters (on top of salting) adds anything.
Related
I am using Django 1.97. The encrypted passwords are significantly different (in terms of the format).
Some passwords are of format $$$:
pbkdf2_sha256$24000$61Rm3LxOPsCA$5kV2bzD32bpXoF6OO5YuyOlr5UHKUPlpNKwcNVn4Bt0=
While others are of format :
!9rPYViI1oqrSMfkDCZSDeJxme4juD2niKcyvKdpB
Passwords are set either using User.objects.create_user() or user.set_password(). Is this difference an expected one ?
You'll be fine. You just have some blank passwords in your database.
Going back as far as V0.95, django used the $ separators for delimiting algorithm/salt/hash. These days, django pulls out the algorithm first by looking at what is in front of the first $ and then passes the whole lot to the hasher to decode. This allows for a wider set of formats, including the one for PBKDF2 which adds an extra iterations parameter in this list (as per your first example).
However, it also recognises that some users may not be allowed to login and/or have no password. This is encoded using the second format you've seen. As you can see here:
If password is None then a concatenation of UNUSABLE_PASSWORD_PREFIX and a random string will be returned which disallows logins.
You can also see that the random string is exactly 40 characters long - just like your second example.
In short, then, this is all as expected.
There is no significant difference between User.objects.create_user() and user.set_password() since first uses second.
Basically, passwords are in string with format <algorithm>$<iterations>$<salt>$<hash> according to docs. The differences might come from PASSWORD_HASHERS settings variable. May be one password was created with one hasher and other password with another. But if you'll keep those hashers in variable mentioned above all should be fine, users will able to change it etc. You can read about it in little notice after bcrypt section.
Also docs for django.contrib.auth package might be helpful too. Link.
UPDATE:
If you find documentation of an old django versions (1.3 for example), you will see that
Previous Django versions, such as 0.90, used simple MD5 hashes without password salts. For backwards compatibility, those are still supported; they'll be converted automatically to the new style the first time check_password() works correctly for a given user.
So I think that the answer might be somewhere here. But it really depends on how legacy your project is, so you can decide if it's normal or what. Anyway you can issue check_password() to be sure. Or you can just email your user with "change password please" notification. There are many factors involved really.
I've got a passwords on a datastore that were hashed using the method SecureSocialPasswordHasher.passwordHash from the package securesocial.utils.SecureSocialPasswordHasher of SecureSocial, and I have to validate them through Python.
Therefore, the use of SecureSocial (or the whole Play Framework) is out of the question. The question is: What does it use for hashing when calling that method? From the documentation it seems it is Bcrypt, but it wasn't clear enough for me to be sure.
---------EDIT---------
I've been told on SecureSocial forums that indeed it uses Bcrypt with work factor 10 default. However it doens't reflect what I see on the datastore.
There are 2 columns there, one for salt, and another one fro the hashed password. Neither of them have the Bcrypt header (such as $2a$10$). Also, the salt size is only 11 characters long, and the hashed password is only 22 characters long (and no signs of having the salt inside the string).
Found out the default for hashing passwords on SecureSocial is indeed Bcrypt.
The default implementation for it's hash method is:
def hash(plainPassword: String): PasswordInfo = {
PasswordInfo(id, BCrypt.hashpw(plainPassword, BCrypt.gensalt(logRounds)))
}
This applies to the latest version of SecureSocial.
On my specific problem, the main issue was that I was not communicated that the code I was dealing with was using an older version of SecureSocial, and that the has method was overriden.
So an emergency project was dumped on me to merge a MySQL user database into an existing Django user database.
I've figured just about everything out except how to handle the passwords as they use different hashes. I don't know Python, the Django backend, or very much about hashing techniques.
I do have a way to verify users with their emails, I just need a way to take the passwords they give me and save them into the database in a Django-acceptable way. It will be have to be done in Perl since that's the only language I know on the server.
I found this page talking about how Django handles passwords, but I sadly don't understand most of what they're saying. Also, I don't know if it's any help, but the admin area of the Django site gives the "hint" of
"Use '[algo]$[salt]$[hexdigest]'" for the password.
That doesn't mean much to me either, but maybe it does to one of you?
There are basically two ways to handle this: convert existing passwords to a format acceptable by Django, or write your own Django password hasher.
For the first way, as you found, the password field consists of three parts, each separated by a $. (Django 1.6 passwords may have 4 parts, but let's ignore that extra part for now, since Django 1.6 also supports the more traditional 3-part passwords.) The parts are
algorithm, which describes the password hashing algorithm; it will look like md5, pbkdf2, etc.
salt, the salt for the hash algorithm
hexdigest, the hashed password
So, assuming your passwords are already salted and hashed, your script needs to take the hashed/salted passwords in your existing database, separate the salt from the hash, then store them into the database with the appropriate algorithm string prefixed. There should be Perl modules for doing password hashing using various algorithms. Django's recommended algorithm is PBKDF2. bcrypt is also good. Any hash algorithm is fine, though, as far as Django is concerned, as long as it has a built-in hasher for that algorithm (Django has hashers for the most common hashing algorithms).
If your existing passwords are not salted and hashed…well, now would be a good time to do that. ;-)
The alternative way is to just copy the passwords over to the new database as-is, and write your own password hasher to handle them in your Django app. Of course, that would require writing some Python code.
I have a function for encrypting with SHA-1 in Python, using hashlib. I take a file and encrypt the contents with this hash.
If I set a password for an encrypted text file can I use this password to decrypt and to restore the file with the original text?
Hashing functions are different than normal crypto algorithms. They are oftenly referred to as one-way ciphers, because the process data goes through is irreversible.
Differently than symmetric and assymetric encryption, hashes are used by asserting the hashed values themselves, instead of decrypting and asserting the plain-text values. To validate logins when you're using hashes, you'd hash the password the user just attempted to log in with and compare it with the hash you have on your DB. If they match, login is successful.
Cracking hashes involves guessing hashing various different strings and trying to match hashed values to the ones illegally obtained from a DB. There are lists available on the internet with millions of already hashed values to make hash cracking easier, those are known as Rainbow Tables and they can be easily countered with the use of Salts.
It's also worth noting that since hashing algorithms are able to digest GBs of data into significantly smaller strings, mathematically, two different values may have identical hashes. Even though this is very rare, it is an existing problem, and its known as Hash Collision.
If hashing was reversible, hard drives would be reduntant since we would be able to hash thousands of GBs into a small string of text and reverse them as we pleased. It would allow for data compression and storage in ways that violate physics.
Related Wikipedia Articles:
Hashing Algorithms: http://en.wikipedia.org/wiki/Hash_function
Rainbow Tables: http://en.wikipedia.org/wiki/Rainbow_table
Salts: http://en.wikipedia.org/wiki/Salt_(cryptography)
Collision: http://en.wikipedia.org/wiki/Collision_(computer_science)
Symmetric Encryption: http://en.wikipedia.org/wiki/Symmetric-key_algorithm
Assymetric Encryption: http://en.wikipedia.org/wiki/Public-key_cryptography
SHA-1 is not an encryption algorithm, it's a hashing algorithm. By definition, you can't "decrypt" anything that was hashed with the SHA-1 function, it doesn't have an inverse.
If you have an arbitrary hashed password, there's very little you can do to retrieve the original password - If you're lucky, the password could be in a database of reverse hashes, but that's as far as you can go.
And the message extraction algorithm expects the original password to perform the verification - the algorithm will hash the provided plain-text password and compare it against the stored hashed password, only if they're equal the plain-text message will be revealed.
Hash functions are one way tickets. You cannot use them for encryption.
Hash function algorithms are realised through modulo, xor and other familiar (one way) operations.
You may try to find what argument was used to generate hash but in theory you will never be 100% sure it is the correct value.
For example try with a really simple (useless in cryptography) hash function modulo 10. This function returns ten different values. If it's 7 you may guess the entry was 7 or 137 and 1234567. Same with md5, sha1 and better ones.
As you can see, in the case when you are using hash function that returns only 40 bytes with files that are much bigger (maybe even few hundred megabytes) there in theory exists infinite numbers of files for each possible hash.
I made a topic about the built-in python hash function: Old python hashing done left to right - why is it bad?
The previous topic was about why it was bad for encryption, because we have an application called Gruyere which is filled with security holes, and it uses the hash() to encrypt cookies.
# global cookie_secret; only use positive hash values
h_data = str(hash(cookie_secret + c_data) & 0x7FFFFFF)
c_data is a username; cookie_secret is salt (which is just '' by default)
I have implemented a more secure encryption method using md5 hashing with salt, but one excercise is to beat this old encryption and I still cannot understand how :-( I've read the string_hash code from python sourcecode but it's not documented and I can't figure it out.
EDIT: The idea is to write a program which can create a valid cookie any valid user, so I think I need to find out cookie_secret somehow
Zack described the answer already in your last question: It's easy to find a collision.
Let's say you save hash("pwd") in the database (that you actually do something different doesn't matter. Now, if you enter "pwd" in the site, you can enter. But how is this checked? Again, the hash of "pwd" is token, and compared to the value in the database. But what if there is a second string, say "hello", and hash("hello") == hash("pwd")? Then you could also use "hello" as password. So to beat the encryption, you don't need to find "pwd", you just need any string which has the same hash-value. You can just search for such a string brute-force (and I guess you can do some optimizations based on the knowledge of the source of hash)