I want to create unique string ids that will be used in a user management system. I want each user to have a string that allows them to access their own database on a server. I read here that you can use the secrets module to create random strings, but since I don't know anything about user management I'm not sure whether to trust the pseudo-random numbers that lie behind the secrets package.
Question: Is it safe to generate a string for each user like this
secrets.token_hex(nbytes = 50)
Out[8]: '24d72775ae86151c600b6a64cef7191e2a55271615894a0ad3b05671978add68a16ad2beb66fa87e66ccdcc442ef4e57e9a4'
and let them use that string to identify themselves to enter the database on my server? Is there a better way that is as quick?
Related
I am making a Django project that will be hosted locally in different environments.
I want users to be able to login by just entering a six-digit PIN on a touch screen or keyboard instead of having to type out a lengthy username/password.
I need to store a PIN for users in the DB. I want the PIN to be hashed or encrypted in some way so that it is not visible in the database. The PIN (and therefore its hash) must be unique but it also must be converted to the same value each time. For instance, every time 123456 is entered it needs to be converted to "jhs8d67RandomString34kds" so that no two users can save the same PIN as the DB column will be unique.
I need to know how to change a user-entered integer and hash it to save in the database.
Then I need to know how to compare it when a user enters the PIN.
I really need some examples on how to implement this and not a lesson in telling me why this is "insecure" or won't work.
Any ideas would be greatly appreciated.
Hashing something doesn't make it secure
All hash function have clashes, the only difference is the probability
Integers have hash function implemented, just use that
Note that for security reasons hashing for strings in randomized in each python process. so those hashes cannot be used for persistent data
You can use module-hashlib:
import hashlib
pincode = "123456"
hashlib.md5(pincode).hexdigest()
'e10adc3949ba59abbe56e057f20f883e'
And the to compare you can do the same:
if hashlib.md5(pincode).hexdigest() == 'e10adc3949ba59abbe56e057f20f883e':
you code here
...
Or use hashlib.pbkdf2_hmac with salt:
hashlib.pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)
import hashlib
dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
dk.hex()
'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'
I need to set key on my program, which would be an exe file. I want to see:
User clicks on exe file then program requers key, user paste the key and key never asks again. User can't send this acivated exe to other users also other users can't use this key again.
or suggest better idea.
p. c. exe file is console app
You can for instance use the platform module to (almost) uniquely identify a machine. They key can then be the sha256 hexdigest of this identifier viewed as a string, like this:
import hashlib
import platform
# Only an example, you can add whatever you want provided by the platform module to identify the machine
identifier = platform.platform()
key = hashlib.sha256(identifier.encode()).hexdigest()
Pros:
Can't be shared
Isn't reusable
Cons:
Doesn't respect Kerchkoff's principle
Hence, it means that your system is secure as long as the user does not know how to compute the identifier by themselves.
You can ellaborate on this model, using maybe a server of your own. For instance, you can compute a key on your server using the identifier you computed and a secret string.
Pros:
You don't have to store a random key for each user, you just need to have access to its identifier
Cons:
If your identifier isn't accurate enough, two users may have the same key
To solve this problem, you can define a random string for each user that you append to their identifier, but it means that you have to store this random string for each user.
Note also that the last two solutions make use of an external server. Hence, you assume that you will be able to do network requests.
This is stated in the Google Cloud Storage Naming Best Practices documentation.
Don't use user IDs, email addresses, project names, project numbers, or any personally identifiable information (PII) in bucket or object names because anyone can probe for the existence of a bucket or object, and use the 403 Forbidden, 404 Not Found and 409 Conflict errors to determine the bucket or object's name. Also, URLs often end up in caches, browser history, proxy logs, shortcuts, and other locations that allow the name to be read easily.
This sort of puts a strain on where I was headed with my application, and how it is structured. I really want to avoid handling/storing Cloud Storage paths via CloudSQL or DataStore.
I'm writing this in Python on Google App Engine, and a good amount of my code for GCS is based off of the username as of right now. For example, a user would always upload his/her file within the folder (username) which he/she has registered as. A lot of the path logic I currently have, utilizes the User variable for GCS.
Could someone possibly recommend a way in which I would be following their guidelines, while still having the capability to use a single variable to call the a user directory? By that I mean without assigning the folder as the users ID. I would need to be able to reference this variable without accessing SQL or Datastore at any given time.
Any help would greatly appreciated!
Usernames and filenames can have PII. For example: JeffreyRennieHasWarts.pdf. So they all must be hidden.
One method is to encrypt the object names. The good news is that Google just announced a Key Management Service that makes this a lot easier. See:
https://cloud.google.com/kms/
Another method, as jterrace mentioned, is to salt and hash the username to create a user key. It would look something like:
user_key = hmac.new("username", mysecretsalt, hashlib.sha256).hexdigest()
But that still leaves the problem of file names. To hide the original file name, you'd give the objects meaningless names, and store a separate object whose contents are the original name of the file. So your object names might look like
userkey1/GUID1.contents
userkey1/GUID1.name
userkey1/GUID2.contents
userkey1/GUID2.name
userkey2/GUID3.contents
userkey2/GUID3.name
The best choice will depend on how you plan to query the data stored in cloud storage.
After reading about how to ensure that "remember me" tokens are kept secure and reading the source code for psecio's Gatekeeper PHP library, I've come up with the following strategy for keeping things secure, and I wanted to find out if this is going to go horribly wrong. I'm basically doing the following things:
When a user logs in, generate a cryptographically-secure string using the system's random number generator. (random.SystemRandom() in Python) This is generated by picking random characters from the selection of all lower and uppercase ASCII letters and digits. (''.join(_random_gen.choice(_random_chars) for i in range(length)), as per how Django does the same. _random_gen is the secure random number generator)
The generated token is inserted into a RethinkDB database along with the userid it goes along with and an expiration time 1 minute into the future. A cookie value is then created by using the unique ID that RethinkDB generates to identify that entry and the sha256-hashed token from before. Basically: ':'.join(unique_id, sha256_crypt.encrypt(token)). sha256_crypt is from Python's passlib library.
When a user accesses a page that would require them to be logged in, the actual cookie value is retrieved from the database using the ID that was stored. The hashed cookie is then verified against the actual cookie using sha256_crypt.verify.
If the verification passes and the time value previously stored is less than the current time, then the previous entry in the database is removed and a new ID/token pair is generated to be stored as a cookie.
Is this a good strategy, or is there an obvious flaw that I'm not seeing?
EDIT: After re-reading some Stack Overflow posts that I linked in a comment, I have changed the process above so that the database stores the hashed token, and the actual token is sent back as a cookie. (which will only happen over https, of course)
You should make sure you generate enough characters in your secure string. I would aim for 64 bits of entropy, which means you need at least 11 characters in your string to prevent any type of practical brute force.
This is as per OWASP's recommendation for Session Identifiers:
With a very large web site, an attacker might try 10,000 guesses per
second with 100,000 valid session identifiers available to be guessed.
Given these assumptions, the expected time for an attacker to
successfully guess a valid session identifier is greater than 292
years.
Given 292 years, generating a new one every minute seems a little excessive. Maybe you could change this to refresh it once per day.
I would also add a system wide salt to your hashed, stored value (known as a pepper). This will prevent any precomputed rainbow tables from extracting the original session value if an attacker manages to gain access to your session table. Create a 16 bit cryptographically secure random value to use as your pepper.
Apart from this, I don't see any inherent problems with what you've described. The usual advice applies though: Also use HSTS, TLS/SSL and Secure cookie flags.
So an emergency project was dumped on me to merge a MySQL user database into an existing Django user database.
I've figured just about everything out except how to handle the passwords as they use different hashes. I don't know Python, the Django backend, or very much about hashing techniques.
I do have a way to verify users with their emails, I just need a way to take the passwords they give me and save them into the database in a Django-acceptable way. It will be have to be done in Perl since that's the only language I know on the server.
I found this page talking about how Django handles passwords, but I sadly don't understand most of what they're saying. Also, I don't know if it's any help, but the admin area of the Django site gives the "hint" of
"Use '[algo]$[salt]$[hexdigest]'" for the password.
That doesn't mean much to me either, but maybe it does to one of you?
There are basically two ways to handle this: convert existing passwords to a format acceptable by Django, or write your own Django password hasher.
For the first way, as you found, the password field consists of three parts, each separated by a $. (Django 1.6 passwords may have 4 parts, but let's ignore that extra part for now, since Django 1.6 also supports the more traditional 3-part passwords.) The parts are
algorithm, which describes the password hashing algorithm; it will look like md5, pbkdf2, etc.
salt, the salt for the hash algorithm
hexdigest, the hashed password
So, assuming your passwords are already salted and hashed, your script needs to take the hashed/salted passwords in your existing database, separate the salt from the hash, then store them into the database with the appropriate algorithm string prefixed. There should be Perl modules for doing password hashing using various algorithms. Django's recommended algorithm is PBKDF2. bcrypt is also good. Any hash algorithm is fine, though, as far as Django is concerned, as long as it has a built-in hasher for that algorithm (Django has hashers for the most common hashing algorithms).
If your existing passwords are not salted and hashed…well, now would be a good time to do that. ;-)
The alternative way is to just copy the passwords over to the new database as-is, and write your own password hasher to handle them in your Django app. Of course, that would require writing some Python code.