Sequential SHA 256 hashes give different outputs for the same input - python

I thought that this would be a fairly common and straightforward problem, but I searched and was not able to find it.
I am a novice Python user, mostly self-taught. I'm trying what I thought would be a fairly straightforward exercise: generating a hash value from an input phrase. Here is my code:
import hashlib
target = input("Give me a phrase: ").encode('utf-8')
hashed_target = hashlib.sha256(target)
print(hashed_target)
I execute this and get the prompt:
Give me a phrase:
I entered the phrase "Give me liberty or give me death!" and got the hash output 0x7f8ed43d6a80.
Just to test, I tried again with the same phrase, but got a different output: 0x7f1cc23bca80.
I thought that was strange, so I copied the original input and pasted it in, and got a third, different hash output: 0x7f358aabea80.
I'm sure there must be a simple explanation. I'm not getting any errors, and the code looks straightforward, but the hashes, while similar, are definitely different.
Can someone help?

You are directly printing an object, which returns a memory address in the __repr__ string. You need to use the hexdigest or digest methods to get the hash:
>>> import hashlib
>>> testing=hashlib.sha256(b"sha256 is much longer than 12 hex characters")
>>> testing
<sha256 HASH object # 0x7f31c1c64670>
>>> hashed_testing=testing.hexdigest()
>>> hashed_testing
'a0798cfd68c7463937acd7c08e5c157b7af29f3bbe9af3c30c9e62c10d388e80'
>>>

Related

Guessing XOR secret key knowing some part of it

I'm trying to guess the secret key to decrypt a message using Python 3. I know the message is going to be something like: crypto{1XXXXXX} where the XXXXXXX is the unknown part of the message.
The encrypt message is: '0e0b213f26041e480b26217f27342e175d0e070a3c5b103e2526217f27342e175d0e077e263451150104' and I have the following code:
from pwn import xor
flkey=bytes.fromhex('0e0b213f26041e480b26217f27342e175d0e070a3c5b103e2526217f27342e175d0e077e263451150104')
print(flkey)
y = xor(flkey, "crypto{1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}")
print(y)
xor(flkey, y)
My question is, how can I find the rest of the message knowing only some part of it? I'm quite new in this topic related to XOR.
EDIT: when I print(y) I obtain:
b'crypto{1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}'
So i guess the length of between the brackets is 34.
The weak point of the XOR operation in cryptography is that A XOR B XOR A = B. So when you know the part of the plaintext message M for the corresponding encrypted message C, you immediately obtain that part of the key as K = M XOR C.
In particular:
>>> cypher = bytes.fromhex('0e0b213f26041e480b26217f27342e175d0e070a3c5b103e2526217f27342e175d0e077e263451150104')
>>> plaintext = b'crypto{1'
>>> key = ''.join(chr(c ^ m) for c, m in zip(cypher, plaintext))
>>> key
'myXORkey'
The chances are high that this is the entire key (it actually is, which is left as an exercise). This string will repeat as many times as needed to match the plain text length.
Suppose now, that this was not the entire key. We know, however, that the key repeats in a loop, so that part we alreaydy know, myXORkey, will be reused somewhere later. We can start applying it to various places in the cypher and see when it starts making sense. That way we know the key length and parts of the messages. There are few ways from here, the most simple is, because we know some parts of the plaintext, we can find the missing part by sense and from there find the remaining part of the key.
The following properties may help:
the key is sufficiently short
the key makes some sense
you know the language the plain text is written in
If the key is as long as the message, is truly random, and used only once, the cypher cannot be broken (See One-time pad).
In a generic case when the plaintext or/and the key length is unknown, there is more sophisticated method based on the Hamming distance and transposition (The method was first discovered in 19th century by Friedrich Kasiski to analyze the Vigenère cipher.

Random hex string get 'none' when sha256 in python

Can you please help me to understand such a simple subject. I need to produce a serie of random hexadecimal strings for which I'm using:
import secrets
x = secrets.token_hex(32)
This gives me something like this:
d6d09acbe78c269147803b8c351214a6e5f39093ca315c47e1126360d0df5369
Which is totally fine. Now I need to pass it through a SHA256 hash for which I'm using:
h = hashlib.new('sha256')
print (h.update(x))
Getting the error:
TypeError: Unicode-objects must be encoded before hashing
I read I need to encode the string before passing the hash using .encode() obtaining a completelly weird:
b'd6d09acbe78c269147803b8c351214a6e5f39093ca315c47e1126360d0df5369'
, and a 'none' result as the hash.
Can you please tell me whats going on here.
Thanks a lot gentls.
You are trying to print the output of h.update(), which will be None as it doesnt return anything. Instead use
h.update(x.encode())
print(h.hexdigest())
to print out a hash string.

How to perform SHA-256 on binary values with Hashlib?

I’m using Python 2 and am attempting to performing sha256 on binary values using hashlib.
I’ve become a bit stuck as I’m quite new to it all but have cobbled together:
hashlib.sha256('0110100001100101011011000110110001101111’.decode('hex')).hexdigest()
I believe it interprets the string as hex based on substituting the hex value (‘68656c6c6f’) into the above and it returning
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
And comparing to this answer in which ‘hello’ or ‘68656c6c6f’ is used.
I think the answer lies with the decode component but I can’t find an example for binary only ‘hex’ or ‘utf-8’
Is anyone able to suggest what needs to be changed so that the function interprets as binary values instead of hex?
Here is code that does each of the data conversions you are looking for. These steps can all be combined, but are separated here so you can see each value.
import hashlib
import binascii
binstr = '0110100001100101011011000110110001101111'
hexstr = "{0:0>4X}".format(int(binstr,2)) # '68656C6C6F'
data = binascii.a2b_hex(hexstr) # 'hello'
output = hashlib.sha256(data).hexdigest()
print output
OUTPUT:
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Hashing (hiding) strings in Python

What I need is to hash a string. It doesn't have to be secure because it's just going to be a hidden phrase in the text file (it just doesn't have to be recognizable for a human-eye).
It should not be just a random string because when the users types the string I would like to hash it and compare it with an already hashed one (from the text file).
What would be the best for this purpose? Can it be done with the built-in classes?
First off, let me say that you can't guarantee unique results. If you wanted unique results for all the strings in the universe, you're better off storing the string itself (or a compressed version).
More on that in a second. Let's get some hashes first.
hashlib way
You can use any of the main cryptographic hashes to hash a string with a few steps:
>>> import hashlib
>>> sha = hashlib.sha1("I am a cat")
>>> sha.hexdigest()
'576f38148ae68c924070538b45a8ef0f73ed8710'
You have a choice between SHA1, SHA224, SHA256, SHA384, SHA512, and MD5 as far as built-ins are concerned.
What's the difference between those hash algorithms?
A hash function works by taking data of variable length and turning it into data of fixed length.
The fixed length, in the case of each of the SHA algorithms built into hashlib, is the number of bits specified in the name (with the exception of sha1 which is 160 bits). If you want better certainty that two strings won't end up in the same bucket (same hash value), pick a hash with a bigger digest (the fixed length).
In sorted order, these are the digest sizes you have to work with:
Algorithm Digest Size (in bits)
md5 128
sha1 160
sha224 224
sha256 256
sha384 384
sha512 512
The bigger the digest the less likely you'll have a collision, provided your hash function is worth its salt.
Wait, what about hash()?
The built in hash() function returns integers, which could also be easy to use for the purpose you outline. There are problems though.
>>> hash('moo')
6387157653034356308
If your program is going to run on different systems, you can't be sure that hash will return the same thing. In fact, I'm running on a 64-bit box using 64-bit Python. These values are going to be wildly different than for 32-bit Python.
For Python 3.3+, as #gnibbler pointed out, hash() is randomized between runs. It will work for a single run, but almost definitely won't work across runs of your program (pulling from the text file you mentioned).
Why would hash() be built that way? Well, the built in hash is there for one specific reason. Hash tables/dictionaries/look up tables in memory. Not for cryptographic use but for cheap lookups at runtime.
Don't use hash(), use hashlib.
You can simply use the base64 module to achieve your goal:
>>> import base64
>>> a = 'helloworld'
>>> encoded_str = base64.encodestring(a)
>>> encoded_str
'aGVsbG93b3JsZA=='
>>> base64.decodestring(encoded_str)
'helloworld'
>>>
of course you can also use the the hashlib module, it's more secure , because the hashed string cannot(or very very hard) be decoded latter, but for your question base64 is enough -- "It doesn't really have to be secure"
Note that Python's string hash is not "defined" - it can, and does, vary across releases and implementations. So storing a Python string hash will create difficulties. CPython's string hash makes no attempt to be "obscure", either.
A standard approach is to use a hash function designed for this kind of thing. Like this:
>>> import hashlib
>>> encoded = hashlib.sha1("abcdef") # "abcdef" is the password
>>> encoded.hexdigest()
'1f8ac10f23c5b5bc1167bda84b833e5c057a77d2'
That long string of hexadecimal digits is "the hash". SHA-1 is a "strong" hash function. You can get famous if you find two strings that hash to the same value ;-) And given the same input, it will return the same "hexdigest" on all platforms across all releases and implementations of Python.
Simply use the hash() built-in function, for example:
s = 'a string'
hash(s)
=> -8411828025894108412

How to generate a mixed-case hash in Python?

I am having a hard time figuring out a reasonable way to generate a mixed-case hash in Python.
I want to generate something like: aZeEe9E
Right now I'm using MD5, which doesn't generate case-sensitive hashes.
Do any of you know how to generate a hash value consisting of upper- and lower- case characters + numbers?
-
Okay, GregS's advice worked like a charm (on the first try!):
Here is a simple example:
>>> import hashlib, base64
>>> s = 'http://gooogle.com'
>>> hash = hashlib.md5(s).digest()
>>> print hash
46c4f333fae34078a68393213bb9272d
>>> print base64.b64encode(hash)
NDZjNGYzMzNmYWUzNDA3OGE2ODM5MzIxM2JiOTI3MmQ=
you can base64 encode the output of the hash. This has a couple of additional characters beyond those you mentioned.
Maybe you can use base64-encoded hashes?

Categories

Resources