I'm still on my RSA project, and now I can successfully create the keys, and encrypt a string with them
def encrypt(clear_message, public_key):
clear_list = convert_into_unicode (clear_message)
n = public_key[0]
e = public_key[1]
message_chiffre = str()
for i, value in enumerate (clear_list) :
encrypted_value = str( pow (int(value), e, n) )
encrypted_message += (encrypted_value )
return encrypted_message
def convert_into_unicode (clear_message):
str_unicode = ''
for car in clear_message:
str_unicode += str (ord (car))
if len (str_unicode ) % 5 != 0:
str_unicode += (5 - len (str_unicode ) % 5) * '0'
clear_list = []
i = 5
while i <= len (str_unicode ):
clear_list .append (str_unicode [i-5:i])
i += 5
return liste_claire
For example, encrypting the message 'Hello World' returns ['72101', '10810', '81113', '28711', '11141', '08100', '32330'] as clear_list then
'3863 111 1616 3015 1202 341 4096' as encrypted_message
The encrypt () function uses the other function to convert the string into a list of the Unicode values but put in blocks because I've read that otherwise, it would be easy to find the clear message only with frequency analysis.
Is it really that easy?
And as it probably is, I come to my main question. As you know, the Unicode values of a character are either double-digits or triple-digits. Before the encryption, the Unicode values are separated into blocks of 5 digits ('stack' -> '115 116 97 99 107' -> '11511 69799 10700')
But the problem is when I want to decrypt this, how do I know where I have to separate that string so that one number represents one character?
I mean, the former Unicode value could be either 11 or 115 (I know it couldn't really be 11, but that's only as an example). So to decrypt and then get back the character, the problem is, I don't know how much digits I have to take.
I had thought of adding a 0 when the Unicode value is < 100, but
Then it's easy to do the same thing as before with the frequency analysis
Still, when I encrypt it, '087' can result in '467' and '089' can result in '046', so the problem is still here.
You're trying to solve real world problems with a toy RSA problem. The frequency analysis can be performed because no random padding of the plaintext message has been used. Random padding is required to make RSA secure.
For this kind of problem it is enough to directly use the Unicode code point (an integer value) per character as input to RSA. RSA can however only directly encrypt values in the range [0..N) where N is the modulus. If you input a larger value x then value will first be converted into the value x modulus N. In that case you loose information and decryption will not be deterministic anymore.
As for the ciphertext, just make this the string representation of the integer values separated by spaces and split them to read them in. This will take more space, but RSA always has a certain overhead.
If you want to implement secure RSA then please read into PKCS#1 standard and beware of time attacks etc. And, as Wyzard already indicated, please use hybrid cryptography (using a symmetric encryption in addition to RSA).
Or use a standard library, now you understand how RSA works in principle.
Your convert_into_unicode function isn't really converting anything "into" Unicode. Assuming clear_message is a Unicode string (The default string type in Python 3, or u'' in Python 2), it's (naturally) Unicode already, and you're using an awkward way of turning it into a sequence of bytes that you can encrypt. If clear_message is a byte string (the default in Python 2, or b'' in Python 3), all the characters fit in a byte already, so the whole process is unnecessary.
It's true that Unicode string needs to be encoded as a byte sequence before you can encrypt it. The normal way to do that is with an encoding such as UTF-8 or UTF-16. You can do that by calling clear_message.encode('utf-8'). After decrypting, you can turn the decrypted byte string back into a Unicode string with decrypted_bytes.decode('utf-8').
You don't need the convert_into_unicode function at all.
Related
Similar to this other question on decoding a hex string, I have some code in a Python 2.7 script which has worked for years. I'm now trying to convert that script to Python 3.
OK, I apologize for not posting a complete question initially. I hope this clarifies the situation.
The issue is that I'm trying to convert an older Python 2.7 script to Python 3.8. For the most part the conversion has gone ok, but I am having issues converting the following code:
# get Register Stings
RegString = ""
for i in range(length):
if regs[start+i]!=0:
RegString = RegString + str(format(regs[start+i],'x').decode('hex'))
Here are some suppodrting data:
regs[start+0] = 20341
regs[start+1] = 29762
I think that my Python 2.7 code is converting these to HEX as "4f75" and "7442", respectively. And then to the characters "Ou" and "tB", respectively.
In Python 3 I get this error:
'str' object has no attribute 'decode'
My goal is to modify my Python 3 code so that the script will generate the same results.
str(format(regs[start+i],'x').decode('hex')) is a very verbose and round-about way of turning the non-zero integer values in regs[start:start + length] into individual characters of a bytestring (str in Python 2 should really be seen as a sequence of bytes). It first converts an integer value into a hexadecimal representation (a string), decodes that hexadecimal string to a (series) of string characters, then calls str() on the result (redundantly, the value is already a string). Assuming that the values in regs are integers in the range 0-255 (or even 0-127), in Python 2 this should really have been using the chr() function.
If you want to preserve the loop use chr() (to get a str string value) or if you need a binary value, use bytes([...]). So:
RegString = ""
for codepoint in regs[start:start + length]:
RegString += chr(codepoint)
or
RegString = b""
for codepoint in regs[start:start + length]:
RegString += bytes([codepoint])
Since this is actually converting a sequence of integers, you can just pass the whole lot to bytes() and filter out the zeros as you go:
# only take non-zero values
RegString = bytes(b for b in regs[start:start + length] if b)
or remove the nulls afterwards:
RegString = bytes(regs[start:start + length]).replace(b"\x00", b"")
If that's still supposed to be a string and not a bytes value, you can then decode it, with whatever encoding is appropriate (ASCII if the integers are in the range 0-127, or a more specific codec otherwise, in Python 2 this code produced a bytestring so look for other hints in the code as to what encoding they might have been using).
I am new to Python & I am trying to learn how to XOR hex encoded ciphertexts against one another & then derive the ASCII value of this.
I have tried some of the functions as outlined in previous posts on this subject - such as bytearray.fromhex, binascii.unhexlify, decode("hex") and they have all generated different errors (obviously due to my lack of understanding). Some of these errors were due to my python version (python 3).
Let me give a simple example, say I have a hex encoded string ciphertext_1 ("4A17") and a hex endoded string ciphertext_2. I want to XOR these two strings and derive their ASCII value. The closest that I have come to a solution is with the following code:
result=hex(int(ciphertext_1, 16) ^ int(ciphertext_2, 16))
print(result)
This prints me a result of: 0xd07
(This is a hex string is my understanding??)
I then try to convert this to its ASCII value. At the moment, I am trying:
binascii.unhexliy(result)
However this gives me an error: "binascii.Error: Odd-length string"
I have tried the different functions as outlined above, as well as trying to solve this specific error (strip function gives another error) - however I have been unsuccessful. I realise my knowledge and understanding of the subject are lacking, so i am hoping someone might be able to advise me?
Full example:
#!/usr/bin/env python
import binascii
ciphertext_1="4A17"
ciphertext_2="4710"
result=hex(int(ciphertext_1, 16) ^ int(ciphertext_2, 16))
print(result)
print(binascii.unhexliy(result))
from binascii import unhexlify
ciphertext_1 = "4A17"
ciphertext_2 = "4710"
xored = (int(ciphertext_1, 16) ^ int(ciphertext_2, 16))
# We format this integer: hex, no leading 0x, uppercase
string = format(xored, 'X')
# We pad it with an initial 0 if the length of the string is odd
if len(string) % 2:
string = '0' + string
# unexlify returns a bytes object, we decode it to obtain a string
print(unhexlify(string).decode())
#
# Not much appears, just a CR followed by a BELL
Or, if you prefer the repr of the string:
print(repr(unhexlify(string).decode()))
# '\r\x07'
When doing byte-wise operations like XOR, it's often easier to work with bytes objects (since the individual bytes are treated as integers). From this question, then, we get:
ciphertext_1 = bytes.fromhex("4A17")
ciphertext_2 = bytes.fromhex("4710")
XORing the bytes can then be accomplished as in this question, with a comprehension. Then you can convert that to a string:
result = [c1 ^ c2 for (c1, c2) in zip(ciphertext_1, ciphertext_2)]
result = ''.join(chr(c) for c in result)
I would probably take a slightly different angle and create a bytes object instead of a list, which can be decoded into your string:
result = bytes(b1 ^ b2 for (b1, b2) in zip(ciphertext_1, ciphertext_2)).decode()
I'm struggling a bit to generate ID of type integer for given string in Python.
I thought the built-it hash function is perfect but it appears that the IDs are too long sometimes. It's a problem since I'm limited to 64bits as maximum length.
My code so far: hash(s) % 10000000000.
The input string(s) which I can expect will be in range of 12-512 chars long.
Requirements are:
integers only
generated from provided string
ideally up to 10-12 chars long (I'll have ~5 million items only)
low probability of collision..?
I would be glad if someone can provide any tips / solutions.
I would do something like this:
>>> import hashlib
>>> m = hashlib.md5()
>>> m.update("some string")
>>> str(int(m.hexdigest(), 16))[0:12]
'120665287271'
The idea:
Calculate the hash of a string with MD5 (or SHA-1 or ...) in hexadecimal form (see module hashlib)
Convert the string into an integer and reconvert it to a String with base 10 (there are just digits in the result)
Use the first 12 characters of the string.
If characters a-f are also okay, I would do m.hexdigest()[0:12].
If you're not allowed to add extra dependency, you can continue using hash function in the following way:
>>> my_string = "whatever"
>>> str(hash(my_string))[1:13]
'460440266319'
NB:
I am ignoring 1st character as it may be the negative sign.
hash may return different values for same string, as PYTHONHASHSEED Value will change everytime you run your program. You may want to set it to some fixed value. Read here
encode utf-8 was needed for mine to work:
def unique_name_from_str(string: str, last_idx: int = 12) -> str:
"""
Generates a unique id name
refs:
- md5: https://stackoverflow.com/questions/22974499/generate-id-from-string-in-python
- sha3: https://stackoverflow.com/questions/47601592/safest-way-to-generate-a-unique-hash
(- guid/uiid: https://stackoverflow.com/questions/534839/how-to-create-a-guid-uuid-in-python?noredirect=1&lq=1)
"""
import hashlib
m = hashlib.md5()
string = string.encode('utf-8')
m.update(string)
unqiue_name: str = str(int(m.hexdigest(), 16))[0:last_idx]
return unqiue_name
see my ultimate-utils python library.
I am new to crypto and I am trying to interpret the below code. Namely, what does <xor> mean?
I have a secret_key secret key. I also have a unique_id. I create pad using the below code.
pad = hmac.new(secret_key, msg=unique_id, digestmod=hashlib.sha1).digest()
Once the pad is created, I have a price e.g. 1000. I am trying to follow this instruction which is pseudocode:
enc_price = pad <xor> price
In Python, what is the code to implement enc_price = pad <xor> price? What is the logic behind doing this?
As a note, a complete description of what I want to do here here:
https://developers.google.com/ad-exchange/rtb/response-guide/decrypt-price
developers.google.com/ad-exchange/rtb/response-guide/decrypt-price
Thanks
The binary (I assume that's what you need) xor is ^ in python:
>>> 6 ^ 12
10
Binary xor works like this (numbers represented in binary):
1234
6 = 0110
12 = 1100
10 = 1010
For every pair of bits, if their sum is 1 (bits 1 and 3 in my example), the resulting bit is 1. Otherwise, it's 0.
The pad, and the plaintext "price" are each to be interpreted as a stream of bits. For each corresponding bit in the two streams, you take the "exclusive OR" of the pair of bits - if the bits are the same, you emit 0, if the bits are different, you emit 1. This operation is interesting because it's reversible: plaintext XOR pad -> ciphertext, and ciphertext XOR pad -> plaintext.
However, in Python, you won't usually do the XORing yourself because it's tedious and overly complex for a newbie; you want to use a popular encryption library such as PyCrypto to do the work.
You mean "Binary bitwise operations"?
The & operator yields the bitwise AND of its arguments, which must be plain or long integers. The arguments are converted to a common type.
The ^ operator yields the bitwise XOR (exclusive OR) of its arguments, which must be plain or long integers. The arguments are converted to a common type.
The | operator yields the bitwise (inclusive) OR of its arguments, which must be plain or long integers. The arguments are converted to a common type.
[update]
Since you can't xor a string and a number, you should either:
convert the number to a string padded to the same size and xor each byte (may give you all sort of strange "escape" problems with some chars, for example, accidentally generating invalid unicode)
use the raw value (20 byte integer?) of the digest to xor and make an hexdigest of the resulting number.
Something like this (untested):
pad = hmac.new(secret_key, msg=unique_id, digestmod=hashlib.sha1).digest()
rawpad = reduce(lambda x, y: (x << 8) + y,
[ b for b in struct.unpack('B' * len(pad), pad)])
enc_price = "%X" % (rawpad ^ price)
[update]
The OP wants to implement "DoubleClick Ad Exchange Real-Time Bidding Protocol".
This very article tells there are some sample python code available:
Initial Testing
You can test your bidding application internally using requester.tar.gz. This is a test python program that sends requests to a bidding application and checks the responses. The program is available on request from your Ad Exchange representative.
I did it so
def strxor(s1,s2):
size = min(len(s1),len(s2))
res = ''
for i in range(size):
res = res + '%c' % (ord(s1[i]) ^ ord(s2[i]))
return res
I feel like a complete tool for posting this, it is so basic and I cant believe I have wasted the last two days on this problem. I've tried all the solutions I can find on this (seriously, I will show you my internet history) but to no avail. Here is the problem:
I am parsing a serial string in from a uC. It is 52 bytes long and contains a lot of different variables of data. The data in encoded in packed binary coded decimal.
Ex: .....blah.....0x01 0x5E .....blah
015E hex gives 350 decimal. This is the value I want. I am reading in the serial string just fine, I used binascii.hexifiy to print the bytes to ensure it is corrent. I use
data = ser.read()
and placed the data in an array if an newline is not received. I have tried making the array a bytearray, list, anything that I could find, but none work.
I want to send the required two byte section to a defined method.
def makeValue(highbyte, lowbyte)
When I try to use unpack, join, pack, bit manipulation, string concentation, I keep getting the same problem.
Because 0x01 and 0x5E are not valid int numbers (start of heading and ^ in ASCII), it wont work. It wont even let me join the numbers first because it's not a valid int.
using hex(): hex argument can't be converted to hex.
Joining the strings: invalid literal for int() with base 16: '\x01^'
using int: invalid literal for int() with base 10: '\x01^'
Packing a struct: struct.error: cannot convert argument to integer
Seriously, am I missing something really basic here? All the examples I can find make use of all the functions above perfectly but they specificy the hex numbers '0x1234', or the numbers they are converting are actual ASCII numbers. Please help.
EDIT
I got it, ch3ka set me on the right track, thanks a million!
I don't know why it wouldn't work before but I hex'ed both values
one = binascii.hexlify(line[7])
two = binascii.hexlify(line[8])
makeValue(one, two)`
and then used the char makeValues ch3ka defined:
def makeValue(highbyte, lowbyte)
print int(highbyte, 16)*256 + int(lowbyte, 16)
Thanks again!!!
you are interpreting the values as chars. Feeding chars to int() won't work, you have to feed the values as strings, like so: int("0x5E", 16). What you are attempting is in fact int(chr(int("0x5E", 16)),16), which is int("^",16) and will of course not work.
Do you expect these results?
makevalue('0x01', '0x5E') -> 350 0x15e 0b101011110
makevalue('0xFF', '0x00') -> 65280 0xff00 0b1111111100000000
makevalue('0x01', '0xFF') -> 511 0x1ff 0b111111111
makevalue('0xFF', '0xFF') -> 65535 0xffff 0b1111111111111111
If so, you can use this:
def makeValue(highbyte, lowbyte):
return int(highbyte, 16)*256 + int(lowbyte, 16)
or the IMO more ugly and errorprone:
def makeValue(highbyte, lowbyte):
return int(highbyte+lowbyte[2:], 16) # strips leading "0x" from lowbyte be4 concat