Base64 implementation is not giving the desired result - python

I've followed the following site instructions in order to implement base64 encoding.
Here's my code:
from bitarray import bitarray
from bitarray.util import ba2int
base64mapping = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcd\
efghijklmnopqrstuvwxyz0123456789+/"
testString1 = "49276d206b696c6c696e6720796f757220627261696e206c\
696b65206120706f69736f6e6f7573206d757368726f6f6d"
stringBits = bitarray()
# We create an unitialized bitarray
stringBits.frombytes(testString1.encode('utf-8'))
# We convert the string to bits
b64stringList = [] # To store the corresponding b64 chars
for sequence in range(0, len(stringBits), 6):
# We scan the bitarray 6 bits at a time
b64stringList += base64mapping[(ba2int(stringBits[sequence:sequence+6]))]
# We store the corresponding b64 char into b64stringlist
print(''.join(b64stringList)) # Lets see the result
The testString1 was taken from the cryptopals set 1 - challenge 1 which tells me the result should be:
SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t
However, my implementation gives another result:
NDkyNzZkMjA2YjY5NmM2YzY5NmU2NzIwNzk2Zjc1NzIyMDYyNzI2MTY5NmUyMDZjNjk2YjY1MjA2MTIwNzA2ZjY5NzM2ZjZlNmY3NTczMjA2ZDc1NzM2ODcyNmY2ZjZk
I checked against Wikipedia's example and I get the same output sans the padding at the end (TODO).
What I am doing wrong or missing here?

According to the site, the task is to convert the hex to base64.
Your encoding, seems to be working correctly as the library base64 outputs the same as your result when running it directly over the hex value
t = b'49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d'
base64.b64encode(t)
b'NDkyNzZkMjA2YjY5NmM2YzY5NmU2NzIwNzk2Zjc1NzIyMDYyNzI2MTY5NmUyMDZjNjk2YjY1MjA2MTIwNzA2ZjY5NzM2ZjZlNmY3NTczMjA2ZDc1NzM2ODcyNmY2ZjZk'
Using binascii we can convert the hex value your originally have to readable text by using unhexlify()
binascii.unhexlify(t)
#b"I'm killing your brain like a poisonous mushroom"
Given this information it would be best to take the approach that what you're storing is not a string but a hex representation of the string itself.
Due to this we will need to convert your hex to the readable string before encoding it.
When we pass the resulting string to base64 your output matches the expected output in the link.
base64.b64encode(binascii.unhexlify(t))
#b'SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t'

Related

How to understand bytes output in python?

I am using struct for creating byte like objects out of arrays. Here is my code:
import numpy as np
import struct
a = 1
a = np.array(a,dtype=np.int32)
format_charecters = f'<1I'
bytes_ = struct.pack(format_charecters,*a.flatten())
bytes_
The code outputs:
b'\x01\x00\x00\x00'
This makes sense to me as I am using < little-endian byte-ordering and referring the following table 1 should correspond to \x01 where x represents hexadecimal.
Now when I replace 1 with 10 I get a surprising result:
b'\n\x00\x00\x00'
I was not expecting this... I thought the output will be:
b'\x0a\x00\x00\x00'
Also for some random value a = 1324233699 I get:
b'\xe33\xeeN'
Using an online decimal-hex converter I get:
4EEE33E3
How to interpret the results of my code?
The link deadshot gave perfectly explained my question. I am adding a screenshot of the table of escape characters in python so that other people can find it even if the 'tutorialspoint' site goes down.

Why python has different types of bytes

I have two variables, one is b_d, the other is b_test_d.
When I type b_d in the console, it shows:
b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
when I type b_test_d in the console, it shows:
b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034,-0.1709796203,0.3663875698,0.14846441,-0.7415930061,-1.7602231949,0.126605689,0.6010934792,-0.466415358,1.5675525816,1.00836295,1.4332792992,0.6113384254,-1.8008540571,-0.9443408896,1.0943670356,-1.0114642686,1.443892627,-0.2709427287,0.2990462512,0.4650133591,0.2560791327,0.2257600462,-2.4077429827,-0.0509983213,1.0062187148,0.4315075795,-0.6116110033,0.3495131413,-0.3249903375,0.3962305931,-0.1985757285,1.165792433,-1.1171953063,-0.1732557874,-0.3791600654,-0.2860519953,0.7872658859,0.217728374,-0.4715179983,-0.4539613811,-0.396353657,1.2326862425,-1.3548659354,1.6476230786,0.6312713442,-0.735444661,-0.6853447369,-0.8480631975,0.9538606574,0.6653542368,-0.2833696021,0.7281604648,-0.2843872095,0.1461980484,-2.3511731773,-0.3118047948,-1.6938613893,-0.0359659687,-0.5162134311,-2.2026641552,-0.7294895084,0.7493073213,0.1034096968,0.6439803068,-0.2596155272,0.5851323455,1.0173285542,-0.7370464113,1.0442954406,-0.5363832595,0.0117795359,0.2225617514,0.067571974,-0.9154681906,-0.293808596,1.3717113798,0.4919516922,-0.3254944005,1.6203744532,-0.1810222279,-0.6111596457,1.344064259,-0.4596893179,-0.2356197144,0.4529942046,1.6244603294,0.1849995925,0.6223061217,-0.0340662398,0.8365900535,-0.6804201929,0.0149665385,0.4132453788,0.7971962667,-1.9391525531,0.1440486871,-0.7103617816,0.9026539637,0.6665798363,-1.5885073458,1.4084493329,-1.397040825,1.6215697667,1.7057148522,0.3802647045,-0.4239271483,1.4773614536,1.6841461329,0.1166845529,-0.3268795898,-0.9612751672,0.4062399443,0.357209662,-0.2977362702,-0.3988147401,-0.1174652196,0.3350589818,-1.8800423584,0.0124169787,1.0015110265,0.789541751,-0.2710408983,1.4987300181,-1.1726824468,-0.355322591,0.6567978423,0.8319110558,0.8258835069,-1.1567887763,1.9568551122,1.5148655075,1.0589021915,-0.4388232953,-0.7451680183,-2.1897621693,0.4502135234,-1.9583089063,0.1358789518,-1.7585860897,0.452259777,0.7406800349,-1.3578980418,1.108740204,-1.1986272667,-1.0273598206,-1.8165822264,1.0853600894,-0.273943514,0.8589890805,1.3639094329,-0.6121993589,-0.0587067992,0.0798457584,1.0992814648,-1.0455733611,1.4780003064,0.5047157705,0.1565451605,0.9656886956,-0.5998330255,0.4846727299,0.8790524818,1.0288893846,-2.0842447397,0.4074607421,2.1523241756,-1.1268047125,-0.6016001524,-1.3302141561,1.1869516954,1.0988060125,0.7405900405,1.1813110811,0.8685330644,2.0927140519,-1.7171952009,0.9231993147,0.320874115,0.7465845079,-0.1034484959,-0.4776822499,0.436218328,-0.4083564542,0.4835567895,1.0733230373,-0.858658902,-0.4493571034,0.4506418221,1.6696649735,-0.9189799982,-1.1690356499,-1.0689397924,0.3174297583,1.0403701444,0.5440082812,-0.1128248996]'
Both of them are bytes type, but I can use numpy.frombuffer to read the b_d, but not the b_test_d. And they look very different. Why do I have these two types of bytes?
Thank you.
[A]nyone can point out how to use Json marshall to convert the byte to the same type of bytes as the first one?
This isn't the right question, but I think I know what you're asking. You say you're getting the 2nd array via JSON marshalling, but that it's also not under your control:
it was obtained by json marshal (convert a received float array to byte array, and then convert the result to base64 string, which is done by someone else)
That's fine though, you just have to do a few steps of processing to get to a state equivalent to the first set of bytes.
First, some context to what's going on. You've already seen that numpy can understand your first set of bytes.
>>> numpy.frombuffer(data)
[1.21 2.963 1. ]
Based on its output, it looks like numpy is interpreting your data as 3 doubles, with 8 bytes each (24 bytes total)...
>>> data = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
>>> len(data)
24
...which the struct module can also interpret.
# Separate into 3 doubles
x, y, z = data[:8], data[8:16], data[16:]
print([struct.unpack('d', i) for i in (x, y, z)])
[(1.21,), (2.963,), (1.0,)
There's actually (at least) 2 ways you can get a numpy array out of this.
Short way
1. Convert to string
# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string (defaults to utf-8), then
# trim off the brackets (first and last characters in the string)
as_str = junk.decode()[1:-1]
2. Use numpy.fromstring
numpy.fromstring(as_str, dtype=float, sep=',')
# Produces:
array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962,
0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569,
0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 ,
0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427,
1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913,
0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758,
... ])
Long way
Note: I found the fromstring method after writing this part up, figured I'd leave it here to at least help explain the byte differences.
1. Convert the JSON data into an array of numeric values.
# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string - defaults to utf-8
junk = junk.decode()
# Trim off the brackets - First and last characters in the string
junk = junk[1:-1]
# Separate into values
junk = junk.split(',')
# Convert to numerical values
doubles = [float(val) for val in junk]
# Or, as a one-liner
doubles = [float(val) for val in junk.decode()[1:-1].split(',')]
# "doubles" currently holds:
[-2.1997713216,
-1.4249271187,
-1.1076795391,
1.5224958034,
...]
2. Use struct to get byte-representations for the doubles
import struct
as_bytes = [struct.pack('d', val) for val in doubles]
# "as_bytes" currently holds:
[b'\x08\x9b\xe7\xb4!\x99\x01\xc0',
b'\x0b\x00\xe0`\x80\xcc\xf6\xbf',
b'+ ..\x0e\xb9\xf1\xbf',
b'hg>\x8f$\\\xf8?',
...]
3. Join all the double values (as bytes) into a single byte-string, then submit to numpy
new_data = b''.join(as_bytes)
numpy.frombuffer(new_data)
# Produces:
array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962,
0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569,
0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 ,
0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427,
1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913,
0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758,
... ])
A bytes object can be in any format. It is "just a bunch of bytes" without context. For display Python will represent byte values <128 as their ASCII value, and use hex escape codes (\x##) for others.
The first looks like IEEE 754 double precision floating point. numpy or struct can read it. The second one is in JSON format. Use the json module to read it:
import numpy as np
import json
import struct
b1 = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
b2 = b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034]'
j = json.loads(b2)
n = np.frombuffer(b1)
s = struct.unpack('3d',b1)
print(j,n,s,sep='\n')
# To convert b2 into a b1 format
b = struct.pack('4d',*j)
print(b)
Output:
[-2.1997713216, -1.4249271187, -1.1076795391, 1.5224958034]
[1.21 2.963 1. ]
(1.21, 2.963, 1.0)
b'\x08\x9b\xe7\xb4!\x99\x01\xc0\x0b\x00\xe0`\x80\xcc\xf6\xbf+ ..\x0e\xb9\xf1\xbfhg>\x8f$\\\xf8?'

Converting an Integer value to base64, and then decoding it to get a plaintext

I am given this number 427021005928, which i am supposed to change into a base64 encoded string and then decode the base64 string to get a plain text.
This decimal value 427021005928 when converted to binary gives 110001101101100011011110111010001101000 which corresponds to 'Y2xvdGg=', which is what i want. Got the conversion from (https://cryptii.com/pipes/binary-to-base64)
And then finally i decode 'Y2xvdGg=' to get the text cloth.
My problem is i do not have any idea how to use Python to get from either the decimal or binary value to get 'Y2xvdGg='
Some help would be appreciated!
NOTE: I only have this value 427021005928 at the start. I need to get the base64 and plaintext answers.
One elegant way would be using [Python 3]: struct - Interpret bytes as packed binary data, but given the fact that Python numbers are not fixed size, some additional computation would be required (for example, the number is 5 bytes long).
Apparently, the online converter, applied the base64 encoding on the number's memory representation, which can be obtained via [Python 3]: int.to_bytes(length, byteorder, *, signed=False)(endianness is important, and in this case it's big):
For the backwards process, reversed steps are required. There are 2 alternatives:
Things being done manually (this could also be applied to the "forward" process)
Using int.from_bytes
>>> import base64
>>>
>>> number = 427021005928
>>>
>>> number_bytes = number.to_bytes((number.bit_length() + 7) // 8, byteorder="big") # Here's where the magic happens
>>> number_bytes, number_bytes.decode()
(b'cloth', 'cloth')
>>>
>>> encoded = base64.b64encode(number_bytes)
>>> encoded, encoded.decode() # Don't let yourself tricked by the variable and method names resemblance
(b'Y2xvdGg=', 'Y2xvdGg=')
>>>
>>> # Now, getting the number back
...
>>> decoded = base64.b64decode(encoded)
>>> decoded
b'cloth'
>>>
>>> final_number0 = sum((item * 256 ** idx for idx, item in enumerate(reversed(decoded))))
>>> final_number0
427021005928
>>> number == final_number0
True
>>>
>>> # OR using from_bytes
...
>>> final_number1 = int.from_bytes(decoded, byteorder="big")
>>> final_number1
427021005928
>>> final_number1 == number
True
For more details on bitwise operations, check [SO]: Output of crc32b in PHP is not equal to Python (#CristiFati's answer).
Try this (https://docs.python.org/3/library/stdtypes.html#int.to_bytes)
>>> import base64
>>> x=427021005928
>>> y=x.to_bytes(5,byteorder='big').decode('utf-8')
>>> base64.b64encode(y.encode()).decode()
'Y2xvdGg='
>>> y
'cloth'
try
number = 427021005928
encode = base64.b64encode(bytes(number))
decode = base64.b64decode(encodeNumber)
The function below converts an unsigned 64 bit integer into base64 representation, and back again. This is particularly helpful for encoding database keys.
We first encode the integer into a byte array using little endian, and automatically remove any extra leading zeros. Then convert to base64, removing the unnecessary = sign. Note the flag url_safe which makes the solution non-base64 compliant, but works better with URLs.
def int_to_chars(number, url_safe = True):
'''
Convert an integer to base64. Used to turn IDs into short URL slugs.
:param number:
:param url_safe: base64 may contain "/" and "+", which do not play well
with URLS. Set to True to convert "/" to "-" and "+" to
"_". This no longer conforms to base64, but looks better
in URLS.
:return:
'''
if number < 0:
raise Exception("Cannot convert negative IDs.")
# Encode the long, long as little endian.
packed = struct.pack("<Q", number)
# Remove leading zeros
while len(packed) > 1 and packed[-1] == b'\x00':
packed = packed[:-1]
encoded = base64.b64encode(packed).split(b"=")[0]
if url_safe:
encoded = encoded.replace(b"/", b"-").replace(b"+", b".")
return encoded
def chars_to_int(chars):
'''Reverse of the above function. Will work regardless of whether
url_safe was set to True or False.'''
# Make sure the data is in binary type.
if isinstance(chars, six.string_types):
chars = chars.encode('utf8')
# Do the reverse of the url_safe conversion above.
chars = chars.replace(b"-", b"/").replace(b".", b"+")
# First decode the base64, adding the required "=" padding.
b64_pad_len = 4 - len(chars) % 4
decoded = base64.b64decode(chars + b"="*b64_pad_len)
# Now decode little endian with "0" padding, which are leading zeros.
int64_pad_len = 8 - len(decoded)
return struct.unpack("<Q", decoded + b'\x00' * int64_pad_len)[0]
You can do following conversions by using python
First of all import base64 by using following syntax
>>> import base64
For converting text to base64 do following
encoding
>>> base64.b64encode("cloth".encode()).decode()
'Y2xvdGg='
decoding
>>> base64.b64decode("Y2xvdGg=".encode()).decode()
'cloth'

python convert bit hex to binary

ok, im fairly new to python but not programming, I know php, C, bash, etc... My question is:
How do I convert data = "b'\x16'" to binary "0001 0110" ??
im trying to read the response from an esc printer from DLE
x = 1
while x:
time.sleep(3)
ser.write("\x10\x04\x01".encode())
bytesToRead = ser.inWaiting()
data = ser.read(bytesToRead)
while data:
print(data)
data = ""
all that ends up printing is: b'\x16' i assume hex but a simple hex to bin is not working because of the b?
What you get back is a bytes object. (think: raw array of bytes) You can get the number itself from the first byte via data[0]. That will give you 0x16 as an int, which you can convert however you want.

How to convert from binary to bytes to hex in Python 3?

I'm trying to write a program that converts two hex strings to bytes, and from bytes to binary. Once in binary, I want to perform an XOR transposition on them. This having been accomplished, I want to convert the binary strings back to bytes, and again into hex. I already know what the answer should be, it's just a question of getting from A to B.
The code I have so far is as follows:
input1 = "1c0111001f010100061a024b53535009181c"
input2 = "686974207468652062756c6c277320657965"
target = "746865206b696420646f6e277420706c6179"
d = conversions.hexconvert(input1)
e = conversions.hexconvert(input2)
print(d)
print(e)
f = bitstring.BitArray(d)
g = bitstring.BitArray(e)
xor1 = f.bin
xor2 = g.bin
print("xor1 is", xor1)
print("xor2 is", xor2)
xor1, xor2 = xor2, xor1
print("xor1 is now:", xor1)
The function "hexconvert" is comprised of the following code:
import codecs
def hexconvert(input):
output = codecs.decode(input, 'hex')
return(output)
My code is currently spitting out the following:
b'\x1c\x01\x11\x00\x1f\x01\x01\x00\x06\x1a\x02KSSP\t\x18\x1c'
b"hit the bull's eye"
xor1 is : 000111000000000100010001000000000001111100000001000000010000000000000110000110100000001001001011010100110101001101010000000010010001100000011100
xor2 is : 011010000110100101110100001000000111010001101000011001010010000001100010011101010110110001101100001001110111001100100000011001010111100101100101
xor1 is now: 011010000110100101110100001000000111010001101000011001010010000001100010011101010110110001101100001001110111001100100000011001010111100101100101
All good so far. I'd like to know what I can add to the end of this code to convert xor1 to bytes then to hex so that I can compare it to the result it should be. I've been trying to figure out how to use struct, binascii, and even bitstring, but I'm getting nowhere. Any and all suggestions greatly appreciated.
It would also be great if anyone could suggest how to make the code more efficient.
Thanks very much in advance!
You don't have to convert to bits here; you can XOR bytes just fine. When you iterate over a bytes object you get the individual values as integers in the range 0-255, and you can XOR those. Vice versa, you can create a new bytes object again from a sequence of integers.
Convert from hex to bytes with binascii.unhexlify(), back again with binascii.hexlify():
from binascii import hexlify, unhexlify
bytes1, bytes2 = unhexlify(input1), unhexlify(input2)
xor_bytes = bytes([b1 ^ b2 for b1, b2 in zip(bytes1, bytes2)])
result = hexlify(xor_bytes).decode('ascii')
The decode is there to convert the bytes output of hexlify back to a string.
Demo:
>>> from binascii import hexlify, unhexlify
>>> input1 = "1c0111001f010100061a024b53535009181c"
>>> input2 = "686974207468652062756c6c277320657965"
>>> bytes1, bytes2 = unhexlify(input1), unhexlify(input2)
>>> xor_bytes = bytes([b1 ^ b2 for b1, b2 in zip(bytes1, bytes2)])
>>> xor_bytes
b"the kid don't play"
>>> hexlify(xor_bytes).decode('ascii')
'746865206b696420646f6e277420706c6179'
If all you need is to xor two hex strings:
>>> hex(int(input1, 16) ^ int(input2, 16))[2:]
'746865206b696420646f6e277420706c6179'

Categories

Resources