I have two variables, one is b_d, the other is b_test_d.
When I type b_d in the console, it shows:
b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
when I type b_test_d in the console, it shows:
b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034,-0.1709796203,0.3663875698,0.14846441,-0.7415930061,-1.7602231949,0.126605689,0.6010934792,-0.466415358,1.5675525816,1.00836295,1.4332792992,0.6113384254,-1.8008540571,-0.9443408896,1.0943670356,-1.0114642686,1.443892627,-0.2709427287,0.2990462512,0.4650133591,0.2560791327,0.2257600462,-2.4077429827,-0.0509983213,1.0062187148,0.4315075795,-0.6116110033,0.3495131413,-0.3249903375,0.3962305931,-0.1985757285,1.165792433,-1.1171953063,-0.1732557874,-0.3791600654,-0.2860519953,0.7872658859,0.217728374,-0.4715179983,-0.4539613811,-0.396353657,1.2326862425,-1.3548659354,1.6476230786,0.6312713442,-0.735444661,-0.6853447369,-0.8480631975,0.9538606574,0.6653542368,-0.2833696021,0.7281604648,-0.2843872095,0.1461980484,-2.3511731773,-0.3118047948,-1.6938613893,-0.0359659687,-0.5162134311,-2.2026641552,-0.7294895084,0.7493073213,0.1034096968,0.6439803068,-0.2596155272,0.5851323455,1.0173285542,-0.7370464113,1.0442954406,-0.5363832595,0.0117795359,0.2225617514,0.067571974,-0.9154681906,-0.293808596,1.3717113798,0.4919516922,-0.3254944005,1.6203744532,-0.1810222279,-0.6111596457,1.344064259,-0.4596893179,-0.2356197144,0.4529942046,1.6244603294,0.1849995925,0.6223061217,-0.0340662398,0.8365900535,-0.6804201929,0.0149665385,0.4132453788,0.7971962667,-1.9391525531,0.1440486871,-0.7103617816,0.9026539637,0.6665798363,-1.5885073458,1.4084493329,-1.397040825,1.6215697667,1.7057148522,0.3802647045,-0.4239271483,1.4773614536,1.6841461329,0.1166845529,-0.3268795898,-0.9612751672,0.4062399443,0.357209662,-0.2977362702,-0.3988147401,-0.1174652196,0.3350589818,-1.8800423584,0.0124169787,1.0015110265,0.789541751,-0.2710408983,1.4987300181,-1.1726824468,-0.355322591,0.6567978423,0.8319110558,0.8258835069,-1.1567887763,1.9568551122,1.5148655075,1.0589021915,-0.4388232953,-0.7451680183,-2.1897621693,0.4502135234,-1.9583089063,0.1358789518,-1.7585860897,0.452259777,0.7406800349,-1.3578980418,1.108740204,-1.1986272667,-1.0273598206,-1.8165822264,1.0853600894,-0.273943514,0.8589890805,1.3639094329,-0.6121993589,-0.0587067992,0.0798457584,1.0992814648,-1.0455733611,1.4780003064,0.5047157705,0.1565451605,0.9656886956,-0.5998330255,0.4846727299,0.8790524818,1.0288893846,-2.0842447397,0.4074607421,2.1523241756,-1.1268047125,-0.6016001524,-1.3302141561,1.1869516954,1.0988060125,0.7405900405,1.1813110811,0.8685330644,2.0927140519,-1.7171952009,0.9231993147,0.320874115,0.7465845079,-0.1034484959,-0.4776822499,0.436218328,-0.4083564542,0.4835567895,1.0733230373,-0.858658902,-0.4493571034,0.4506418221,1.6696649735,-0.9189799982,-1.1690356499,-1.0689397924,0.3174297583,1.0403701444,0.5440082812,-0.1128248996]'
Both of them are bytes type, but I can use numpy.frombuffer to read the b_d, but not the b_test_d. And they look very different. Why do I have these two types of bytes?
Thank you.
[A]nyone can point out how to use Json marshall to convert the byte to the same type of bytes as the first one?
This isn't the right question, but I think I know what you're asking. You say you're getting the 2nd array via JSON marshalling, but that it's also not under your control:
it was obtained by json marshal (convert a received float array to byte array, and then convert the result to base64 string, which is done by someone else)
That's fine though, you just have to do a few steps of processing to get to a state equivalent to the first set of bytes.
First, some context to what's going on. You've already seen that numpy can understand your first set of bytes.
>>> numpy.frombuffer(data)
[1.21 2.963 1. ]
Based on its output, it looks like numpy is interpreting your data as 3 doubles, with 8 bytes each (24 bytes total)...
>>> data = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
>>> len(data)
24
...which the struct module can also interpret.
# Separate into 3 doubles
x, y, z = data[:8], data[8:16], data[16:]
print([struct.unpack('d', i) for i in (x, y, z)])
[(1.21,), (2.963,), (1.0,)
There's actually (at least) 2 ways you can get a numpy array out of this.
Short way
1. Convert to string
# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string (defaults to utf-8), then
# trim off the brackets (first and last characters in the string)
as_str = junk.decode()[1:-1]
2. Use numpy.fromstring
numpy.fromstring(as_str, dtype=float, sep=',')
# Produces:
array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962,
0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569,
0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 ,
0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427,
1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913,
0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758,
... ])
Long way
Note: I found the fromstring method after writing this part up, figured I'd leave it here to at least help explain the byte differences.
1. Convert the JSON data into an array of numeric values.
# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string - defaults to utf-8
junk = junk.decode()
# Trim off the brackets - First and last characters in the string
junk = junk[1:-1]
# Separate into values
junk = junk.split(',')
# Convert to numerical values
doubles = [float(val) for val in junk]
# Or, as a one-liner
doubles = [float(val) for val in junk.decode()[1:-1].split(',')]
# "doubles" currently holds:
[-2.1997713216,
-1.4249271187,
-1.1076795391,
1.5224958034,
...]
2. Use struct to get byte-representations for the doubles
import struct
as_bytes = [struct.pack('d', val) for val in doubles]
# "as_bytes" currently holds:
[b'\x08\x9b\xe7\xb4!\x99\x01\xc0',
b'\x0b\x00\xe0`\x80\xcc\xf6\xbf',
b'+ ..\x0e\xb9\xf1\xbf',
b'hg>\x8f$\\\xf8?',
...]
3. Join all the double values (as bytes) into a single byte-string, then submit to numpy
new_data = b''.join(as_bytes)
numpy.frombuffer(new_data)
# Produces:
array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962,
0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569,
0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 ,
0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427,
1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913,
0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758,
... ])
A bytes object can be in any format. It is "just a bunch of bytes" without context. For display Python will represent byte values <128 as their ASCII value, and use hex escape codes (\x##) for others.
The first looks like IEEE 754 double precision floating point. numpy or struct can read it. The second one is in JSON format. Use the json module to read it:
import numpy as np
import json
import struct
b1 = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?'
b2 = b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034]'
j = json.loads(b2)
n = np.frombuffer(b1)
s = struct.unpack('3d',b1)
print(j,n,s,sep='\n')
# To convert b2 into a b1 format
b = struct.pack('4d',*j)
print(b)
Output:
[-2.1997713216, -1.4249271187, -1.1076795391, 1.5224958034]
[1.21 2.963 1. ]
(1.21, 2.963, 1.0)
b'\x08\x9b\xe7\xb4!\x99\x01\xc0\x0b\x00\xe0`\x80\xcc\xf6\xbf+ ..\x0e\xb9\xf1\xbfhg>\x8f$\\\xf8?'
I am given this number 427021005928, which i am supposed to change into a base64 encoded string and then decode the base64 string to get a plain text.
This decimal value 427021005928 when converted to binary gives 110001101101100011011110111010001101000 which corresponds to 'Y2xvdGg=', which is what i want. Got the conversion from (https://cryptii.com/pipes/binary-to-base64)
And then finally i decode 'Y2xvdGg=' to get the text cloth.
My problem is i do not have any idea how to use Python to get from either the decimal or binary value to get 'Y2xvdGg='
Some help would be appreciated!
NOTE: I only have this value 427021005928 at the start. I need to get the base64 and plaintext answers.
One elegant way would be using [Python 3]: struct - Interpret bytes as packed binary data, but given the fact that Python numbers are not fixed size, some additional computation would be required (for example, the number is 5 bytes long).
Apparently, the online converter, applied the base64 encoding on the number's memory representation, which can be obtained via [Python 3]: int.to_bytes(length, byteorder, *, signed=False)(endianness is important, and in this case it's big):
For the backwards process, reversed steps are required. There are 2 alternatives:
Things being done manually (this could also be applied to the "forward" process)
Using int.from_bytes
>>> import base64
>>>
>>> number = 427021005928
>>>
>>> number_bytes = number.to_bytes((number.bit_length() + 7) // 8, byteorder="big") # Here's where the magic happens
>>> number_bytes, number_bytes.decode()
(b'cloth', 'cloth')
>>>
>>> encoded = base64.b64encode(number_bytes)
>>> encoded, encoded.decode() # Don't let yourself tricked by the variable and method names resemblance
(b'Y2xvdGg=', 'Y2xvdGg=')
>>>
>>> # Now, getting the number back
...
>>> decoded = base64.b64decode(encoded)
>>> decoded
b'cloth'
>>>
>>> final_number0 = sum((item * 256 ** idx for idx, item in enumerate(reversed(decoded))))
>>> final_number0
427021005928
>>> number == final_number0
True
>>>
>>> # OR using from_bytes
...
>>> final_number1 = int.from_bytes(decoded, byteorder="big")
>>> final_number1
427021005928
>>> final_number1 == number
True
For more details on bitwise operations, check [SO]: Output of crc32b in PHP is not equal to Python (#CristiFati's answer).
Try this (https://docs.python.org/3/library/stdtypes.html#int.to_bytes)
>>> import base64
>>> x=427021005928
>>> y=x.to_bytes(5,byteorder='big').decode('utf-8')
>>> base64.b64encode(y.encode()).decode()
'Y2xvdGg='
>>> y
'cloth'
try
number = 427021005928
encode = base64.b64encode(bytes(number))
decode = base64.b64decode(encodeNumber)
The function below converts an unsigned 64 bit integer into base64 representation, and back again. This is particularly helpful for encoding database keys.
We first encode the integer into a byte array using little endian, and automatically remove any extra leading zeros. Then convert to base64, removing the unnecessary = sign. Note the flag url_safe which makes the solution non-base64 compliant, but works better with URLs.
def int_to_chars(number, url_safe = True):
'''
Convert an integer to base64. Used to turn IDs into short URL slugs.
:param number:
:param url_safe: base64 may contain "/" and "+", which do not play well
with URLS. Set to True to convert "/" to "-" and "+" to
"_". This no longer conforms to base64, but looks better
in URLS.
:return:
'''
if number < 0:
raise Exception("Cannot convert negative IDs.")
# Encode the long, long as little endian.
packed = struct.pack("<Q", number)
# Remove leading zeros
while len(packed) > 1 and packed[-1] == b'\x00':
packed = packed[:-1]
encoded = base64.b64encode(packed).split(b"=")[0]
if url_safe:
encoded = encoded.replace(b"/", b"-").replace(b"+", b".")
return encoded
def chars_to_int(chars):
'''Reverse of the above function. Will work regardless of whether
url_safe was set to True or False.'''
# Make sure the data is in binary type.
if isinstance(chars, six.string_types):
chars = chars.encode('utf8')
# Do the reverse of the url_safe conversion above.
chars = chars.replace(b"-", b"/").replace(b".", b"+")
# First decode the base64, adding the required "=" padding.
b64_pad_len = 4 - len(chars) % 4
decoded = base64.b64decode(chars + b"="*b64_pad_len)
# Now decode little endian with "0" padding, which are leading zeros.
int64_pad_len = 8 - len(decoded)
return struct.unpack("<Q", decoded + b'\x00' * int64_pad_len)[0]
You can do following conversions by using python
First of all import base64 by using following syntax
>>> import base64
For converting text to base64 do following
encoding
>>> base64.b64encode("cloth".encode()).decode()
'Y2xvdGg='
decoding
>>> base64.b64decode("Y2xvdGg=".encode()).decode()
'cloth'