Unpacking and packing back a struct consisting of single bytes - python
I am getting struct.error: bad char in struct format when packing bytes back in the struct even without making any changes to them.
I am trying to do bitwise operations on each byte in RGBTRIPLE of a 24-bit BMP image. For the sake of simplicity, I am posting the code with just one sample bytes sequence representing a pixel in a Bitmap; I don't make any bitwise operations on it, just try to pack it back.
from struct import *
from collections import namedtuple
def main():
RGBTRIPLE = namedtuple('RGBTRIPLE', 'rgbtRed rgbtGreen rgbtBlue')
rgbt_fmt = '=BBB'
rgbt_size = calcsize(rgbt_fmt)
rgbt_buffer = b'\x1c\x1e\x1f'
rgbt = RGBTRIPLE._make(unpack(rgbt_fmt, rgbt_buffer))
rgbtRed = rgbt.rgbtRed
rgbtGreen = rgbt.rgbtGreen
rgbtBlue = rgbt.rgbtBlue
rgbt_buffer = pack('rgbt_fmt', rgbtRed, rgbtGreen, rgbtBlue)
if __name__ == "__main__":
From what I understand, the problem is that when I am unpacking bytes, I am getting ints with size > 1 byte. What is the best way to fix the size of those ints at 1 byte, so I can pack them back using the same =BBB struct format?
Why python has different types of bytes
I have two variables, one is b_d, the other is b_test_d. When I type b_d in the console, it shows: b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?' when I type b_test_d in the console, it shows: b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034,-0.1709796203,0.3663875698,0.14846441,-0.7415930061,-1.7602231949,0.126605689,0.6010934792,-0.466415358,1.5675525816,1.00836295,1.4332792992,0.6113384254,-1.8008540571,-0.9443408896,1.0943670356,-1.0114642686,1.443892627,-0.2709427287,0.2990462512,0.4650133591,0.2560791327,0.2257600462,-2.4077429827,-0.0509983213,1.0062187148,0.4315075795,-0.6116110033,0.3495131413,-0.3249903375,0.3962305931,-0.1985757285,1.165792433,-1.1171953063,-0.1732557874,-0.3791600654,-0.2860519953,0.7872658859,0.217728374,-0.4715179983,-0.4539613811,-0.396353657,1.2326862425,-1.3548659354,1.6476230786,0.6312713442,-0.735444661,-0.6853447369,-0.8480631975,0.9538606574,0.6653542368,-0.2833696021,0.7281604648,-0.2843872095,0.1461980484,-2.3511731773,-0.3118047948,-1.6938613893,-0.0359659687,-0.5162134311,-2.2026641552,-0.7294895084,0.7493073213,0.1034096968,0.6439803068,-0.2596155272,0.5851323455,1.0173285542,-0.7370464113,1.0442954406,-0.5363832595,0.0117795359,0.2225617514,0.067571974,-0.9154681906,-0.293808596,1.3717113798,0.4919516922,-0.3254944005,1.6203744532,-0.1810222279,-0.6111596457,1.344064259,-0.4596893179,-0.2356197144,0.4529942046,1.6244603294,0.1849995925,0.6223061217,-0.0340662398,0.8365900535,-0.6804201929,0.0149665385,0.4132453788,0.7971962667,-1.9391525531,0.1440486871,-0.7103617816,0.9026539637,0.6665798363,-1.5885073458,1.4084493329,-1.397040825,1.6215697667,1.7057148522,0.3802647045,-0.4239271483,1.4773614536,1.6841461329,0.1166845529,-0.3268795898,-0.9612751672,0.4062399443,0.357209662,-0.2977362702,-0.3988147401,-0.1174652196,0.3350589818,-1.8800423584,0.0124169787,1.0015110265,0.789541751,-0.2710408983,1.4987300181,-1.1726824468,-0.355322591,0.6567978423,0.8319110558,0.8258835069,-1.1567887763,1.9568551122,1.5148655075,1.0589021915,-0.4388232953,-0.7451680183,-2.1897621693,0.4502135234,-1.9583089063,0.1358789518,-1.7585860897,0.452259777,0.7406800349,-1.3578980418,1.108740204,-1.1986272667,-1.0273598206,-1.8165822264,1.0853600894,-0.273943514,0.8589890805,1.3639094329,-0.6121993589,-0.0587067992,0.0798457584,1.0992814648,-1.0455733611,1.4780003064,0.5047157705,0.1565451605,0.9656886956,-0.5998330255,0.4846727299,0.8790524818,1.0288893846,-2.0842447397,0.4074607421,2.1523241756,-1.1268047125,-0.6016001524,-1.3302141561,1.1869516954,1.0988060125,0.7405900405,1.1813110811,0.8685330644,2.0927140519,-1.7171952009,0.9231993147,0.320874115,0.7465845079,-0.1034484959,-0.4776822499,0.436218328,-0.4083564542,0.4835567895,1.0733230373,-0.858658902,-0.4493571034,0.4506418221,1.6696649735,-0.9189799982,-1.1690356499,-1.0689397924,0.3174297583,1.0403701444,0.5440082812,-0.1128248996]' Both of them are bytes type, but I can use numpy.frombuffer to read the b_d, but not the b_test_d. And they look very different. Why do I have these two types of bytes? Thank you.
[A]nyone can point out how to use Json marshall to convert the byte to the same type of bytes as the first one? This isn't the right question, but I think I know what you're asking. You say you're getting the 2nd array via JSON marshalling, but that it's also not under your control: it was obtained by json marshal (convert a received float array to byte array, and then convert the result to base64 string, which is done by someone else) That's fine though, you just have to do a few steps of processing to get to a state equivalent to the first set of bytes. First, some context to what's going on. You've already seen that numpy can understand your first set of bytes. >>> numpy.frombuffer(data) [1.21 2.963 1. ] Based on its output, it looks like numpy is interpreting your data as 3 doubles, with 8 bytes each (24 bytes total)... >>> data = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?' >>> len(data) 24 ...which the struct module can also interpret. # Separate into 3 doubles x, y, z = data[:8], data[8:16], data[16:] print([struct.unpack('d', i) for i in (x, y, z)]) [(1.21,), (2.963,), (1.0,) There's actually (at least) 2 ways you can get a numpy array out of this. Short way 1. Convert to string # Original JSON data (snipped) junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]' # Decode from bytes to a string (defaults to utf-8), then # trim off the brackets (first and last characters in the string) as_str = junk.decode()[1:-1] 2. Use numpy.fromstring numpy.fromstring(as_str, dtype=float, sep=',') # Produces: array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962, 0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569, 0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 , 0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427, 1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913, 0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758, ... ]) Long way Note: I found the fromstring method after writing this part up, figured I'd leave it here to at least help explain the byte differences. 1. Convert the JSON data into an array of numeric values. # Original JSON data (snipped) junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]' # Decode from bytes to a string - defaults to utf-8 junk = junk.decode() # Trim off the brackets - First and last characters in the string junk = junk[1:-1] # Separate into values junk = junk.split(',') # Convert to numerical values doubles = [float(val) for val in junk] # Or, as a one-liner doubles = [float(val) for val in junk.decode()[1:-1].split(',')] # "doubles" currently holds: [-2.1997713216, -1.4249271187, -1.1076795391, 1.5224958034, ...] 2. Use struct to get byte-representations for the doubles import struct as_bytes = [struct.pack('d', val) for val in doubles] # "as_bytes" currently holds: [b'\x08\x9b\xe7\xb4!\x99\x01\xc0', b'\x0b\x00\xe0`\x80\xcc\xf6\xbf', b'+ ..\x0e\xb9\xf1\xbf', b'hg>\x8f$\\\xf8?', ...] 3. Join all the double values (as bytes) into a single byte-string, then submit to numpy new_data = b''.join(as_bytes) numpy.frombuffer(new_data) # Produces: array([-2.19977132, -1.42492712, -1.10767954, 1.5224958 , -0.17097962, 0.36638757, 0.14846441, -0.74159301, -1.76022319, 0.12660569, 0.60109348, -0.46641536, 1.56755258, 1.00836295, 1.4332793 , 0.61133843, -1.80085406, -0.94434089, 1.09436704, -1.01146427, 1.44389263, -0.27094273, 0.29904625, 0.46501336, 0.25607913, 0.22576005, -2.40774298, -0.05099832, 1.00621871, 0.43150758, ... ])
A bytes object can be in any format. It is "just a bunch of bytes" without context. For display Python will represent byte values <128 as their ASCII value, and use hex escape codes (\x##) for others. The first looks like IEEE 754 double precision floating point. numpy or struct can read it. The second one is in JSON format. Use the json module to read it: import numpy as np import json import struct b1 = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07#\x00\x00\x00\x00\x00\x00\xf0?' b2 = b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034]' j = json.loads(b2) n = np.frombuffer(b1) s = struct.unpack('3d',b1) print(j,n,s,sep='\n') # To convert b2 into a b1 format b = struct.pack('4d',*j) print(b) Output: [-2.1997713216, -1.4249271187, -1.1076795391, 1.5224958034] [1.21 2.963 1. ] (1.21, 2.963, 1.0) b'\x08\x9b\xe7\xb4!\x99\x01\xc0\x0b\x00\xe0`\x80\xcc\xf6\xbf+ ..\x0e\xb9\xf1\xbfhg>\x8f$\\\xf8?'
Why can't Python struct module pack (or unpack) multi bytes with little endian
I'm dealing with some multi bytes issues. For example, I have a variable a = b'\x00\x01\x02\x03', it is a bytes object rather than int. I'd like to struct.pack it to form a package with little endian, but <4s didn't work. In fact, <4s and >4s get the same results. What to do if I'd like the result to be b'\x03\x02\x01\x00. I know I could use struct.pack('<L', struct.unpack('>L', a)), but is it the only and correct way to deal with multi bytes objects? Example: import struct import secrets mhdr = b'\x20' joineui = b'\x00\x01\x02\x03\x04\x05\x06\x07' deveui = b'\x08\x09\x10\x11\x12\x13\x14\x15' devnonce = secrets.token_bytes(2) joinreq = struct.pack( '<s8s8s2s', mhdr, joineui, deveui, devnonce, ) # The expected joinreq should be b'\x20\x07\x06\x05\x04\x03\x02\x01\x00\x15\x14\x13\x12\x11\x10\x09\x08...'
It seems to me you do not want to have 4 single chars, but instead 1 integer. So instead of '4s' you should try using 'i' or 'I' (whether it is signed or unsigned). Your example should look like import struct import secrets mhdr = b'\x20' joineui = b'\x00\x01\x02\x03\x04\x05\x06\x07' deveui = b'\x08\x09\x10\x11\x12\x13\x14\x15' devnonce = secrets.token_bytes(2) joinreq = struct.pack( '<BQQH', #use small letters if the values are signed instead of unsigned mhdr, joineui, deveui, devnonce, ) "Q" stands for long long unsigned (8byte). If you want to use float instead you can use d for double float precision (8byte). You can see the meaning of all letters in the documentation of struct.
Packing an integer number to 3 bytes in Python
With background knowledge of C I want to serialize an integer number to 3 bytes. I searched a lot and found out I should use struct packing. I want something like this: number = 1195855 buffer = struct.pack("format_string", number) Now I expect buffer to be something like ['\x12' '\x3F' '\x4F']. Is it also possible to set endianness?
It is possible, using either > or < in your format string: import struct number = 1195855 def print_buffer(buffer): print(''.join(["%02x" % ord(b) for b in buffer])) # Python 2 #print(buffer.hex()) # Python 3 # Little Endian buffer = struct.pack("<L", number) print_buffer(buffer) # 4f3f1200 # Big Endian buffer = struct.pack(">L", number) print_buffer(buffer) # 00123f4f 2.x docs 3.x docs Note, however, that you're going to have to figure out how you want to get rid of the empty byte in the buffer, since L will give you 4 bytes and you only want 3. Something like: buffer = struct.pack("<L", number) print_buffer(buffer[:3]) # 4f3f12 # Big Endian buffer = struct.pack(">L", number) print_buffer(buffer[-3:]) # 123f4f would be one way.
Another way is to manually pack the bytes: >>> import struct >>> number = 1195855 >>> data = struct.pack('BBB', ... (number >> 16) & 0xff, ... (number >> 8) & 0xff, ... number & 0xff, ... ) >>> data b'\xa5Z' >>> list(data) [18, 63, 79] As just the 3-bytes, it's a bit redundant since the last 3 parameters of struct.pack equals the data. But this worked well in my case because I had header and footer bytes surrounding the unsigned 24-bit integer. Whether this method, or slicing is more elegant is up to your application. I found this was cleaner for my project.
How to properly decode .wav with Python
I am coding a basic frequency analisys of WAVE audio files, but I have trouble when it comes to convertion from WAVE frames to integer. Here is the relevant part of my code: import wave track = wave.open('/some_path/my_audio.wav', 'r') byt_depth = track.getsampwidth() #Byte depth of the file in BYTES frame_rate = track.getframerate() buf_size = 512 def byt_sum (word): #convert a string of n bytes into an int in [0;8**n-1] return sum( (256**k)*word[k] for k in range(len(word)) ) raw_buf = track.readframes(buf_size) ''' One frame is a string of n bytes, where n = byt_depth. For instance, with a 24bits-encoded file, track.readframe(1) could be: b'\xff\xfe\xfe'. raw_buf[n] returns an int in [0;255] ''' sample_buf = [byt_sum(raw_buf[byt_depth*k:byt_depth*(k+1)]) - 2**(8*byt_depth-1) for k in range(buf_size)] Problem is: when I plot sample_buf for a single sine signal, I get an alternative, wrecked sine signal. I can't figure out why the signal overlaps udpside-down. Any idea? P.S.: Since I'm French, my English is quite hesitating. Feel free to edit if there are ugly mistakes.
It might be because you need to use an unsigned value for representing the 16bit samples. See https://en.wikipedia.org/wiki/Pulse-code_modulation Try to add 32767 to each sample. Also you should use the python struct module to decode the buffer. import struct buff_size = 512 # 'H' is for unsigned 16 bit integer, try 'h' also sample_buff = struct.unpack('H'*buf_size, raw_buf)
The easiest way is to use a library that does the decoding for you. There are several Python libraries available, my favorite is the soundfile module: import soundfile as sf signal, samplerate = sf.read('/some_path/my_audio.wav')
Python 2.7.6 Optimizing code for packing big endian bytes into a string
import struct varA['Z']['value'] = 8700 varA['Y']['value'] = 8800 varA['X']['value'] = 8900 varA['W']['value'] = 8800 varA['V']['value'] = 8700 varB = "" varC = "" for name in 'Z Y X W V'.split(' '): varB = C[name]['value'] varC += str(struct.pack('>h',varB)) print varC[:-1] + '\n' What i need is a string of bytes, where each number is a signed int16 big-endian byte(s). this code here works for what im trying to do, but i know theres a far more elegant solution. I wouldn't spend any time on optimizing the varA as its only there to set up the code and won't be used in my project. Also the print is also there to set up the problem, im actually sending the bytes as a socket. Initially I had this in an array first few times, but when I converted the array to a bytearray, I kept running into having 0x00 mixed in. Same with struct, as you can see in my solution removing the 0x00 at the end.
Here's a simpler way to do it. It's unclear from the question whether this is exactly the result you desire. values = [varA[name]['value'] for name in 'ZYXWV'] varC = struct.pack('>'+str(len(values))+'h', *values)