Serial Port data - python

I am attempting to read the data from an Absolute Encoder with a USB interface using pyserial on the Raspberry. The datasheet for the encoder is below. The USB interface data is on page 22-23
I have successfully connected to the Encoder and I am able to send commands using
port = serial.Serial("/dev/serial/by-id/usb-RLS_Merilna_tehnkis_AksIM_encoder_3454353-if00")
where x is any of the available Commands listed for the USB interface.
For example port.write(b"1") is meant to initiate a single position request. I am able to print the output from encoder with
x =
The problem is converting the output into actual positiong data. port.write(b"1") outputs the following data:
I know that the first and last bytes are just the header and footer. Bytes 5 and 6 are the encoder status. Bytes 2-4 is the actual position data. The customer support has informed me that I need to take bytes 2 to 4, shift them into a 32 bit unsigned integer (into lower 3 bytes), convert to a floating point number, divide by 0xFF FF FF, multiply by 360. Result are degrees.
I'm not exactly sure how to do this. Can someone please let me know the python prgramming/functions I need to write in order to do this. Thank you.

You have to use builtin from_bytes() method:
x = b'\xea\xd0\x05\x00\x00\x00\xef'
number = 360 * float(
int.from_bytes(x[1:4], 'big') # get integer from bytes
) / 0xffffff
will print:

This is the way to extract the bytes and shift them into an integer and scale as a float:
x = b'\xea\xd0\x05\x00\x00\x00\xef'
int_value = 0 # initialise shift register
for index in range(1,4):
int_value *= 256 # shift up by 8 bits
int_value += x[index] # or in the next byte
# scale the integer against the max value
float_value = 360 * float(int_value) / 0xffffff


Encode data to HEX and get an L at the end in Python 2.7. Why?

I ask a Measurement Device to give me some Data. At first it tells me how many bytes of data are in the storage. It is always 14. Then it gives me the data which i have to encode into hex. It is Python 2.7 can´t use newer versions. Line 6 to 10 tells the Device to give me the measured data.
Line 12 to 14 is the encoding to Hex. In other Programs it works. but when i print result(Line 14) then i get a Hex number with 13 Bytes PLUS 1 which can not be correct because it has an L et the end. I guess it is some LONG or whatever. and i dont need the last Byte. but i do think it changes the Data too, which is picked out from Line 15 and up. at first in Hex. Then it is converted into Int.
Is it possible that the L has an effect on the Data or not?
How can i fix it?
1 ap.write(b"ML\0")
rmemb =
rmemb = int(rmemb)+1
5 rmem = rmemb #must be and is 14 Bytes
addmem = ("MR:%s\0" % rmem)
# addmem = ("MR:14\0")
10 time.sleep(1)
test =
result = hex(int(test.encode('hex'), 16))
15 ftflash = result[12:20]
ftbg = result[20:28]
ftflash = int(ftflash, 16)
20 # print(ftflash)
ftbg = int(ftbg, 16)
# print(ftbg)
Python 2 has two built-in integer types, int and long. hex returns a string representing a Python hexadecimal literal, and in Python 2, that means that longs get an L at the end, to signify that it's a long.

Unpack IEEE 754 Floating Point Number

I am reading two 16 bit registers from a tcp client using the pymodbus module. The two registers make up a 32 bit IEEE 754 encoded floating point number. Currently I have the 32 bit binary value of the registers shown in the code below.
start_address = 0x1112
reg_count = 2
client = ModbusTcpClient(<IP_ADDRESS>)
response = client.read_input_registers(start_address,reg_count)
reg_1 = response.getRegister(0)<<(16 - (response.getRegister(0).bit_length())) #Get in 16 bit format
reg_2 = response.getRegister(1)<<(16 - (response.getRegister(1).bit_length())) #Get in 16 bit format
volts = (reg_1 << 16) | reg_2 #Get the 32 bit format
The above works fine to get the encoded value the problem is decoding it. I was going to code something like in this video but I came across the 'f' format in the struct module for IEEE 754 encoding. I tried decode the 32 bit float stored in volts in the code above using the unpack method in the struct module but ran into the following errors.
val = struct.unpack('f',volts)
>>> TypeError: a bytes-like object is required, not 'int'
Ok tried convert it to a 32 bit binary string.
temp = bin(volts)
val = struct.unpack('f',temp)
>>> TypeError: a bytes-like object is required, not 'str'
Tried to covert it to a bytes like object as in this post and format in different ways.
val = struct.unpack('f',bytes(volts))
>>> TypeError: string argument without an encoding
temp = "{0:b}".format(volts)
val = struct.unpack('f',temp)
>>> ValueError: Unknown format code 'b' for object of type 'str'
val = struct.unpack('f',volts.encode())
>>> struct.error: unpack requires a buffer of 4 bytes
Where do I add this buffer and where in the documentation does it say I need this buffer with the unpack method? It does say in the documentation
The string must contain exactly the amount of data required by the format (len(string) must equal calcsize(fmt)).
The calcsize(fmt) function returns a value in bytes but the len(string) returns a value of the length of the string, no?
Any suggestions are welcome.
There is a solution to decoding below however a better solution to obtaining the 32 bit register value from the two 16 bit register values is shown below compared to the original in the question.
start_address = 0x1112
reg_count = 2
client = ModbusTcpClient(<IP_ADDRESS>)
response = client.read_input_registers(start_address,reg_count)
reg_1 = response.getRegister(0)
reg_2 = response.getRegister(1)
# Shift reg 1 by 16 bits
reg_1s = reg_1 << 16
# OR with the reg_2
total = reg_1s | reg_2
I found a solution to the problem using the BinaryPayloadDecoder.fromRegisters() from the pymodbus moudule instead of the struct module. Note that this solution is specific to the modbus smart meter device I am using as the byte order and word order of the registers could change in other devices. It may still work in other devices to decode registers but I would advise to read the documentation of the device first to be sure. I left in the comments in the code below but when I refer to page 24 this is just for my device.
from pymodbus.client.sync import ModbusTcpClient
from pymodbus.constants import Endian
from pymodbus.payload import BinaryPayloadDecoder
start_address = 0x1112
reg_count = 2
client = ModbusTcpClient(<IP_ADDRESS>)
response = client.read_input_registers(start_address,reg_count)
# The response will contain two registers making a 32 bit floating point number
# Use the BinaryPayloadDecoder.fromRegisters() function to decode
# The coding scheme for a 32 bit float is IEEE 754
# The MS Bytes are stored in the first address and the LS bytes are stored in the second address,
# this corresponds to a big endian byte order (Second parameter in function)
# The documentation for the Modbus registers for the smart meter on page 24 says that
# the low word is the first priority, this correspond to a little endian word order (Third parameter in function)
decoder = BinaryPayloadDecoder.fromRegisters(response.registers, Endian.Big, wordorder=Endian.Little)
final_val = (decoder.decode_32bit_float())
Credit to juanpa-arrivillaga and chepner the problem can be solved using the struct module also with the byteorder='little'. The two functions in the code below can be used if the byteorder is little or if the byte order is big depending upon the implementation.
import struct
from pymodbus.client.sync import ModbusTcpClient
def big_endian(response):
reg_1 = response.getRegister(0)
reg_2 = response.getRegister(1)
# Shift reg 1 by 16 bits
reg_1s = reg_1 << 16
# OR with the reg_2
total = reg_1s | reg_2
return total
def little_endian(response):
reg_1 = response.getRegister(0)
reg_2 = response.getRegister(1)
# Shift reg 2 by 16 bits
reg_2s = reg_2 << 16
# OR with the reg_1
total = reg_2s | reg_1
start_address = 0x1112
reg_count = 2
client = ModbusTcpClient(<IP_ADDRESS>)
response = client.read_input_registers(start_address,reg_count)
# Little
little = little_endian(response)
lit_byte = little.to_bytes(4,byteorder='little')
# Big
big = big_endian(response)
big_byte = big.to_bytes(4,byteorder='big')

Python u-Law (MULAW) wave decompression to raw wave signal

I googled this issue for last 2 weeks and wasn't able to find an algorithm or solution. I have some short .wav file but it has MULAW compression and python doesn't seem to have function inside that can successfully decompresses it. So I've taken upon myself to build a decoder in python.
I've found some info about MULAW in basic elements:
A-law u-Law comparison
Some c-esc codec library
So I need some guidance, since I don't know how to approach getting from signed short integer to a full wave signal. This is my initial thought from what I've gathered so far:
So from wiki I've got a equation for u-law compression and decompression :
compression :
decompression :
So judging by compression equation, it looks like the output is limited to a float range of -1 to +1 , and with signed short integer from –32,768 to 32,767 so it looks like I would need to convert it from short int to float in specific range.
Now, to be honest, I've heard of quantisation before, but I am not sure if I should first try and dequantize and then decompress or in the other way, or even if in this case it is the same thing... the tutorials/documentation can be a bit of tricky with terminology.
The wave file I am working with is supposed to contain 'A' sound like for speech synthesis, I could probably verify success by comparing 2 waveforms in some audio software and custom wave analyzer but I would really like to diminish trial and error section of this process.
So what I've had in mind:
u = 0xff
data_chunk = b'\xe7\xe7' # -6169
data_to_r1 = unpack('h',data_chunk)[0]/0xffff # I suspect this is wrong,
# # but I don't know what else
u_law = ( -1 if data_chunk<0 else 1 )*( pow( 1+u, abs(data_to_r1)) -1 )/u
So is there some sort of algorithm or crucial steps I would need to take in form of first: decompression, second: quantisation : third ?
Since everything I find on google is how to read a .wav PCM-modulated file type, not how to manage it if wild compression arises.
So, after scouring the google the solution was found in github ( go figure ). I've searched for many many algorithms and found 1 that is within bounds of error for lossy compression. Which is for u law for positive values from 30 -> 1 and for negative values from -32 -> -1
To be honest i think this solution is adequate but not quite per equation per say, but it is best solution for now. This code is transcribed to python directly from gcc9108 audio codec
def uLaw_d(i8bit):
bias = 33
sign = pos = 0
decoded = 0
i8bit = ~i8bit
if i8bit&0x80:
i8bit &= ~(1<<7)
sign = -1
pos = ( (i8bit&0xf0) >> 4 ) + 5
decoded = ((1 << pos) | ((i8bit & 0x0F) << (pos - 4)) | (1 << (pos - 5))) - bias
return decoded if sign else ~decoded
def uLaw_e(i16bit):
MAX = 0x1fff
BIAS = 33
mask = 0x1000
sign = lsb = 0
pos = 12
if i16bit < 0:
i16bit = -i16bit
sign = 0x80
i16bit += BIAS
if ( i16bit>MAX ): i16bit = MAX
for x in reversed(range(pos)):
if i16bit&mask != mask and pos>=5:
pos = x
lsb = ( i16bit>>(pos-4) )&0xf
return ( ~( sign | ( pos<<4 ) | lsb ) )
With test:
print( 'normal :\t{0}\t|\t{0:2X}\t:\t{0:016b}'.format(0xff) )
print( 'encoded:\t{0}\t|\t{0:2X}\t:\t{0:016b}'.format(uLaw_e(0xff)) )
print( 'decoded:\t{0}\t|\t{0:2X}\t:\t{0:016b}'.format(uLaw_d(uLaw_e(0xff))) )
and output:
normal : 255 | FF : 0000000011111111
encoded: -179 | -B3 : -000000010110011
decoded: 263 | 107 : 0000000100000111
And as you can see 263-255 = 8 which is within bounds. When i tried to implement seeemmmm method described in G.711 ,that kind user Oliver Charlesworth suggested that i look in to , the decoded value for maximum in data was -8036 which is close to the maximum of uLaw spec, but i couldn't reverse engineer decoding function to get binary equivalent of function from wikipedia.
Lastly, i must say that i am currently disappointed that python library doesn't support all kind of compression algorithms since it is not just a tool that people use, it is also a resource python consumers learn from since most of data for further dive into code isn't readily available or understandable.
After decoding the data and writing wav file via i've successfully succeeded to write a new raw linear PCM file. This works... even though i was sceptical at first.
EDIT 2: ::> you can find real solution
I find this helpful for converting to/from ulaw with numpy arrays.
import audioop
def numpy_audioop_helper(x, xdtype, func, width, ydtype):
'''helper function for using audioop buffer conversion in numpy'''
xi = np.asanyarray(x).astype(xdtype)
if np.any(x != xi):
xinfo = np.iinfo(xdtype)
raise ValueError("input must be %s [%d..%d]" % (xdtype, xinfo.min, xinfo.max))
y = np.frombuffer(func(xi.tobytes(), width), dtype=ydtype)
return y.reshape(xi.shape)
def audioop_ulaw_compress(x):
return numpy_audioop_helper(x, np.int16, audioop.lin2ulaw, 2, np.uint8)
def audioop_ulaw_expand(x):
return numpy_audioop_helper(x, np.uint8, audioop.ulaw2lin, 2, np.int16)
Python actually supports decoding u-Law out of the box:
audioop.ulaw2lin(fragment, width)
Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. u-LAW encoding always uses 8 bits samples, so width
refers only to the sample width of the output fragment here.

Python string formatting to send through serial port

I need to properly format the string in order to send it to the arduino connected through a serial port. For example I have this python2.7.5 code:
x = int(7)
y = int(7000.523)
self.ser.write("%s%s" % (x, y))
but I want x in a byte and y in different bytes from x so I can assign a variable for each recieved byte in the arduino code similar to this:
for (i=0; i<3; i++)
bufferArray[i] =;
d1 = bufferArray[0];
d2 = bufferArray[1];
d3 = bufferArray[2];
x = d1;
y = (d2 << 8) + d3;
In other words, I don't want that a piece of y is in the x byte.
What is the proper string format to do this?
Following the advice of #Mattias Nilsson there is a sample code if you want to send two consecutive 16 bit unsigned integers:
import struct
x = int(7)
y = int(7000.523)
buf = struct.pack("<HH", x, y)
# read it back
for i in buf:
print "%02x" % (ord(i))
You can see that they are send each in 2 bytes and the LSB byte is always first. (Tested on intel x64 machine python 2.7.5)
Edit: You should be able to explicitly set the endiannes using the < character for little endian order at the beginning of the format string.
Then you could just send both buffer and the string using Serial.write:
You can nottice the zero charater that will terminate your string. If you send the string like this you should not send any zero byte character in your string.
On the arduino side you should read and decode those two integers first and then to read characters in a loop that will end reading if you read a zero byte. You should definitely check if your reading buffer won't overflow too.

Interpreting WAV Data

I'm trying to write a program to display PCM data. I've been very frustrated trying to find a library with the right level of abstraction, but I've found the python wave library and have been using that. However, I'm not sure how to interpret the data.
The wave.getparams function returns (2 channels, 2 bytes, 44100 Hz, 96333 frames, No compression, No compression). This all seems cheery, but then I tried printing a single frame:'\xc0\xff\xd0\xff' which is 4 bytes. I suppose it's possible that a frame is 2 samples, but the ambiguities do not end there.
96333 frames * 2 samples/frame * (1/44.1k sec/sample) = 4.3688 seconds
However, iTunes reports the time as closer to 2 seconds and calculations based on file size and bitrate are in the ballpark of 2.7 seconds. What's going on here?
Additionally, how am I to know if the bytes are signed or unsigned?
Many thanks!
Thank you for your help! I got it working and I'll post the solution here for everyone to use in case some other poor soul needs it:
import wave
import struct
def pcm_channels(wave_file):
"""Given a file-like object or file path representing a wave file,
decompose it into its constituent PCM data streams.
Input: A file like object or file path
Output: A list of lists of integers representing the PCM coded data stream channels
and the sample rate of the channels (mixed rate channels not supported)
stream =,"rb")
num_channels = stream.getnchannels()
sample_rate = stream.getframerate()
sample_width = stream.getsampwidth()
num_frames = stream.getnframes()
raw_data = stream.readframes( num_frames ) # Returns byte data
total_samples = num_frames * num_channels
if sample_width == 1:
fmt = "%iB" % total_samples # read unsigned chars
elif sample_width == 2:
fmt = "%ih" % total_samples # read signed 2 byte shorts
raise ValueError("Only supports 8 and 16 bit audio formats.")
integer_data = struct.unpack(fmt, raw_data)
del raw_data # Keep memory tidy (who knows how big it might be)
channels = [ [] for time in range(num_channels) ]
for index, value in enumerate(integer_data):
bucket = index % num_channels
return channels, sample_rate
"Two channels" means stereo, so it makes no sense to sum each channel's duration -- so you're off by a factor of two (2.18 seconds, not 4.37). As for signedness, as explained for example here, and I quote:
8-bit samples are stored as unsigned
bytes, ranging from 0 to 255. 16-bit
samples are stored as 2's-complement
signed integers, ranging from -32768
to 32767.
This is part of the specs of the WAV format (actually of its superset RIFF) and thus not dependent on what library you're using to deal with a WAV file.
I know that an answer has already been accepted, but I did some things with audio a while ago and you have to unpack the wave doing something like this.
pcmdata = wave.struct.unpack("%dh"%(wavedatalength),wavedata)
Also, one package that I used was called PyAudio, though I still had to use the wave package with it.
Each sample is 16 bits and there 2 channels, so the frame takes 4 bytes
The duration is simply the number of frames divided by the number of frames per second. From your data this is: 96333 / 44100 = 2.18 seconds.
Building upon this answer, you can get a good performance boost by using numpy.fromstring or numpy.fromfile. Also see this answer.
Here is what I did:
def interpret_wav(raw_bytes, n_frames, n_channels, sample_width, interleaved = True):
if sample_width == 1:
dtype = np.uint8 # unsigned char
elif sample_width == 2:
dtype = np.int16 # signed 2-byte short
raise ValueError("Only supports 8 and 16 bit audio formats.")
channels = np.fromstring(raw_bytes, dtype=dtype)
if interleaved:
# channels are interleaved, i.e. sample N of channel M follows sample N of channel M-1 in raw data
channels.shape = (n_frames, n_channels)
channels = channels.T
# channels are not interleaved. All samples from channel M occur before all samples from channel M-1
channels.shape = (n_channels, n_frames)
return channels
Assigning a new value to shape will throw an error if it requires data to be copied in memory. This is a good thing, since you want to use the data in place (using less time and memory overall). The ndarray.T function also does not copy (i.e. returns a view) if possible, but I'm not sure how you ensure that it does not copy.
Reading directly from the file with np.fromfile will be even better, but you would have to skip the header using a custom dtype. I haven't tried this yet.

