16 bit artimatic sum in Python

16 bit artimatic sum in Python - python

I am working on a program that is creating IRIG106 Chapter 10 data for a cube-sat project. Currently it is being implemented in python and I am having difficulty implementing the final component of the Chapter 10 header.
The way I have implemented it I am currently finding checksum values that are larger than what will fit inside of an integer of the size defined by the specification (2 bytes).
The standard defines the header checksum in section 10.6.1.1 paragraph "J" of the IRIG 106-09 standard. It is defined as the following:
J Header Checksum. (2 Bytes) contains a value representing a 16-bit arithmetic sum of all 16-bit words in the header excluding the Header Checksum Word.
There is also a programming manual provided that has example C code that shows the following (from page A-2-17):
uint16_t I106_CALL_DECL uCalcHeaderChecksum(SuI106Ch10Header * psuHeader)
{
int iHdrIdx;
uint16_t uHdrSum;
uint16_t * aHdr = (uint16_t *)psuHeader;
uHdrSum = 0;
for (iHdrIdx=0; iHdrIdx<(HEADER_SIZE-2)/2; iHdrIdx++)
uHdrSum += aHdr[iHdrIdx];
return uHdrSum;
}
I have implemented the following in Python using the BitString library:
def calculate_checksum(byte_data: BitArray = None, header_length_bytes: int = 24, chunk_length: int = 16):
# Set the checksum to zero:
checksum = 0
# Loop through the Chapter 10 header and compute the 16 bit arithmetic sum:
for bit_location in range(0, (header_length_bytes-2), chunk_length):
# Get the range of bits to select:
bit_range = slice(bit_location, (bit_location + chunk_length))
# Get the uint representation of the bit data found:
checksum += Bits(bin=byte_data.bin[bit_range]).uint
# Write the computed checksum as binary data to the start location of the checksum in the header:
byte_data.overwrite(Bits(uint=checksum, length=chunk_length), (header_length_bytes-2*8))
Any thoughts or insights you could provide would be extremely appreciated. I know it should be a simple solution but I am just not able to see it.
--- Update 2 ---
I tried doing both roll over and truncation and they both produced the same result:
test_value = 2**16
test_value1 = test_value + 500
test_value2 = test_value1 % (2**16) -> 500
test_value3 = test_value1 & 0xFFFF -> 500
--- Update 3 ---
When I compare the execution of the python and C checksum functions I have run into the following using these values as an input per the spec:
Sync = "EB25" (2 bytes)
ChannelID = 1 (2 bytes)
PacketLen = 1024 (4 bytes)
When I compare the outputs at each step I see the following:
C:
Header0: EB25
index = 0 16bit chunk = 60197 checksum = 60197
Header1: 0001
index = 1 16bit chunk = 1 checksum = 60198
Header2: 0400
index = 2 16bit chunk = 1024 checksum = 61222
Header3: 0000
index = 3 16bit chunk = 0 checksum = 61222
Python:
eb25
index: 0 chunk: 60197 checksum: 60197
0001
index: 1 chunk: 1 checksum: 60198
0000
index: 2 chunk: 0 checksum: 60198
0400
index: 3 chunk: 1024 checksum: 61222

So, I know this question is really old, but I had the same issue. The endianess of the packet matters. In a .ch10 file, the packet start is 0x25EB because each section is little endian. Here's how I'm doing everything right now in C-ish code.
// read until the end of the file
while (!atEndOfFile)
{
// verify that we get teh sync packet
if ( readNextByte == 0x25 )
{
if ( readNextByte == 0xEB )
{
// store the sync packet
byte packetHeader[24];
packetHeader[0] = 0x25;
packetHeader[1] = 0xEB
// grab the rest of the header
for ( int i = 2; i < 24; j++ )
{
packetHeader[j] = readNextByte;
}
// grab the check sum from the packet
uint16 actualCheckSum = (packetHeader[23] << 8) |
packetHeader[22];
// calculate the checkSum
uint16 calculatedCheckSum = 0;
for ( int i = 0; i < 22; i++ )
{
calculatedCheckSum += (packetHeader[i + 1] << 8) |
packetHeader[i];
}
// verify the calculation
if ( calculatedCheckSum == actualChecksum )
{
printLine( "We calculated the checksum!");
printLine( "actual checksum: " + actualCheckSum +
"calculated checksum" + calculatedCheckSum );
}
}
}
}
I haven't done enough digging into the irig106 library, but I believe that it handles the translation when it reads in a .ch10 file.

Related

Reading an Ogg Opus header to check the crc

I decided to experiment with file formats and I'm using python to read said files.
Everything I have extracted from the Ogg header is correct, except the crc check.
The documentation says you must check the entire header and page with the original crc check value set to 0.
I'm wondering what steps I'm missing to get the expected result.
import zlib
import struct
with open("sample3.opus", "rb") as f_:
file_data = f_.read()
cp, ssv, htf, agp, ssn, psn, pc, ps = struct.unpack_from("<4sBBQIIIB", file_data, 0)
offset = struct.calcsize("<4sBBQIIIB")
segments = struct.unpack_from(f"<{ps}B", file_data, offset)
packet_size = 0
for num in segments:
packet_size += num
header_size = offset + len(segments) + packet_size
# Copying the entire packet then changing the crc to 0.
header_copy = bytearray()
header_copy.extend(file_data[0:header_size])
struct.pack_into("<I", header_copy, struct.calcsize("<4sBBQII"), 0)
print(pc)
print(zlib.crc32(header_copy))
This script results in:
277013243
752049619
The audio file I'm using:
https://filesamples.com/formats/opus

zlib.crc32() is not the CRC that they specify. They say the initial value and final exclusive-or is zero, whereas for zlib.crc32(), those values are both 0xffffffff. They fail to specify whether their CRC is reflected or not, so you'd need to try both to see which it is.
Update:
I checked, and it's a forward CRC. Unfortunately, you can't use zlib.crc32() to calculate it. You can compute it with this:
def crc32ogg(seq):
crc = 0
for b in seq:
crc ^= b << 24
for _ in range(8):
crc = (crc << 1) ^ 0x104c11db7 if crc & 0x80000000 else crc << 1
return crc

Read binary file into struct (translating instructions)

Reading binary files and structs are a new area for me.
I understand how to read in the file and attempted various methods to read the raw data but seems I need to use struct.
I am trying to translate these instructions to python code:
The beginning of the Binary Merge file contains an array of GWI_file_header_struct structs (defined in file INET_INT.H) for the various channels, followed by the interlaced 32bit floating point data. The 1st 4 bytes in the header is the length of the header for 1 channel in bytes (i.e. 516 = 0x0204). To read the # of channels stored in the file, read the 'channelsPerFile' field of the 1st struct (e.g. to see how many headers are there). After the header, the data is saved in an interlaced form, where points are stored in the order that they are acquired in time.
The main confusion is how do I translate this to:
struct.unpack(...)
INET_INT.H struct:
typedef struct GWI_file_header_struct{ // This struct is at the beginning of GWI iNet BINARY files that contain waves.
//
// Macintosh:
//
// file type: 'GWID'
// creator type: 'ioNe' NETWORK_DATA_CREATOR
// ----------------------------------
// HEADER INFORMATION
iNetINT32 headerSizeInBytes; // contains length, in bytes, of this header (this does not include any data) { bytes 0..3, base 0 }
// ----------------------------------
// FILE INFORMATION
iNetINT32 int32key; // 32bit key that should contain 0x12345678 (this will help you make sure your byte lanes are ok).
// { bytes 4..7, base 0 }
iNetINT32 file_endian; // endian mode of stored data on disk: 0 = bigEndian_ion, 1 = littleEndian_ion
// { bytes 8..11, base 0 }
iNetINT16 int16key; // 16bit key that should contain 0x55b4; (this field should consume 2 bytes
// in the struct -- no padding) (i.e. INET_INT16_KEY = 0x55b4)
// { bytes 12..13, base 0 }
iNetINT16 zero; // set to 0 (this field should consume 2 bytes in the struct -- no padding)
// { bytes 14..15, base 0 }
// # of seconds since Jan 1, Midnight, 1904 that the acquisition started (this is used to compute the
// date of acquisition). This overflows in 2030.
// Strip Chart: 1st digitized point in entire stream (i.e. 1st pt of 1st scan)
// Osc Mode: 1st point in current scan, secsSince1904_Int64 units
// { bytes 16..19, base 0 }
iNetUINT32 acquisition_SecsSince1904_FixedUint32_OverflowIn2030;
// ----------------------------------
// # OF POINTS STORED
//
// This file contains a set of scans. Each scan is 1 to .5billion points long. For example,
// we might have 100 scans, each 1000 points long. In this example:
//
// pointsPerScanThisChannel_LSW = 1000
// pointsPerScanThisChannel_MSW = 0
//
// numScansStoredBeforeLastScan = 99
//
// numPointsInLastPartialScan_LSW = 1000
// numPointsInLastPartialScan_MSW = 0
//
// Each channel can have a different number of points per scan due to the sampleRateChanMULTiplier
iNetUINT32 pointsPerScanThisChannel_LSW;
iNetUINT32 pointsPerScanThisChannel_MSW;
// # points per scan = (pointsPerScanThisChannel_MSW * 2^32) + pointsPerScanThisChannel_LSW
// { bytes 20..23, base 0 }
// { bytes 24..27, base 0 }
iNetUINT32 numScansStoredBeforeLastScan_LSW;
// # of complete scans stored in file
// { bytes 28..31, base 0 }
// iNetUINT32 numScansStoredBeforeLastScan_MSW;
// this is defined below, at the end of the struct
iNetUINT32 numPointsInLastPartialScan_LSW;
iNetUINT32 numPointsInLastPartialScan_MSW;
// # points stored in last scan if it is partially complete = (numPointsInLastPartialScan_MSW * 2^32) + numPointsInLastPartialScan_LSW
// { bytes 32..35, base 0 }
// { bytes 36..39, base 0 }
// ----------------------------------
// TIME INFORMATION
iNetFLT32 firstPoint_Time_Secs; // time of 1st point, units are seconds
// { bytes 40..43, base 0 }
iNetFLT32 endUser_channel_samplePeriod_Secs;
// time between points for this channel,
// units are seconds. Notice that channels
// can have different sample rates, which
// is the master_endUser_SampleRate / sampleRate_Divider,
// where 'sampleRate_Divider' is an integer.
// { bytes 44..47, base 0 }
// ----------------------------------
// TYPE OF DATA STORED
iNetINT32 arrayDataType; // Type of src array data. iNetDataType:
//
// 0 iNetDT_INT16: 16bit integer, signed
// 2 iNetDT_UINT16: 16bit integer, unsigned
// 3 iNetDT_INT32: 32bit integer, signed
// 4 iNetDT_UINT32: 32bit integer, unsigned
// 5 iNetDT_FLT32: 32bit float (IEEE flt32 format)
// 6 iNetDT_Double: 'double', as determined by the compiler
// (e.g. flt64, flt80, flt96, flt128)
// see 'bytesPerDataPoint' field to see
// how many bytes
// { bytes 48..51, base 0 }
iNetINT32 bytesPerDataPoint; // # of bytes for each datapoint (e.g. 4 for 32bit signed integer)
// { bytes 52..55, base 0 }
iNetStr31 verticalUnitsLabel; // pascal string of vertical units label (e.g. "Volts")
// { bytes 56..87, base 0 }
iNetStr31 horizontalUnitsLabel; // horizontal units label, e.g. "Secs", pascal string (0th char is the # of valid chars)
// { bytes 88..119, base 0 }
iNetStr31 userName; // user named set by user, e.g. "Pressure 1" , pascal string (0th char is the # of valid chars)
// { bytes 120..151, base 0 }
iNetStr31 chanName; // name of channel, e.g. "Ch1 Vin+", pascal string (0th char is the # of valid chars)
// { bytes 152..183, base 0 }
// ----------------------------------
// DATA MAPPING
//
iNetINT32 minCode; // if data is stored in integer format, this contains the mapping from integer
iNetINT32 maxCode; // to engineering units (e.g. +/-2048 A/D data is mapped to +/- 10V, minCode = -2048,
iNetFLT32 minEU; // maxCode = +2047, minEU = -10.000, maxEU = +9.995.
iNetFLT32 maxEU; //
// { bytes 184..187, base 0 }
// { bytes 188..191, base 0 }
// { bytes 192..195, base 0 }
// { bytes 196..199, base 0 }
// ----------------------------------
// iNet NETWORK ADDRESS (this does not need
// to be filled in, 0L's are ok)
iNetINT32 netNum; // channel network # (this pertains to iNet only; use 0 otherwise)
// { bytes 200..203, base 0 }
iNetINT32 devNum; // channel device # (this pertains to iNet only; use 0 otherwise)
// { bytes 204..207, base 0 }
iNetINT32 modNum; // channel module # (this pertains to iNet only; use 0 otherwise)
// { bytes 208..211, base 0 }
iNetINT32 chNum; // channel channel # (this pertains to iNet only; use 0 otherwise)
// { bytes 212..215, base 0 }
// ----------------------------------
// END USER NOTES
iNetStr255 notes; // pascal string that contains notes about the data stored.
// { bytes 216..471, base 0 }
// ----------------------------------
// MAPPING
iNetFLT32 /* must remain flt32 */ internal1; // Mapping from internal engineering units (e.g. Volts) to external engineering
iNetFLT32 /* must remain flt32 */ external1; // units (e.g. mmHg). This is used for 2 point linear mapping/calibration to
iNetFLT32 /* must remain flt32 */ internal2; // a new, user defined, coordinate system. instruNet World does not read these values
iNetFLT32 /* must remain flt32 */ external2; // from the wave files, yet instead reads them from the instrNet.prf file -- they
// are only stored for the benefit of other software that might read this file. gsw 12/1/96
// { bytes 472..475, base 0 }
// { bytes 476..479, base 0 }
// { bytes 480..483, base 0 }
// { bytes 484..487, base 0 }
iNetFLT32 flt32key; // flt32 key set to 1234.56 (i.e. INET_FLT32_KEY), Used to test floating point code. gsw 12/1/96
// { bytes 488..491, base 0 }
iNetINT32 sampleRate_Divider; // this channel is digitized at the master_endUser_SampleRate divided
// this 'sampleRate_Divider' (i.e. sampleRateChanMULT_integerRatio_N_int64)
// (helpful with FileType Binary Merge), gsw 1/29/97. Note: This field was introduced 1/29/97 and
// files saved before that time set it to 0.
// { bytes 492..495, base 0 }
iNetINT32 channelsPerFile; // # of channels per file (i.e. interlaced after array of headers) (helpful with FileType Binary Merge), gsw 1/29/97
// Note: This field was introduced 1/29/97 and files saved before that time set it to 0.
// { bytes 496..499, base 0 }
// ----------------------------------
// EXPANSION FIELDS
#if 1 // gsw 12/23/09
// # of complete scans stored in file, MS 32bits
// { bytes 500..503, base 0 }
iNetUINT32 numScansStoredBeforeLastScan_MSW;
#else
iNetINT32 expansion8; // expansion fields that are preset to
#endif
iNetINT32 expansion9; // 0 and then ignored
iNetINT32 expansion10; // { bytes 500..503, base 0 }
// { bytes 504..507, base 0 }
// { bytes 508..511, base 0 }
// ----------------------------------
// KEY TO TEST STRUCT PACKING
iNetINT32 int32key_StructTest; // 32bit key that should contain 0x12345678; (i.e. INET_INT32_KEY)
// { bytes 512..515, base 0 }
// ----------------------------------
// ACTUAL DATA
/* iNetFLT32 *data[1]; */ // contains array of data of type 'arrayDataType'
} GWI_file_header_struct;
Final Code and Results:
Code
from struct import *
# Current 3 channels: Ch11 Vin+, Ch13 Vin+ and Ch15 Vin+
# Header info extracted using provided header struct (INET_INT.H)
# After the header, the data is saved in an interlaced form,
# where points are stored in the order that they are acquired in time.
# 3 channels: A[0], B[0], C[0], A[1], B[1], C[1]...
# After header = 516 header size x 3 channels = 1,548 bytes
# Start of data at 1,548 bytes?
with open(file, "rb") as f:
byte = f.read(12)
header_size, int32key, file_endian = unpack('<3i', byte)
# channel name 1
f.seek(152)
chan = f.read(183-152)
chan = struct.unpack("<31s", chan)[0].rstrip(b'\x00').lstrip(b'\t')
# channel name 2
f.seek(152+header_size)
chan2 = f.read(183-152)
chan2 = struct.unpack("<31s", chan2)[0].rstrip(b'\x00').lstrip(b'\t')
print(header_size, int32key, file_endian)
print("channel 1: {}".format(chan))
print("channel 2: {}".format(chan2))
Results
516 305419896 1
channel 1: b'Ch11 Vin+'
channel 2: b'Ch13 Vin+'

Ok, this is not a full answer but I feel comments would be really unreadable here.
The first step is reading the first 12 bytes (three 4-bytes integers), and unpack them so we can check the endianness. Let's try big-endian first
from struct import *
with open(file, "rb") as f:
byte = f.read(12)
header_size, int32key, file_endian = unpack('>3i', byte)
We expect to have int32key set at 305419896 (= \x12345678). If we get another value then let's switch to little-endian, i.e. change our unpack format string to <3i.
At this point we can read the rest of the header, with the same logic, and get all the info we need to read data for the first channel. I hope this can be a good start for you.

Implementing a CRC algorithm

I'm trying to implement a CRC algorithm as defined in some video interface standards:
SMPTE296M-2001
BT.1120-9:2017
The raw data is 10 bit words that are squashed into 8 bit bytes which I have no issues extracting and working with in numpy.
the CRC has polynomial:
CRC(X) = X^18 + X^5 + X^4 + 1
I believe this gives me the constant:
POLY = 0x40031
I've tried a few different implementations and nothing I generate matches my sample data.
this implementation was inspired by this
MASK = 0x3FFFF
class MYCRC:
crc_table = []
def __init__(self):
if not self.crc_table:
for i in range(1024):
k = i
for j in range(10):
if k & 1:
k ^= POLY
k >>= 1
self.crc_table.append(k)
def calc(self, crc, data):
crc ^= MASK
for d in data:
crc = (crc >> 10) ^ self.crc_table[(crc & 0x3FF) ^ d]
return crc ^ MASK
then there is this implementation I pulled from somewhere (not sure where)
def crc_calc(crc, p):
crc = MASK & ~crc
for i in range(len(p)):
crc = (crc ^ p[i]) # & BIG_MASK
for j in range(10):
crc = ((crc >> 1) ^ (POLY & -(crc & 1))) # & BIG_MASK
return MASK & ~crc
I also looked at using this library which has support for using custom polynomials, but it appears to be built to work with 8 bit data, not the 10 bit data I have.
I'm not sure how best to share test data as I only have whole frames which if exported as a numpy file is ~5MB.
I'm also unclear as the the range of data I'm supposed to feed to the CRC calculation. I think from reading it, it should be from the first active sample on one line, up to the line count of the line after, then the checksum calculated over that range. This makes the most sense from a hardware perspective, but the standard doesn't read that clearly to me.
edit:
pastebin of 10 lines worth of test data, this includes the embedded checksum.
within a line of data, samples 0-7 are the EAV marker, 8-11 are the line number,12-16 are the two checksums. the data is two interleaved streams of video data (luma channel and CbCr channel).
the standards state the checksums are run from the first active sample to the end of the line data, which I interpret to mean that it runs from sample 740 of one line to sample 11 of the next line.
As per section 5 of SMPTE292M the data is 10 bit data which cannot go below 0x3 or above 0x3FC. as per table 4 the result of the CRC should be 18 bits which get split and embedded into the stream as two words (with one bit filled in with the not of another bit) Note that there is one checksum for each channel of data, these two checksums are at 12-16 on each line
edit 2
some longer test data that straddles the jump from blanking data to active frame data

The CRC calculation must be done reflected. (Clue in note on Table 9: "NOTE – CRC0 is the MSB of error detection codes.")
This C routine checks the CRCs in your example correctly:
// Update the CRC-18 crc with the low ten bits of word.
// Polynomial = 1000000000000110001
// Reflected (dropping x^18) = 10 0011 0000 0000 0000 = 0x23000
unsigned crc18(unsigned crc, unsigned word) {
crc ^= word & 0x3ff;
for (int k = 0; k < 10; k++)
crc = crc & 1 ? (crc >> 1) ^ 0x23000 : crc >> 1;
return crc;
}
Indeed the span of the check is from the start of the active line through the line numbers, up to just before the two CRCs in the stream. That calculation matches those two CRCs. Each CRC is calculated on alternating words from the stream. The CRCs are initialized to zero.

Can zlib compressed output avoid using certain byte value?

It seems that the output of zlib.compress uses all possible byte values. Is this possible to use 255 of 256 byte values (for example avoid using \n)?
Note that I just use the python manual as a reference, but the question is not specific to python (i.e. any other languages that has a zlib library).

No, this is not possible. Apart from the compressed data itself, there is standardized control structures which contain integers. Those integers may accidentially lead to any 8-bit character ending up in the bytestream.
Your only chance would be to encode the zlib bytestream into another format, e.g. base64.

The whole point of compression is to reduce the size as much as possible. If zlib or any compressor only used 255 of the 256 byte values, the size of the output would be increased by at least 0.07%.
That may be perfectly fine for you, so you can simply post-process the compressed output, or any data at all, to remove one particular byte value at the expense of some expansion. The simplest approach would be to replace that byte when it occurs with a two-byte escape sequence. You would also then need to replace the escape prefix with a different two-byte escape sequence. That would expand the data on average by 0.8%. That is exactly what Hans provided in another answer here.
If that cost is too high, you can do something more sophisticated, which is to decode a fixed Huffman code that encodes 255 symbols of equal probability. To decode you then encode that Huffman code. The input is a sequence of bits, not bytes, and most of the time you will need to pad the input with some zero bits to encode the last symbol. The Huffman code turns one symbol into seven bits and the other 254 symbols into eight bits. So going the other way, it will expand the input by a little less than 0.1%. For short messages it will be a little more, since often less than seven bits at the very end will be encoded into a symbol.
Implementation in C:
// Placed in the public domain by Mark Adler, 26 June 2020.
// Encode an arbitrary stream of bytes into a stream of symbols limited to 255
// values. In particular, avoid the \n (10) byte value. With -d, decode back to
// the original byte stream. Take input from stdin, and write output to stdout.
#include <stdio.h>
#include <string.h>
// Encode arbitrary bytes to a sequence of 255 symbols, which are written out
// as bytes that exclude the value '\n' (10). This encoding is actually a
// decoding of a fixed Huffman code of 255 symbols of equal probability. The
// output will be on average a little less than 0.1% larger than the input,
// plus one byte, assuming random input. This is intended to be used on
// compressed data, which will appear random. An input of all zero bits will
// have the maximum possible expansion, which is 14.3%, plus one byte.
int nolf_encode(FILE *in, FILE *out) {
unsigned buf = 0;
int bits = 0, ch;
do {
if (bits < 8) {
ch = getc(in);
if (ch != EOF) {
buf |= (unsigned)ch << bits;
bits += 8;
}
else if (bits == 0)
break;
}
if ((buf & 0x7f) == 0) {
buf >>= 7;
bits -= 7;
putc(0, out);
continue;
}
int sym = buf & 0xff;
buf >>= 8;
bits -= 8;
if (sym >= '\n' && sym < 128)
sym++;
putc(sym, out);
} while (ch != EOF);
return 0;
}
// Decode a sequence of symbols from a set of 255 that was encoded by
// nolf_encode(). The input is read as bytes that exclude the value '\n' (10).
// Any such values in the input are ignored and flagged in an error message.
// The sequence is decoded to the original sequence of arbitrary bytes. The
// decoding is actually an encoding of a fixed Huffman code of 255 symbols of
// equal probability.
int nolf_decode(FILE *in, FILE *out) {
unsigned long lfs = 0;
unsigned buf = 0;
int bits = 0, ch;
while ((ch = getc(in)) != EOF) {
if (ch == '\n') {
lfs++;
continue;
}
if (ch == 0) {
if (bits == 0) {
bits = 7;
continue;
}
bits--;
}
else {
if (ch > '\n' && ch <= 128)
ch--;
buf |= (unsigned)ch << bits;
}
putc(buf, out);
buf >>= 8;
}
if (lfs)
fprintf(stderr, "nolf: %lu unexpected line feeds ignored\n", lfs);
return lfs != 0;
}
// Encode (no arguments) or decode (-d) from stdin to stdout.
int main(int argc, char **argv) {
if (argc == 1)
return nolf_encode(stdin, stdout);
else if (argc == 2 && strcmp(argv[1], "-d") == 0)
return nolf_decode(stdin, stdout);
fputs("nolf: unknown options (use -d to decode)\n", stderr);
return 1;
}

As #ypnos says, this isn't possible within zlib itself. You mentioned that base64 encoding is too inefficient, but it's pretty easy to use an escape character to encode a character you want to avoid (like newlines).
This isn't the most efficient code in the world (and you might want to do something like finding the least used bytes to save a tiny bit more space), but it's readable enough and demonstrates the idea. You can losslessly encode/decode, and the encoded stream won't have any newlines.
def encode(data):
# order matters
return data.replace(b'a', b'aa').replace(b'\n', b'ab')
def decode(data):
def _foo():
pair = False
for b in data:
if pair:
# yield b'a' if b==b'a' else b'\n'
yield 97 if b==97 else 10
pair = False
elif b==97: # b'a'
pair = True
else:
yield b
return bytes(_foo())
As some measure of confidence you can check this exhaustively on small bytestrings:
from itertools import *
all(
bytes(p) == decode(encode(bytes(p)))
for c in combinations_with_replacement(b'ab\nc', r=6)
for p in permutations(c)
)

Temperature conversion in Python: How to convert int (0 - 255) which is a byte object to degrees Celsius in Python

We are currently working on an Arduino Uno project and getting stuck at the conversion of the integer data to degrees celcius. This code is working however it converts the binary packed data (\xd01) etc. to int (0-255). Our question is: how to convert the integer value to read out a certain degree of Celcius. For example: int 2 = 2 degrees celcius and 255 = 35 degrees Celcius
This is our Python code with the Pyserial module
import serial
import struct
ser = serial.Serial('COM3', 19200, timeout=5)
while True:
tempdata = ser.read(2)
x= struct.unpack('!BB', tempdata)
print(x)
And this is the code of the temperature conversion on our Arduino Uno, it is written in C.
#define F_CPU 16E6
// output on USB = PD1 = board pin 1
// datasheet p.190; F_OSC = 16 MHz & baud rate = 19.200
#define UBBRVAL 51
void uart_init()
{
// set the baud rate
UBRR0H = 0;
UBRR0L = UBBRVAL;
// disable U2X mode
UCSR0A = 0;
// enable transmitter
UCSR0B = _BV(TXEN0);
// set frame format : asynchronous, 8 data bits, 1 stop bit, no parity
UCSR0C = _BV(UCSZ01) | _BV(UCSZ00);
}
void transmit(uint8_t data)
{
// wait for an empty transmit buffer
// UDRE is set when the transmit buffer is empty
loop_until_bit_is_set(UCSR0A, UDRE0);
// send the data
UDR0 = data;
}
void init_adc()
{
// ref=Vcc, left adjust the result (8 bit resolution),
// select channel 0 (PC0 = input)
ADMUX = (1<<REFS0);
// enable the ADC & prescale = 128
ADCSRA = (1<<ADEN)|(1<<ADPS2)|(1<<ADPS1)|(1<<ADPS0);
}
uint8_t get_adc_value()
{
//ADMUX |= 1
ADCSRA |= (1<<ADSC); // start conversion
loop_until_bit_is_clear(ADCSRA, ADSC);
return ADC; // 8-bit resolution, left adjusted
}
/*
((value / 1024 * 5) - 0. 5) * 100
*/
int main(void) {
init_adc();
uart_init();
//int x;
while(1)
{
int x = get_adc_value();
int temp = ((((float) x / 1024) * 5) - 0.5) * 100;
transmit(temp);
_delay_ms(200);
}
}

The conversion from ADC value to temp will most likely depend on what type of temperature sensor you are using. I recommend looking a the datasheet of your temp sensor.
If you are using a 'TMP36', you can convert using this formula:
Centigrade temperature = [(analog voltage in mV) - 500] / 10
Source: https://learn.adafruit.com/tmp36-temperature-sensor/using-a-temp-sensor
If you a using a thermocouple, you'll need to look at the table of correspondance for the type you are using (e.g. type K: https://www.omega.fr/temperature/Z/pdf/z204-206.pdf)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

16 bit artimatic sum in Python - python

Related

Reading an Ogg Opus header to check the crc

Read binary file into struct (translating instructions)

Implementing a CRC algorithm

Can zlib compressed output avoid using certain byte value?

Temperature conversion in Python: How to convert int (0 - 255) which is a byte object to degrees Celsius in Python

Categories

Resources