I have got a list that I am packing as bytes using struct module in Python. Here is my list:
[39, 39, 126, 126, 256, 258, 260, 259, 257, 126]
I am packing my list as:
encoded = struct.pack(">{}H".format(len(list)), *list)
where I pass number of elements in list as a format.
Now, I need to unpack the packed struct. For that I will need a format where I again pass number of elements. For now I am doing it like so:
struct.unpack(">{}H".format(10), encoded)
However, I can't pass it as a simple parameter to function format because that struct is then written to file that I am using for compressing image. How can I add a number of elements to file, and unpack it after?
P.S. I would like to get that 10 (in unpacking) from file itself that is packed as bytes.
Form what I understood from the comments and questions. Maybe this will be helpful.
import struct
data = [39, 39, 126, 126, 256, 258, 260, 259, 257, 126]
encoded = struct.pack(">{}H".format(len(data)), *data)
tmp = struct.pack(">H", len(data))
encoded = tmp + encoded #appending at the start
begin = 2
try:
size = struct.unpack(">H", encoded[0:begin])[0]
print(size)
print(struct.unpack(">{}H".format(size), encoded[begin:]))
except Exception as e:
print(e)
Let me know if it helps.
Here is my approach of adding that [number of elements] to the file:
file.write(len(compressed_list).to_bytes(3,'big'))
I allocate 3 bytes of memory for the length of compressed_list, convert it to bytes, and add it to the beginning of the file. Further, write other left parts.
Next, when I need that number, I get it from the file like so:
sz = int.from_bytes(encoded[0:3],'big')
which means that I take first three bytes from byte array read from the file, and typecast that bytes to int.
That solved my problem.
Related
I want to change the value of beta in Test.py which are in multiple folders at the same time without actually opening these files but I am getting an error. How do I do this?
import os
N=[8,10, 23,29, 36, 37, 41,42, 45, 46, 47]
I=[]
for i in N:
os.read(rf'C:\Users\User\{i}\Test.py')
beta=1e-1
The error is
in <module>
os.read(rf'C:\Users\User\OneDrive - Technion\Research_Technion\Python_PNM\All_ND\var_6.0_beta_0.1\{i}\220_beta_1.0_50.0_6.0ND.py')
TypeError: read expected 2 arguments, got 1
Syntax: os.read(fd, n)
Parameter: fd: A file descriptor representing the file to be read. n:
An integer value denoting the number of bytes to be read from the file
associated with the given file descriptor fd
Seems like you forgot the second argument n.
see - https://www.geeksforgeeks.org/python-os-read-method/#:~:text=read()%20method%20in%20Python,bytes%20left%20to%20be%20read.
I'm having trouble encrypting bytes from a PyObject using XOR.
For now I only managed to print the bytes as an encoded string (with PyUnicode_AsEncodedString):
Here's what I tried (taken from this SO answer)
PyObject* repr = PyObject_Repr(wf.str); // wf.str is a PyObject *
PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
const char *bytes = PyBytes_AS_STRING(str);
printf("REPR: %s\n", bytes);
Py_XDECREF(repr);
Py_XDECREF(str);
From here on, I don't know what to do anymore.
I also tried to access bytes only using PyBytes_AS_STRING(wf.str) and then proceed with the encryption, but it only returned one byte.
There is a way to XOR encrypt bytes taken from a PyObject? Something like this:
bytes = getBytesFromPyObject(wf.str)
encrypted = XOREncryption(bytes)
AssignBytesToPyObject(encrypted, wf.str)
Note: I don't know much about C, all of this is almost new to me.
Edit: I'm using C instead of Python because I need to implement a function that uses XOR encryption in a built-in module for Python3.
"I also tried to access bytes only using PyBytes_AS_STRING(wf.str)
and then proceed with the encryption, but it only returned one byte."
Are you sure about this? It looks like it is returning a byte pointer byte*.
In C, a pointer to an array is a pointer to the location of the first element in the array. If you add an offset equal to the size of the data you are accessing (in this case, 1 byte), then you should be able to access the following element.
The issue is likely that you need some method to determine the size of your byte array, then you can operate on the byte* that you've already accessed and iterate through each byte.
I know that #h0r53 has already answered my question, but I want to post the code anyway in case it comes useful to someone.
This was implemented in a function (PyMarshal_WriteObjectToString, used for marshal.dump and marshal.dumps) of my custom version of Python, in the marshal.c file
char *bytes = PyBytes_AS_STRING(wf.str);
const char key[32] = {162, 10, 190, 161, 209, 110, 69, 181,
119, 63, 176, 125, 158, 134, 48, 185,
200, 22, 41, 43, 212, 144, 131, 169,
158, 182, 8, 220, 200, 232, 231, 126
};
Py_ssize_t n = PyBytes_Size(wf.str);
for (int i = 0; i < n; i++) {
bytes[i] = bytes[i] ^ key[i % (sizeof(key) / sizeof(key[0]))];
}
wf.str = PyBytes_FromStringAndSize(bytes, n);
I am using pySerial to read TTL byte stream. To read two bytes:
CheckSumByte = [ b for b in ser.read(2)]
print( CheckSumByte)
print( type(CheckSumByte))
print( str(len(CheckSumByte)))
print( CheckSumByte[0])
Output:
[202, 87]
<class 'list'>
2
IndexError: list index out of range
I cannot access any elements of CheckSumByte by index (0 or 1). What is wrong?
Here is my code:
while(ReadBufferCount < 1000):
time.sleep(0.00002)
InputBuffer = ser.inWaiting()
if (InputBuffer > 0):
FirstByte = ser.read(1)
if ord(FirstByte) == 0xFA:
while ser.inWaiting() < 21: pass
IndexByte = ser.read(1)
SpeedByte = [ b for b in ser.read(2)]
DataByte0 = [ b for b in ser.read(4)]
DataByte1 = [ b for b in ser.read(4)]
DataByte2 = [ b for b in ser.read(4)]
DataByte3 = [ b for b in ser.read(4)]
CheckSumByte = [ b for b in ser.read(2)]
print( CheckSumByte[0]) #Out of Range??`
Traceback (most recent call last):
File "<ipython-input-6-5233b0a578b1>", line 1, in <module>
runfile('C:/Users/Blair/Documents/Python/Neato XV-11 Lidar/Serial9.py', wdir='C:/Users/Blair/Documents/Python/Neato XV-11 Lidar')
File "C:\Program Files (x86)\WinPython-32bit-3.4.3.3\python-3.4.3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "C:\Program Files (x86)\WinPython-32bit-3.4.3.3\python-3.4.3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Blair/Documents/Python/Neato XV-11 Lidar/Serial9.py", line 88, in <module>
print( CheckSumByte[0]) #Out of Range??
IndexError: list index out of range
Kenny: Thanks. Even simpler for the two bytes:
CheckSumByte.append(ser.read(1))
CheckSumByte.append(ser.read(1))
Works properly, but awkward. The items are type bytes. How to add items to the list using list comprehension? I would like to avoid the append function because it is slow.
I notice it does not work when the items of CheckSumByte are integer. Does Python 3 list comprehension require special format to add the bytes as byte (not convert to integer)?
According to your most recent comment you have constructed ser as:
ser = serial.Serial(
port=PortName, baudrate=115200, parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE, bytesize=serial.EIGHTBITS,
timeout=0)
According to the documentation this means that ser is non-blocking (despite your assertion it is blocking!).
As it is in non-blocking mode there is absolutely no reason to expect ser.read(n) to return exactly n bytes. Instead, if you wish to read n bytes you should either:
construct ser as blocking (use timeout=None) in the constructor; or
loop while monitoring the number of bytes actually read (just as you would when reading a network socket)
The latter for example means that if you wish to read n bytes you will need to do something like:
def read_exactly(ser, n):
bytes = b""
while len(bytes) < n:
bytes += ser.read(n - len(bytes))
return bytes
In your particular case, you seem to be monitoring the the input buffer to ensure there is adequate data for the following reads. But this monitoring is only occurring some of the time, not all of the time. Thus when FirstByte != 0xFA you could exhaust the read buffer unless you take one of the approachs given above.
So basically I am trying to read in the information of a wave file so that I can take the byte information and create an array of time->amplitude points.
import wave
class WaveFile:
# `filename` is the name of the wav file to open
def __init__(self, fileName):
self.wf = wave.open(fileName, 'r')
self.soundBytes = self.wf.readframes(-1)
self.timeAmplitudeArray = self.__calcTimeAmplitudeArray()
def __calcTimeAmplitudeArray(self):
self.internalTimeAmpList = [] # zero out the internal representation
byteList = self.soundBytes
if((byteList[i+1] & 0x080) == 0):
amp = (byteList[i] & 0x0FF) + byteList[i+1] << 8
#more code continues.....
Error:
if((int(byteList[i+1]) & 0x080) == 0):
TypeError: unsupported operand type(s) for &: 'str' and 'int'
I have tried using int() to convert to integer type, but to no avail. I come from a Java background where this would done using the byte type, but that does not appear to be a language feature of Python. Any direction would be appreciated.
Your problem comes from the fact that the wave library is just giving you raw binary data (in the form of a string).
You'll probably need to check the form of the data with self.wf.getparams(). This returns (nchannels, sampwidth, framerate, nframes, comptype, compname). If you do have 1 channel, a sample width of 2, and no compression (fairly common type of wave), you can use the following (import numpy as np) to get the data:
byteList = np.fromstring(self.soundBytes,'<h')
This returns a numpy array with the data. You don't need to loop. You'll need something different in the second paramater if you have a different sample width. I've tested with with a simple .wav file and plot(byteList); show() (pylab mode in iPython) worked.
See Reading *.wav files in Python for other methods to do this.
Numpyless version
If you need to avoid numpy, you can do:
import array
bytelist = array.array('h')
byteList.fromstring(self.soundBytes)
This works like before (tested with plot(byteList); show()). 'h' means signed short. len, etc. works. This does import the wav file all at once, but then again .wav usually are small. Not always.
I usually use the array-module for this and the fromstring method.
My standard-pattern for operating on chunks of data is this:
def bytesfromfile(f):
while True:
raw = array.array('B')
raw.fromstring(f.read(8192))
if not raw:
break
yield raw
with open(f_in, 'rb') as fd_in:
for byte in bytesfromfile(fd_in):
# do stuff
Above 'B' denotes unsigned char, i.e. 1-byte.
If the file isn't huge, then you can just slurp it:
In [8]: f = open('foreman_cif_frame_0.yuv', 'rb')
In [9]: raw = array.array('B')
In [10]: raw.fromstring(f.read())
In [11]: raw[0:10]
Out[11]: array('B', [10, 40, 201, 255, 247, 254, 254, 254, 254, 254])
In [12]: len(raw)
Out[12]: 152064
Guido can't be wrong...
If you instead prefer numpy, I tend to use:
fd_i = open(file.bin, 'rb')
fd_o = open(out.bin, 'wb')
while True:
# Read as uint8
chunk = np.fromfile(fd_i, dtype=np.uint8, count=8192)
# use int for calculations since uint wraps
chunk = chunk.astype(np.int)
if not chunk.any():
break
# do some calculations
data = ...
# convert back to uint8 prior to writing.
data = data.astype(np.uint8)
data.tofile(fd_o)
fd_i.close()
fd_o.close()
or to read the whole-file:
In [18]: import numpy as np
In [19]: f = open('foreman_cif_frame_0.yuv', 'rb')
In [20]: data = np.fromfile(f, dtype=np.uint8)
In [21]: data[0:10]
Out[21]: array([ 10, 40, 201, 255, 247, 254, 254, 254, 254, 254], dtype=uint8)
I have the following dict which I want to write to a file in binary:
data = {(7, 190, 0): {0: 0, 1: 101, 2: 7, 3: 0, 4: 0},
(7, 189, 0): {0: 10, 1: 132, 2: 17, 3: 20, 4: 40}}
I went ahead to use the struct module in this way:
packed=[]
for ssd, add_val in data.iteritems():
# am trying to using 0xcafe as a marker to tell me where to grab the keys
pack_ssd = struct.pack('HBHB', 0xcafe, *ssd)
packed.append(pack_ssd)
for add, val in data[ssd].iteritems():
pack_add_val = struct.pack('HH', add, val)
packed.append(pack_add_val)
The output of this is packed = ['\xfe\xca\x07\x00\xbe\x00\x00', '\x00\x00\x00\x00', '\x01\x00e\x00', '\x02\x00\x07\x00', '\x03\x00\x00\x00', '\x04\x00\x00\x00', '\xfe\xca\x07\x00\xbd\x00\x00', '\x00\x00\n\x00', '\x01\x00\x84\x00', '\x02\x00\x11\x00', '\x03\x00\x14\x00', '\x04\x00(\x00']
After which I write this as a binary file :
ifile = open('test.bin', 'wb')
for pack in packed:
ifile.write(pack)
Here is what the binary file looks like:
'\xfe\xca\x07\x00\xbe\x00\x00\x00\x00\x00\x00\x01\x00e\x00\x02\x00\x07\x00\x03\x00\x00\x00\x04\x00\x00\x00\xfe\xca\x07\x00\xbd\x00\x00\x00\x00\n\x00\x01\x00\x84\x00\x02\x00\x11\x00\x03\x00\x14\x00\x04\x00(\x00'
It's all OK until I tried to unpack the data. Now I want to read the contents of the binary file and arrange it back to how my dict looked liked in the first place. This is how I tried to unpack it but I was always getting an error:
unpack=[]
while True:
chunk = ifile.read(log_size)
if len(chunk) == log_size:
str = struct.unpack('HBHB', chunk)
unpack.append(str)
chunk = ifile.read(log1_size)
str= struct.unpack('HH', chunk)
unpack.append(str)
Traceback (most recent call last):
File "<interactive input>", line 7, in ?
error: unpack str size does not match format
I realize the method I tried to unpack will always run into problems, but I can't seem to find a good way in unpacking the contents of the binary file. Any help is much appreciated..
If you need to write something custom, I would suggest doing the following:
1) 64 bit integer: Number of keys
2) 64 bit integer * 3 * number of keys: Key tuple data
for i in number of keys:
3i) 64 bit integer: Number of keys for dictionary i
4i): 64 bit integer * 2 * number of keys for i: key data, value data, key data, value data...
After that, just make sure you read and write with the same endianness and that specifying an invalid length at any point (too high, too low) doesn't crash your program and you are good.
The idea is that at any state in the unpacker it is either expecting a length or to read data as something, and so it is 100% unambiguous where everything starts and ends as long as you follow the format.