I want to create a struct like this:
import ctypes
class MyStruct(ctypes.Structure):
_fields_ = [('field1', /* size of 16 bytes */),
('field2', /* size of 4 bytes */)
('field3', /* size of 8 bytes */)]
What is the types that i need to write here for these sizes of fields ? I want the the max size for field1 will be 16 bytes so the required value will be written there, and all the other bytes will be zeros (if necessary, up to 16 bytes). And in the same way for field2 and for field3.
Two ways. The 16-bit field makes it a bit tricky:
import ctypes
class MyStruct(ctypes.Structure):
_pack_ = 1
_fields_ = (('field1', ctypes.c_ubyte * 16), # C char[16] field1
('field2', ctypes.c_uint32), # C uint32_t field2
('field3', ctypes.c_uint64)) # C uint64_t field3
def __init__(self,a,b,c):
self.field1[:] = a.to_bytes(16,'little') # [:] trick to copy bytes to c_ubyte array
self.field2 = b
self.field3 = c
s = MyStruct(0x100, 0x200, 0x300)
print(bytes(s).hex(' '))
with open('out.bin','wb') as f:
f.write(bytes(s))
# OR
import struct
b = struct.pack('<16sLQ',(0x100).to_bytes(16,'little'),0x200,0x300)
print(b.hex(' '))
with open('out.bin','wb') as f:
f.write(b)
Output:
00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 00 00 00
00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 00 00 00
Related
I am trying to use f.write(struct.pack()) to write n bytes to a binary file but not quite sure how to do that? Any example or sample would be helpful.
You don't really explain your exact problem or what you tried and which error messages you encountered:
The solution should look something like:
with open("filename", "wb") as fout:
fout.write(struct.pack(format, data, ...))
If you explain what data exactly you want to dump, then I can elaborate on the solution
If your data is just a hex string, then you do not need struct, you just use decode.
Please refer to SO question hexadecimal string to byte array in python
example for python 2.7:
hex_str = "414243444500ff"
bytestring = hex_str.decode("hex")
with open("filename", "wb") as fout:
fout.write(bytestring)
The below worked for me:
reserved = "Reserved_48_Bytes"
f.write(struct.pack("48s", reserved))
Output:
hexdump -C output.bin
00000030 52 65 73 65 72 76 65 64 5f 34 38 5f 42 79 74 65 |Reserved_48_Byte|
00000040 73 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |s...............|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
I came across the following example of creating an Internet Checksum:
Take the example IP header 45 00 00 54 41 e0 40 00 40 01 00 00 0a 00 00 04 0a 00 00 05:
Adding the fields together yields the two’s complement sum 01 1b 3e.
Then, to convert it to one’s complement, the carry-over bits are added to the first 16-bits: 1b 3e + 01 = 1b 3f.
Finally, the one’s complement of the sum is taken, resulting to the checksum value e4c0.
I was wondering how the IP header is added together to get 01 1b 3e?
Split your IP header into 16-bit parts.
45 00
00 54
41 e0
40 00
40 01
00 00
0a 00
00 04
0a 00
00 05
The sum is 01 1b 3e. You might want to look at how packet header checksums are being calculated here https://en.m.wikipedia.org/wiki/IPv4_header_checksum.
The IP header is added together with carry in hexadecimal numbers of 4 digits.
i.e. the first 3 numbers that are added are 0x4500 + 0x0054 + 0x41e0 +...
I need to write a bitstring, which isn't always a multiple of 8 to a binary file. I also need to successfully read the string from the file again.
There will never be a 0 at the start of the string.
Example string:
bitstring = '10110101111111001101101010011011111010011010110001010101011100010110100010001001110001110100011111010001100011011110010100110000010111101011001011010111111100000110000000001001101000010110000111'
I need to use as little storage as possible. So if the string has length 194 (above) I need the file size to be 194//8 + 1 = 25 bytes, although I'm not sure if its possible to store a non-integer amount of bytes in a bin file.
This is the first time I have used binary so excuse the bad practice.
This is my current solution to write to the file:
with open(filename,"wb+") as f:
f.write(bytes(list(map(int, bitstring))))
f.close()
And this to read from it:
string = "".join(list(map(str,np.fromfile(filename,"u1"))))
using EmEditor, every digit in the string is stored as a 2 digit binary string, which is undesirable. I realise that this is probably because I'm splitting the bitstring into individual digits. Here is the above bitstring shown in the binary editor:
01 00 01 01 00 01 00 01 01 01 01 01 01 01 00 00 01 01 00 01 01 00 01 00 01 00 00 01 01 00 01 01
01 01 01 00 01 00 00 01 01 00 01 00 01 01 00 00 00 01 00 01 00 01 00 01 00 01 01 01 00 00 00 01
00 01 01 00 01 00 00 00 01 00 00 00 01 00 00 01 01 01 00 00 00 01 01 01 00 01 00 00 00 01 01 01
01 01 00 01 00 00 00 01 01 00 00 00 01 01 00 01 01 01 01 00 00 01 00 01 00 00 01 01 00 00 00 00
00 01 00 01 01 01 01 00 01 00 01 01 00 00 01 00 01 01 00 01 00 01 01 01 01 01 01 01 00 00 00 00
00 01 01 00 00 00 00 00 00 00 00 00 01 00 00 01 01 00 01 00 00 00 00 01 00 01 01 00 00 00 00 01
01 01
(All numbers are 2 digits, and file size is 194 bytes, which is the amount of binary numbers in the file/string)
I have tried to use bytearray with the same results
Thanks a lot in advance
Its can be solved by converting the string to integer and write binary integers to file like this
import os
from array import *
def main():
filename = "test.bin"
bitstring = '10110101111111001101101010011011111010011010110001010101011100010110100010001001110001110100011111010001100011011110010100110000010111101011001011010111111100000110000000001001101000010110000111'
print(bitstring)
# split string to 8 bites long chunks
splits = [bitstring[x:x + 8] for x in range(0, len(bitstring), 8)]
print(splits)
bin_array_in = array('B')
bin_array_out = array('B')
# convert bits to int and add to list
for split in splits:
bin_array_in.append(int(split, 2))
print(bin_array_in)
# dump list to file
with open(filename, "wb+") as f:
bin_array_in.tofile(f)
f.close()
print("file size: {}".format(os.path.getsize(filename)))
# get the list from file
with open(filename, "rb+") as f:
bin_array_out.fromfile(f, len(bin_array_in))
f.close()
print(bin_array_out)
# convert back to bin and join to one string
bitstring = ""
for i in bin_array_out:
bitstring += "{:08b}".format(i, "08b")
print(bitstring)
if __name__ == '__main__':
main()
There is only one problem, the last byte is padded with 0 to make 8 bits chunk.
A dont know if its more easy to change read / write logic or make your initial string % 8 so i let you figure out this part.
Try this:
>>> bitstring = '10110101111111001101101010011011111010011010110001010101011100010110100010001001110001110100011111010001100011011110010100110000010111101011001011010111111100000110000000001001101000010110000111'
>>> a = int(bitstring, 2)
>>> a
17849302729865679414224788101014796653923247039249910236551L
>>> bin(a)[2:] == bitstring
True
Once you have converted your bitstring into an integer, you may write it to the file the usual way. And read it back. And convert back to the bitstring (see above) and get the original one.
Credit to hiro protagonist who suggested using bitarray, as it was the simplest implementation I could do (Couldn't mark comment as an answer).Thanks for the answers but using bitarray I could solve it using fewest lines of code.
Here's the solution:
write:
bitstring = "11010101001101010011101"
bitarr = bitarray([0 for _ in range(8-(len(bitstring)%8))] + list(map(int,bitstring)))
with open(filename,"wb+") as f:
bitarr.tofile(f)
read:
with open(filename, "rb") as f:
string = "".join(_reformat_bin(list(map(str,np.fromfile(string,"u1")))))
while not int(string[0]):
string = string[1:]
_reformat_bin() converts an array of strings representing numbers 0-255 to an arr of strings representing bytes e.g.
>>> _reformat_bin(["255", "0", "9"])
>>> ["11111111", "00000000", "00001001"]
Cheers all.
I'm decoding an AMF0 format file. The data I'm looking for is a timestamp, encoded as an array. [HH, MM, SS].
Since the data is AMF0, I can locate the start of the data by reading in the file as bytes, converting each byte to hex, and looking for the signal 08 00 00 00 03, an array of length 3.
My problem is that I don't know how to decode the 8-byte integer in each element of the array. I have the data in the same, hex-encoded format, e.g.:
08 00 00 00 03 *signals array length 3*
00 01 30 00 00 00 00 00 00 00 00 00 *signals integer*
00 01 31 00 00 00 00 00 00 00 00 00 *signals integer*
00 01 32 00 40 3C 00 00 00 00 00 00 *signals integer*
00 00 09 *signals object end*
This should be decoded as [0, 0, 28] (if minerva is to be believed).
I've been trying to use struct.unpack, but all the examples I see are for 4-byte (little endian) values.
The format specifier you are looking for is ">9xd4xd4xd3x":
>>> import struct
>>> from binascii import unhexlify
>>> struct.unpack(">9xd4xd4xd3x", unhexlify("080000000300013000000000000000000000013100000000000000000000013200403C000000000000000009"))
(0.0, 0.0, 28.0)
Broken down:
>: big endian format
5x: 5 bytes begin-of-array marker + size (ignored)
4x: 4 bytes begin-of-element marker (ignored)
d: 1 big endian IEEE-754 double
points 2-3 for other 2 elements
3x: 3 bytes end-of-array marker (ignored)
Points 1. and 2. are merged together into 9x.
As you might have noticed, struct can only ignore extra bytes, not validate. If you need more flexibility in the input format, you could use a regex matching begin/end array markers in non-greedy mode.
To decode floats use the struct-module:
>>> struct.unpack('>d','403C000000000000'.decode('hex'))[0]
28.0
I am writing a program to parse a WAV file header and print the information to the screen. Before writing the program i am doing some research
hexdump -n 48 sound_file_8000hz.wav
00000000 52 49 46 46 bc af 01 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt |
00000010 10 00 00 00 01 00 01 00 >40 1f 00 00< 40 1f 00 00 |........#...#...|
00000020 01 00 08 00 64 61 74 61 98 af 01 00 81 80 81 80 |....data........|
hexdump -n 48 sound_file_44100hz.wav
00000000 52 49 46 46 c4 ea 1a 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt |
00000010 10 00 00 00 01 00 02 00 >44 ac 00 00< 10 b1 02 00 |........D.......|
00000020 04 00 10 00 64 61 74 61 a0 ea 1a 00 00 00 00 00 |....data........|
The part between > and < in both files are the sample rate.
How does "40 1f 00 00" translate to 8000Hz and "44 ac 00 00" to 44100Hz? Information like number of channels and audio format can be read directly from the dump. I found a Python
script called WavHeader that parses the sample rate correctly in both files. This is the core of the script:
bufHeader = fileIn.read(38)
# Verify that the correct identifiers are present
if (bufHeader[0:4] != "RIFF") or \
(bufHeader[12:16] != "fmt "):
logging.debug("Input file not a standard WAV file")
return
# endif
stHeaderFields = {'ChunkSize' : 0, 'Format' : '',
'Subchunk1Size' : 0, 'AudioFormat' : 0,
'NumChannels' : 0, 'SampleRate' : 0,
'ByteRate' : 0, 'BlockAlign' : 0,
'BitsPerSample' : 0, 'Filename': ''}
# Parse fields
stHeaderFields['ChunkSize'] = struct.unpack('<L', bufHeader[4:8])[0]
stHeaderFields['Format'] = bufHeader[8:12]
stHeaderFields['Subchunk1Size'] = struct.unpack('<L', bufHeader[16:20])[0]
stHeaderFields['AudioFormat'] = struct.unpack('<H', bufHeader[20:22])[0]
stHeaderFields['NumChannels'] = struct.unpack('<H', bufHeader[22:24])[0]
stHeaderFields['SampleRate'] = struct.unpack('<L', bufHeader[24:28])[0]
stHeaderFields['ByteRate'] = struct.unpack('<L', bufHeader[28:32])[0]
stHeaderFields['BlockAlign'] = struct.unpack('<H', bufHeader[32:34])[0]
stHeaderFields['BitsPerSample'] = struct.unpack('<H', bufHeader[34:36])[0]
I do not understand how this can extract the corret sample rates, when i cannot using hexdump?
I am using information about the WAV file format from this page:
https://ccrma.stanford.edu/courses/422/projects/WaveFormat/
The "40 1F 00 00" bytes equate to an integer whose hexadecimal value is 00001F40 (remember that the integers are stored in a WAVE file in the little endian format). A value of 00001F40 in hexadecimal equates to a decimal value of 8000.
Similarly, the "44 AC 00 00" bytes equate to an integer whose hexadecimal value is 0000AC44. A value of 0000AC44 in hexadecimal equates to a decimal value of 44100.
They're little-endian.
>>> 0x00001f40
8000
>>> 0x0000ac44
44100