I'm trying to write hex data taken from ascii file to a newly created binary file
ascii file example:
98 af b7 93 bb 03 bf 8e ae 16 bf 2e 52 43 8b df
4f 4e 5a e4 26 3f ca f7 b1 ab 93 4f 20 bf 0a bf
82 2c dd c5 38 70 17 a0 00 fd 3b fe 3d 53 fc 3b
28 c1 ff 9e a9 28 29 c1 94 d4 54 d4 d4 ff 7b 40
my code
hexList = []
with open('hexFile.txt', 'r') as hexData:
line=hexData.readline()
while line != '':
line = line.rstrip()
lineHex = line.split(' ')
for i in lineHex:
hexList.append(int(i, 16))
line = hexData.readline()
with open('test', 'wb') as f:
for i in hexList:
f.write(hex(i))
Thought hexList holds already hex converted data and f.write(hex(i)) should write these hex data into a file, but python writes it with ascii mode
final output: 0x9f0x2c0x380x590xcd0x110x7c0x590xc90x30xea0x37 which is wrong!
where is the issue?
Use binascii.unhexlify:
>>> import binascii
>>> binascii.unhexlify('9f')
'\x9f'
>>> hex(int('9f', 16))
'0x9f'
import binascii
with open('hexFile.txt') as f, open('test', 'wb') as fout:
for line in f:
fout.write(
binascii.unhexlify(''.join(line.split()))
)
replace:
f.write(hex(i))
With:
f.write(chr(i)) # python 2
Or,
f.write(bytes((i,))) # python 3
Explanation
Observe:
>>> hex(65)
'0x41'
65 should translate into a single byte but hex returns a four character string. write will send all four characters to the file.
By contrast, in python2:
>>> chr(65)
'A'
This does what you want: chr converts the number 65 to the character single-byte string which is what belongs in a binary file.
In python3, chr(i) is replaced by bytes((i,)).
Related
I have been trying to read the contents of the genesis.block given in this file of the Node SDK in Hyperledger Fabric using Python. However, whenever I try to read the file with Python by using
data = open("twoorgs.genesis.block").read()
The value of the data variable is as follows:
>>> data
'\n'
With nodejs using fs.readFileSync() I obtain an instance of Buffer() for the same file.
var data = fs.readFileSync('./twoorgs.genesis.block');
The result is
> data
<Buffer 0a 22 1a 20 49 63 63 ac 9c 9f 3e 48 2c 2c 6b 48 2b 1f 8b 18 6f a9 db ac 45 07 29 ee c0 bf ac 34 99 9e c2 56 12 e1 84 01 0a dd 84 01 0a d9 84 01 0a 79 ... >
How can I read this file successfully using Python?
You file has a 1a in it. This is Ctrl-Z, which is an end of file on Windows.
So try binary mode like:
data = open("twoorgs.genesis.block", 'rb').read()
So I'm working with incoming audio from Watson Text to Speech. I want to play the sound immediately when data arrives to Python with a websocket from nodeJS.
This is a example of data I'm sending with the websocket:
<Buffer e3 f8 28 f9 fa f9 5d fb 6c fc a6 fd 12 ff b3 00 b8 02 93 04 42 06 5b 07 e4 07 af 08 18 0a 95 0b 01 0d a2 0e a4 10 d7 12 f4 12 84 12 39 13 b0 12 3b 13 ... >
So the data arrives as a hex bytestream and I try to convert it to something that Sounddevice can read/play. (See documentation: The types 'float32', 'int32', 'int16', 'int8' and 'uint8' can be used for all streams and functions.) But how can I convert this?
I already tried something, but when I run my code I only hear some noise, nothing recognizable.
Here you can read some parts of my code:
def onMessage(self, payload, isBinary):
a = payload.encode('hex')
queue.put(a)
After I receive the bytesstream and convert to hex, I try to send the incoming bytestream to Sounddevice:
def stream_audio():
with sd.OutputStream(channels=1, samplerate=24000, dtype='int16', callback=callback):
sd.sleep(int(20 * 1000))
def callback(outdata, frames, time, status):
global reststuff, i, string
LENGTH = frames
while len(reststuff) < LENGTH:
a = queue.get()
reststuff += a
returnstring = reststuff[:LENGTH]
reststuff = reststuff[LENGTH:]
for char in returnstring:
i += 1
string += char
if i % 2 == 0:
print string
outdata[:] = int(string, 16)
string = ""
look at your stream of data:
e3 f8 28 f9 fa f9 5d fb 6c fc a6 fd 12 ff b3 00
b8 02 93 04 42 06 5b 07 e4 07 af 08 18 0a 95 0b
01 0d a2 0e a4 10 d7 12 f4 12 84 12 39 13 b0 12
3b 13
you see here that every two bytes the second one is starting with e/f/0/1 which means near zero (in two's complement).
So that's your most significant bytes, so your stream is little-endian!
you should consider that in your conversion.
If I have more data I would have tested but this is worth some miliseconds!
This question already has answers here:
Tools to help reverse engineer binary file formats
(9 answers)
Closed 6 years ago.
I have a file here. To me it appears it is a binary file. This is raw file and I believe that it has the stock information in OHLCV (Open, High, Low, Close, Volume). Besides it may also have some text.
One of the entries that I could possibly have for OHLCV is
464.95, 468.3, 460, 465.65, 3957854
This is the code that I have tried. I dont fully understand about ASCII and Unicode.
input_file = "00063181.dat" # tata motors
with open(input_file, "rb") as fh:
buf = fh.read()
output_l = list(map(int , buf))
print (output_l)
My Doubt: How do I decode this file and make sense out of it? Is there any way for me to read this file through a program written in python and separate the text from int/float? I am using Python 3 and Win 10 64 bit.
You're looking to reverse engineer the structure of a binary file using Python. Since you've declared that the file is binary, it may prove difficult. You're going to need to examine the contents of the file and use your best intuition to try to infer the structure. The first thing you're going to want is a way to display each of the bytes of the file an a way that will help you understand the meaning.
Fortunately, someone has already written a tool to do this, hexdump. Install that package using pip.
The function you need from that package is hexdump, so let's import it the package and get help on the function.
>>> import hexdump
>>> help(hexdump.hexdump)
Help on function hexdump in module hexdump:
hexdump(data, result='print')
Transform binary data to the hex dump text format:
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[x] data argument as a binary string
[x] data argument as a file like object
Returns result depending on the `result` argument:
'print' - prints line by line
'return' - returns single string
'generator' - returns generator that produces lines
Now you can start to explore the contents of your file. Use the slice operator to do it in chunks. For example, to render the contents of the first 1KB of your file:
>>> hexdump.hexdump(buf[:1024])
00000000: C3 8E C2 8F 22 13 C2 AA 66 2A 22 47 C3 94 C3 AA ...."...f*"G....
00000010: C3 89 C3 A0 C3 B1 C3 91 6A C2 A4 C3 BF 3C C2 AA ........j....<..
00000020: C2 91 73 C3 85 46 57 47 C2 88 C3 99 C2 B6 3E 2D ..s..FWG......>-
00000030: C3 BA 69 10 C2 93 C3 94 38 C3 81 7A 6A 43 30 7C ..i.....8..zjC0|
00000040: C3 BB C2 AA 01 2D C2 97 C3 83 C3 88 64 14 C3 9C .....-......d...
00000050: C2 AB C2 AA C3 A2 74 C2 85 5D C3 97 4E 64 68 C3 ......t..]..Ndh.
...
000003C0: 42 C2 8F 06 7F 12 33 7F 79 1E 2C 2A 0F C3 92 36 B.....3.y.,*...6
000003D0: C3 A6 C2 96 C2 93 C2 8B 43 C2 9F 4C C2 95 48 24 ........C..L..H$
000003E0: C2 B3 C2 82 26 C3 88 C3 BD C3 96 12 1E 5E 18 2E ....&........^..
000003F0: 37 C3 A7 C2 87 C3 AE 00 4F 3F C2 9C C3 A8 1C C2 7.......O?......
Hexdump has a nice property of rendering the byte position, the hex code, and then (if possible) the printable form of the character on the right.
Hopefully some of your text values will be visible there and that will give some clue as to how to reverse engineer your file.
Once you've started to determine how your file is structured, you can use the various string operators to manipulate your data. For example, if you find that your file is split into sections by the null byte (b'\x00'), you can get those sections thus:
>>> sections = buf.split(b'\x00')
There are a lot of things that you're likely to have to learn as you dig deeper, like character encodings, number encodings (including little-endian for integers and floating-point encoding for floating point numbers). You'll want to find some way to externally validate your results.
Best of luck.
skip = 2
with open("text.txt") as f:
for line in f:
if line.startswith("blah"):
next(f)
lst = [next(f) for i in range(skip)]
lst = [el.replace('\n', '') for el in lst]
for i in range(len(lst)/2):
lst.insert(i,lst.pop(i) + lst.pop(i))
lst =[int(i,16) for i in lst]
print lst
i got:
lst =[int(i,16) for i in lst]
ValueError: invalid literal for int() with base 16: 'ff 55 00 90 00 92 00 ad 00 c6 00 b7 00 8d 00 98 00 87 00 8a 00 98 00 8f 00 ca 01 78 03 54 05 bf'
any idea how to get a int hex?
lst =[int(val,16) for line in lst for val in line.split()]
I think will do what you want ... although its hard to tell for sure
I have a hex file which appears as below:-
00000000 AA AA 11 FF EC FF E7 3E FA DA D8 78 39 75 89 4E
00000010 FD FD BF E5 FF DD FF AA E9 78 67 84 90 E4 87 83
00000020 9F E7 80 FD FE 73 75 78 93 47 58 93 EE 33 33 3F
I want to read 3rd and 4th byte. Swap these two bytes and save them in a variable. For e.g, i want to save 0xFF11 (after byteswap) in variable "num"
This is what i tried:
I read these two bytes one by one
data=open('xyz.bin','rb').read()
num1=data[2]
num2=data[3]
num1,num2=num2,num1
num= num1*100+num2
print(num)
Now the problem is num variable has integer value and i have no idea how to get hex into it.
I am stuck here and not able to proceed further. Any help would be welcomed.
PS: I am very new to python.
import struct
with open("xyz.bin", "rb") as f:
f.seek(2)
num, = struct.unpack("<H", f.read(2))
print "little endian:", hex(num), num # little endian: 0xff11 65297
In Python 3, you could create an integer directly from bytes:
with open('xyz.bin','rb') as file:
file.seek(2)
num = int.from_bytes(file.read(2), 'little')
print(hex(num), num) # -> 0xff11 65297
First, you would have to multiply num1 by 256, of course, not 100 (you could write decimal 256 as 0x100, though, if that helps make your intention clearer).
Second, to format an integer as a hex number, use
print("{:x}".format(num))
For example:
>>> num1 = 0xff
>>> num2 = 0xab
>>> num = num1*256 + num2
>>> print("{:x}".format(num))
ffab
You may be interested in some/all of the following operations which abstract away all of the bitwise math that you'd otherwise have to do.
import struct
line = '00000000 AA AA 11 FF EC FF E7 3E FA DA D8 78 39 75 89 4E'.split()
bytearray(int(x,16) for x in line[3:5])
Out[42]: bytearray(b'\x11\xff')
struct.unpack('H',bytearray(int(x,16) for x in line[3:5]))
Out[43]: (65297,)
hex(65297)
Out[44]: '0xff11'
packed_line = bytearray(int(x,16) for x in line[1:])
struct.unpack('{}H'.format(len(packed_line)/2),packed_line)
Out[47]: (43690, 65297, 65516, 16103, 56058, 30936, 30009, 20105)
The Best way of doing it is by using struct module.