I need to read a file bit by bit in python [duplicate] - python

This question already has answers here:
How to read bits from a file?
(3 answers)
Closed last year.
I would like an output like 01100011010... or [False,False,True,True,False,...] from a file to then create an encrypted file.
I've already tried byte = file.read(1) but i don't know how to then convert it to bits.

You can read a file in binary mode this way:
with open(r'path\to\file.ext', 'rb') as binary_file:
bits = binary_file.read()
The 'rb' option stands for Read Binary.
For the output you asked for you can do something like this:
[print(bit, end='') for bit in bits]
that is the list comprehension equivalent to:
for bit in bits:
print(bit, end='')
Since 'rb' gives you hex numbers instead of bin, you can use the built-in function bin to solve the problem:
with open(r'D:\\help\help.txt', 'rb') as file:
for char in file.read():
print(bin(char), end='')

Suppose we have text file text.txt
# Read the content:
with open("text.txt") as file:
content=file.read()
# using join() + bytearray() + format()
# Converting String to binary
res = ''.join(format(i, '08b') for i in bytearray(content, encoding ='utf-8'))
# printing result
print("The string after binary conversion : " + str(res))

Related

Python 3.10 Binary splitting script(inconsistent output)

I need to split a .bin file into chunks. However, I seem to face a problem when it comes to writing the output in the split/new binary file. The output is inconsistent, I can see the data, but there are shifts and gaps everywhere when comparing the split binary with the bigger original one.
def hash_file(filename: str, blocksize: int = 4096) -> str:
blocksCount = 0
with open(filename, "rb") as f:
while True:
#Read a new chunk from the binary file
full_string = f.read(blocksize)
if not full_string:
break
new_string = ' '.join('{:02x}'.format(b) for b in full_string)
split_string = ''.join(chr(int(i, 16)) for i in new_string.split())
#Append the split chunk to the new binary file
newf = open("SplitBin.bin","a", encoding="utf-8")
newf.write(split_string)
newf.close()
#Check if the desired number of mem blocks has been reached
blocksCount = blocksCount + 1
if blocksCount == 1:
break
For characters with ordinals between 0 and 0x7f, their UTF-8 representation will be the same as their byte value. But for characters with ordinals between 0x80 and 0xff, UTF-8 will output two bytes neither of which will be the same as the input. That's why you're seeing inconsistencies.
The easiest way to fix it would be to open the output file in binary mode as well. Then you can eliminate all the formatting and splitting, because you can directly write the data you just read:
with open("SplitBin.bin", "ab") as newf:
newf.write(full_string)
Note that reopening the file each time you write to it will be very slow. Better to leave it open until you're done.

Create text file of hexadecimal from binary

I would like to convert a binary to hexadecimal in a certain format and save it as a text file.
The end product should be something like this:
"\x7f\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52"
Input is from an executable file "a".
This is my current code:
with open('a', 'rb') as f:
byte = f.read(1)
hexbyte = '\\x%02s' % byte
print hexbyte
A few issues with this:
This only prints the first byte.
The result is "\x" and a box like this:
00
7f
In terminal it looks exactly like this:
Why is this so? And finally, how do I save all the hexadecimals to a text file to get the end product shown above?
EDIT: Able to save the file as text with
txt = open('out.txt', 'w')
print >> txt, hexbyte
txt.close()
You can't inject numbers into escape sequences like that. Escape sequences are essentially constants, so, they can't have dynamic parts.
There's already a module for this, anyway:
from binascii import hexlify
with open('test', 'rb') as f:
print(hexlify(f.read()).decode('utf-8'))
Just use the hexlify function on a byte string and it'll give you a hex byte string. You need the decode to convert it back into an ordinary string.
Not quite sure if decode works in Python 2, but you really should be using Python 3, anyway.
Your output looks like a representation of a bytestring in Python returned by repr():
with open('input_file', 'rb') as file:
print repr(file.read())
Note: some bytes are shown as ascii characters e.g. '\x52' == 'R'. If you want all bytes to be shown as the hex escapes:
with open('input_file', 'rb') as file:
print "\\x" + "\\x".join([c.encode('hex') for c in file.read()])
Just add the content to list and print:
with open("default.png",'rb') as file_png:
a = file_png.read()
l = []
l.append(a)
print l

Using python to write hex to file

I am trying to create a bunch of binary files that contain corresponding hex values
for i in range(2**8):
file = open("test" + str(i) + ".bin", "wb")
file.write(hex(i))
file.close()
Unfortunately it appears that a text representation of my counter converted to hex is being written to the files instead of the actual hex values. Can someone please correct this code? I'm sure the problem is with hex(i)
If you want the value to be written in binary, use chr() to create the character from i:
for i in range(2**8):
with open("test" + str(i) + ".bin", "wb") as f:
f.write(chr(i))

Python3 ASCII Hexadecimal to Binary String Conversion

I'm using Python 3.2.3 on Windows, and am trying to convert binary data within a C-style ASCII file into its binary equivalent for later parsing using the struct module. For example, my input file contains "0x000A 0x000B 0x000C 0x000D", and I'd like to convert it into "\x00\x0a\x00\x0b\x00\x0c\x00\x0d".
The problem I'm running into is that the string datatypes have changed in Python 3, and the built-in functions to convert from hexadecimal to binary, such as binascii.unhexlify(), no longer accept regular unicode strings, but only byte strings. This process of converting from unicode strings to byte strings and back is confusing me, so I'm wondering if there's an easier way to achieve this. Below is what I have so far:
with open(path, "r") as f:
l = []
data = f.read()
values = data.split(" ")
for v in values:
if (v.startswith("0x")):
l.append(binascii.unhexlify(bytes(v[2:], "utf-8").decode("utf-8")
string = ''.join(l)
3>> ''.join(chr(int(x, 16)) for x in "0x000A 0x000B 0x000C 0x000D".split()).encode('utf-16be')
b'\x00\n\x00\x0b\x00\x0c\x00\r'
As agf says, opening the image with mode 'r' will give you string data.
Since the only thing you are doing here is looking at binary data, you probably want to open with 'rb' mode and make your result of type bytes, not str.
Something like:
with open(path, "rb") as f:
l = []
data = f.read()
values = data.split(b" ")
for v in values:
if (v.startswith(b"0x")):
l.append(binascii.unhexlify(v[2:]))
result = b''.join(l)

Reading binary file (.chn) in Python

In python, how do I read a binary file (here I need to read a .chn file) and show the result in binary format?
Assuming that values are separated by a space:
with open('myfile.chn', 'rb') as f:
data = []
for line in f: # a file supports direct iteration
data.extend(hex(int(x, 2)) for x in line.split())
In Python is better to use open() over file(), documentation says it explicitly:
When opening a file, it’s preferable to use open() instead of invoking
the file constructor directly.
rb mode will open the file in binary mode.
Reference:
http://docs.python.org/library/functions.html#open
try this:
with open('myfile.chn') as f:
data=f.read()
data=[bin(ord(x)).strip('0b') for x in data]
print ''.join(data)
and if you want only the binary data it will be in the list.
with open('myfile.chn') as f:
data=f.read()
data=[bin(ord(x)).strip('0b') for x in data]
print data
In data now you will have the list of binary numbers. you can take this and convert to hexadecimal number
with file('myfile.chn') as f:
data = f.read() # read all strings at once and return as a list of strings
data = [hex(int(x, 2)) for x in data] # convert to a list of hex strings (by interim getting the decimal value)

Categories

Resources