I am trying to create a bunch of binary files that contain corresponding hex values
for i in range(2**8):
file = open("test" + str(i) + ".bin", "wb")
file.write(hex(i))
file.close()
Unfortunately it appears that a text representation of my counter converted to hex is being written to the files instead of the actual hex values. Can someone please correct this code? I'm sure the problem is with hex(i)
If you want the value to be written in binary, use chr() to create the character from i:
for i in range(2**8):
with open("test" + str(i) + ".bin", "wb") as f:
f.write(chr(i))
Related
This question already has answers here:
How to read bits from a file?
(3 answers)
Closed last year.
I would like an output like 01100011010... or [False,False,True,True,False,...] from a file to then create an encrypted file.
I've already tried byte = file.read(1) but i don't know how to then convert it to bits.
You can read a file in binary mode this way:
with open(r'path\to\file.ext', 'rb') as binary_file:
bits = binary_file.read()
The 'rb' option stands for Read Binary.
For the output you asked for you can do something like this:
[print(bit, end='') for bit in bits]
that is the list comprehension equivalent to:
for bit in bits:
print(bit, end='')
Since 'rb' gives you hex numbers instead of bin, you can use the built-in function bin to solve the problem:
with open(r'D:\\help\help.txt', 'rb') as file:
for char in file.read():
print(bin(char), end='')
Suppose we have text file text.txt
# Read the content:
with open("text.txt") as file:
content=file.read()
# using join() + bytearray() + format()
# Converting String to binary
res = ''.join(format(i, '08b') for i in bytearray(content, encoding ='utf-8'))
# printing result
print("The string after binary conversion : " + str(res))
I have a rather large text document and would like to replace all instances of hexadecimals inside with regular decimals. Or if possible convert them into text surrounded by '' e.g. 'I01A' instead of $49303141
The hexadecimals are currently marked by starting with $ but I can ctrl+F change that into 0x if that helps, and I need the program to detect the end of the number since some are short $A, while others are long like $568B1F
How could I do this with python, or is it not possible?
Thank you for the help thus far, hoping to clarify my request a bit more to hopefully get a complete solution.
I used a version of Grismar's answer and the output it gives me is
"if not (GetItemTypeId(GetSoldItem())==I0KB) then
set int1= 2+($3E8*3)"
However, I would like to add the ' around the newly created text and convert hex strings smaller then 8 to decimals instead so the output becomes
"if not (GetItemTypeId(GetSoldItem())=='I0KB') then
set int1= 2+(1000*3)"
Hoping for some more help tog et the rest of the way.
def hex2dec(s):
return int(s,16)
was my attempt to convert the shorter hexadecimals to decimal but clearly has not worked, throws syntax errors instead.
Also, I will manually deal with the few $ not used to denote a hexadecimal.
# just creating an example file
with open('D:\Deprotect\wc3\mpq editor\Work\\new 4.txt', 'w') as f:
f.write('if not (GetItemTypeId(GetSoldItem())==$49304B42) then\n')
f.write('set int1= 2+($3E8*3)\n')
def hex_match_to_string(m):
return ''.join([chr(int(m.group(1)[i:i+2], 16)) for i in range(0, len(m.group(1)), 2)])
def hex2dec(s):
return int(s,16)
# open the file for reading
with open('D:\Deprotect\wc3\mpq editor\Work\\new 4.txt', 'r') as file_in:
# open the same file again for reading and writing
with open('D:\Deprotect\wc3\mpq editor\Work\\new 4.txt', 'r+') as file_out:
# start writing at the start of the existing file, overwriting the contents
file_out.seek(0)
while True:
line = file_in.readline()
if line == '':
# end of file
break
# replace the parts of the string matching the regex
line = re.sub(r'\$((?:\w\w\w\w\w\w\w\w)+)', hex_match_to_string, line)
#line = re.sub(r'$\w+', hex2dec,line)
file_out.write(line)
# the resulting file is shorter, truncate it from the current position
file_out.truncate()
See the answer https://stackoverflow.com/a/12597709/1780027 for how to use re.sub to replace specific content of a string with the output of a function. Using this you could presumably use the "int("FFFF", 16) " code snippet you're talking about to perform the action you desire.
EG:
>>> def replace(match):
... match = match.group(1)
... return str(int(match, 16))
>>> sample = "here's a hex $49303141 and there's a nother 1034B and another $8FD0B"
>>> re.sub(r'\$([a-fA-F0-9]+)', replace, sample)
"here's a hex 1227895105 and there's a nother 41803 and another 589067"
Since you are replacing parts of the file with something that's shorter, you can write to the same file you're reading. But keep in mind that, if you were replacing those parts with something that was longer, you would need to write the result to a new file and replace the old file with the new file once you were done.
Also, from your description, it appears you are reading a text file, which makes reading the file line by line the easiest, but if your file was some sort of binary file, using re wouldn't be as convenient and you'd probably need a different solution.
Finally, your question doesn't mention whether $ might also appear elsewhere in the text file (not just in front of pairs of characters that should be read as hexadecimal numbers). This answer assumes $ only appears in front of strings of 2-character hexadecimal numbers.
Here's a solution:
import re
# just creating an example file
with open('test.txt', 'w') as f:
f.write('example line $49303141\n')
f.write('$49303141 example line, with more $49303141\n')
f.write('\n')
f.write('just some text\n')
def hex_match_to_string(m):
return ''.join([chr(int(m.group(1)[i:i+2], 16)) for i in range(0, len(m.group(1)), 2)])
# open the file for reading
with open('test.txt', 'r') as file_in:
# open the same file again for reading and writing
with open('test.txt', 'r+') as file_out:
# start writing at the start of the existing file, overwriting the contents
file_out.seek(0)
while True:
line = file_in.readline()
if line == '':
# end of file
break
# replace the parts of the string matching the regex
line = re.sub(r'\$((?:\w\w)+)', hex_match_to_string, line)
file_out.write(line)
# the resulting file is shorter, truncate it from the current position
file_out.truncate()
The regex is simple r'\$((?:\w\w)+)', which matches any string starting with an actual $ (the backslash avoids it being interpreted as 'the beginning of the string') and followed by 1 or more (+) pairs of letters and numbers (\w\w).
The function hex_match_to_string(m) expects a regex match object and loops over pairs of characters in the first matched group. Each pair is turned into its decimal value by interpreting it as a hexadecimal string (int(pair, 16)) and that decimal value is then turned into a character with that ASCII value (chr(value)). All the resulting characters are joined into a single string (''.join(list)).
A different way or writing hex_match_to_string(m):
def hex_match_to_string(m):
hex_nums = iter(m.group(1))
return ''.join([chr(int(a, 16) * 16 + int(b, 16)) for a, b in zip(hex_nums, hex_nums)])
This may perform a bit better, since it avoids manipulating strings, but it does the same thing.
I would like to convert a binary to hexadecimal in a certain format and save it as a text file.
The end product should be something like this:
"\x7f\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52"
Input is from an executable file "a".
This is my current code:
with open('a', 'rb') as f:
byte = f.read(1)
hexbyte = '\\x%02s' % byte
print hexbyte
A few issues with this:
This only prints the first byte.
The result is "\x" and a box like this:
00
7f
In terminal it looks exactly like this:
Why is this so? And finally, how do I save all the hexadecimals to a text file to get the end product shown above?
EDIT: Able to save the file as text with
txt = open('out.txt', 'w')
print >> txt, hexbyte
txt.close()
You can't inject numbers into escape sequences like that. Escape sequences are essentially constants, so, they can't have dynamic parts.
There's already a module for this, anyway:
from binascii import hexlify
with open('test', 'rb') as f:
print(hexlify(f.read()).decode('utf-8'))
Just use the hexlify function on a byte string and it'll give you a hex byte string. You need the decode to convert it back into an ordinary string.
Not quite sure if decode works in Python 2, but you really should be using Python 3, anyway.
Your output looks like a representation of a bytestring in Python returned by repr():
with open('input_file', 'rb') as file:
print repr(file.read())
Note: some bytes are shown as ascii characters e.g. '\x52' == 'R'. If you want all bytes to be shown as the hex escapes:
with open('input_file', 'rb') as file:
print "\\x" + "\\x".join([c.encode('hex') for c in file.read()])
Just add the content to list and print:
with open("default.png",'rb') as file_png:
a = file_png.read()
l = []
l.append(a)
print l
I'm using Python 3.2.3 on Windows, and am trying to convert binary data within a C-style ASCII file into its binary equivalent for later parsing using the struct module. For example, my input file contains "0x000A 0x000B 0x000C 0x000D", and I'd like to convert it into "\x00\x0a\x00\x0b\x00\x0c\x00\x0d".
The problem I'm running into is that the string datatypes have changed in Python 3, and the built-in functions to convert from hexadecimal to binary, such as binascii.unhexlify(), no longer accept regular unicode strings, but only byte strings. This process of converting from unicode strings to byte strings and back is confusing me, so I'm wondering if there's an easier way to achieve this. Below is what I have so far:
with open(path, "r") as f:
l = []
data = f.read()
values = data.split(" ")
for v in values:
if (v.startswith("0x")):
l.append(binascii.unhexlify(bytes(v[2:], "utf-8").decode("utf-8")
string = ''.join(l)
3>> ''.join(chr(int(x, 16)) for x in "0x000A 0x000B 0x000C 0x000D".split()).encode('utf-16be')
b'\x00\n\x00\x0b\x00\x0c\x00\r'
As agf says, opening the image with mode 'r' will give you string data.
Since the only thing you are doing here is looking at binary data, you probably want to open with 'rb' mode and make your result of type bytes, not str.
Something like:
with open(path, "rb") as f:
l = []
data = f.read()
values = data.split(b" ")
for v in values:
if (v.startswith(b"0x")):
l.append(binascii.unhexlify(v[2:]))
result = b''.join(l)
I'm trying to write a series of files for testing that I am building from scratch. The output of the data payload builder is of type string, and I'm struggling to get the string written directly to the file.
The payload builder only uses hex values, and simply adds a byte for each iteration.
The 'write' functions I have tried all either fall over the writing of strings, or write the ASCII code for the string, rather than the string its self...
I want to end up with a series of files - with the same filename as the data payload (e.g. file ff.txt contains the byte 0xff
def doMakeData(counter):
dataPayload = "%X" %counter
if len(dataPayload)%2==1:
dataPayload = str('0') + str(dataPayload)
fileName = path+str(dataPayload)+".txt"
return dataPayload, fileName
def doFilenameMaker(counter):
counter += 1
return counter
def saveFile(dataPayload, fileName):
# with open(fileName, "w") as text_file:
# text_file.write("%s"%dataPayload) #this just writes the ASCII for the string
f = file(fileName, 'wb')
dataPayload.write(f) #this also writes the ASCII for the string
f.close()
return
if __name__ == "__main__":
path = "C:\Users\me\Desktop\output\\"
counter = 0
iterator = 100
while counter < iterator:
counter = doFilenameMaker(counter)
dataPayload, fileName = doMakeData(counter)
print type(dataPayload)
saveFile(dataPayload, fileName)
To write just a byte, use chr(n) to get a byte containing integer n.
Your code can be simplified to:
import os
path = r'C:\Users\me\Desktop\output'
for counter in xrange(100):
with open(os.path.join(path,'{:02x}.txt'.format(counter)),'wb') as f:
f.write(chr(counter))
Note use of raw string for the path. If you had a '\r' or '\n' in the string they would be treated as a carriage return or linefeed without using a raw string.
f.write is the method to write to a file. chr(counter) generates the byte. Make sure to write in binary mode 'wb' as well.
dataPayload.write(f) # this fails "AttributeError: 'str' object has no attribute 'write'
Of course it does. You don't write to strings; you write to files:
f.write(dataPayload)
That is to say, write() is a method of file objects, not a method of string objects.
You got this right in the commented-out code just above it; not sure why you switched it around here...