How can I define a very long hex literal over several lines in Python? E.g.
p = 0xB10B8F96 A080E01D DE92DE5E AE5D54EC 52C99FBC FB06A3C6
9A6A9DCA 52D23B61 6073E286 75A23D18 9838EF1E 2EE652C0
13ECB4AE A9061123 24975C3C D49B83BF ACCBDD7D 90C4BD70
98488E9C 219A7372 4EFFD6FA E5644738 FAA31A4F F55BCCC0
A151AF5F 0DC8B4BD 45BF37DF 365C1A65 E68CFDA7 6D4DA708
DF1FB2BC 2E4A4371
It would be nice if I can keep the spaces or another separator like _ too.
Here is one attempt, which saves it as a string, and then uses ast.literal_eval to calculate the actual number:
from ast import literal_eval
hex_string_literal = (
"0xB10B8F96" "A080E01D" "DE92DE5E" "AE5D54EC" "52C99FBC" "FB06A3C6"
"9A6A9DCA" "52D23B61" "6073E286" "75A23D18" "9838EF1E" "2EE652C0"
"13ECB4AE" "A9061123" "24975C3C" "D49B83BF" "ACCBDD7D" "90C4BD70"
"98488E9C" "219A7372" "4EFFD6FA" "E5644738" "FAA31A4F" "F55BCCC0"
"A151AF5F" "0DC8B4BD" "45BF37DF" "365C1A65" "E68CFDA7" "6D4DA708"
"DF1FB2BC" "2E4A4371")
p = literal_eval(hex_string_literal)
Defining the string literal above uses the string literal concatenation.
EDIT
As said by #nneonneo in comments below, you could also use int(hex_string_literal, 16) or int(hex_string_literal, 0) in example above, so that you don't have to import something extra.
p = int(''.join('''
B10B8F96 A080E01D DE92DE5E AE5D54EC 52C99FBC FB06A3C6
9A6A9DCA 52D23B61 6073E286 75A23D18 9838EF1E 2EE652C0
13ECB4AE A9061123 24975C3C D49B83BF ACCBDD7D 90C4BD70
98488E9C 219A7372 4EFFD6FA E5644738 FAA31A4F F55BCCC0
A151AF5F 0DC8B4BD 45BF37DF 365C1A65 E68CFDA7 6D4DA708
DF1FB2BC 2E4A4371
'''.split()), 16)
You can use int(str, 16) to translate hex.
Related
I'm trying to increment the video file names every time they get into my folder. I tried the + and the join() method but I can't seem to figure it out. I tried integers without quotation marks but the join method wont let me use an integer so I tried with quotation marks but now it won't increment
Here is my code
VideoNumber += "99"
folderLocation = ("C:/Users/someone/Documents", VideoNumber, ".mp4")
x = "/".join(folderLocation)
print(x)
You can format integers into a string using an f-string or the format() method on strings.
video_number += 99
video_path = f"C:/Users/someone/Documents/{video_number}.mp4"
print(video_path)
Just as an example of how to make your original code work, you could keep your number as an integer and then convert it to a string using str() (though note this has a bug because you will have an extra / between the number and .mp4).
VideoNumber += 99
folderLocation = ("C:/Users/someone/Documents", str(VideoNumber), ".mp4")
x = "/".join(folderLocation)
print(x)
You can cast the integer into string, so your code will be like this
folderLocation = ("C:/Users/someone/Documents", str(VideoNumber), ".mp4")
I got a string of bytes string like below:
string1 = "b'\xe6\x88\x91\xe4\xbb\xac \xe7\xb4\xa2\xe8\xa6\x81 \xe6\x8e\xa8\xe5\xb9\xbf \xe7\x9a\x84 \xe6\x98\xaf\xe4\xb8\x80 \xe5\xbe\x97 \xe6\x96\xb9\xe6\x96\xb9 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\xba\xe5\x9f\xba \xe7\xa1\x80 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\x80 \xe5\x8c\x97\xe4\xba\xac \xe5\xb7\xb2 \xe5\x9b\xa0 \xe4\xb8\xba \xe6\xa0\x87\xe5\x87\x86 \xe7\x9a\x84 \xe6\x99\xae\xe9\x80\x9a \xe8\xaf\x9d \xe4\xbb\x96 \xe4\xbb\x8e \xe5\x84\xbf\xe7\xab\xa5 \xe6\x97\xb6\xe4\xbb\xa3 \xe8\xb5\xb7 \xe5\xb0\xb1 \xe5\x96\x9c\xe6\xac\xa2 \xe4\xb8\x8b \xe5\x9b\xb4\xe6\xa3\x8b \xe5\x9c\xa8 \xe5\x8d\x81\xe4\xba\x94 \xe5\xb2\x81 \xe7\x9a\x84 \xe6\x97\xb6\xe5\x80\x99 \xe5\xb0\xb1 \xe6\x98\xaf\xe6\x9c\x89 \xe5\x90\x8d \xe5\x85\xb6 \xe5\xb0\x91 \xe4\xba\x86'"
I want to convert string of bytes string into string so that i could use decode function to normal result.
First, put an r before it so that the \x keeps both characters. Then ast.literal_eval() will work.
import ast
string1 = r"b'\xe6\x88\x91\xe4\xbb\xac \xe7\xb4\xa2\xe8\xa6\x81 \xe6\x8e\xa8\xe5\xb9\xbf \xe7\x9a\x84 \xe6\x98\xaf\xe4\xb8\x80 \xe5\xbe\x97 \xe6\x96\xb9\xe6\x96\xb9 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\xba\xe5\x9f\xba \xe7\xa1\x80 \xe6\x96\xb9\xe8\xa8\x80 \xe4\xb8\x80 \xe5\x8c\x97\xe4\xba\xac \xe5\xb7\xb2 \xe5\x9b\xa0 \xe4\xb8\xba \xe6\xa0\x87\xe5\x87\x86 \xe7\x9a\x84 \xe6\x99\xae\xe9\x80\x9a \xe8\xaf\x9d \xe4\xbb\x96 \xe4\xbb\x8e \xe5\x84\xbf\xe7\xab\xa5 \xe6\x97\xb6\xe4\xbb\xa3 \xe8\xb5\xb7 \xe5\xb0\xb1 \xe5\x96\x9c\xe6\xac\xa2 \xe4\xb8\x8b \xe5\x9b\xb4\xe6\xa3\x8b \xe5\x9c\xa8 \xe5\x8d\x81\xe4\xba\x94 \xe5\xb2\x81 \xe7\x9a\x84 \xe6\x97\xb6\xe5\x80\x99 \xe5\xb0\xb1 \xe6\x98\xaf\xe6\x9c\x89 \xe5\x90\x8d \xe5\x85\xb6 \xe5\xb0\x91 \xe4\xba\x86'"
bytes1 = ast.literal_eval(string1)
print(bytes1.decode('utf8')) # 我们 索要 ...
Am writing a program with python gui. that program concept is when we run the prgm it will ask to open one file(witch contains hexa decimal value as TASK.txt) with read mode.
am storing the data of one line in one variable.
how can i convert that data into ascii value. Am new to python. This is my code:
import binascii
import base64
from tkinter import *
from tkinter.filedialog import askopenfilename
def callback():
with open(askopenfilename(),'r') as r:
next(r)
for x in r:
z = str(x[1:-2])
if len(z) % 2:
z = '0' + 'x' + z
print(binascii.unhexlify(z))
a = Button(text='select file', command=callback)
a.pack()
mainloop()
This is the error I am getting:
Exception in Tkinter callback
Traceback (most recent call last):
File "D:\python sw\lib\tkinter\__init__.py", line 1699, in __call__
return self.func(*args)
File "C:\Users\LENOVO\Downloads\hex2.py", line 16, in callback
print(binascii.unhexlify(z))
binascii.Error: Non-hexadecimal digit found"""
Just reread your question correctly, new answer:
Do not prefix with 0x since it does not work with unhexlify and won't even make the string-length even.
You need an even string length, since each pair of hex-digits represent one byte (being one character)
unhexlify returns a byte array, which can be decoded to a string using .decode()
As pointed out here you don't even need the import binascii and can convert hex-to-string with bytearray.fromhex("7061756c").decode()
list(map(lambda hx: bytearray.fromhex(hx).decode(),"H7061756c H7061756c61".replace("H","").split(" ")))
Returns ['paul', 'paula']
What I wrote before I thoroughly read your question
may still be of use
As PM 2Ring noted, unhexilify only works without prefixes like 0x.
Your hex-strings are separated by spaces and are prefixed with H, which must be removed. You already did this, but I think this can be done in a nicer way:
r = "H247314748F8 HA010001FD" # one line in your file
z_arrary = data.replace("H","").split(" ")
# this returns ['247314748F8','A010001FD']
# now we can apply unhexlify to all those strings:
unhexed = map(binascii.unhexlify, z_array)
# and print it.
print(list(unhexed))
This will throw you an Error: Odd-length string. Make sure you really want to unhexilify your data. As stated in the docs you'll need an even number of hexadecimal characters, each pair representing a byte.
If you want to convert the hexadecimal numbers to decimal integers numbers instead, try this one:
list(map(lambda hx: int(hx,16),"H247314748F8 HA010001FD".replace("H","").split(" ")))
int(string, base) will convert from one number system (hexadecimal has base 16) to decimal (with base 10).
** Off topic **
if len(z) % 2:
z = '0' + 'x' + z
Will lead to z still being of uneven length, since you added an even amount of characters.
At the moment I have a byte stream of a string that is received by my Python code and must be converted into a string. For now I managed to extract each character, convert them and append them to a string individually. The code looks something like this:
import struct
# The byte stream is received and stored in byte_stream
text = ''
i = 0
while i < len(byte_stream):
text = text + struct.unpack('c', byte_stream[i])[0]
i += 1
print(text)
But that surely cannot be the most efficient way... Is there a more elegant way to do achieve the same result?
From Convert bytes to a Python string:
byte_stream = [112, 52, 52]
''.join(map(chr, bytes))
>> p44
I want to convert a binary file (such as a jpg, mp3, etc) to web-safe text and then back into binary data. I've researched a few modules and I think I'm really close but I keep getting data corruption.
After looking at the documentation for binascii I came up with this:
from binascii import *
raw_bytes = open('test.jpg','rb').read()
text = b2a_qp(raw_bytes,quotetabs=True,header=False)
bytesback = a2b_qp(text,header=False)
f = open('converted.jpg','wb')
f.write(bytesback)
f.close()
When I try to open the converted.jpg I get data corruption :-/
I also tried using b2a_base64 with 57-long blocks of binary data. I took each block, converted to a string, concatenated them all together, and then converted back in a2b_base64 and got corruption again.
Can anyone help? I'm not super knowledgeable on all the intricacies of bytes and file formats. I'm using Python on Windows if that makes a difference with the \r\n stuff
Your code looks quite complicated. Try this:
#!/usr/bin/env python
from binascii import *
raw_bytes = open('28.jpg','rb').read()
i = 0
str_one = b2a_base64(raw_bytes) # 1
str_list = b2a_base64(raw_bytes).split("\n") #2
bytesBackAll = a2b_base64(''.join(str_list)) #2
print bytesBackAll == raw_bytes #True #2
bytesBackAll = a2b_base64(str_one) #1
print bytesBackAll == raw_bytes #True #1
Lines tagged with #1 and #2 represent alternatives to each other. #1 seems most straightforward to me - just make it one string, process it and convert it back.
You should use base64 encoding instead of quoted printable. Use b2a_base64() and a2b_base64().
Quoted printable is much bigger for binary data like pictures. In this encoding each binary (non alphanumeric character) code is changed into =HEX. It can be used for texts that consist mainly of alphanumeric like email subjects.
Base64 is much better for mainly binary data. It takes 6 bites of first byte, then last 2 bits of 1st byte and 4 bites from 2nd byte. etc. It can be recognized by = padding at the end of the encoded text (sometimes other character is used).
As an example I took .jpeg of 271 700 bytes. In qp it is 627 857 b while in base64 it is 362 269 bytes. Size of qp is dependent of data type: text which is letters only do not change. Size of base64 is orig_size * 8 / 6.
Your documentation reference is for Python 3.0.1. There is no good reason using Python 3.0. You should be using 3.2 or 2.7. What exactly are you using?
Suggestion: (1) change bytes to raw_bytes to avoid confusion with the bytes built-in (2) check for raw_bytes == bytes_back in your test script (3) while your test should work with quoted-printable, it is very inefficient for binary data; use base64 instead.
Update: Base64 encoding produces 4 output bytes for every 3 input bytes. Your base64 code doesn't work with 56-byte chunks because 56 is not an integral multiple of 3; each chunk is padded out to a multiple of 3. Then you join the chunks and attempt to decode, which is guaranteed not to work.
Your chunking loop would be much better written as:
output_string = ''.join(
b2a_base64(raw_bytes[i:i+57]) for i in xrange(0, xrange(len(raw_bytes), 57)
)
In any case, chunking is rather slow and pointless; just do b2a_base64(raw_bytes)
#PMC's answer copied from the question:
Here's what works:
from binascii import *
raw_bytes = open('28.jpg','rb').read()
str_list = []
i = 0
while i < len(raw_bytes):
byteSegment = raw_bytes[i:i+57]
str_list.append(b2a_base64(byteSegment))
i += 57
bytesBackAll = a2b_base64(''.join(str_list))
print bytesBackAll == raw_bytes #True
Thanks for the help guys. I'm not sure why this would fail with [0:56] instead of [0:57] but I'll leave that as an exercise for the reader :P