I want to alter the values in my 2D list keyed within a dictionary. My code is altering all the values in my dictionary which is not intended.
cache_memory = {}
zero_list = ["0"] * 2 + ["00"] * 5
empty_lists = []
# append zero_lists to empty_list to make a 2D LIST
for count in range(2):
empty_lists.append(zero_list)
# store 2D lists with keys
for index in range(2):
cache_memory[index] = empty_lists
cache_memory[0][0][0] = "1"
print(cache_memory)
Output is:
{0: [['1', '0', '00', '00', '00', '00', '00'], ['1', '0', '00', '00', '00', '00', '00']], 1: [['1', '0', '00', '00', '00', '00', '00'], ['1', '0', '00', '00', '00', '00', '00']]}
I want it to be:
{0: [['1', '0', '00', '00', '00', '00', '00'], ['0', '0', '00', '00', '00', '00', '00']], 1: [['0', '0', '00', '00', '00', '00', '00'], ['0', '0', '00', '00', '00', '00', '00']]}
Is there a way to achieve this in python or should I try a different workaround?
In the following lines of your code
for count in range(2):
empty_lists.append(zero_list)
# store 2D lists with keys
for index in range(2):
cache_memory[index] = empty_lists
You are creating a shallow copy of the list which is nothing but a reference to the list. When you make changes in the copy, the same changes are reflected in the original list. That's why all the lists in your dictionary 'cache_memory' are references of the same list and modifying the element in one list modifies the list for all the references.
To get your favourable output you need to create and store the deep copies of the list which can be achieved using copy module available in python. So you need to modify your code as:
import copy #importing copy module
cache_memory = {}
zero_list = ["0"] * 2 + ["00"] * 5
empty_lists = []
# append zero_lists to empty_list to make a 2D LIST
for count in range(2):
empty_lists.append(copy.deepcopy(zero_list))
# store 2D lists with keys
for index in range(2):
cache_memory[index] = copy.deepcopy(empty_lists)
cache_memory[0][0][0] = "1"
print(cache_memory)
You can learn more about shallow copies and deep copies in this article Shallow vs deep copies in Python
When you just say cache_memory[index] = empty_lists, Python will take that as cache_memory[index] and empty_lists is one object. If you want to set cache_memory[index] to a copy of empty_lists, then you can either put .copy() after empty_lists in that line or [:]. Both of these get a copy of empty_lists instead of taking them both as one object.
If you want to explore this more, just use the following code snippet.
my_list = [1,2,3]
new_list = my_list
print('Prev my_list:',my_list)
print('Prev new_list:',new_list)
my_list.append(4)
print('New my_list:',my_list)
print('New new_list:',new_list) # As you can see new_list also gets updated
I'm extracting jpeg type bits from mp3 data actually it will be album arts.
I thought about using library called mutagen, but I'd like to try with bits for some practice purpose.
import os
import sys
import re
f = open(sys.argv[1], "rb")
#sys.argv[1] gets mp3 file name ex) test1.mp3
saver = ""
for value in f:
for i in value:
hexval = hex(ord(i))[2:]
if (ord(i) == 0):
saver += "00" #to match with hex form
else:
saver += hexval
header = "ffd8"
tail = "ffd9"
this part of code is to get mp3 as bit form, and then transform it into hex
and find jpeg trailers which starts as "ffd8" and ends with "ffd9"
frontmatch = re.search(header,saver)
endmatch = re.search(tail, saver)
startIndex = frontmatch.start()
endIndex = endmatch.end()
jpgcontents = saver[startIndex:endIndex]
scale = 16 # equals to hexadecimal
numbits = len(jpgcontents) * 4 #log2(scale)
bitcontents = bin(int(jpgcontents, scale))[2:].zfill(numbits)
and here, I get the bits between the header and tail and transform it into
binary form. Which supposed to be the jpg part of the mp3 files.
txtfile = open(sys.argv[1] + "_tr.jpg", "w")
txtfile.write(bitcontents)
and I wrote the bin to the new file with writing type as jpg.
sorry for my wrong naming as txtfile.
But these codes gave the error which is
Error interpreting JPEG image file
(Not a JPEG file: starts with 0x31 0x31)
I'm not sure whether the bits I extracted are wrong or writing to the file
step is wrong. Or there might be other problem in code.
I'm working in linux version with python 2.6. Is there anything wrong with
just writing str type of bin data as JPG?
You are creating a string of ASCII zeroes and ones, i.e. \x30 and \x31, but the JPEG file needs to be proper binary data. So where your file should have a single byte of (for example) \xd8 you instead have these eight bytes: 11011000, or \x31\x31\x30\x31\x31\x30\x30\x30.
You don't need to do all that messy conversion stuff. You can just search directly for the desired byte patterns, writing them using \x hex escape sequences. And you don't even need regex: the simple string .index or .find methods can do this easily and quickly.
with open(fname, 'rb') as f:
data = f.read()
header = "\xff\xd8"
tail = "\xff\xd9"
try:
start = data.index(header)
end = data.index(tail, start) + 2
except ValueError:
print "Can't find JPEG data!"
exit()
print 'Start: %d End: %d Size: %d' % (start, end, end - start)
with open(fname + "_tr.jpg", 'wb') as f:
f.write(data[start:end])
(Tested on Python 2.6.6)
However, extracting embedded JPEG data like this isn't foolproof, since it's possible that those header and tail byte sequences exist in the MP3 sound data.
FWIW, a simpler way to translate binary data to hex strings and back is to use hexlify and unhexlify from the binascii module.
Here are some examples of doing these transformations, both with and without the binascii functions.
from binascii import hexlify, unhexlify
#Create a string of all possible byte values
allbytes = ''.join([chr(i) for i in xrange(256)])
print 'allbytes'
print repr(allbytes)
print '\nhex list'
print [hex(ord(v))[2:].zfill(2) for v in allbytes]
hexstr = hexlify(allbytes)
print '\nhex string'
print hexstr
newbytes = ''.join([chr(int(hexstr[i:i+2], 16)) for i in xrange(0, len(hexstr), 2)])
print '\nNew bytes'
print repr(newbytes)
print '\nUsing unhexlify'
print repr(unhexlify(hexstr))
output
allbytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
hex list
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0b', '0c', '0d', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '8a', '8b', '8c', '8d', '8e', '8f', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '9a', '9b', '9c', '9d', '9e', '9f', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'aa', 'ab', 'ac', 'ad', 'ae', 'af', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'ba', 'bb', 'bc', 'bd', 'be', 'bf', 'c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'ca', 'cb', 'cc', 'cd', 'ce', 'cf', 'd0', 'd1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9', 'da', 'db', 'dc', 'dd', 'de', 'df', 'e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'ea', 'eb', 'ec', 'ed', 'ee', 'ef', 'f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff']
hex string
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff
New bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Using unhexlify
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Note that this code needs some modifications to run on Python 3 (apart from converting the print statements to print function calls) because plain Python 3 strings are Unicode strings, not byte strings.
You need to write out as binary
Try:
txtfile = open(sys.argv[1] + "_tr.jpg", "wb")
Oups, you are not doing what you expect. the bin generates a string containing the value in binary form. Let's look at what you have, if the content on the input file was :
saver is a string of hexadecimal characters in textual form something like "313233414243" for an initial string of "132ABC"
jpgcontents has same format and starts with "ffd8" and ends with "ffd9"
you then apply the magic formula bin(int(jpgcontents, scale))[2:].zfill(numbits) that
convert the hexa string to a long integer
convert the long integer to a binary representation string - this part would convert hexa "ff" in integer 255 and end in the string "0b11111111"
remove first characters "0b" and fill the end of buffer if needed
bitcontents is then a string starting with "11111111....". Just rename your file with a .txt extension and open it with a text editor, you will see that it is a large file containing only ASCII characters 0 and 1.
As the header is "ffd8" the file will start with 10 "1". So the error that it starts with 0x31 0x31 because 0x31 is the ascii code of "1".
What you need is convert the hexa string jpgcontents in a binary byte array.
fileimage = ''.join([ jpgcontent[i:i+2] for i in range(0, len(jpgcontent), 2]
You can then safely copy the fileimage buffer to a binary file:
file = open(sys.argv[1] + "_tr.jpg", "wb")
file.write(fileimage)
The easiest method is using the binascii module: https://docs.python.org/2/library/binascii.html.
import binascii
# code in ascii format contained in a list
code = ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09']
bfile = open('bfile.bin', 'w')
for c in code:
# convert the ascii to binary and write it to the file
bfile.write(binascii.unhexlify(c))
bfile.close()
I was reversing some application and i faced this opcode:
PSHUFB XMM2, XMMWORD_ADDRESS
and i tried implementing the algorithm of this function in python with no success!
The reference of how this opcode should work is here:
http://www.felixcloutier.com/x86/PSHUFB.html
Here is a code snippet:
PSHUFB (with 128 bit operands)
for i = 0 to 15 {
if (SRC[(i * 8)+7] = 1 ) then
DEST[(i*8)+7..(i*8)+0] ← 0;
else
index[3..0] ← SRC[(i*8)+3 .. (i*8)+0]; DEST[(i*8)+7..(i*8)+0] ← DEST[(index*8+7)..(index*8+0)];
endif
}
DEST[VLMAX-1:128] ← 0
im trying to implement the 128 version of this opcode with no success.
Here are the values before and after the function
Before
WINDBG>r xmm2
xmm2= 0 3.78351e-044 6.09194e+027 6.09194e+027
After
WINDBG>r xmm2
xmm2=9.68577e-042 0 4.92279e-029 4.92279e-029
in python you can use 'struct' to change those from float numbers to Hex:
hex(struct.unpack('<I', struct.pack('<f', f))[0])
So i can sort of say those are the hex values of XMM2 before and after the PSHUFB opcode:
Before
xmm2 = 0 0x0000001b 0x6d9d7914 0x6d9d7914
After
xmm2 = 00001b00 00000000 10799d78 10799d78
And most importantly, i almost forgot.. the value of XMMWORD_ADDRESS is:
03 02 01 00 07 06 05 04 0D 0C 0B 0A 09 08 80 80
xmmword 808008090A0B0C0D0405060700010203h
Implementation in Python could be highly appreciated.
Implementation in C could work as well
or maybe some explanation of how the hell it works!
Because i couldnt understand the intel reference
This is the code algorithm i have so far
x = ['00', '00', '00', '00', '00', '00', '00', '1b', '6d', '9d', '79', '14', '6d', '9d', '79', '14']
s = ['03', '02', '01', '00', '07', '06', '05', '04', '0D', '0C', '0B', '0A', '09', '08', '80', '80']
new = []
for i in range(16):
if 0x80 == int(s[i], 16) & 0x80:
print "MSB", s[i]
new.append(0)
else:
print "NOT MSB", s[i]
new.append( x[int(s[i], 16) & 15] )
print x
print new
Where x is the xmm0, and s is the SRC.
the output i get is:
['00', '00', '00', '00', '00', '00', '00', '1b', '6d', '9d', '79',
'14', '6d', '9d', '79', '14']
['00', '00', '00', '00', '1b', '00', '00', '00', '9d', '6d', '14',
'79', '9d', '6d', '00', '00']
where i should get
['00', '00', '1b', '00', '00', '00', '00', '00', '10', '79', '9d',
'78', '10', '79', '9d', '78']
Something else i have noticed right now, in the 'output' i get the hexadecimal number 0x78
Where could it come from?
It works like 16 parallel table lookups, with special handling for indexes that have their top bit set. So for example, it could look like this: (not tested, not Python)
for (int i = 0; i < 16; i++)
new_dest[i] = (src[i] & 0x80) ? 0 : dest[src[i] & 15];
dest = new_dest;
The new_dest there is significant, because it's really 16 parallel assignments, ie read-before-write, the second lookup is not affected by what happened in to the first byte and so on. Intel's code snippet leaves that implicit (or is wrong, depending on how you look at it).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do you split a list into evenly sized chunks in Python?
What is the most “pythonic” way to iterate over a list in chunks?
Say I have a string
s = '1234567890ABCDEF'
How can I slice (or maybe split is the correct term?) this string into a list consisting of strings containing 2 characters each?
desired_result = ['12', '34', '56', '78', '90', 'AB', 'CD', 'EF']
Not sure if this is relevant, but I'm parsing a string of hex characters and the final result I need is a list of bytes, created from the list above (for instance, by using int(desired_result[i], 16))
3>> bytes.fromhex('1234567890ABCDEF')
b'\x124Vx\x90\xab\xcd\xef'
You could use binascii:
>>> from binascii import unhexlify
>>> unhexlify(s)
'\x124Vx\x90\xab\xcd\xef'
Then:
>>> list(_)
['\x12', '4', 'V', 'x', '\x90', '\xab', '\xcd', '\xef']
>>> s = '1234567890ABCDEF'
>>> iter_s = iter(s)
>>> [a + next(iter_s) for a in iter_s]
['12', '34', '56', '78', '90', 'AB', 'CD', 'EF']
>>>
>>> s = '1234567890ABCDEF'
>>> [char0+char1 for (char0, char1) in zip(s[::2], s[1::2])]
['12', '34', '56', '78', '90', 'AB', 'CD', 'EF']
But, as others have noted, there are more direct solutions to the more general problem of converting hexadecimal numbers to bytes.
Also note that Robert Kings's solution is more efficient, in general, as it essentially has a zero memory footprint (at the cost of a less legible code).
I need to generate a list of numbers in a specific format. The format is
mylist = [00,01,02,03,04,05,06,07,08,09,10,11,12,13,14,15]
#Numbers between 0-9 are preceded by a zero.
I know how to generate a normal list of numbers using range
>>> for i in range(0,16):
... print i
So, is there any built-in way in python to generate a list of numbers in the specified format.
Python string formatting allows you to specify a precision:
Precision (optional), given as a '.' (dot) followed by the precision.
In this case, you can use it with a value of 2 to get what you want:
>>> ["%.2d" % i for i in range(16)]
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12',
'13', '14', '15']
You could also use the zfill function:
>>> str(3).zfill(2)
'03'
or the string format function:
>>> "{0:02d}".format(3)
'03'
One liner in Python 3.x
[f'{i:>02}' for i in range(1, 16)]
Output:
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15']
in oracle write this query to generate 01 02 03 series
select lpad(rownum+1, decode(length(rownum),1,2, length(rownum)),0) from employees