Looking for a compressing algorithm that match this code

Looking for a compressing algorithm that match this code - python

I am trying to write a tool to modify a game save. To do so, I need to decompress and compress the save. I found this decompressing algorithm for the save (that works), but I don't understand the code, so I don't know how to compress the data.
Does anyone find this familiar to any compression/decompression algorithms?
Thanks!
def get_bit(buffer, ref_pointer, ref_filter, length):
result = 0
current = buffer[ref_pointer.value]
print(current)
for i in range(length):
result <<= 1
if current & ref_filter.value:
result |= 0x1
ref_filter.value >>= 1
if ref_filter.value == 0:
ref_pointer.value += 1
current = buffer[ref_pointer.value]
ref_filter.value = 0x80
return result
def decompress(buffer, decode, length):
ref_pointer = Ref(0)
ref_filter = Ref(0x80)
dest = 0
dic = [0] * 0x2010
while ref_pointer.value < length:
print(ref_pointer.value, ref_filter.value, dest)
bits = get_bit(buffer, ref_pointer, ref_filter, 1)
if ref_pointer.value >= length:
return dest
if bits:
bits = get_bit(buffer, ref_pointer, ref_filter, 8)
if ref_pointer.value >= length:
print(dic)
return dest
decode[dest] = bits
dic[dest & 0x1fff] = bits
dest += 1
else:
bits = get_bit(buffer, ref_pointer, ref_filter, 13)
if ref_pointer.value >= length:
print(dic)
return dest
index = bits - 1
bits = get_bit(buffer, ref_pointer, ref_filter, 4)
if ref_pointer.value >= length:
print(dic)
return dest
bits += 3
for i in range(bits):
dic[dest & 0x1fff] = dic[index + i]
decode[dest] = dic[index + i]
dest += 1
print(dic)
return dest

Related

How can I edit this code so I can input a GIF file? It allows me for JPG and PNG but not GIF

Full code: https://paste.pythondiscord.com/kenatayuce.py
I am encountering this error when trying to use LSB Steganography on a GIF:
pix = [value for value in imdata.next()[:3] + # 3 pixels extracted at one time
TypeError: 'int' object is not subscriptable
def modPix(pix, data): # Pixels are modified from 8-bit binary
Link to full code: https://paste.pythondiscord.com/kenatayuce.py
datalist = genData(data)
lendata = len(datalist)
imdata = iter(pix)
for i in range(lendata):
pix = [value for value in imdata.__next__()[:3] + # 3 pixels extracted at one time
imdata.__next__()[:3] +
imdata.__next__()[:3]]
for j in range(0, 8):
if (datalist[i][j] == '0' and pix[j]% 2 != 0): # Pixel value = 1 for odd, 0 for even
pix[j] -= 1
elif (datalist[i][j] == '1' and pix[j] % 2 == 0):
if(pix[j] != 0):
pix[j] -= 1
else:
pix[j] += 1
if (i == lendata - 1): # 8th pixel will state whether to stop or to carry on reading
if (pix[-1] % 2 == 0): # 0 = Keep reading.
if(pix[-1] != 0): # 1 = Stop. Message is over.
pix[-1] -= 1
else:
pix[-1] += 1
else:
if (pix[-1] % 2 != 0):
pix[-1] -= 1
pix = tuple(pix)
yield pix[0:3]
yield pix[3:6]
yield pix[6:9]

Comparing Raw I420 Video Files with Python

I want to write a programm that tells you if two given raw video files with 420 colorspace are frame identical. So far i have this:
import os
def checkifduplicate(file, file2, width, height):
bytesPerFrame = int(width * height * 12/8)
n1 = int(os.stat(file).st_size / bytesPerFrame)
n2 = int(os.stat(file2).st_size / bytesPerFrame)
if (n1 != n2):
return 0
with open (file, "rb") as f1:
with open(file2, "rb") as f2:
frameF1 = f1.read(bytesPerFrame)
frameF2 = f2.read(bytesPerFrame)
counter = 1
while frameF1 != b"" and frameF2 != b"" and counter <= n1:
if (frameF1 != frameF2):
return 1
frameF1 = f1.read(bytesPerFrame)
frameF2 = f2.read(bytesPerFrame)
counter += 1
return 2
inputFile1 = "1.raw"
inputFile2 = "2.raw"
n = checkifduplicate(inputFile1,inputFile2,3840,2160)
if (n == 0):
print ("Files contain different amounts of Frames... Ending")
elif (n == 1):
print ("Different Frames")
elif (n == 2):
print ("Identical Stream")
pause = input("Press a Key to End...")
If i use two raw files with different size it works properly.
Also if I use the same raw file twice it return 2 (identical stream)
If I copy one raw file tho and rename it and use those 2 files it returns 1 (different frame)

Memory overflow in Python

I have 67000 files, I need to read them and extract similarities between the words, but when I run the code my laptop becomes much slower, I can't open any other application, and then a memory overflow error shows up (even when I run on around 10 000 of the files). Is there a way to clear the memory after every for loop maybe, or will running the code on all files be impossible to do? Below is the code:
def isAscii(s):
for c in s:
if c not in string.printable:
return False
return True
windowSize = 2
relationTable = {}
probabilities = {}
wordCount = {}
totalWordCount = 0
def sim(w1, w2):
numerator = 0
denominator = 0
if (w1 in relationTable) and (w2 in relationTable):
rtw1 = {}
rtw2 = {}
rtw1 = relationTable[w1]
rtw2 = relationTable[w2]
for word in rtw1:
rtw1_PMI = rtw1[word]['pairPMI']
denominator += rtw1_PMI
if(word in rtw2):
rtw2_PMI = rtw2[word]['pairPMI']
numerator += (rtw1_PMI + rtw2_PMI)
for word in rtw2:
rtw2_PMI = rtw2[word]['pairPMI']
denominator += rtw2_PMI
if(denominator != 0):
return float(numerator)/denominator
else:
return 0
else:
return -1
AllNotes = {}
AllNotes = os.listdir("C:/Users/nerry-san/Desktop/EECE 502/MedicalNotes")
fileStopPunctuations = open('C:/Users/nerry-san/Desktop/EECE 502/stopPunctuations.txt')
stopPunctuations = nltk.word_tokenize(fileStopPunctuations.read())
for x in range (0, 10):
fileToRead = open('C:/Users/nerry-san/Desktop/EECE 502/MedicalNotes/%s'%(AllNotes[x]))
case1 = fileToRead.read()
text = nltk.WordPunctTokenizer().tokenize(case1.lower())
final_text = []
for index in range(len(text)):
word = text[index]
if (word not in stopPunctuations):
final_text.append(word)
for index in range (len(final_text)):
w1 = final_text[index]
if(isAscii(w1)):
for index2 in range(-windowSize, windowSize+1):
if (index2 != 0):
if ( index + index2 ) in range (0, len(final_text)):
w2 = final_text[index + index2]
if(isAscii(w2)):
totalWordCount += 1
if (w1 not in wordCount):
wordCount[w1] = {}
wordCount[w1]['wCount'] = 0
try:
wordCount[w1][w2]['count'] += 1
wordCount[w1]['wCount'] += 1
except KeyError:
wordCount[w1][w2] = {'count':1}
wordCount[w1]['wCount'] += 1
for word in wordCount:
probabilities[word]={}
probabilities[word]['wordProb'] = float (wordCount[word]['wCount'])/ totalWordCount
for word in wordCount:
relationTable[word] = {}
for word2 in wordCount[word]:
if ( word2 != 'wCount'):
pairProb = float(wordCount[word][word2]['count'])/(wordCount[word]['wCount'])
relationTable[word][word2] = {}
relationTable[word][word2]['pairPMI'] = math.log(float(pairProb)/(probabilities[word]['wordProb'] * probabilities[word2]['wordProb']),2)
l = []
for word in relationTable:
l.append(word)
for index in range (0, len(l)):
word = l[index]
simValues = []
for index2 in range (0, len(l)):
word2 = l[index2]
if(word!= word2):
simVal = sim(word,word2)
if(simVal > 0):
simValues.append([word2, simVal])
simValues.sort(key= operator.itemgetter(1), reverse = True)

Every time you open a file, use the "with" statement. This will ensure the file is closed when the loop finishes (or rather when the with block is exited.

Bitonic sort, mpi4py

I am attempting to implement the Bitonic-Sort algorithm.
Parallel Bitonic Sort Algorithm for processor Pk (for k := 0 : : : P 1)
d:= log P /* cube dimension */
sort(local datak) /* sequential sort */
/* Bitonic Sort follows */
for i:=1 to d do
window-id = Most Signicant (d-i) bits of Pk
for j:=(i-1) down to 0 do
if((window-id is even AND jth bit of Pk = 0) OR
(window-id is odd AND jth bit of Pk = 1))
then call CompareLow(j)
else call CompareHigh(j)
endif
endfor
endfor
Source: http://www.cs.rutgers.edu/~venugopa/parallel_summer2012/mpi_bitonic.html#expl
Unfortunately the descriptions of CompareHigh and CompareLow are shaky at best.
From my understanding, CompareHigh will take the data from the calling process, and its partner process, merge the two, sorted, and store the upper half in the calling process' data. CompareLow will do the same, and take the lower half.
I've verified that my implementation is selecting the correct partners and calling the correct CompareHigh/Low method during each iteration for each process, but my output is still only partially sorted. I'm assuming that my implementation of CompareHigh/Low is incorrect.
Here is a sample of my current output:
[0] [17 24 30 37]
[1] [ 92 114 147 212]
[2] [ 12 89 92 102]
[3] [172 185 202 248]
[4] [ 30 51 111 148]
[5] [148 149 158 172]
[6] [ 17 24 59 149]
[7] [160 230 247 250]
And here are my CompareHigh, CompareLow, and merge functions:
def CompareHigh(self, j):
partner = self.getPartner(self.rank, j)
print "[%d] initiating HIGH with %d" % (self.rank, partner)
new_data = np.empty(self.data.shape, dtype='i')
self.comm.Send(self.data, dest = partner, tag=55)
self.comm.Recv(new_data, source = partner, tag=55)
assert(self.data.shape == new_data.shape)
self.data = np.split(self.merge(data, new_data), 2)[1]
def CompareLow(self, j):
partner = self.getPartner(self.rank, j)
print "[%d] initiating LOW with %d" % (self.rank, partner)
new_data = np.empty(self.data.shape, dtype='i')
self.comm.Recv(new_data, source = partner, tag=55)
self.comm.Send(self.data, dest = partner, tag=55)
assert(self.data.shape == new_data.shape)
self.data = np.split(self.merge(data, new_data), 2)[0]
def merge(self, a, b):
merged = []
i = 0
j = 0
while i < a.shape[0] and j < b.shape[0]:
if a[i] < b[j]:
merged.append(a[i])
i += 1
else:
merged.append(b[j])
j += 1
while i < a.shape[0]:
merged.append(a[i])
i += 1
while j < a.shape[0]:
merged.append(b[j])
j += 1
return np.array(merged)
def getPartner(self, rank, j):
# Partner process is process with j_th bit of rank flipped
j_mask = 1 << j
partner = rank ^ j_mask
return partner
Finally, here the actual algorithm loop:
# Generating map of bit_j for each process.
bit_j = [0 for i in range(d)]
for i in range(d):
bit_j[i] = (rank >> i) & 1
bs = BitonicSorter(data)
for i in range(1, d+1):
window_id = rank >> i
for j in reversed(range(0, i)):
if rank == 0: print "[%d] iteration %d, %d" %(rank, i, j)
comm.Barrier()
if (window_id%2 == 0 and bit_j[j] == 0) \
or (window_id%2 == 1 and bit_j[j] == 1):
bs.CompareLow(j)
else:
bs.CompareHigh(j)
if rank == 0: print ""
comm.Barrier()
if rank != 0:
comm.Send(bs.data, dest = 0, tag=55)
comm.Barrier()
else:
dataset[0] = bs.data
for i in range(1, size) :
comm.Recv(dataset[i], source = i, tag=55)
comm.Barrier()
for i, datai in enumerate(dataset):
print "[%d]\t%s" % (i, str(datai))
dataset = np.array(dataset).reshape(data_size)

Well bugger me:
self.data = np.split(self.merge(data, new_data), 2)
Were the problematic lines. I'm not sure what variable data was bound to, but that was the problem right there.

BitString with python

I am trying to use bitstring for python to interpret an incoming data packet and break it up into readable sections. the packet will consist of a header( Source (8bits), Destination (8bits), NS(3bits), NR(3bits), RSV(1bit), LST(1bit), OPcode(8bits), LEN(8bits) ),
the Payload which is somewhere between 0 and 128 bytes (determined by the LEN in the header) and a CRC of 16bits.
The data will be arriving in a large packet over the com port. The data is originated from a micro controller that is packetizing the data and sending it to the user, which is where the python comes in to play.
Since i am unsure of how to store it before parsing I do not have any code for this.
I am new to python and need a little help getting this off the ground.
Thanks,
Erik
EDIT
I currently have a section of code up and running, but it is not producing exactly what i need.... Here is the section of code that i have up and running....
def packet_make(ser):
src = 10
p = 0
lst = 0
payload_make = 0
crc = '0x0031'
ns = 0
nr = 0
rsv = 0
packet_timeout = 0
top = 256
topm = 255
#os.system(['clear','cls'][os.name == 'nt'])
print("\tBatts: 1 \t| Berry: 2 \t| Bessler: 3")
print("\tCordell: 4 \t| Dave: 5 \t| Gold: 6")
print("\tYen: 7 \t| Erik: 8 \t| Tommy: 9")
print("\tParsons: 10 \t| JP: 11 \t| Sucess: 12")
dst = raw_input("Please select a destination Adderss: ")
message = raw_input("Please type a message: ")
#################### Start Making packet#################
p_msg = message
message = message.encode("hex")
ln = (len(message)/2)
#print (ln)
ln_hex = (ln * 2)
message = list(message)
num_of_packets = ((ln/128) + 1)
#print (num_of_packets)
message = "".join(message)
src = hex(src)
dst = hex(int(dst))
#print (message)
print("\n########Number of packets = "+str(num_of_packets) + " ############\n\n")
for p in range (num_of_packets):
Ack_rx = 0
if( (p + 1) == (num_of_packets)):
lst = 1
else:
lst = 0
header_info = 0b00000000
if ((p % 2) > 0):
ns = 1
else:
ns = 0
header_info = (header_info | (ns << 5))
header_info = (header_info | (nr << 2))
header_info = (header_info | (rsv << 1))
header_info = (header_info | (lst))
header_info = hex(header_info)
#print (header_info)
op_code = '0x44'
if (lst == 1):
ln_packet = ((ln_hex - (p * 256)) % 256)
if (p > 0):
ln_packet = (ln_packet + 2)
else:
ln_packet = ln_packet
ln_packet = (ln_packet / 2)
# print (ln_packet)
# print()
else:
ln_packet = 128
# print(ln_packet)
# print()
#ll = (p * 128)
#print(ll)
#ul = ((ln - ll) % 128)
#print(ul)
#print (message[ll:ul])
if ((p == 0)&(ln_hex > 256)):
ll = (p * 255)
# print(ll)
payload_make = (message[ll:256])
# print (payload_make)
elif ((p > 0) & ((ln_hex - (p*256)) > 256)):
ll = (p * 256)
# print(ll)
ll = (ll - 2)
ul = (ll + 256)
# print (ul)
payload_make = (message[ll:ul])
# print(payload_make)
elif ((p > 0) & ((ln_hex - (p*256)) < 257)):
ll = (p * 256)
# print(ll)
ll = (ll - 2)
ul = ((ln_hex - ll) % 256)
ul = (ll + (ul))
ul = ul + 2
print()
print(ul)
print(ln_hex)
print(ln_packet)
print()
# print(ul)
payload_make = (message[ll:ul])
# print(payload)
elif ((p == 0) & (ln_hex < 257)):
ll = (p * 255)
ul = ln_hex
payload_make = (message[ll:ul])
print(payload_make)
packet_m = BitStream()
########################HEADER#########################
packet_m.append('0x0')
packet_m.append(src) #src
packet_m.append('0x0')
packet_m.append(dst) #dst
if(int(header_info,16) < 16):
packet_m.append('0x0')
packet_m.append(header_info) # Ns, Nr, RSV, Lst
packet_m.append(op_code) #op Code
#if(ln_packet < 16):
#packet_m.append('0x0')
packet_m.append((hex(ln_packet))) #Length
###################END OF HEADER#######################
packet_m.append(("0x"+payload_make)) #Payload
#packet_m.append(BitArray(p_msg)) #Payload
packet_m.append(crc) #CRC
#print()
#print(packet)
temp_ack = '0x00'
print(packet_m)
print(ln_packet)
while((Ack_rx == 0) & (packet_timeout <= 5)):
try:
###### Send the packet
#ser.write(chr(0x31))
str_pack = list(str(packet_m)[2:])
"".join(str_pack)
ser.write(chr(0x02))
#ser.write((str(packet_m)[2:]))
for i in range (len(str_pack)):
t = ord(str_pack[i])
ser.write(chr(t))
print(chr(t))
ser.write(chr(0x04))
ser.write(chr(0x10))
ack_packet = BitStream(ser.read())
if((len(ack_packet) > 3)):
temp_ack = ACK_parse(ack_packet)
else:
packet_timeout = (packet_timeout + 1)
print "why so serious\n\n"
if(temp_ack == '0x41'):
Ack_rx = 1
elif (temp_ack == '0x4E'):
Ack_rx = 0
else:
Acl_rx = 0
except serial.SerialTimeoutException: #if timeout occurs increment counter and resend last packet
Ack_rx = 0
packet_timeout = (packet_timeout + 1)
except serial.SerialException:
print "Error ... is not Active!!!", port
The output that is printed to the terminal is as follows when source and payload are both 1:
#######Number of packets = 1 #######
31
0x0a0101441310031
1
0
.... etc..
The micro on the other end of the serial reads : 0a0101441310031
when it should read a 1 1 44 1 31 0031
Python is sending each value as a separate character rather than putting it as one char. when it was appended into the packet rather than storing into the proper length and data type it seems to have separated the hex into 2 8 bit locations rather than 1 8 bit location....
The section of python code where i am reading from the Micro works flawlessly when reading an acknowledgement packet. I have not tried it with data, but i don't think that will be an issue. The C side can not read the ACK from the python side since it is separating the hex values into 2 char rather than transmitting just the 8 bit value....
Any ideas??? Thanks

Your exact problem is a bit vague, but I should be able to help with the bitstring portion of it.
You've probably got your payload to analyse as a str (or possibly bytes if you're using Python 3 but don't worry - it works the same way). If you haven't got that far then you're going to have to ask a more basic question. I'm going to make up some data to analyse (all this is being done with an interactive Python session):
>>> from bitstring import BitStream
>>> packet_data = '(2\x06D\x03\x124V\x03\xe8'
>>> b = BitStream(bytes=packet_data)
Now you can unpack or use reads on your BitStream to extract the things you need. For example:
>>> b.read('uint:8')
40
>>> b.read('uint:8')
50
>>> b.readlist('uint:3, uint:3')
[0, 1]
>>> b.readlist('2*bool')
[True, False]
>>> b.readlist('2*uint:8')
[68, 3]
>>> b.read('bytes:3')
'\x124V'
This is just parsing the bytes and interpreting chunks as unsigned integers, bools or bytes. Take a look at the manual for more details.
If you just want the payload, then you can just extract the length then grab it by slicing:
>>> length = b[32:40].uint
>>> b[40:40 + length*8]
BitStream('0x123456')
and if you want it back as a Python str, then use the bytes interpretation:
>>> b[40:40 + 3*8].bytes
'\x124V'
There are more advance things you can do too, but a good way to get going in Python is often to open an interactive session and try some things out.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Looking for a compressing algorithm that match this code - python

Related

How can I edit this code so I can input a GIF file? It allows me for JPG and PNG but not GIF

Comparing Raw I420 Video Files with Python

Memory overflow in Python

Bitonic sort, mpi4py

BitString with python

Categories

Resources