Python decoding, base64, nbt, gzip? what is it? - python

I am trying to get information from a Minecraft AP. From the API you can read players inventories, but it this is what it says: here is link to pastebin
I tried to run base64 on it on python, but it gave me an output like this (only a few lines):
b'\xad\xa9\xc0d\x85\xe4\xe0\x87`\xcess\x00\x9b]e~c\xea\xaa\xb8\x9a\xa4\xdd\x958"\x8f\x0f\x10\xb9\xea\x9f2v\xdd\xcc#N\xe8x\xb4\xdd\x18\xa9\xee>\xcfM
I read a bit about it on their forums, and a few comments said stuff about "base64, gzip, nbt".
Know, I haven't really worked at decoding stuff, etc, and I am trying to understand what it all means.
Thanks

NBT is a minecraft specific format: Named Binary Tag
So you get an NBT-File, that is zipped (compressed) in the gzip format and then Base64 encoded.
After base64 decoding you need to unzip the gzip format to get the NBT.
There's also a nbt parser in python.

Related

How to encode a video response in python?

In a request that I made I received a byte response and I know it is a response of a video. and it's status code was 200. And I don't know how to use it. I mean I tried to encode it into utf-8 and then save it to a file but it is not playable. media players is unable to read it's content here's the request that I made
import requests
resp = requests.get('https://bcboltsony-a.akamaihd.net/media/v1/hls/v4/aes128/5182475815001/4ded6ac4-6f8b-4da2-8194-db2391d5e331/164fe5c5-15a3-4997-b4c6-7dd4b95f9c57/92410c6d-c565-4341-8650-1d40a795ece2/5x/segment1.ts?akamai_token=exp=1589337578~acl=/media/v1/hls/v4/aes128/5182475815001/4ded6ac4-6f8b-4da2-8194-db2391d5e331/164fe5c5-15a3-4997-b4c6-7dd4b95f9c57/92410c6d-c565-4341-8650-1d40a795ece2/*~hmac=bf9745f2a9b51c04d59eb9955de20dcf1b4c8c7e434ad0bdd639f2d80fa10ecc')
open('E:/video.mp4', 'wb').write(bytes(resp.text, encoding='utf-8'))
how to convert this response to a watchable format
Try using wget which can help download files 10x easier.
Here is a simple code with your situation:
import wget
url = "https://bcboltsony-a.akamaihd.net/media/v1/hls/v4/aes128/5182475815001/4ded6ac4-6f8b-4da2-8194-db2391d5e331/164fe5c5-15a3-4997-b4c6-7dd4b95f9c57/92410c6d-c565-4341-8650-1d40a795ece2/5x/segment1.ts?akamai_token=exp=1589337578~acl=/media/v1/hls/v4/aes128/5182475815001/4ded6ac4-6f8b-4da2-8194-db2391d5e331/164fe5c5-15a3-4997-b4c6-7dd4b95f9c57/92410c6d-c565-4341-8650-1d40a795ece2/*~hmac=bf9745f2a9b51c04d59eb9955de20dcf1b4c8c7e434ad0bdd639f2d80fa10ecc"
wget.download(url, 'c:/users/Yourname/downloads/video.mp4')
If this does not work the problem of encoding may be on the url's side.
Your code is absolutely right.But note that:
If you open this page in your explorer,you will find it is a .ts file instead of .mp4 file.
Also,if you download it in the explorer directly, you also couldn't play it directly.In my PC, it also reminds me it has been damaged.
If you search it in the internet, .ts file is encrypted(In the page of your url,the way it encrypt is AES128).Maybe you need to take some measures.
Replace your code with the below code. I hope it will work :).
open('E:/video.mp4', 'wb').write(resp.content)

decode bytes that are in string form with unkown encoding

So I just started my own little project to create a bot for a game,
but only did little coding before, so I am definitely no expert, if I get something mixed up or forget to mention some information, I apologize in advance!
so basically my python bot will connect to the server (WebSocket connection 13, the header says "Accept-Encoding: gzip, deflate, br"), I use the WebSocket module and that works well. the game sends messages in JSON format. however, they are filled up with backslashes, I think internally a javascript clears those out / splits each message into multiple ones and removes the outermost layer. so far my solution is to just clear out the backslashes and from there on it's pretty straightforward.
problem is: map data is apparently encoded. so basically the message would look like this:
{"type":"pkg","data":"[\"{\\\"type\\\":\\\"pl\\\",\\\"data\\\":[\\\"{\\\\\\\"type\\\\\\\":\\\\\\\"p\\\\\\\",\\\\\\\"id\\\\\\\":227727,\\\\\\\"tpl\\\\\\\":227727,\\\\\\\"s\\\\\\\":458
.... and then at the end of the message (its a lot longer, i just didnt to post 30 lines of compressed data):
{\\\"type\\\":\\\"zip\\\",\\\"data\\\":\\\"{\\\\\\\"type\\\\\\\":\\\\\\\"map\\\\\\\",\\\\\\\"xî182îyî478îtilesî\\\\\\\"1:î¢î¤î£î¦526_21î¢254
which is obviously the encoded / compressed map data. firefox dev tools however shows it decompressed too, it then looks more like this:
\\\\\\\"map\\\\\\\",\\\\\\\"xî\u0080\u0086182î\u0080\u008dyî\u0080\u0086478î\u0080\u008dtilesî\u0080\u0086\\\\\\\"1:î\u0080¢î\u0080¤î\u0080£î\u0080¦526_21î\u0080¢254:36î\u0080²î\u0080´î\u0080³î\u0080µ:î\u0080¬î\u0080¸î\u0080ºî\u0080·î\u0080·î\u0080ºî\u0080¼î\u0080¶
I tried around with different commands and modules like zlib, but honestly, I m really lost. is that data already decoded and now in byte form or is that still compressed zip data? if so, how can I decode it, as I right now handle it as a raw string? or should I put it into a data file from the get-go? what does the xi, in the beginning, stand for, the encoding scheme?
any help is greatly appreciated, I would really like to know what the heck is going on here :D

python - Getting requests response in gzip encoding. How to convert the data to the actual file?

I am trying to download the file through API. I was informed that API response is in gzip encoding. I have to convert this response to the actual file with text data in actual format. Please help me how to do that. Here is my code
import requests
response=requests.get(api_endpoint,headers={"Authorization":token_no} data=response.content
Data is having gzip encoded data. The data looks like this:
�j7���8}��#���hT�Nj���C�MKJ�#]�Sy�{
how can I convert this to the actual text data? I almost searched in Stack overflow but nothing helped. I also tried gzip library in python but that throws error as OSError: Not a gzipped file (b'%P')
Thanks

Strange behaviour when converting base64 string to png in Python

Hello I'm new to the concept of base64 images. I was trying to convert base64 "links" in a HTML to png files in Python, but the png generated seems to be damaged and I don't know why... Here is my code (in python 3.6)
encoded = (string2[0].split(",")[1]).encode("utf-8")
with open(r"myDirectory\example1.png", "wb") as fh:
fh.write(base64.decodebytes(encoded))
string2[0] is the full base64 string which I copied from the HTML. i.e. something like
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA0gAA...K5C%0AYII=
The problem is essentially the following: A png file will be generated, but when I open it, windows says "the file appears to be damaged, corrupted". However strangely when I open this base64 string in google chrome, the photo can be shown.
Anyone has encountered similar situation before?
p.s. I was thinking to provide the full base64 string, but it's very very long. Anyone knows how to paste such a long string to the question? e.g. a "dragable box of code" similar to what the OP has done in this question
Edit: The base64 string can be found here. My first time sharing documents in google drive - let me know if you guys can access it.

html DOM: good test webpages to test encoding/decoding

What I'm doing is:
via javascript, reading the DOM of webpage
converting to json string
sending to python as ajax
in Python, json decoding the string into object
What I want is for any text that is part of the json to be in unicode to avoid any character issues. I used to use beautifulsoup for this:
from bs4 import *
from bs4.dammit import UnicodeDammit
text_unicode = UnicodeDammit(text, [None, None], "html", True).unicode_markup
But that doesn't work with the json string. Running the string through UnicodeDammit causes an error when I try to json decode it.
The thing is, I'm not even sure that collecting the DOM doesn't handle this issue automatically.
For starters, I would therefore like a series of test webpages to test this. Where one is encoded with utf-8, another with something else, etc. And that uses characters that will look wrong if, for example, you think it's utf-8 but it's not. Note that I don't even bother considering the webpage's stated encoding. This is too often wrong.
You are trying to solve a problem that does not exist.
The browser is responsible for detecting and handling the web page encoding. It'll determine the correct encoding based on the server headers, meta tags in the HTML page and plain guessing if needed. The DOM gives you Unicode data.
JSON handles Unicode data; sending JSON data to your Python process sends appropriately encoded byte data that any decent JSON library will turn back into Unicode values for you. The Python json module is such a library.
Just load the data from your JavaScript script with the json.load() or json.loads() functions as is. Your browser will already have used the correct encoding (most likely UTF-8), and the Python json module will decode any of the standard encodings used without additional configuration or handling.

Categories

Resources