RedVox Python SDK | Not Reading in .rdvxz Files - python

I'm attempting to read in a series of files for processing contained in a single directory using RedVox:
input_directory = "/home/ben/Documents/Data/F1D1/21" # file location
rdvx_data = DataWindow(input_dir=input_directory, apply_correction=False, debug=True) # using RedVox to read in the files
print(os.listdir(input_directory)) # verifying the files actually exist...
# returns "['file1.rdvxz', 'file2.rdvxz', file3.rdvxz', ...etc]", they exist
# write audio portion to file
rdvx_data.to_json_file(base_dir=output_rpd_directory,
file_name=output_filename)
# this never runs, because rdvx_data.stations = [] (verified through debugging)
for station in rdvx_data.stations:
# some code here
Enabling debugging through arguments as seen above does not provide an extra details. In fact, there is no error message whatsoever. It writes the JSON file and pickle to disk, but the JSON file is full of null values and the pickle object is just a shell, no contents. So the files definitely exist, os.listdir() sees them, but RedVox does not.
I assume this is some very silly error or lack of understanding on my part. Any help is greatly appreciated. I have not worked with RedVox previously, nor do I have much understanding of what these files contain other than some audio data and some other data. I've simply been tasked with opening them to work on a model to analyze the data within.

SOLVED: Not sure why the previous code doesn't work (it was handed to me), however, I worked around the DataWindow call and went straight to calling the "redvox.api900.reader" object:
from redvox.api900 import reader
dataset_dir = "/home/*****/Documents/Data/F1D1/21/"
rdvx_files = glob(dataset_dir+"*.rdvxz")
for file in rdvx_files:
wrapped_packet = reader.read_rdvxz_file(file)
From here I can view all of the sensor data within:
if wrapped_packet.has_microphone_sensor():
microphone_sensor = wrapped_packet.microphone_sensor()
print("sample_rate_hz", microphone_sensor.sample_rate_hz())
Hope this helps anyone else who's confused.

Related

Is there any feasible solution to read WOT battle results .dat files?

I am new here to try to solve one of my interesting questions in World of Tanks. I heard that every battle data is reserved in the client's disk in the Wargaming.net folder because I want to make a batch of data analysis for our clan's battle performances.
image
It is said that these .dat files are a kind of json files, so I tried to use a couple of lines of Python code to read but failed.
import json
f = open('ex.dat', 'r', encoding='unicode_escape')
content = f.read()
a = json.loads(content)
print(type(a))
print(a)
f.close()
The code is very simple and obviously fails to make it. Well, could anyone tell me the truth about that?
Added on Feb. 9th, 2022
After I tried another set of codes via Jupyter Notebook, it seems like something can be shown from the .dat files
import struct
import numpy as np
import matplotlib.pyplot as plt
import io
with open('C:/Users/xukun/Desktop/br/ex.dat', 'rb') as f:
fbuff = io.BufferedReader(f)
N = len(fbuff.read())
print('byte length: ', N)
with open('C:/Users/xukun/Desktop/br/ex.dat', 'rb') as f:
data =struct.unpack('b'*N, f.read(1*N))
The result is a set of tuple but I have no idea how to deal with it now.
Here's how you can parse some parts of it.
import pickle
import zlib
file = '4402905758116487.dat'
cache_file = open(file, 'rb') # This can be improved to not keep the file opened.
# Converting pickle items from python2 to python3 you need to use the "bytes" encoding or "latin1".
legacyBattleResultVersion, brAllDataRaw = pickle.load(cache_file, encoding='bytes', errors='ignore')
arenaUniqueID, brAccount, brVehicleRaw, brOtherDataRaw = brAllDataRaw
# The data stored inside the pickled file will be a compressed pickle again.
vehicle_data = pickle.loads(zlib.decompress(brVehicleRaw), encoding='latin1')
account_data = pickle.loads(zlib.decompress(brAccount), encoding='latin1')
brCommon, brPlayersInfo, brPlayersVehicle, brPlayersResult = pickle.loads(zlib.decompress(brOtherDataRaw), encoding='latin1')
# Lastly you can print all of these and see a lot of data inside.
The response contains a mixture of more binary files as well as some data captured from the replays.
This is not a complete solution but it's a decent start to parsing these files.
First you can look at the replay file itself in a text editor. But it won't show the code at the beginning of the file that has to be cleaned out. Then there is a ton of info that you have to read in and figure out but it is the stats for each player in the game. THEN it comes to the part that has to do with the actual replay. You don't need that stuff.
You can grab the player IDs and tank IDs from WoT developer area API if you want.
After loading the pickle files like gabzo mentioned, you will see that it is simply a list of values and without knowing what the value is referring to, its hard to make sense of it. The identifiers for the values can be extracted from your game installation:
import zipfile
WOT_PKG_PATH = "Your/Game/Path/res/packages/scripts.pkg"
BATTLE_RESULTS_PATH = "scripts/common/battle_results/"
archive = zipfile.ZipFile(WOT_PKG_PATH, 'r')
for file in archive.namelist():
if file.startswith(BATTLE_RESULTS_PATH):
archive.extract(file)
You can then decompile the python files(uncompyle6) and then go through the code to see the identifiers for the values.
One thing to note is that the list of values for the main pickle objects (like brAccount from gabzo's code) always has a checksum as the first value. You can use this to check whether you have the right order and the correct identifiers for the values. The way these checksums are generated can be seen in the decompiled python files.
I have been tackling this problem for some time (albeit in Rust): https://github.com/dacite/wot-battle-results-parser/tree/main/datfile_parser.

When I import an array from another file, do I take just the data from it or need to "build" the array with how the original file build it?

Sorry if the question is not well formulated, will reformulated if necessary.
I have a file with an array that I filled with data from an online json db, I imported this array to another file to use its data.
#file1
response = urlopen(url1)
a=[]
data = json.loads(response.read())
for i in range(len(data)):
a.append(data[i]['name'])
i+=1
#file2
from file1 import a
'''do something with "a"'''
Does importing the array means I'm filling the array each time I call it in file2?
If that is the case, what can I do to just keep the data from the array without "building" it each time I call it?
If you saved a to a file, then read a -- you will not need to rebuild a -- you can just open it. For example, here's one way to open a text file and get the text from the file:
# set a variable to be the open file
OpenFile = open(file_path, "r")
# set a variable to be everything read from the file, then you can act on that variable
file_guts = OpenFile.read()
From the Python docs on the Modules section - link - you can read:
When you run a Python module with
python fibo.py <arguments>
the code in the module will be executed, just as if you imported it
This means that importing a module has the same behavior as running it as a regular Python script, unless you use the __name__ as mentioned right after this quotation.
Also, if you think about it, you are opening something, reading from it, and then doing some operations. How can you be sure that the content you are now reading from is the same as the one you had read the first time?

What happened when I used pandas to read csv files for multiple time in kaggle's notebook?

I am participating the kaggle's NCAA March Madness Anlytics Competion. I used pandas to read the information from csv files but encountered such a problem:
seeds = pd.read_csv('/kaggle/input/march-madness-analytics-2020/2020DataFiles/2020DataFiles/2020-Womens-Data/WDataFiles_Stage1/WNCAATourneySeeds.csv')
seeds
Here the output is empty. And I tried again like this:
rank = seeds.merge(teams)
Then there came an error:
NameError: name 'seeds' is not defined.
I can't figure out what happened and I tried it offline which turned out that nothing happened. Do I miss anything? And how can I fix it? Note that this was not the first time I used the read_csv() to read data from csv file in this notebook, though I couldn't figure out whether there is relation between this trouble and my situation.
You must put the CSV file in the folder where python saves projects.
Run this to find out the destination:
%pwd
Put the file in the destination and run this:
seeds = pd.read_csv('WNCAATourneySeeds.csv')
You can also run this:
seeds = pd.read_csv(r'C:\Users....\WNCAATourneySeeds.csv')
Where "C" is the disk where your file is saved and replace "..." by the computer path where the file is saved. Use also "\" not "/".
I finally found the problem. I didn't notice I was writing my codes in the markdown cell. Stupid me!

Storing ZipFile Objects into the Django database

The problem I have is quite uncommon I think, because I didn't seem to be able to find an answer on here or on Google.
I have several pictures stored in my database and in order to serve these, I want to zip them, store the ZipFile created in the database which has an AmazonS3 storage as a backend. On more thing, all these operations are done in a background task managed by Celery. Now... Here is the code I wrote :
zipname = "{}.zip".format(reporting.title)
with ZipFile(zipname, 'w') as zf:
# Here is the zipfile generation. It quite doesn't matter anyway since this works fine.
reporting = Reporting.objects.get(pk=reporting_id)
reporting.pictures_archive = zf
reporting.save()
I got the error : *** AttributeError: 'ZipFile' object has no attribute '_committed'
So I tried to cast the zipfile into a Django File this way : zf = File(zf) but it returns an empty object.
Can anyone help me with that ? I'm kind of stuck...
This was kind of not as complicated as I thought. (Which could explain why no one asked that question all over the internet I guess)
Using Python 3.3, your strings are unicode and you mainly work with unicode objects. File needs bytes data to work correctly so here is the solution :
zipname = "{}.zip".format(reporting.id, reporting.title)
with ZipFile(zipname, 'w') as zf:
# Generating the ZIP !
reporting = Reporting.objects.get(pk=reporting_id)
reporting.pictures_archive.delete()
reporting.pictures_archive = File(open(zipname, "rb"))
reporting.save()

Open URL stored in a csv file

I'm almost an absolute beginner in Python, but I am asked to manage some difficult task. I have read many tutorials and found some very useful tips on this website, but I think that this question was not asked until now, or at least in the way I tried it in the search engine.
I have managed to write some url in a csv file. Now I would like to write a script able to open this file, to open the urls, and write their content in a dictionary. But I have failed : my script can print these addresses, but cannot process the file.
Interestingly, my script dit not send the same error message each time. Here the last : req.timeout = timeout
AttributeError: 'list' object has no attribute 'timeout'
So I think my script faces several problems :
1- is my method to open url the right one ?
2 - and what is wrong in the way I build the dictionnary ?
Here is my attempt below. Thanks in advance to those who would help me !
import csv
import urllib
dict = {}
test = csv.reader(open("read.csv","rb"))
for z in test:
sock = urllib.urlopen(z)
source = sock.read()
dict[z] = source
sock.close()
print dict
First thing, don't shadow built-ins. Rename your dictionary to something else as dict is used to create new dictionaries.
Secondly, the csv reader creates a list per line that would contain all the columns. Either reference the column explicitly by urllib.urlopen(z[0]) # First column in the line or open the file with a normal open() and iterate through it.
Apart from that, it works for me.

Categories

Resources