Storing ZipFile Objects into the Django database - python

The problem I have is quite uncommon I think, because I didn't seem to be able to find an answer on here or on Google.
I have several pictures stored in my database and in order to serve these, I want to zip them, store the ZipFile created in the database which has an AmazonS3 storage as a backend. On more thing, all these operations are done in a background task managed by Celery. Now... Here is the code I wrote :
zipname = "{}.zip".format(reporting.title)
with ZipFile(zipname, 'w') as zf:
# Here is the zipfile generation. It quite doesn't matter anyway since this works fine.
reporting = Reporting.objects.get(pk=reporting_id)
reporting.pictures_archive = zf
reporting.save()
I got the error : *** AttributeError: 'ZipFile' object has no attribute '_committed'
So I tried to cast the zipfile into a Django File this way : zf = File(zf) but it returns an empty object.
Can anyone help me with that ? I'm kind of stuck...

This was kind of not as complicated as I thought. (Which could explain why no one asked that question all over the internet I guess)
Using Python 3.3, your strings are unicode and you mainly work with unicode objects. File needs bytes data to work correctly so here is the solution :
zipname = "{}.zip".format(reporting.id, reporting.title)
with ZipFile(zipname, 'w') as zf:
# Generating the ZIP !
reporting = Reporting.objects.get(pk=reporting_id)
reporting.pictures_archive.delete()
reporting.pictures_archive = File(open(zipname, "rb"))
reporting.save()

Related

RedVox Python SDK | Not Reading in .rdvxz Files

I'm attempting to read in a series of files for processing contained in a single directory using RedVox:
input_directory = "/home/ben/Documents/Data/F1D1/21" # file location
rdvx_data = DataWindow(input_dir=input_directory, apply_correction=False, debug=True) # using RedVox to read in the files
print(os.listdir(input_directory)) # verifying the files actually exist...
# returns "['file1.rdvxz', 'file2.rdvxz', file3.rdvxz', ...etc]", they exist
# write audio portion to file
rdvx_data.to_json_file(base_dir=output_rpd_directory,
file_name=output_filename)
# this never runs, because rdvx_data.stations = [] (verified through debugging)
for station in rdvx_data.stations:
# some code here
Enabling debugging through arguments as seen above does not provide an extra details. In fact, there is no error message whatsoever. It writes the JSON file and pickle to disk, but the JSON file is full of null values and the pickle object is just a shell, no contents. So the files definitely exist, os.listdir() sees them, but RedVox does not.
I assume this is some very silly error or lack of understanding on my part. Any help is greatly appreciated. I have not worked with RedVox previously, nor do I have much understanding of what these files contain other than some audio data and some other data. I've simply been tasked with opening them to work on a model to analyze the data within.
SOLVED: Not sure why the previous code doesn't work (it was handed to me), however, I worked around the DataWindow call and went straight to calling the "redvox.api900.reader" object:
from redvox.api900 import reader
dataset_dir = "/home/*****/Documents/Data/F1D1/21/"
rdvx_files = glob(dataset_dir+"*.rdvxz")
for file in rdvx_files:
wrapped_packet = reader.read_rdvxz_file(file)
From here I can view all of the sensor data within:
if wrapped_packet.has_microphone_sensor():
microphone_sensor = wrapped_packet.microphone_sensor()
print("sample_rate_hz", microphone_sensor.sample_rate_hz())
Hope this helps anyone else who's confused.

How to decode/read Flask filesystem session files?

I have a Python3 Flask app using Flask-Session (which adds server-side session support) and configured to use the filesystem type.
This type underlying uses the Werkzeug class werkzeug.contrib.cache.FileSystemCache (Werkzeug cache documentation).
The raw cache files look like this if opened:
J¬».].Äï;}î(å
_permanentîàå
respondentîåuuidîåUUIDîìî)Åî}î(åintîät˙ò∑flŒºçLÃ/∆6jhåis_safeîhåSafeUUIDîìîNÖîRîubåSECTIONS_VISITEDî]îåcurrent_sectionîKåSURVEY_CONTENTî}î(å0î}î(ås_idîås0îånameîåWelcomeîådescriptionîåîå questionsî]î}î(ås_idîhåq_idîhåq_constructîhåq_textîhå
q_descriptionîhåq_typeîhårequiredîhåoptions_rowîhåoptions_row_alpha_sortîhåreplace_rowîhåoptions_colîhåoptions_col_codesîhåoptions_col_alpha_sortîhåcond_continue_rules_rowîhåq_meta_notesîhuauå1î}î(hås1îhå Screeningîhå[This section determines if you fit into the target group.îh]î(}î(hh/håq1îh hh!å9Have you worked on a product in this field before?
The items stored in the session can be seen a bit above:
- current_section should be an integer, e.g., 0
- SECTIONS_VISITED should be an array of integers, e.g., [0,1,2]
- SURVEY_CONTENT format should be an object with structure like below
{
'item1': {
'label': string,
'questions': [{}]
},
'item2': {
'label': string,
'questions': [{}]
}
}
What you can see in the excerpt above, for example the text This section determines if you fit into the target group is the value of one label. The stuff after questions are keys that can be found in each questions object, e.g., q_text as well as their values, e.g., Have you worked on a product in this field before? is the value of q_text.
I need to retrieve data from the stored cache files in a way that I can read them without all the extra characters like å.
I tried using Werkzeug like this, where the item 9c3c48a94198f61aa02a744b16666317 is the name of the cache file I want to read. However, it was not found in the cache directory.
from werkzeug.contrib.cache import FileSystemCache
cache_dir="flask_session"
mode=0600
threshold=20000
cache = FileSystemCache(cache_dir, threshold=threshold, mode=mode)
item = "9c3c48a94198f61aa02a744b16666317"
print(cache.has(item))
data = cache.get(item)
print(data)
What ways are there to read the cache files?
I opened a GitHub issue in Flask-Session, but that's not really been actively maintained in years.
For context, I had an instance where for my web app writing to the database was briefly not working - but the data I need was also being saved in the session. So right now the only way to retrieve that data is to get it from these files.
EDIT:
Thanks to Tim's answer I solved it using the following:
import pickle
obj = []
with open(file_name,"rb") as fileOpener:
while True:
try:
obj.append(pickle.load(fileOpener))
except EOFError:
break
print(obj)
I needed to load all pickled objects in the file, so I combined Tim's solution with the one here for loading multiple objects: https://stackoverflow.com/a/49261333/11805662
Without this, I was just seeing the first pickled item.
Also, in case anyone has the same problem, I needed to use the same python version as my Flask app (related post). If I didn't, then I would get the following error:
ValueError: unsupported pickle protocol: 4
You can decode the data with pickle. Pickle is part of the Python standard library.
import pickle
with open("PATH/TO/SESSION/FILE") as f:
data = pickle.load(f)

Python Storing Data

I have a list in my program. I have a function to append to the list, unfortunately when you close the program the thing you added goes away and the list goes back to the beginning. Is there any way that I can store the data so the user can re-open the program and the list is at its full.
You may try pickle module to store the memory data into disk,Here is an example:
store data:
import pickle
dataset = ['hello','test']
outputFile = 'test.data'
fw = open(outputFile, 'wb')
pickle.dump(dataset, fw)
fw.close()
load data:
import pickle
inputFile = 'test.data'
fd = open(inputFile, 'rb')
dataset = pickle.load(fd)
print dataset
You can make a database and save them, the only way is this. A database with SQLITE or a .txt file. For example:
with open("mylist.txt","w") as f: #in write mode
f.write("{}".format(mylist))
Your list goes into the format() function. It'll make a .txt file named mylist and will save your list data into it.
After that, when you want to access your data again, you can do:
with open("mylist.txt") as f: #in read mode, not in write mode, careful
rd=f.readlines()
print (rd)
The built-in pickle module provides some basic functionality for serialization, which is a term for turning arbitrary objects into something suitable to be written to disk. Check out the docs for Python 2 or Python 3.
Pickle isn't very robust though, and for more complex data you'll likely want to look into a database module like the built-in sqlite3 or a full-fledged object-relational mapping (ORM) like SQLAlchemy.
For storing big data, HDF5 library is suitable. It is implemented by h5py in Python.

How to output a Numpy array to a file object for Picloud

I have a matrix-factorization process that I'm running on picloud. The output is a set of numpy arrays (ndarray).
Now, I want to save it to my bucket, but I'm not able to zero in on the right way to do it. Let's assume that the array to be saved is P.
I tried:
cloud.bucket.putf(P,'p.csv')
but that returned an error: "IOError: File object is not seekable. Cannot transmit".
I tried
numpy.ndarray.tofile(P,f, sep=",", format="%s") #outputing the array to a file object f
cloud.bucket.putf(f,'p.csv') #saving the file object f in the bucket.
I tried a couple of other things, including using using numpy.savetext (as I would if I ran it locally) but I'm not able to solve this between the picloud documentation and stackexchange questions. I haven't tried pickle yet, though. I felt this was something straightforward, but I'm feeling quite silly after spending a few hours on this.
As you guessed, you want to pickle the array as follows:
import cloud
import cPickle as pickle
# to write
cloud.bucket.putf(pickle.dumps(P), 'p.csv')
# to read
obj = pickle.loads(cloud.bucket.getf('p.csv').read())
This is a general way to serialize and store any Python object in your PiCloud Bucket. I also recommend that you store your csv files under a prefix to keep it organized [1].
[1] http://docs.picloud.com/bucket.html#namespacing-with-prefix

Open URL stored in a csv file

I'm almost an absolute beginner in Python, but I am asked to manage some difficult task. I have read many tutorials and found some very useful tips on this website, but I think that this question was not asked until now, or at least in the way I tried it in the search engine.
I have managed to write some url in a csv file. Now I would like to write a script able to open this file, to open the urls, and write their content in a dictionary. But I have failed : my script can print these addresses, but cannot process the file.
Interestingly, my script dit not send the same error message each time. Here the last : req.timeout = timeout
AttributeError: 'list' object has no attribute 'timeout'
So I think my script faces several problems :
1- is my method to open url the right one ?
2 - and what is wrong in the way I build the dictionnary ?
Here is my attempt below. Thanks in advance to those who would help me !
import csv
import urllib
dict = {}
test = csv.reader(open("read.csv","rb"))
for z in test:
sock = urllib.urlopen(z)
source = sock.read()
dict[z] = source
sock.close()
print dict
First thing, don't shadow built-ins. Rename your dictionary to something else as dict is used to create new dictionaries.
Secondly, the csv reader creates a list per line that would contain all the columns. Either reference the column explicitly by urllib.urlopen(z[0]) # First column in the line or open the file with a normal open() and iterate through it.
Apart from that, it works for me.

Categories

Resources