Mongoengine pre_delete FileField - python

I'm new to mongoengine. I am trying to get the pre_delete hook to delete a FileField storing in GridFS.
I am using Python 2.7.10, Mongo 3.4 and mongoengine 0.8.7.
Here is what I have.
import uuid
import mongoengine as me
class MyFiles(me.Document):
meta = {"collection": "test"}
guid = me.UUIDField(binary=False, required=True)
my_file = me.FileField()
#classmethod
def pre_delete(cls, sender, document, **kwargs):
document.my_file.delete()
if __name__ == '__main__':
me.connect(db='main', alias='default', host='localhost')
m = MyFiles(guid=uuid.uuid4())
m.my_file.new_file(content_type='text/plain')
m.my_file.write("This is")
m.my_file.write("my file")
m.my_file.write("Hooray!")
m.my_file.close()
m.save()
print(m.my_file.read())
m.delete()
Now I am debugging with a breakpoint on m.delete()
my.file.read() worked.
There is a document in collection "test" that refers to the file in GridFS.
There is a file in fs.files.
And in fs.chunks.
Now I ran m.delete().
Collection "test" is empty.
fs.files is not empty. Neither is fs.chunks. The file remains.
According to mongoengine docs for gridfs, I need to run m.my_file.delete() to delete the GridFS entry before deleting the MyFiles document. I have confirmed this works if I put m.my_file.delete() before m.delete() like so.
m.save()
print(m.my_file.read())
m.my_file.delete()
m.delete()
However I want it to run in pre_delete. This seems like the purpose of pre_delete. Any ideas what I am doing wrong?

Here is the problem. I did not register the signal. This works:
import uuid
import mongoengine as me
class MyFiles(me.Document):
meta = {"collection": "test"}
guid = me.UUIDField(binary=False, required=True)
my_file = me.FileField()
#classmethod
def pre_delete(cls, sender, document, **kwargs):
document.my_file.delete()
me.signals.pre_delete.connect(MyFiles.pre_delete, sender=MyFiles)
if __name__ == '__main__':
me.connect(db='main', alias='default', host='localhost')
m = MyFiles(guid=uuid.uuid4())
m.my_file.new_file(content_type='text/plain')
m.my_file.write("This is")
m.my_file.write("my file")
m.my_file.write("Hooray!")
m.my_file.close()
m.save()
print(m.my_file.read())
m.delete()

Related

errors in django tests

im trying to run the following test but im not sure why its not working since when tests are ran a test db is created and that means that the newly created item gets an id of 1 but im still getting an error that no object matches the query
model
from django.db import models
# Create your models here.
class Post(models.Model):
text = models.TextField()
def __str__(self):
return self.text[:50]
tests
from django.test import TestCase
from .models import Post
# Create your tests here.
class PostModelTest(TestCase):
def setup(self):
post = Post.objects.create(text="just a test")
def test_text_content(self):
post = Post.objects.get(id=1)
expected_obect_name = f'{post.text}'
self.assertEqual(expected_obect_name, 'just a test')
here's the error
First of all you should name method setUp not setup. It's case-sensetive. But it's probably will not solve your problem, since django doesn't reset auto id for each TestCase class. So if you have another test case with same code the error will happened again. To solve it you need to save post id as e.g. self.post_id and use it in test methods instead of 1:
class PostModelTest(TestCase):
def setUp(self):
post = Post.objects.create(text="just a test")
self.post_id = post.pk
def test_text_content(self):
post = Post.objects.get(id=self.post_id)
expected_obect_name = f'{post.text}'
self.assertEqual(expected_obect_name, 'just a test')

With peewee how connect to existing SQLite db for reading only

I have a silly question.
This my code:
from peewee import *
db = SqliteDatabase(None)
class Base(Model):
class Meta:
database = db
class Table(Base):
a_date = DateField()
url = CharField()
def __main()__
parser = argparse.ArgumentParser()
parser.add_argument('--db-dir', action='store')
args = parser.parse_args()
db_path = os.path.join(args.db_dir, 'data.db')
try:
db.init(db_path)
db.connect()
query = Table.select().order_by(Table.a_date.desc()).get()
except Exception:
sys.exit(1)
else:
print(query.url)
sys.exit(0)
if __name__ == '__main__':
main()
This code is working fine, but if the file db not exist db.connect always create it. How I can prevent this ?
Another question is , How can query table database for this field without declare the peewee Model?
Thanks
If I understand correctly peewee doc (http://docs.peewee-orm.com/en/latest/peewee/database.html), they use the api provided by python in order to connect to sqlite.
Which means you have to deal with this api (https://docs.python.org/2/library/sqlite3.html#sqlite3.connect), and the connect method always create the database beforehand.
I however believe that you can pass a custom Connection class to this method (parameter factory), you could define your behaviour in this custom class.
import os
from sqlite3 import Connection
from peewee import *
class CustomConnection(Connection):
def __init__(self, dbname, *args, **kwargs):
# Check if db already exists or not
if not os.path.exists(dbname):
raise ValueError('DB {} does not exist'.format(dbname))
super(CustomConnection, self).__init__(dbname, *args, **kwargs)
db = SqliteDatabase('mydatabase', factory=CustomConnection)

Integrate extracted PDF content with django-haystack

I have extracted PDF/DOCX content with Solr and I've suceeded to establish some search queries using the following Solr URL dedicated to this :
http://localhost:8983/solr/select?q=Lycee
I would like to establish a such query with django-haystack. I have found this link which is talking about the issue :
https://github.com/toastdriven/django-haystack/blob/master/docs/rich_content_extraction.rst
But there is no "FileIndex" class with django-haystack (2.0.0-beta). How can I integrate a such search within django-haystack ?
The "FileIndex" referenced in the documentation is a hypothetical subclass of haystack.indexes.SearchIndex. Here is an example:
from haystack import indexes
from myapp.models import MyFile
class FileIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
title = indexes.CharField(model_attr='title')
owner = indexes.CharField(model_attr='owner__name')
def get_model(self):
return MyFile
def index_queryset(self, using=None):
return self.get_model().objects.all()
def prepare(self, obj):
data = super(FileIndex, self).prepare(obj)
# This could also be a regular Python open() call, a StringIO instance
# or the result of opening a URL. Note that due to a library limitation
# file_obj must have a .name attribute even if you need to set one
# manually before calling extract_file_contents:
file_obj = obj.the_file.open()
extracted_data = self.backend.extract_file_contents(file_obj)
# Now we'll finally perform the template processing to render the
# text field with *all* of our metadata visible for templating:
t = loader.select_template(('search/indexes/myapp/myfile_text.txt', ))
data['text'] = t.render(Context({'object': obj,
'extracted': extracted_data}))
return data
So extracted_data would be replaced with whatever process you came up with to extract the PDF/DOCX content. You would then update your template to include that data.

Flask-sqlalchemy `create_all()`, specifying models

I have the file with commands, separated from models. The models had executed only after I had imported all the models from models to my starter file.
db = SQLAlchemy(app)
from models import *
try:
argv = sys.argv[1]
argv == '--run' and app.run()
argv == '--create' and db.create_all()
But it is pretty ambiguously for me. I walk around Flask-Sqlalchemy source code. I saw:
def get_tables_for_bind(self, bind=None):
"""Returns a list of all tables relevant for a bind."""
result = []
for table in self.Model.metadata.tables.itervalues():
if table.info.get('bind_key') == bind:
result.append(table)
return result
And I understand the self.Model is:
def make_declarative_base(self):
"""Creates the declarative base."""
base = declarative_base(cls=Model, name='Model',
mapper=signalling_mapper,
metaclass=_BoundDeclarativeMeta)
base.query = _QueryProperty(self)
return base
Do I have more explicit way to specify the models for execution? And how the self.Model knows what the tables must be executed (after import)?

Unable to query from entities loaded onto the app engine datastore

I am a newbie to python. I am not able to query from the entities- UserDetails and PhoneBook I loaded to the app engine datastore. I have written this UI below based on the youtube video by Brett on "Developing and Deploying applications on GAE" -- shoutout application. Well I just tried to do some reverse engineering to query from the datastore but failed in every step.
#!/usr/bin/env python
import wsgiref.handlers
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
import models
class showPhoneBook(db.Model):
""" property to store user_name from UI to persist for the session """
user_name = db.StringProperty(required=True)
class MyHandler(webapp.RequestHandler):
def get(self):
## Query to get the user_id using user_name retrieved from UI ##
p = UserDetails.all().filter('user_name = ', user_name)
result1 = p.get()
for itr1 in result1:
userId = itr.user_id
## Query to get the phone book contacts using user_id retrieved ##
q = PhoneBook.all().filter('user_id = ', userId)
values = {
'phoneBookValues': q
}
self.request.out.write(
template.render('phonebook.html', values))
def post(self):
phoneBookuser = showPhoneBook(
user_name = self.request.get('username'))
phoneBookuser.put()
self.redirect('/')
def main():
app = webapp.WSGIApplication([
(r'.*',MyHandler)], debug=True)
wsgiref.handlers.CGIHandler().run(app)
if __name__ == "__main__":
main()
This is my models.py file where I've defined my UserDetails and PhoneBook classes,
#!/usr/bin/env python
from google.appengine.ext import db
#Table structure of User Details table
class UserDetails(db.Model):
user_id = db.IntegerProperty(required = True)
user_name = db.StringProperty(required = True)
mobile_number = db.PhoneNumberProperty(required = True)
#Table structure of Phone Book table
class PhoneBook(db.Model):
contact_id = db.IntegerProperty(required=True)
user_id = db.IntegerProperty(required=True)
contact_name = db.StringProperty(required=True)
contact_number = db.PhoneNumberProperty(required=True)
Here are the problems I am facing,
1) I am not able to call user_name (retrieved from UI-- phoneBookuser = showPhoneBook(user_name = self.request.get('username'))) in get(self) method for querying UserDetails to to get the corresponding user_name.
2) The code is not able to recognize UserDetails and PhoneBook classes when importing from models.py file.
3) I tried to define UserDetails and PhoneBook classes in the main.py file itself, them I get the error at result1 = p.get() saying BadValueError: Unsupported type for property : <class 'google.appengine.ext.db.PropertiedClass'>
I have been struggling since 2 weeks to get through the mess I am into but in vain. Please help me out in straightening out my code('coz I feel what I've written is a error-prone code all the way).
I recommend that you read the Python documentation of GAE found here.
Some comments:
To use your models found in models.py, you either need to use the prefix models. (e.g. models.UserDetails) or import them using
from models import *
in MyHandler.get() you don't lookup the username get parameter
To fetch values corresponding to a query, you do p.fetch(1) not p.get()
You should also read Reference properties in GAE as well. I recommend you having your models as:
class UserDetails(db.Model):
user_name = db.StringProperty(required = True)
mobile_number = db.PhoneNumberProperty(required = True)
#Table structure of Phone Book table
class PhoneBook(db.Model):
user = db.ReferenceProperty(UserDetails)
contact_name = db.StringProperty(required=True)
contact_number = db.PhoneNumberProperty(required=True)
Then your MyHandler.get() code will look like:
def get(self):
## Query to get the user_id using user_name retrieved from UI ##
user_name = self.request.get('username')
p = UserDetails.all().filter('user_name = ', user_name)
user = p.fetch(1)[0]
values = {
'phoneBookValues': user.phonebook_set
}
self.response.out.write(template.render('phonebook.html', values))
(Needless to say, you need to handle the case where the username is not found in the database)
I don't quite understand the point of showPhoneBook model.
Your "session variable" being stored to the datastore isn't going to follow your redirect; you'd have to fetch it from the datastore in your get() handler, although without setting a session ID in a cookie or something this isn't going to implement sessions at all, but rather allow anyone getting / to use whatever value was send with a POST request whether it was sent by them or someone else. Why use the redirect at all; responding to a POST request should be done in the post() method, not through a redirect to a GET method.

Categories

Resources