How to make a python object json-serialized? - python

I want to serialize a python object, after saved it into mysql(based on Django ORM) I want to get it and pass this object to a function which need this kind of object as a param.
Following two parts are my main logic code:
1 save param part :
class Param(object):
def __init__(self, name=None, targeting=None, start_time=None, end_time=None):
self.name = name
self.targeting = targeting
self.start_time = start_time
self.end_time = end_time
#...
param = Param()
param.name = "name1"
param.targeting= "targeting1"
task_param = {
"task_id":task_id, # string
"user_name":user_name, # string
"param":param, # Param object
"save_param":save_param_dict, # dictionary
"access_token":access_token, # string
"account_id": account_id, # string
"page_id": page_id, # string
"task_name":"sync_create_ad" # string
}
class SyncTaskList(models.Model):
task_id = models.CharField(max_length=128, blank=True, null=True)
ad_name = models.CharField(max_length=128, blank=True, null=True)
user_name = models.CharField(max_length=128, blank=True, null=True)
task_status = models.SmallIntegerField(blank=True, null=True)
task_fail_reason = models.CharField(max_length=255, blank=True, null=True)
task_name = models.CharField(max_length=128, blank=True, null=True)
start_time = models.DateTimeField()
end_time = models.DateTimeField(blank=True, null=True)
task_param = models.TextField(blank=True, null=True)
class Meta:
managed = False
db_table = 'sync_task_list'
SyncTaskList(
task_id=task_id,
ad_name=param.name,
user_name=user_name,
task_status=0,
task_param = task_param,
).save()
2 use param part
def add_param(param, access_token):
pass
task_list = SyncTaskList.objects.filter(task_status=0)
for task in task_list:
task_param = json.loads(task.task_param)
add_param(task_param["param"], task_param["access_token"]) # pass param object to function add_param
If I directly use Django ORM to save task_param into mysql, I get error,
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
for after ORM operation, I get string who's property name enclosed in single quotes like :
# in mysql it saved as
task_param: " task_param: {'task_id': 'e4b8b240cefaf58fa9fa5a591221c90a',
'user_name': 'jimmy',
'param': Param(name='name1',
targeting='geo_locations',
),
'save_param': {}}"
I am now confused with serializing an python object, then how to load this original object and pass it to a function?
Any commentary is very welcome. great thanks.
update my solution so far
task_param = {
# ...
"param":vars(param), # turn Param object to dictionary
# ...
}
SyncTaskList(
#...
task_param = json.dumps(task_param),
#...
).save()
#task_list = SyncTaskList.objects.filter(task_status=0)
#for task in task_list:
task_param = json.loads(task.task_param)
add_param(Param(**task_param["param"]), task_param["access_token"])
update based on #AJS's answer
directly pickle dumps and saved it as an binary field, then pickle loadsit also works
Any better solution for this?

Try looking into msgpack
https://msgpack.org/index.html
unlike pickle, which is python-specific, msgpack is supported by many languages (so the language you use to write to mysql can be different than the language used to read).
There are also some projects out there that integrate these serializer-libraries into Django model fields:
Pickle: https://pypi.org/project/django-picklefield/
MsgPack: https://github.com/vakorol/django-msgpackfield/blob/master/msgpackfield/msgpackfield.py

You can use pickle basically you are serializing your python object and save it as bytes in your MySQL db using BinaryField as your model field type in Django, as i don't think JSON serialization would work in your case as you have a python object as a value as well in your dict, when you fetch your data from db simpily unpickle it syntax is similar to json library see below.
import pickle
#to pickle
data = pickle.dumps({'name':'testname'})
# to unpickle just do
pickle.loads(data)
so in your case when you unpickle your object you should get your data in same form as it was before you did pickle.
Hope this helps.

Related

Django - How to take string values on URL for PUT?

I set up my URL like this :
path('voucher/<str:voucher_id>', views.update_voucher),
My process
def update_voucher(request, voucher_id):
put = QueryDict(request.body)
try:
customer_id = put.get('customer_id')
except:
return HttpResponse("Missing parameters")
updateVoucher = Voucher.objects.filter(code = voucher_id)
Its a PUT call taking parameters from both body and url. (voucher_id from URL) and (customer_id from body)
.
I call this URL http://127.0.0.1:5448/voucher/NewVoucher
I got this error:
ValueError: Field 'id' expected a number but got 'NewVoucher'.
The below is my model:
here.
class Voucher(models.Model):
code = models.CharField(unique=True, max_length=255)
delivery_type = models.CharField(max_length=255)
description = models.CharField(max_length=255, blank=True, null=True)
start_at = models.DateTimeField()
end_at = models.DateTimeField()
discount_type = models.CharField(max_length=255)
discount_amount = models.FloatField(blank=True, null=True)
P/S: I am a maintainer - cant change method function, and cant change the way this URL take parameters from both URL and body
You are not passing voucher_id as integers. instead you are passing code "NewVoucher" which is a string as per this error.
ValueError: Field 'id' expected a number but got 'NewVoucher'.
You have to pass id in integers so it would look something like this
http://127.0.0.1:5448/voucher/1
So far as i've understood you are looking for filter based on voucher code i,e "NewVoucher".
then Your method should be changed as,
def update_voucher(request, voucher_code, *args, **kwargs):
voucher = get_object_or_404(Voucher, code=voucher_code)
customer_id = request.data.get("customer_id") # im not sure where you are using this customer_id
if not customer_id:
raise HttpResponse("Missing parameters")
# updateVoucher = Voucher.objects.filter(code = voucher_id) no need of this line as voucher variable contains it
# urls
path('voucher/<str:voucher_code>', views.update_voucher),

Storing a function call in a Django model

Currently I am storing a series of objects as a dictionary of dictionaries, and within this I store calls to functions defined outside of the dictionary. These functions are specific to the objects, and cannot be generalised. In the dictionary I can refer to the function directly eg: 'some_property': function_name, and when I call that later on dictionary['some_property'](arg_1, arg_2)and the function gets called. I am looking migrate this dictionary of dictionaries to a django model, but I cannot see how I can replicate this functionality from a model.
What I currently have:
dictionaries.py
def year_camel_month(filename, **kwargs):
month = kwargs['month'].title()
return filename.format(str(kwargs['year']), month)
def year_month(filename, **kwargs):
month = kwargs['month']
return filename.format(str(kwargs['year']), month.lower())
data_source_families = {
'dataset_1': {
'source_url': 'https://example.org/url/subfolder',
'slug': 'slug_that_changes_predictably_over_time{}-{}',
'slug_treatment': year_camel_month
},
'dataset_2': {
'source_url': 'https://example2.org/url/subfolder',
'slug': 'slug_that_changes_predictably_over_time{}-{}',
'slug_treatment': year_month
},
}
Which then gets called when combined with a user-defined time frame later on:
get_data.py
from .dictionaries import data_source_families
slug = data_source_families[selected_dataset]['slug']
processed_slug = data_source_families[selected_dataset]['slug_treatment'](slug, some_kwargs)
url = data_source_families[selected_dataset]['source_url'] + processed_slug
And this is working fine. I am looking to develop functionality to improve consistency (and make these data available to another programme) by creating a django model that replicates this, something like this:
models.py
def year_camel_month(filename, **kwargs):
month = kwargs['month'].title()
return filename.format(str(kwargs['year']), month)
def year_month(filename, **kwargs):
month = kwargs['month']
return filename.format(str(kwargs['year']), month.lower())
class DataSourceFamilies(models.Model):
name = models.CharField(max_length=200, unique=True)
source_url = models.CharField(max_length=300, blank=False)
slug = models.CharField(max_length=200, blank=False)
--> slug_treatment = models._____(choices=list_of_functions) <--
def __str___(self):
return self.name.name
Does something like this exist? How would I go about doing this?
You cannot store functions (Python functions I mean) in a SQL database, indeed. But you can store any text value, and you can have a dict of 'key:func' in your model, ie:
class DataSourceFamilies(models.Model):
name = models.CharField(max_length=200, unique=True)
source_url = models.CharField(max_length=300, blank=False)
slug = models.CharField(max_length=200, blank=False)
SLUG_TREATEMENTS = [
# key, label, function
('year_camel_month', "Year, Camel month", year_camel_month),
('year_month': "Year month", year_month),
]
SLUG_TREATEMENTS_ACTIONS = {
k: func
for k, label, func in SLUG_TREATEMENTS
}
SLUG_TREATEMENTS_CHOICES = [
(k, label)
for k, label, func in SLUG_TREATEMENTS
]
slug_treatment = models.CharField(
max_length=50 # let's have a little headroom,
choices=SLUG_TREATMENT_CHOICES
)
def get_slug_treatment_func(self):
return self.SLUG_TREATEMENTS_ACTIONS[self.slug_treatment]
One thing you could do is to use a CharField and then eval it. However, using eval is usually a huge security risk. Any python code that enters it will be executed, and you do not want anything like that in a web application.
Another option is to have a lookup system. You could, say, have a CharField with choices that corresponds to a dictionary like so:
models.py
...
slug_treatment = models.CharField(max_length=100, choices=function_choices)
...
And then:
get_data.py
function_lookup = {
"year_month": year_month,
"year_camel_month": year_camel_month
}
processed_slug = function_lookup[data_source.slug_treatment](slug, some_kwargs)
Sry its a bit confusing to me, but based on what i understood, maybe you can declare all function in your class and use slug_treament as parameter to which function will be called when you need.
Lets draw it a bit
YEAR_CAMEL_MONTH=1
YEAR_MONTH=2
SLUG_TREATEMENTS_CHOICES = [
(YEAR_CAMEL_MONTH: 'year_camel_month'),
(YEAR_MONTH: 'year_month'),
]
class DataSourceFamilies(models.Model):
...
slug = models.CharField(max_length=200, blank=False)
slug_treatment = models.IntegerField(choices=SLUG_TREATMENT_CHOICES)
def year_camel_month(self):
... # Your logic
return formated_slug
def year_month(self):
... # Your logic
return formated_slug
def save(self *args **kwargs):
if self.slug_treatment == YEAR_CAMEL_MONTH:
self.slug = self.year_camel_month()
elif self.slug_treatment == YEAR_MONTH:
self.slug = self.year_month()
super(DataSourceFamilies, self).save(*args, **kwargs)
Or you can use it as prorperty method instead of persisted data (so the slug will be evaluated everytime your call your queryset, so its is dinamic instead of persisted" Obs.: Property methods works like columns from your database, but its not persisted, its like CAST in database
class DataSourceFamilies(models.Model):
...
slug_treatment = models.IntegerField(choices=SLUG_TREATMENT_CHOICES)
#property
def slug(self):
if self.slug_treatment == YEAR_CAMEL_MONTH:
return slug = self.year_camel_month()
elif self.slug_treatment == YEAR_MONTH:
return slug = self.year_month()
https://docs.djangoproject.com/en/2.0/topics/db/models/
Obs.: If you trying to get code from text and evaluate it in python i guess its possible, but is highly unsafe, and i do not recommend it

Django full text search using indexes with PostgreSQL

After solving the problem I asked about in this question, I am trying to optimize performance of the FTS using indexes.
I issued on my db the command:
CREATE INDEX my_table_idx ON my_table USING gin(to_tsvector('italian', very_important_field), to_tsvector('italian', also_important_field), to_tsvector('italian', not_so_important_field), to_tsvector('italian', not_important_field), to_tsvector('italian', tags));
Then I edited my model's Meta class as follows:
class MyEntry(models.Model):
very_important_field = models.TextField(blank=True, null=True)
also_important_field = models.TextField(blank=True, null=True)
not_so_important_field = models.TextField(blank=True, null=True)
not_important_field = models.TextField(blank=True, null=True)
tags = models.TextField(blank=True, null=True)
class Meta:
managed = False
db_table = 'my_table'
indexes = [
GinIndex(
fields=['very_important_field', 'also_important_field', 'not_so_important_field', 'not_important_field', 'tags'],
name='my_table_idx'
)
]
But nothing seems to have changed. The lookup takes exactly the same amount of time as before.
This is the lookup script:
from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
# other unrelated stuff here
vector = SearchVector("very_important_field", weight="A") + \
SearchVector("tags", weight="A") + \
SearchVector("also_important_field", weight="B") + \
SearchVector("not_so_important_field", weight="C") + \
SearchVector("not_important_field", weight="D")
query = SearchQuery(search_string, config="italian")
rank = SearchRank(vector, query, weights=[0.4, 0.6, 0.8, 1.0]). # D, C, B, A
full_text_search_qs = MyEntry.objects.annotate(rank=rank).filter(rank__gte=0.4).order_by("-rank")
What am I doing wrong?
Edit:
The above lookup is wrapped in a function I use a decorator on to time. The function actually returns a list, like this:
#timeit
def search(search_string):
# the above code here
qs = list(full_text_search_qs)
return qs
Might this be the problem, maybe?
You need to add a SearchVectorField to your MyEntry, update it from your actual text fields and then perform the search on this field. However, the update can only be performed after the record has been saved to the database.
Essentially:
from django.contrib.postgres.indexes import GinIndex
from django.contrib.postgres.search import SearchVector, SearchVectorField
class MyEntry(models.Model):
# The fields that contain the raw data.
very_important_field = models.TextField(blank=True, null=True)
also_important_field = models.TextField(blank=True, null=True)
not_so_important_field = models.TextField(blank=True, null=True)
not_important_field = models.TextField(blank=True, null=True)
tags = models.TextField(blank=True, null=True)
# The field we actually going to search.
# Must be null=True because we cannot set it immediately during create()
search_vector = SearchVectorField(editable=False, null=True)
class Meta:
# The search index pointing to our actual search field.
indexes = [GinIndex(fields=["search_vector"])]
Then you can create the plain instance as usual, for example:
# Does not set MyEntry.search_vector yet.
my_entry = MyEntry.objects.create(
very_important_field="something very important", # Fake Italien text ;-)
also_important_field="something different but equally important"
not_so_important_field="this one matters less"
not_important_field="we don't care are about that one at all"
tags="things, stuff, whatever"
Now that the entry exists in the database, you can update the search_vector field using all kinds of options. For example weight to specify the importance and config to use one of the default language configurations. You can also completely omit fields you don't want to search:
# Update search vector on existing database record.
my_entry.search_vector = (
SearchVector("very_important_field", weight="A", config="italien")
+ SearchVector("also_important_field", weight="A", config="italien")
+ SearchVector("not_so_important_field", weight="C", config="italien")
+ SearchVector("tags", weight="B", config="italien")
)
my_entry.save()
Manually updating the search_vector field every time some of the text fields change can be error prone, so you might consider adding an SQL trigger to do that for you using a Django migration. For an example on how to do that see for instance a blog article on Full-text Search with Django and PostgreSQL.
To actually search in MyEntry using the index you need to filter and rank by your search_vector field. The config for the SearchQuery should match the one of the SearchVector above (to use the same stopword, stemming etc).
For example:
from django.contrib.postgres.search import SearchQuery, SearchRank
from django.core.exceptions import ValidationError
from django.db.models import F, QuerySet
search_query = SearchQuery("important", search_type="websearch", config="italien")
search_rank = SearchRank(F("search_vector"), search_query)
my_entries_found = (
MyEntry.objects.annotate(rank=search_rank)
.filter(search_vector=search_query) # Perform full text search on index.
.order_by("-rank") # Yield most relevant entries first.
)
I'm not sure but according to postgresql documentation (https://www.postgresql.org/docs/9.5/static/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX):
Because the two-argument version of to_tsvector was used in the index
above, only a query reference that uses the 2-argument version of
to_tsvector with the same configuration name will use that index. That
is, WHERE to_tsvector('english', body) ## 'a & b' can use the index,
but WHERE to_tsvector(body) ## 'a & b' cannot. This ensures that an
index will be used only with the same configuration used to create the
index entries.
I don't know what configuration django uses but you can try to remove first argument

Querying Binary data using Django and PostgreSQL

I am trying to print out to the console the actual content value (which is html) of the field 'htmlfile': 16543. (See below)
So far I am able to print out the whole row using .values() method
Here is what I am getting in my python shell:
>>>
>>> Htmlfiles.objects.values()[0]
{'id': 1, 'name': 'error.html', 'htmlfile': 16543}
>>>
I want to print out the content of 16543.. I have scanned through the Django QuerySet docs so many times and still cannot find the right method..
Here is my data model in models.py:
class Htmlfiles(models.Model):
name = models.CharField(max_length=30, blank=True, null=True)
htmlfile = models.TextField(blank=True, null=True)
class Meta:
managed = False
db_table = 'htmlfiles'
Any assistance would be greatly appreciated.
You can fetch only the htmlfield value with:
Htmlfiles.objects.values('htmlfile')
Which, for each row, will give you an dictionary like so:
{'htmlfile': 12345}
So to print all the htmlfile values something like this is what you need:
objects = Htmlfiles.objects.values('htmlfile')
for obj in objects:
print(obj['htmlfile'])

How do I store a string in ArrayField? (Django and PostgreSQL)

I am unable to store a string in ArrayField. There are no exceptions thrown when I try to save something in it, but the array remains empty.
Here is some code from models.py :
# models.py
from django.db import models
import uuid
from django.contrib.auth.models import User
from django.contrib.postgres.fields import JSONField, ArrayField
# Create your models here.
class UserDetail(models.Model):
user = models.OneToOneField(User, on_delete=models.CASCADE)
key = models.CharField(max_length=50, default=False, primary_key=True)
api_secret = models.CharField(max_length=50)
user_categories = ArrayField(models.CharField(max_length = 1000), default = list)
def __str__(self):
return self.key
class PreParentProduct(models.Model):
product_user = models.ForeignKey(UserDetail, default=False, on_delete=models.CASCADE)
product_url = models.URLField(max_length = 1000)
pre_product_title = models.CharField(max_length=600)
pre_product_description = models.CharField(max_length=2000)
pre_product_variants_data = JSONField(blank=True, null=True)
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
def __str__(self):
return self.pre_product_title
I try to save it this way:
catlist = ast.literal_eval(res.text)
for jsonitem in catlist:
key = jsonitem.get('name')
id = jsonitem.get("id")
dictionary = {}
dictionary['name'] = key
dictionary['id'] = id
tba = json.dumps(dictionary)
print("It works till here.")
print(type(tba))
usersearch[0].user_categories.append(tba)
print(usersearch[0].user_categories)
usersearch[0].save()
print(usersearch[0].user_categories)
The output I get is:
It works till here.
<class 'str'>
[]
It works till here.
<class 'str'>
[]
[]
Is this the correct way to store a string inside ArrayField?
I cannot store JSONField inside an ArrayField, so I had to convert it to a string.
How do I fix this?
Solution to the append problem.
You haven't demonstrated how your usersearch[0] I suspect it's something like this:
usersearch = UserDetail.objects.all()
If that is so you are making changes to a resultset, those things are immutable. Try this you will see that the id is unchanged too:
usersearch[0].id = 1000
print usersearch.id
But this works
usersearch = list(UserDetail.objects.all())
and so does
u = usersearch[0]
Solution to the real problem
user_categories = ArrayField(models.CharField(max_length = 1000), default = list)
This is wrong. ArrayFields shouldn't be used in this manner. You will soon find that you need to search through them and
Arrays are not sets; searching for specific array elements can be a
sign of database misdesign. Consider using a separate table with a row
for each item that would be an array element. This will be easier to
search, and is likely to scale better for a large number of elements
ref: https://www.postgresql.org/docs/9.5/static/arrays.html
You need to normalize your data. You need to have a category model and your UserDetail should be related to it through a foreign key.

Categories

Resources