I am pulling some data from an API and I want to store it in my Django model. The data is a baseball inning and then runs scored that inning and comes to me like this...
"innings":[
0:0
1:3
2:0
3:0
4:1
5:2
6:0
7:0
8:4
]
I access each individual value like...
for game in games:
first_inning = game['scoreboard']['score']['innings'][0]
second_inning = game['scoreboard']['score']['innings'][1]
etc...
But if I wanted to save all the data as it is and start the innings at 1 instead of 0, which type of field would I use and how would I do that? Would it be an ArrayField?
I really appreciate your help.
There are some ways depending on your problem.
you can store your data in ArrayField. but ArrayField is only spcific to PostgreSQL database.(more information here)
you can convert your data to JSON and store it in JSONField(more information about JSONField is here).
My suggestion is solution number 2 because you are reading serialized data from API.
I hope it could help you.
One option is an ArrayField, it's 0-indexed as any python list and you cannot change that.
Another option is to model your Inning as a separate model, in case you want to perform queries like "average score on the 3rd inning" etc. You will be able to adjust inning numbers however you want them.
class Inning(models.Model):
game = models.ForeignKey('game.Game', on_delete=models.CASCADE)
number = models.PositiveIntegerField()
score = models.PositiveIntegerField()
class Meta:
unique_together = [('game', 'number')]
You could use the JSONField. The benefit of it is that you can format it however you want to store your data according to what you are getting from the API. In your model, you can define the field like this:
class SomeModel(models.Model):
....
innings_score = models.JSONField(default=dict)
Here I would advise you to use a default because of what is mentioned in the offical docs:
If you give the field a default, ensure it’s an immutable object, such
as a str, or a callable object that returns a fresh mutable object
each time, such as dict or a function. Providing a mutable default
object like default={} or default=[] shares the one object between all
model instances.
Then you can use save your data as a normal dictionary in the model:
SomeModel.objects.create(...., innings_score={0:0,
1:3,
2:0,
3:0,
4:1,
5:2,
6:0,
7:0,
8:4})
Since this is a dictionary, you can start your data from 1 by naming your key 1 instead of 0 (i.e skip the first value), and so on.
Related
I want to create a database of dislike items, but depending on the category of item, it has different columns I'd like to show when all you're looking at is cars. In fact, I'd like the columns to be dynamic based on the category so we can easily an additional property to cars in the future, and have that column show up now too.
For example:
But when you filter on car or person, additional rows show up for filtering.
All the examples that I can find about using django models aren't giving me a very clear picture on how I might accomplish this behavior in a clean, simple web interface.
I would probably go for a model describing a "dislike criterion":
class DislikeElement(models.Model):
item = models.ForeignKey(Item) # Item is the model corresponding to your first table
field_name = models.CharField() # e.g. "Model", "Year born"...
value = models.CharField() # e.g. "Mustang", "1960"...
You would have quite a lot of flexibility in what data you can retrieve. For example, to get for a given item all the dislike elements, you would just have to do something like item.dislikeelements_set.all().
The only problem with this solution is that you would to store in value numbers, strings, dates... under the same data type. But maybe that's not an issue for you.
Is it possible to filter a queryset by casting an hstore value to int or float?
I've run into an issue where we need to add more robust queries to an existing data model. The data model uses the HStoreField to store the majority of the building data, and we need to be able to query/filter against them, and some of the values need to be treated as numeric values.
However, since the values are treated as strings, they're compared character by character and results in incorrect queries. For example, '700' > '1000'.
So if I want to query for all items with a sqft value between 700 and 1000, I get back zero results, even though I can plainly see there are hundreds of items with values within that range. If I just query for items with sqft value >= 700, I only get results where the sqft value starts with 7, 8 or 9.
I also tried testing this using a JsonField from django-pgjson (since we're not yet on Django 1.9), but it appears to have the same issue.
Setup
Django==1.8.9
django-pgjson==0.3.1 (for jsonfield functionality)
Postgres==9.4.7
models.py
from django.contrib.postgres.fields import HStoreField
from django.db import models
class Building (models.Model):
address1 = models.CharField(max_length=50)
address2 = models.CharField(max_length=20, default='', blank=True)
city = models.CharField(max_length=50)
state = models.CharField(max_length=2)
zipcode = models.CharField(max_length=10)
data = HStoreField(blank=True, null=True)
Example Data
This is an example of what some of the data on the hstore field looks like.
address1: ...
address2: ...
city: ...
state: ...
zipcode: ...
data: {
'year_built': '1995',
'building_type': 'residential',
'building_subtype': 'single-family',
'bedrooms': '2',
'bathrooms': '1',
'total_sqft': '958',
}
Example Query which returns incorrect results
queryset = Building.objects.filter(data__total_sqft__gte=700)
I've tried playing around with the annotate feature to see if I can coerce it to cast to a numeric value but I have not had any luck getting that to work. I always get an error saying the field I'm querying against does not exist. This is an example I found elsewhere which doesn't seem to work.
queryset = Building.objects.all().annotate(
sqft=RawSQL("((data->>total_sqft)::numeric)")
).filter(sqft__gte=700)
Which results in this error:
FieldError: Cannot resolve keyword 'sqft' into field. Choices are: address1, address2, city, state, zipcode, data
One thing that complicates this setup a little further is that we're building the queries dynamically and using Q() objects to and/or them together.
So, trying to do something sort of like this, given a key, value and operator type (gte, lte, iexact):
queryset.annotate(**{key: RawSQL("((%data->>%s)::numeric)", (key,)})
queries.append(Q(**{'{}__{}'.format(key, operator): value})
queries.filter(reduce(operator.and_, queries)
However, I'd be happy even just getting the first query working without dynamically building them out.
I've thought about the possibility of having to create a separate model for the building data with the fields explicitly defined, however there are over 600 key value pairs in the data hstore. It seems like changing that into a concrete data model would be a nightmare to setup and potentially maintain.
So I had a very similar problem and ended up using the Cast Function (Django > 1.10) with KeyTextTransform.
my_query =.query.annotate(as_numeric=Cast(KeyTextTransform('my_json_fieldname', 'metadata'), output_field=DecimalField(max_digits=6, decimal_places=2))).filter(as_numeric=2)
The model for my Resource class is as follows:
class Resource(ndb.Model):
name = ndb.StringProperty()
availability = ndb.StructuredProperty(Availability, repeated=True)
tags = ndb.StringProperty(repeated=True)
owner = ndb.StringProperty()
id = ndb.StringProperty(indexed=True, required=True)
lastReservedTime = ndb.DateTimeProperty(auto_now_add=False)
startString = ndb.StringProperty()
endString = ndb.StringProperty()
I want to extract records where the owner is equal to a certain string.
I have tried the below query. It does not give an error but does not return any result either.
Resource.query(Resource.owner== 'abc#xyz.com').fetch()
As per my understanding if a column has duplicate values it shouldn't be indexed and that is why owner is not indexed. Please correct me if I am wrong.
Can someone help me figure out how to achieve a where clause kind of functionality?
Any help is appreciated! Thanks!
Just tried this. It worked first time. Either you have no Resource entities with an owner of "abc#xyz.com", or the owner property was not indexed when the entities were put (which can happen if you had indexed=False at the time the entities were put).
My test:
Resource(id='1', owner='abc#xyz.com').put()
Resource(id='2', owner='abc#xyz.com').put()
resources = Resource.query(Resource.owner == 'abc#xyz.com').fetch()
assert len(resources) == 2
Also, your comment:
As per my understanding if a column has duplicate values it shouldn't
be indexed and that is why owner is not indexed. Please correct me if
I am wrong.
Your wrong!
Firstly, there is no concept of a 'column' in a datastore model, so I will I assume you mean 'Property'.
Next, to clarify what you mean by "if a column property has duplicate values":
I assume you mean 'multiple entities created from the same model with the same value for a specific property', in your case 'owner'. This has no effect on indexing, each entity will be indexed as expected.
Or maybe you mean 'a single entity with a property that allows multiple values (ie a list)', which also does not prevent indexing. In this case, the entity will be indexed multiple times, once for each item in the list.
To further elaborate, most properties (ie ones that accept primitive types such as string, int, float etc) are indexed automatically, unless you add the attribute indexed=False to the Property constructor. In fact, the only time you really need to worry about indexing is when you need to perform more complex queries, which involve querying against more that 1 property (and even then, by default, the app engine dev server will auto create the indexes for you in your local index.yaml file), or using inequality filters.
Please read the docs for more detail.
Hope this helps!
I have a simple to-do list with activities that can be ordered by the user. I use the model List, with a many-to-many field to the model Activities.
Now I need a way to store the user defined ordering of the activities on the list. Should I go with an extra field in my List model to store the order of my activity primary keys like this:
class List(models.Model):
activities = models.ManyToManyField(Activity)
order = models.CommaSeperatedIntegerField(max_length=250)
Or should I go with a solution in the Activity model, like described here:
https://djangosnippets.org/snippets/998/
What method can be considered as best practice?
you can create your own ManyToMany Model defining the extra field order
https://docs.djangoproject.com/en/dev/topics/db/models/#extra-fields-on-many-to-many-relationships
something like:
class ActivityList(models.Model):
activity = models.ForeignKey(Activity)
list = models.ForeignKey(List)
order = models.IntegerField()
class List(models.Model)
activities = models.ManyToManyField(Activity, through='ActivityList')
Now I need a way to store the user defined ordering of the activities on the list.
Specifying and order field allows you to give each activity an order.
specifying a comma seperated string, is 100% not the way to go, IN fact it is one of the biggest anti patterns in relational databases, Is storing a delimited list in a database column really that bad?
Using a through model lets you query for the order when presenting your todo list to the user
for activity in your_list.activities.all().order_by('order'):
# display activity to user
I am trying to create a django model which has as one of its fields a reference to some sort of python type, which could be either a integer, string, date, or decimal.
class MyTag(models.Model):
name = models.CharField(max_length=50)
object = (what goes here??)
I know that if I want a foreign key to any other model, I can use GenericForeignKeys and content_types. How can I have a model field that references any python type? The only idea I have come up with so far is to create models that are simple wrappers on objects and use GenericForeignKeys.
Is there any way to do this?
Since you want to filter, you would need some kind of DB support for your field and json field won't be that valuable for you. You can use some different solution with different
complication levels according to your actual need. One suggestion is to serialize your data to string. Pad with enough zeros for string/integer/float sorting. Now you can filter all your stuff as string (make sure you pad the value you are filtering by as well). Add a data_type column for fetching the right python object.
TYPES = [(int, 1), (Decimal, 2),(date, 3), (str, 4)]
class MyTag(models.Model):
name = models.CharField(max_length=50)
data_type = models.IntegetField(choices=TYPES)
value = models.CharField(max_length=100)
def set_the_value(self, value):
choices = dict(TYPES)
self.data_type = choices[type(value)]
if self.data_type == int:
self.value = "%010d" % value
# else... repeat for other data types
def get_the_value(self):
choices = dict([(y,x) for x,y in TYPES])
return choices[self.data_type](self.value)
(Disclaimer: this is a hack, but probably not the worst one).
JSONField would work, but it won't handle Decimal as is.
If you don't need to filter through this field you could either use JsonField or you could pickle your objects like this
The second aproach would allow you to store nearly any type of python data type, though it would be usable only from python code.
If you need to filter through this data you should just create separate fields with one data type.