Django: create a model with fields derived from pandas dataframe

Django: create a model with fields derived from pandas dataframe - python

I have a pandas DataFrame from which I want to create a model whose fields are the df columns (and sets the right "Field" type for django).
Googled around but could only find bulk_create() tips, which, for my understanding, creates different Models.
My knowledge level: Django tutorial done, trying to exercise by creating this model from a big dataframe (many columns).
EDIT:
In practice, what I think I need is a way to write this part (in a model.py class):
first_name = models.CharField(max_length=200)
surname = models.CharField(max_length=200)
run_date = models.DateTimeField('run_date')
...
in an automatic way by looking at a dataframe with 30+ columns and defining a field for each of them.
I suspect I'm probably thinking it in a wrong way, so I would like some redirection of that kind, if needed.

Related

How to keep assigned attributes to a queryset object after filtering? Alternatives?

Maybe it's a stange answer, so i will explain why i'm doing this.
I have a model of Products. I have to assign each of them some stock.
So i have a function on the Products model that calculates a lot of neccesary things like stock and returns a QuerySet.
Since my db model is a little bit "complicated" i can't use annotations in this case. So i decided to execute this database query manually and then, assign each product on the querySet a stock attribute manually. Something like:
for product in queryset_products:
product.stock = some_stock_calc...
The problem comes when i want to use filters this queryset_product.
after executing something like:
queryset_products = queryset_products.filter(...)
the stock attribute gets lost
Any solution?

Since you can't use annotate(), if you can add a separate column to store stock in your Product table, you can make a the filter queries any time.
Maybe have a celery task that does all the calculations for each Product and save to new column.
Otherwise, without annotate you can't have the stock attribute in the queryset.

It can be solved differently, you can run one loop as
queryset_products = list(queryset_products.filter(...))
for product in queryset_products:
setattr(product, "stock") = some_stock_calc...
Basically, you need to fetch all the records from the database as query being lazy it will be lost since it will be re-evaluated unless results have been cached/stored.

All operations on the queryset like .filter() are symbolic until the queryset is enumerated. Then an SQL query is compiled and executed. It is not effective to calculate the stock on a big unfiltered queryset and then even to run it again filtered. You can split the filter to conditions unrelated to stock appended to the queryset and then a filter related to stock that you evaluate in the same Python loop where you calculate the stock.
result = []
for product in queryset_products.filter(**simple filters):
product.stock = some_stock_calc...
if product.stock > 0 or not required_on_stock:
result append(product)
A cache field of possible active products that could be on stock is very useful for the first simple filter.
Maybe the stock calculation is not more complicated then e.g. a stock at midnight plus a sum of stock operations since midnight. Then the current stock can be calculated by a Subquery in an annotation and filtered together. It will by compiled to one SQL with a main query with joins to your related models and a relative simple subquery for stock. (That would be another question.)

django-filter chained select

I'm using django-filters lib https://django-filter.readthedocs.io/en/master/index.html. I need to make chained select dropdown in my filters.
I knew how to make it with simple django-forms like here https://simpleisbetterthancomplex.com/tutorial/2018/01/29/how-to-implement-dependent-or-chained-dropdown-list-with-django.html.
When user pick region, i need to show cities in this region? Have someone idea or solution how to build filters like this?

Integrate django-smart-selects with how you perform the filtering.
This package allows you to quickly filter or group “chained” models by adding a custom foreign key or many to many field to your models. This will use an AJAX query to load only the applicable chained objects.
In analogy to the original question for Region -> City, the documentation's example is Continent -> Country which fits exactly to what is needed.
Once you select a continent, if you want only the countries on that continent to be available, you can use a ChainedForeignKey on the Location model:
class Location(models.Model):
continent = models.ForeignKey(Continent)
country = ChainedForeignKey(
Country,
chained_field="continent", # Location.continent
chained_model_field="continent", # Country.continent
show_all=False, # only the filtered results should be shown
auto_choose=True,
sort=True)
Related question:
How to use django-smart-select

Is their any time advantage of using #property decorator on django models instead of saving the field directly into database

So a field that can be computed like full_name from first and last name, we should use the #property to compute the full_name. But when we required to get a list of all 'n' persons with their full name. the full_name will be computed 'n' times which should require more time than just getting the field from database(if it stored as separate field already!).
So is their any processing time / db fetching time advantage /disadvantage of using #property to compute full_name?
(Note: I have considered other advantages of #property like reduction of database size, not worrying about change in first or last name without change in full name, setter function to set first and last name etc. I just want to know the processing/ db fetching time advantage/disadvantage over saving full_name into database.

Technique that you're talking about is called Denormalization. This is quite advanced technique.
Denormalization is a strategy used on a previously-normalized database
to increase performance. In computing, denormalization is the process
of trying to improve the read performance of a database, at the
expense of losing some write performance, by adding redundant copies of data or by grouping data.
It's opposite to database Normalization. And you always should start your application with normalized database.
If you don't have any serious problems with performance, I'd advise to not do this. If you have problems, try other solutions first to improve your app speed.
First Normal Form(1NF):
It should only have single(atomic) valued attributes/columns.
Very basic example of disadvantage: UPDATE statement. You'll need to access 2 columns in table + calculation for full_name.
Anyway, your full_name example is so simple, that you should definitely do this with #property
More on this topic:
Difference Between Normalization and Denormalization

Odoo 10 - Counting values of linked records

In Odoo 10 I have created my own custom application (using the new studio feature), however I have run into an issue trying to compute data between records that belong to different views.
In the scenario I have two models (model A and model B), where records from model B are connect to records from model A via a many2one relational field. There is a field in Model B that counts a numerical value entered into it.
Ideally what I would like to achieve is have some form of Automated Action / Server Action, that loops through the records in Model A, then loops through related records in Model B adding together the values of the previously mentioned numerical value field and sets the value of a field in model A equal to the equated number, before continuing onto the next record.
For example sake say the field names are:
Model A = x_a
- Model A ID Field = x_id_field
- Target field for computed value = x_compute
Model B = x_b
- many2one field = x_a_id
- numerical field = x_value_field
I have attempted to use the automated actions to execute some basic Python code (because I thought this would be as simple as a nested loop) however all my attempts have been failures due to not being familiar with how to loop through records in odoo and how to access other models and their records (from python).
How would I go about accomplishing this?

Ideally what I would like to achieve is have some form of Automated
Action / Server Action, that loops through the records in Model A,
then loops through related records in Model B adding together the
values of the previously mentioned numerical value field and sets the
value of a field in model A equal to the equated number, before
continuing onto the next record.
Create an Automated Action with Related Document Model = model a
On the Actions tab create a Server Action:
model_b_records = self.env['model_b'].search([('many2one_field', '!=', False)])
for record in model_b_records:
record.many2one_field.target_field_for_computed_value = record.numerical_field
Save the Server Action and execute it.
The code should be self-explanatory, for any questions do not hesitate to ask and comment below.

save a tuple to a django model

Is there a way to save a tuple to a django model?
example:
Class User(models.Model):
location = models.tupleField()
where User.location = (longitude, latitude)

Maybe you are looking for GeoDjango's PointField:
from django.contrib.gis.db import models
Class User(models.Model):
location = models.PointField(help_text="Represented as (longitude, latitude)”)

Honestly, i didn't see tupleField in django documentation. I think better approach is add two fields
longitude and latitude or create another model to store location and in User model add ForeignKey.

If you feel the need to save tuples in a single field, you should really look into relationships. That's the reason they were created.
Old answer:
This answer is assuming you're really saving locations.
If you're using PostgreSQL, you can use PostGIS. MySQL and Oracle has spacial fields built-in.
Django already supports Geo data. E.g. -
from django.contrib.gis.db import models
class ModelName(models.Model):
centroid = models.GeometryField(blank=True, null=True)
Then you can make all sorts of geo queries(e.g. find places that are within a specified area, sorted by distance etc) and your DBMS will take care of them. Plus from that single field, you can access your data like -
lat, long = [modelobject.centroid.x, modelobject.centroid.y]
# Or you can just use .x or .y directly
You can read more about them in the Geodjango documentation.

I recommend the Django-Geoposition app, and for extra, you can see the Latitude and Longitude in Google Maps.

I think this answer could help here as well.
--
As #p14z suggests, the best way to save my tuple was the ArrayField. For my example, I just wanted to save a Polygon extent, with this format:
(3.7739613717694787, 50.31527681737183, 4.726162032377, 50.49743217278623)
from django.contrib.postgres.fields import ArrayField
my_tuple_field = ArrayField(models.FloatField(), size=4, null=True)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: create a model with fields derived from pandas dataframe - python

Related

How to keep assigned attributes to a queryset object after filtering? Alternatives?

django-filter chained select

Is their any time advantage of using #property decorator on django models instead of saving the field directly into database

Odoo 10 - Counting values of linked records

save a tuple to a django model

Categories

Resources