Storing an Object in a Database

Storing an Object in a Database - python

I have little to no experience with databases and i'm wondering how i would go about storing certain parts of an object.
Let's say I have an object like the following and steps can be an arbitrary length. How would I store these steps or list of steps into an sql database?
class Error:
name = "" #name of error
steps = [] #steps to take to attempt to solve error

For your example you would create a table called Errors with metadata about the error such as an error_ID as the primary key, a name, date created, etc... then you'd create another table called Steps with it's own id, lets say Step_ID and any fields related to the step. The important part is you'd create a field on the Steps table that relates back to the Error that the steps are for we'll call that field again error_ID, then you'd make that field a foreign key so the database enforces that constraint.

If you want to store your Python objects in a database (or any other language objects in a database) the place to start is a good ORM (Object-Relational Mapper). For example Django has a built-in ORM. This link has a comparison of some Python Object-Relational mappers.

Related

How to access these tables more efficiently in django

we have the following data base schema to store different types of data.
DataDefinition: basic information about the new data.
*FieldDefinition: Every DataDefinition has some fields. Every field has a type, title, etc, that information is stored here. Every DataDefinition has more than one FieldDefinition associated. I have put '' because we have a lot of different models, one for every kind of field supported.
DataValue, *FieldValues: we store the definition and the values in different models.
With this setup, to retrieve a data from our database we need to do a lot of queries:
Retrieve the DataDefinition.
Retrieve the DataValue.
Retrieve the *FieldDefinition associated to that DataDefinition.
Retrieve all the *FieldValues associated to those *FieldDefinition.
So, if n is the average number of fields of a DataDefinition, we need to make 2*n+2 queries to the database to retrieve a single value.
We cannot change this setup, but queries are quite slow. So to speed it up I have thought the following: store a joined version of the tables. I do not know if this is possible but I cannot think of any other way. Any suggestion?
Update: we are already using prefetch_related and select_related and it's still slow.
Use Case Right now: get an entire data object from the one object value:
someValue = SomeTypeValue.objects.filter(value=value).select_related('DataValue', 'DataDefinition')[0]
# for each *FieldDefinition/*FieldValue model
definition = SomeFieldDefinition.objects.filter(*field_definition__id=someValue.data_value.data_definition.id)
value = SomeFieldValue.objects.filter(*field_definition__id=definition.id)
And with that info you can now build the entire data object.
Django: 1.11.20
Python: 2.7

Django: models setup for using one database to populate another

I'm rebuilding a personal project in Django, (a family tree), and I'm working on migrating the actual data from the old awkward database to my new model/schema, both Postgres databases. I've defined them in the DATABASES list on settings.py as 'default' and 'source'.
I've made my models and I'd like to copy the records from the old database into their corresponding table in the new database, but I'm not quite understanding how to set up the models for it to work, since the Django code uses the models/ORM to access/update/create objects, and I only have models reflecting the new schema, not the old ones.
In a coincidental case where I have a table with the exact same schema in the old and new database, I have a management command that can grab the old records from the source using my new ImagePerson model (ported_data = ImagePerson.objects.using('source').all()), since it the expected fields are the same. Then I save objects for them in the 'default': (obj, created_bool) = ImagePerson.objects.using('default').get_or_create(field=fieldvalue, etc), and it works just like I need it to.
However when I have a table where the old version is missing fields that my new model/table have, I can't use the model to access those records (which makes sense). Am I supposed to also make some kind of legacy version of each model for use in the migration? I saw a tutorial mention running ./manage.py inspectdb --database=source > models.py, but doing so didn't seem to add anything else to my file (and it would seem weird to save temporary/legacy models in there anyway). What's the right way to access the old-formatted records? Is the ORM right?
To give a specific example, I have a Notes table to hold a memory about a specific person or about a specific family. The old table used a 'type' field (1 was for person note, 2 was for family note), and a ref_id that would be the id for the person or family the note applies to. The new table instead has a person_id field and a family_id field.
I'd like my management command to be able to pull all the records from the source table, then if type=1, look up the person with id equal to the ref_id field, and save a new object in the new database with that person. I can grab them using the new Notes model with the old database like this: ported_notes = Note.objects.using('source').all(), but then if I try to access any field (like print(note_row.body)), I get an error that the result object is missing the person_id field
django.db.utils.ProgrammingError: column notes.person_id does not exist
What's the right way to approach this?

Creating models for your old schema definitely doesn't seem like the right approach.
One solution would be to write a data-migration, where you could use raw SQL to fetch your old data, and then use the ORW to write it to your new tables/models:
from django.db import migrations, connections
def transfer_data(apps, schema_editor):
ModelForNewDB = apps.get_model('yourappname', 'ModelForNewDB')
# Fetch your old data
with connections['my_old_db'].cursor() as cursor:
cursor.execute('select * from some_table')
data = cursor.fetchall()
# Write it to your new models
for datum in data:
# do something with the data / add any
# additional values needed.
ModelForNewDB.objects.create(...)
class Migration(migrations.Migration):
dependencies = [
('yourappname', '0001_initial'),
]
operations = [
migrations.RunPython(transfer_data),
]
Then simply run your migrations. One thing to note however:
If you have foreignKeys etc. between tables you will need to be careful how you order the migrations. This can be done by editing your dependencies. You may even have to add a migration to allow null values for some foreign keys, and then add another one afterwards to correct this.

Is there a way to standardize the business datatypes across all systems in organization?

Can we do a loosely coupled data access layer design in python?
Lets say,in a scenario i have an oracle table with column name ACTIVITY_ID with column datatype as Number(10). If this column is a foreign key in lot many tables,to hold the data of this column, can i create something like ACTID class (like a java object) and can be used across the code if i want to manipulate/hold the ACTIVITY_ID column data so that i could maintain consistency of business object columns. Is there any such possibility in python ?

Try Django
As I understand it, Python does not natively have any database functionality. There are many different libraries/frameworks/etc. that can be used to provide database functionality to Python. I recommend taking a look at Django. With Django, you create a class for each database table and Django hides a LOT of the details, including allowing using with multiple database engines such as MySQL and PostgreSQL. Django handles Foreign Key relationships very well. Every table normally has a primary key, by default an auto-incremented id field. If you add a field like activity = models.ForeignKey(Activity) then you now have a foreign key field activity_id in one table referencing the primary key field id in the Activity table. The Admin page will take care of cascading deletion of records if you delete an Activity record, and in general things "just work" the way you might expect them to.

Extract metadata from database using python?

I have looked for an answer to this question, but very was able to find very little. I want to extract the names of the tables, references between them, column names so I can graphically visualize that information. I need to this in a Django project.
Since I am a newbie to python I would like to know if there is some kind of API to do this type of thing.
Edit
I have created a model which consists Node, Attribute and Link. Node has attributes, while Link has fields parent_node and child_node. What I want is to connect to a database, read the metadata by which I mean: Table names, Column names and Foreign key constraints. Then I could properly put this data in the model I have created.

You can use inspectdbcommand, with this django reads your database and create models for each table and if your database has relations you get also in django. You can see more info here.
python manage.py inspectdb

telling django to use an integer as a foreign key on demand

I'm new to Django and currently defining an existing database scheme as a Django model.
I have a central repository to store values like:
(some_qualifier_id_1, some_qualifier_id_2, measure_id, value)
The value is database-wise an integer. It can however refer to categorical data, in which case I want to link it to another table which gives me additional information like the string that's supposed to be displayed instead of the number, ordering information.
Can I tell Django to create a link to a table using value as a foreign key sometimes?
Update:
Using the int to get the category - yes, that's what I do. However the point of the model layer as far as I understand it now was being able to tell Django how the relations between tables are. Just using an int and using it to look up stuff would mean hacking it together without telling Django what I'm doing, which will probably mean I have to generate SQL manually instead of using the model layer at some point. Use case: Order category values by some ordering field in the Categories table.

If the measure_id value is semantically a foreign key, so each value must be an index for something in the measure table, then declare it as such, and everythin will come together alone.
If the measure_id value is not a foreign key, you can annotate your resultset on demand, using
MyModel.objects.filter(**your_filters).extra(select={
'measure_name':
'SELECT measure.name FROM measure WHERE mymodeltable.measure_id = measure.id'
})
Then your retrieved objects will have a 'measure_name' attribute with the joined column.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.