Loading data from a (MySQL) database into Django without models

Loading data from a (MySQL) database into Django without models - python

This might sound like a bit of an odd question - but is it possible to load data from a (in this case MySQL) table to be used in Django without the need for a model to be present?
I realise this isn't really the Django way, but given my current scenario, I don't really know how better to solve the problem.
I'm working on a site, which for one aspect makes use of a table of data which has been bought from a third party. The columns of interest are liklely to remain stable, however the structure of the table could change with subsequent updates to the data set. The table is also massive (in terms of columns) - so I'm not keen on typing out each field in the model one-by-one. I'd also like to leave the table intact - so coming up with a model which represents the set of columns I am interested in is not really an ideal solution.
Ideally, I want to have this table in a database somewhere (possibly separate to the main site database) and access its contents directly using SQL.

You can always execute raw SQL directly against the database: see the docs.

There is one feature called inspectdb in Django. for legacy databases like MySQL , it creates models automatically by inspecting your db tables. it stored in our app files as models.py. so we don't need to type all column manually.But read the documentation carefully before creating the models because it may affect the DB data ...i hope this will be useful for you.

I guess you can use any SQL library available for Python. For example : http://www.sqlalchemy.org/
You have just then to connect to your database, perform your request and use the datas at your will. I think you can't use Django without their model system, but nothing prevents you from using another library for this in parallel.

Related

What should I use to enter data to my database on Django? Django admin or SQL code?

I am a newbie in programming, but now I connected my project with PostgreSQL. I learned the way to enter by SQL code and also found out that we can actually enter /adming (by creating the superuser and add data there). So which one is widely used in webdev?

It will depend completely on your application.
You can add rows to a table using SQL if that's the easiest way for you. Or you can add rows by creating new object instances in Python code and .save()ing them. Or you can create instances through a CreateView or through the Django admin.
Adding data with SQL has the drawback that you will lise the benefit of any validators declared on the model's fields. YOu may end up with data stored in your SQL tables which your app regards as "impossible", which may cause you minor or even major difficulties.
I have several times written management commands which all have the same general format. For each "row" in a data source (often a spreadsheet) construct one or more Django objects and save them. You can process each data "row" within a transaction (with transaction.atomic()) so if anything goes wrong, the data row is not committed. Or you can treat the entire process as a single transaction (not recommended for vast numbers of "rows", though)·

How to populate my django database with json that I scraped from a website

I have scraped data from a website using their API on a Django application. The data is JSON (a Python dictionary when I retrieve it on my end). The data has many, many fields. I want to store them in a database, so that I can create endpoints that will allow for lookup and modifications (updates). I need to use their fields to create the structure of my database. Any help on this issue or on how to tackle it would be greatly appreciated. I apologize if my question is not concise enough, please let me know if there is anything I need to specify.
I have seen many, many people saying to just populate it, such as this example How to populate a Django sqlite3 database. The issue is, there are so many fields that I cannot go and actually create the django model fields myself. From what I have read, it seems like I may be able to use serializers.ModelSerializer, although that seems to just populate a pre-existing db with already defined model.

Tricky to answer without details, but I would consider doing this in two steps - first, convert your json data to a database schema, for example using a tool like sqlify: https://sqlify.io/convert/json/to/sqlite
Then, create a database from the generated schema file, and use inspectdb to generate your django models: https://docs.djangoproject.com/en/2.2/ref/django-admin/#inspectdb
You'll probably need to tweak the generated schema and/or models, but this should go a long way towards automating the process.

I would go for a document database, like Elasticsearch or MongoDB.
Those are made for this kind of situation, look it up.

Recommendation for manipulating data with python vs. SQL for a django app

Background:
I am developing a Django app for a business application that takes client data and displays charts in a dashboard. I have large databases full of raw information such as part sales by customer, and I will use that to populate the analyses. I have been able to do this very nicely in the past using python with pandas, xlsxwriter, etc., and am now in the process of replicating what I have done in the past in this web app. I am using a PostgreSQL database to store the data, and then using Django to build the app and fusioncharts for the visualization. In order to get the information into Postgres, I am using a python script with sqlalchemy, which does a great job.
The question:
There are two ways I can manipulate the data that will be populating the charts. 1) I can use the same script that exports the data to postgres to arrange the data as I like it before it is exported. For instance, in certain cases I need to group the data by some parameter (by customer for instance), then perform calculations on the groups by columns. I could do this for each different slice I want and then export different tables for each model class to postgres.
2) I can upload the entire database to postgres and manipulate it later with django commands that produce SQL queries.
I am much more comfortable doing it up front with python because I have been doing it that way for a while. I also understand that django's queries are little more difficult to implement. However, doing it with python would mean that I will need more tables (because I will have grouped them in different ways), and I don't want to do it the way I know just because it is easier, if uploading a single database and using django/SQL queries would be more efficient in the long run.
Any thoughts or suggestions are appreciated.

Well, it's the usual tradeoff between performances and flexibility. With the first approach you get better performances (your schema is taylored for the exact queries you want to run) but lacks flexibility (if you need to add more queries the scheam might not match so well - or even not match at all - in which case you'll have to repopulate the database, possibly from raw sources, with an updated schema), with the second one you (hopefully) have a well normalized schema but one that makes queries much more complex and much more heavy on the database server.
Now the question is: do you really have to choose ? You could also have both the fully normalized data AND the denormalized (pre-processed) data alongside.
As a side note: Django ORM is indeed most of a "80/20" tool - it's designed to make the 80% simple queries super easy (much easier than say SQLAlchemy), and then it becomes a bit of a PITA indeed - but nothing forces you to use django's ORM for everything (you can always drop down to raw sql or use SQLAlchemy alongside).
Oh and yes: your problem is nothing new - you may want to read about OLAP

Django: Bypassing the database abstraction

I have been playing arround with django for a couple of days and it seems great, but I find it a pain if I want to change the structure of my database, I then am stuck with a few rather awkward options.
Is there a way to completely bypass djangos database abstraction so if I change the structure of the database I dont have to guess what model would have generated it or use a tool (south or ...) to change things?
I essentially want this: https://docs.djangoproject.com/en/dev/topics/db/sql/ (Raw SQL Queries) but instead of refering to a model, refering to an external database.
Could I just create an empty model and then only perform raw queries on it? (and set up the DB externally)
Thanks
P.S. I dont really mind if I have separate databases for the admin stuff and the app data

It's in your question already, just read the docs article from here: Executing custom SQL directly

Migration to GAE

What is the best way to migrate MySQL tables to Google Datastore and create python models for them?
I have a PHP+MySQL project that I want to migrate to Python+GAE project. So far the big obstacle is migrating the tables and creating corresponding models. Each table is about 110 columns wide. Creating a model for the table manually is a bit tedious, let alone creating a loader and importing a generated csv table representation.
Is there a more efficient way for me to do the migration?

In general, generating your models automatically shouldn't be too difficult. Suppose you have a csv file for each table, with lines consisting of (field name, data type), then something like this would do the job:
# Maps MySQL types to Datastore property classes
type_map = {
'char': 'StringProperty',
'text': 'TextProperty',
'int': 'IntegerProperty',
# ...
}
def generate_model_class(classname, definition_file):
ret = []
ret.append("class %s(db.Model):" % (classname,))
for fieldname, type in csv.reader(open(definition_file)):
ret.append(" %s = db.%s()" % (fieldname, type_map[type]))
return "\n".join(ret)
Once you've defined your schema, you can bulk load directly from the DB - no need for intermediate CSV files. See my blog post on the subject.

approcket can mysql⇌gae or gae builtin remote api from google

In your shoes, I'd write a one-shot Python script to read the existing MySQL schema (with MySQLdb), generating a models.py to match (then do some manual checks and edits on the generated code, just in case). That's assuming that a data model with "about 110" properties per entity is something you're happy with and want to preserve, of course; it might be worth to take the opportunity to break things up a bit (indeed you may have to if your current approach also relies on joins or other SQL features GAE doesn't give you), but that of course requires more manual work.
Once the data model is in place, bulk loading can happen, typically via intermediate CSV files (there are several ways you can generate those).

you don't need to
http://code.google.com/apis/sql/
:)

You could migrate them to django models first
In particular use
python manage.py inspectdb > models.py
And edit models.py until satisfied. You might have to put ForeignKeys in, adjusts the length of CharFields etc.
I've converted several legacy databases to django like this with good success.
Django models however are different to GAE models (which I'm not very familiar with) so that may not be terribly helpful I don't know!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.