SQLalchemy - Iterate through all mapped tables - python

I am currently creating a web app in Flask and use SQL-alchemy (not the flask version) to deal with reading and writing to my MySQL database.
I have about 15 different tables each mapped to a different declarative class, however the application is still in beta stages and so this number will probably increase.
I would like a way to iterate through every single table and run the same command on every single one. This is part of an update function where an admin can change the name of a book, this name change should be reflected in all the other tables where that book is referred to.
Is there a way to iterate through all your SqlAlchemy tables?
Thanks!

Not exactly sure what you want to achieve here, but if you use declarative base, you can try something like this:
tables = Base.__subclasses__()
for t in tables:
rows = Session.query(t).all()
for r in rows:
... do something ...
This gets all tables by listing subclasses of Base. Then it queries everything from each table in turn and loops through selected rows.
However, I do not quite understand why you would want to do this. How you describe your question is that you should have a Book table, and all others link to it if they want to reference books. This would be the relational model instead of dragging information on Books in each and every table and trying to manage them like this manually.

Related

How to cache an (almost) read-only Flask web app?

I have a Flask web app that has no registered users, but its database is updated daily (therefore the content only changes once a day).
It seems to me the best choice would be to cache the entire website once a day and serve everything from the cache.
I tried with Flask Cache, but a dynamic page is created and then cached for every different user-session, which is clearly not ideal since the content is always the same no matter who's browsing the website.
Do you know how can I do better, either with Flask Cache or using something else?
Perhaps use an in-memory SQLite database? Will look and feel like any regular db, but with memory access speeds.
A couple of years ago, I wrote an in-memory database which I called littletable. Tables are represented as lists of objects. Selects and queries are normally done by simple list scans, but common object properties can be indexed. Tables can be joined or pivoted.
The main difference in the littletable model is that there is no separate concept of a table vs. a results list. The result of any query or join is another table. Tables can also store namedtuples and a littletable-defined type called a DataObject. Tables can be imported/exported to CSV files to persist any updates.
There is at least one website that uses littletable to maintain its mostly-static product catalog. You might also find littletable useful for prototyping before creating actual tables in a more common database. Here's a link to the online docs.

Bulk insert with multiprocessing using peewee

I'm working on simple html scraper in Python 3.4, using peewee as ORM (great ORM btw!). My script takes a bunch of sites, extract necessary data and save them to the database, however every site is scraped in detached process, to improve performance and saved data should be unique. There can be duplicate data not only between sites, but also on particular site, so I want to store them only once.
Example:
Post and Category - many-to-many relation. During scraping, same category appears multiple times in different posts. For the first time I want to save that category to database (create new row). If the same category shows up in different post, I want to bind that post with already created row in db.
My question is - do I have to use atomic updates/inserts (insert one post, save, get_or_create categories, save, insert new rows to many-to-many table, save) or can I use bulk insert somehow? What is the fastest solution to that problem? Maybe some temporary tables shared between processes, which will be bulk insert at the end of work? Im using MySQL db.
Thx for answers and your time
You can rely on the database to enforce unique constraints by adding unique=True to fields or multi-column unique indexes. You can also check the docs on get/create and bulk inserts:
http://docs.peewee-orm.com/en/latest/peewee/models.html#indexes-and-unique-constraints
http://docs.peewee-orm.com/en/latest/peewee/querying.html#get-or-create
http://docs.peewee-orm.com/en/latest/peewee/querying.html#bulk-inserts
http://docs.peewee-orm.com/en/latest/peewee/querying.html#upsert - upsert with on conflict
Looked for this myself for a while, but found it!
you can use the on_conflict_replace() or on_conflict_ignore() functions to define behaviour for when a record exists in a table that has a uniqueness constraint.
PriceData.insert_many(values).on_conflict_replace().execute()
or
PriceData.insert_many(values).on_conflict_ignore().execute()
More info under "Upsert" here

Loading data from a (MySQL) database into Django without models

This might sound like a bit of an odd question - but is it possible to load data from a (in this case MySQL) table to be used in Django without the need for a model to be present?
I realise this isn't really the Django way, but given my current scenario, I don't really know how better to solve the problem.
I'm working on a site, which for one aspect makes use of a table of data which has been bought from a third party. The columns of interest are liklely to remain stable, however the structure of the table could change with subsequent updates to the data set. The table is also massive (in terms of columns) - so I'm not keen on typing out each field in the model one-by-one. I'd also like to leave the table intact - so coming up with a model which represents the set of columns I am interested in is not really an ideal solution.
Ideally, I want to have this table in a database somewhere (possibly separate to the main site database) and access its contents directly using SQL.
You can always execute raw SQL directly against the database: see the docs.
There is one feature called inspectdb in Django. for legacy databases like MySQL , it creates models automatically by inspecting your db tables. it stored in our app files as models.py. so we don't need to type all column manually.But read the documentation carefully before creating the models because it may affect the DB data ...i hope this will be useful for you.
I guess you can use any SQL library available for Python. For example : http://www.sqlalchemy.org/
You have just then to connect to your database, perform your request and use the datas at your will. I think you can't use Django without their model system, but nothing prevents you from using another library for this in parallel.

Python: Dumping Database Data with Peewee

Background
I am looking for a way to dump the results of MySQL queries made with Python & Peewee to an excel file, including database column headers. I'd like the exported content to be laid out in a near-identical order to the columns in the database. Furthermore, I'd like a way for this to work across multiple similar databases that may have slightly differing fields. To clarify, one database may have a user table containing "User, PasswordHash, DOB, [...]", while another has "User, PasswordHash, Name, DOB, [...]".
The Problem
My primary problem is getting the column headers out in an ordered fashion. All attempts thus far have resulted in unordered results, and all of which are less then elegant.
Second, my methodology thus far has resulted in code which I'd (personally) hate to maintain, which I know is a bad sign.
Work so far
At present, I have used Peewee's pwiz.py script to generate the models for each of the preexisting database tables in the target databases, then went and entered all primary and foreign keys. The relations are setup, and some brief tests showed they're associating properly.
Code: I've managed to get the column headers out using something similar to:
for i, column in enumerate(User._meta.get_field_names()):
ws.cell(row=0,column=i).value = column
As mentioned, this is unordered. Also, doing it this way forces me to do something along the lines of
getattr(some_object, title)
to dynamically populate the fields accordingly.
Thoughts and Possible Solutions
Manually write out the order that I want stuff in an array, and use that for looping through and populating data. The pros of this is very strict/granular control. The cons are that I'd need to specify this for every database.
Create (whether manually or via a method) a hash of fields with an associated weighted value for all possibly encountered fields, then write a method for sorting "_meta.get_field_names()" according to weight. The cons of this is that the columns may not be 100% in the right order, such as Name coming before DOB in one DB, while after it in another.
Feel free to tell me I'm doing it all wrong or suggest completely different ways of doing this, I'm all ears. I'm very much new to Python and Peewee (ORMs in general, actually). I could switch back to Perl and do the database querying via DBI with little to no hassle. However, it's libraries for excel would cause me as many problems, and I'd like to take this as a time to expand my knowledge.
There is a method on the model meta you can use:
for field in User._meta.get_sorted_fields():
print field.name
This will print the field names in the order they are declared on the model.

Work with Postgres/PostGIS View in SQLAlchemy

Two questions:
i want to generate a View in my PostGIS-DB. How do i add this View to my geometry_columns Table?
What i have to do, to use a View with SQLAlchemy? Is there a difference between a Table and View to SQLAlchemy or could i use the same way to use a View as i do to use a Table?
sorry for my poor english.
If there a questions about my question, please feel free to ask so i can try to explain it in another way maybe :)
Nico
Table objects in SQLAlchemy have two roles. They can be used to issue DDL commands to create the table in the database. But their main purpose is to describe the columns and types of tabular data that can be selected from and inserted to.
If you only want to select, then a view looks to SQLAlchemy exactly like a regular table. It's enough to describe the view as a Table with the columns that interest you (you don't even need to describe all of the columns). If you want to use the ORM you'll need to declare for SQLAlchemy that some combination of the columns can be used as the primary key (anything that's unique will do). Declaring some columns as foreign keys will also make it easier to set up any relations. If you don't issue create for that Table object, then it is just metadata for SQLAlchemy to know how to query the database.
If you also want to insert to the view, then you'll need to create PostgreSQL rules or triggers on the view that redirect the writes to the correct location. I'm not aware of a good usage recipe to redirect writes on the Python side.

Categories

Resources