I need in my project (based on Postgresql) to export few models as SQLite dump. It must be made 'on-demand' f.e. on user request.
I can prepare appropriate database manually, but I would like to omit the duplication of information about schema. I dream about solution like 'dumpdata app-name' but instead of JSON/XML/YAML there should be SQLite.
Is there such solution?
P.S. For too overbearing - it's not broad question. Possibilities are only two: there is such snippet, helper etc. or there is not and it should be done individually. I can't find it by my own so I ask for help.
To sum up details (some people could not figure them out and could put my question 'on hold'):
there is Django project with main Postgresql database
I'm already processing request from users (through an API)
one of request is "make a dump of some models (tables) in SQLite format"
I can prepare temporary SQLite database and manually fill it with data
I'm looking for powerful and universal tool (solution) which will do such export automatically (from some of my Django models to SQLite)
Related
I need to dynamically create database tables depending on user requirements. so apart from a few predefined databases, all other databases should be created at runtime after taking table characteristics(like no of cols, primary key etc.) from user.
I read a bit of docs, and know about django.db.connection but all examples there are only for adding data to a database, not creating tables. (ref: https://docs.djangoproject.com/en/4.0/topics/db/sql/#executing-custom-sql-directly)
So is there anyway to create tables without models in django, this condition is a must, so if not possible with django, which other framework should I look at?
note: I am not good at writing questions, ask if any other info is needed.
Thanks!
You can use inspectdb to automatically generate the models from the legacy database. You can check about it in here.
Or you can use SQL directly. Although, you will have to process the tables in python. Check it here.
I have scraped data from a website using their API on a Django application. The data is JSON (a Python dictionary when I retrieve it on my end). The data has many, many fields. I want to store them in a database, so that I can create endpoints that will allow for lookup and modifications (updates). I need to use their fields to create the structure of my database. Any help on this issue or on how to tackle it would be greatly appreciated. I apologize if my question is not concise enough, please let me know if there is anything I need to specify.
I have seen many, many people saying to just populate it, such as this example How to populate a Django sqlite3 database. The issue is, there are so many fields that I cannot go and actually create the django model fields myself. From what I have read, it seems like I may be able to use serializers.ModelSerializer, although that seems to just populate a pre-existing db with already defined model.
Tricky to answer without details, but I would consider doing this in two steps - first, convert your json data to a database schema, for example using a tool like sqlify: https://sqlify.io/convert/json/to/sqlite
Then, create a database from the generated schema file, and use inspectdb to generate your django models: https://docs.djangoproject.com/en/2.2/ref/django-admin/#inspectdb
You'll probably need to tweak the generated schema and/or models, but this should go a long way towards automating the process.
I would go for a document database, like Elasticsearch or MongoDB.
Those are made for this kind of situation, look it up.
I'm working on a project that I inherited, and I want to add a table to my database that is very similar to one that already exists. Basically, we have a table to log users for our website, and I want to create a second table to specifically log users that our site fails to do a task for.
Since I didn't write the site myself, and am pretty new to both SQL and Django, I'm a little paranoid about running a migration (we have a lot of really sensitive data that I'm paranoid about wiping).
Instead of having a django migration create the table itself, can I create the second table in MySQL, and the corresponding model in Django, and then have this model "recognize" the SQL table? without explicitly using a migration?
SHORT ANSWER: Yes.
MEDIUM ANSWER: Yes. But you will have to figure out how Django would have created the table, and do it by hand. That's not terribly hard.
Django may also spit out some warnings on startup about migrations being needed...but those are warnings, and if the app works, then you're OK.
LONG ANSWER: Yes. But for the sake of your sanity and sleep quality, get a completely separate development environment and test your backups. (But you knew that already.)
Note: Scroll down to the Background section for useful details. Assume the project uses Python-Django and South, in the following illustration.
What's the best way to import the following CSV
"john","doe","savings","personal"
"john","doe","savings","business"
"john","doe","checking","personal"
"john","doe","checking","business"
"jemma","donut","checking","personal"
Into a PostgreSQL database with the related tables Person, Account, and AccountType considering:
Admin users can change the database model and CSV import-representation in real-time via a custom UI
The saved CSV-to-Database table/field mappings are used when regular users import CSV files
So far two approaches have been considered
ETL-API Approach: Providing an ETL API a spreadsheet, my CSV-to-Database table/field mappings, and connection info to the target database. The API would then load the spreadsheet and populate the target database tables. Looking at pygrametl I don't think what i'm aiming for is possible. In fact, i'm not sure any ETL APIs do this.
Row-level Insert Approach: Parsing the CSV-to-Database table/field mappings, parsing the spreadsheet, and generating SQL inserts in "join-order".
I implemented the second approach but am struggling with algorithm defects and code complexity. Is there a python ETL API out there that does what I want? Or an approach that doesn't involve reinventing the wheel?
Background
The company I work at is looking to move hundreds of project-specific design spreadsheets hosted in sharepoint into databases. We're near completing a web application that meets the need by allowing an administrator to define/model a database for each project, store spreadsheets in it, and define the browse experience. At this stage of completion transitioning to a commercial tool isn't an option. Think of the web application as a django-admin alternative, though it isn't, with a DB modeling UI, CSV import/export functionality, customizable browse, and modularized code to address project-specific customizations.
The implemented CSV import interface is cumbersome and buggy so i'm trying to get feedback and find alternate approaches.
How about separating the problem into two separate problems?
Create a Person class which represents a person in the database. This could use Django's ORM, or extend it, or you could do it yourself.
Now you have two issues:
Create a Person instance from a row in the CSV.
Save a Person instance to the database.
Now, instead of just CSV-to-Database, you have CSV-to-Person and Person-to-Database. I think this is conceptually cleaner. When the admins change the schema, that changes the Person-to-Database side. When the admins change the CSV format, they're changing the CSV-to-Database side. Now you can deal with each separately.
Does that help any?
I write import sub-systems almost every month at work, and as I do that kind of tasks to much I wrote sometime ago django-data-importer. This importer works like a django form and has readers for CSV, XLS and XLSX files that give you lists of dicts.
With data_importer readers you can read file to lists of dicts, iter on it with a for and save lines do DB.
With importer you can do same, but with bonus of validate each field of line, log errors and actions, and save it at end.
Please, take a look at https://github.com/chronossc/django-data-importer. I'm pretty sure that it will solve your problem and will help you with process of any kind of csv file from now :)
To solve your problem I suggest use data-importer with celery tasks. You upload the file and fire import task via a simple interface. Celery task will send file to importer and you can validate lines, save it, log errors for it. With some effort you can even present progress of task for users that uploaded the sheet.
I ended up taking a few steps back to address this problem per Occam's razor using updatable SQL views. It meant a few sacrifices:
Removing: South.DB-dependent real-time schema administration API, dynamic model loading, and dynamic ORM syncing
Defining models.py and an initial south migration by hand.
This allows for a simple approach to importing flat datasets (CSV/Excel) into a normalized database:
Define unmanaged models in models.py for each spreadsheet
Map those to updatable SQL Views (INSERT/UPDATE-INSTEAD SQL RULEs) in the initial south migration that adhere to the spreadsheet field layout
Iterating through the CSV/Excel spreadsheet rows and performing an INSERT INTO <VIEW> (<COLUMNS>) VALUES (<CSV-ROW-FIELDS>);
Here is another approach that I found on github. Basically it detects the schema and allows overrides. Its whole goal is to just generate raw sql to be executed by psql and or whatever driver.
https://github.com/nmccready/csv2psql
% python setup.py install
% csv2psql --schema=public --key=student_id,class_id example/enrolled.csv > enrolled.sql
% psql -f enrolled.sql
There are also a bunch of options for doing alters (creating primary keys from many existing cols) and merging / dumps.
What is the best way to migrate MySQL tables to Google Datastore and create python models for them?
I have a PHP+MySQL project that I want to migrate to Python+GAE project. So far the big obstacle is migrating the tables and creating corresponding models. Each table is about 110 columns wide. Creating a model for the table manually is a bit tedious, let alone creating a loader and importing a generated csv table representation.
Is there a more efficient way for me to do the migration?
In general, generating your models automatically shouldn't be too difficult. Suppose you have a csv file for each table, with lines consisting of (field name, data type), then something like this would do the job:
# Maps MySQL types to Datastore property classes
type_map = {
'char': 'StringProperty',
'text': 'TextProperty',
'int': 'IntegerProperty',
# ...
}
def generate_model_class(classname, definition_file):
ret = []
ret.append("class %s(db.Model):" % (classname,))
for fieldname, type in csv.reader(open(definition_file)):
ret.append(" %s = db.%s()" % (fieldname, type_map[type]))
return "\n".join(ret)
Once you've defined your schema, you can bulk load directly from the DB - no need for intermediate CSV files. See my blog post on the subject.
approcket can mysql⇌gae or gae builtin remote api from google
In your shoes, I'd write a one-shot Python script to read the existing MySQL schema (with MySQLdb), generating a models.py to match (then do some manual checks and edits on the generated code, just in case). That's assuming that a data model with "about 110" properties per entity is something you're happy with and want to preserve, of course; it might be worth to take the opportunity to break things up a bit (indeed you may have to if your current approach also relies on joins or other SQL features GAE doesn't give you), but that of course requires more manual work.
Once the data model is in place, bulk loading can happen, typically via intermediate CSV files (there are several ways you can generate those).
you don't need to
http://code.google.com/apis/sql/
:)
You could migrate them to django models first
In particular use
python manage.py inspectdb > models.py
And edit models.py until satisfied. You might have to put ForeignKeys in, adjusts the length of CharFields etc.
I've converted several legacy databases to django like this with good success.
Django models however are different to GAE models (which I'm not very familiar with) so that may not be terribly helpful I don't know!