If I run this query directly in sqlite3.exe on the same database, I get 20 records.
When I run it in Python using sqlite3, it returns every single record from table a (200000+).
import sqlite3
db = sqlite3.connect("path/to/my.db")
c = db.cursor()
c.execute("""SELECT a.*, b.*, c.* FROM t_data a NATURAL LEFT JOIN t_finished b
NATURAL LEFT JOIN user_info c WHERE user_id=1;""")
for row in c:
print row
How can this be possible?
Here is how the tables are related.
CREATE TABLE t_data ( t_id INTEGER REFERENCES t_finished (t_id),
ui_id INTEGER NOT NULL REFERENCES user_info (ui_id), ...);
CREATE TABLE t_finished ( t_id INTEGER PRIMARY KEY, ...);
CREATE TABLE user_info ( ui_id INTEGER PRIMARY KEY, user_id INTEGER REFERENCES accounts, ...);
No other columns are shared between them.
Trying to use explicit JOINS I have the same problem:
SELECT * FROM t_data a LEFT JOIN t_finished b USING(t_id) LEFT JOIN user_info c USING(ui_id) WHERE user_id=1;
This query works in sqlite3.exe, but throws an error in Python:
OperationalError: cannot join using column ui_id - column not present in both tables
If you are 100% certain that you are dealing with the same database, you probably are using a newer SQLite library version with the command-line tool.
You can verify what versions are being used with:
print sqlite.sqlite_version
in Python and
sqlite3 -version
with the command-line tool. You can check against the sqlite3 changelog to see if anything relevant changed, or you could just update your SQLite3 DLLs to the lastest version to make sure that you are not running into a bug or new feature here.
Related
I am using the code below from this stackoverflow answer to copy a row in a table and create a new row in the same table, but with the same data and a unique primary key. I am running the code using the mysql-connector-python module.
CREATE TEMPORARY TABLE tmptable_1 SELECT * FROM table WHERE primarykey = 1;
UPDATE tmptable_1 SET primarykey = NULL;
INSERT INTO table SELECT * FROM tmptable_1;
DROP TEMPORARY TABLE IF EXISTS tmptable_1;
This code runs on the production database, but gives the following error on my local test db.
IntegrityError: 1048 (23000): Column 'primaryKey' cannot be null
Local DB:
Version: MYSQL==5.6.47
Compiled For: osx10.15 (x86_64)
DB Client: mysql-connector-python==2.1.3
Production DB:
Version: MYSQL==5.6.40
Compiled For: Linux (x86_64)
DB Client: mysql-connector-python==2.1.3
I use MYSQL Workbench to export the schema and data from the production data base and import it into my local test database.
Is there some configuration setting for a MYSQL DB that is the problem? I am not sure what the cause is for the difference in database behaviour.
Just drop that column from the INSERT side and you can insert into yourself:
INSERT INTO table (a, b, c) SELECT a, b, c FROM table;
Where you must specify the columns in exactly the same order, omitting the id one.
I have a query in a python script that creates a materialized view after some tables get created.
Script is something like this:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema1.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema1.t1 t1
LEFT JOIN schema1.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema1.t3 t3;'''
con=create_engine(db_conn)
con.execute(sql)
The query successfully executes when I run on the database directly.
But when running the script in python, I get an error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "CREATE MATERIALIZED VIEW schema"
I can't for the life of me figure out what it has an issue with - any ideas?
This was the weirdest thing. I had copied my query text out of another tool that I use to navigate around my pg DB into VS Code. The last part of the answer by #EOhm gave me the idea to just type the whole thing out in VS Code instead of copy/pasting.
And everything worked.
Even though the pasted text and what I typed appear identical in every way. So apparently there was some invisible formatting causing this issue.
I don't know wether SQLAlchemy suports MView-Creation, but if it should be similiar or done with specific Metadata functions (https://docs.sqlalchemy.org/en/13/core/schema.html).
The text function is designed for database indepenendent DML, not DDL. Maybe it works for DDL (I don't know about SQLAlchemy) but by design the syntax is different than when You would execute directly on the database as SQLAlchemy shall abstract the details of databases from user.
If SQLAlchemy does no offer some convenient way for that and You nevertheless have valid reasons to use SQLAlchemy at all, You can just execute the plain SQL Statememt in the dialect the database backend understands, so just omit the sqlalchemies text function for the SQL statement, like:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;'''
con=create_engine(db_conn)
con.raw_connection().cursor().execute(sql)
(But of course You have to take care for the backend type then opposed to the SQLAlchemy wrapped statements.)
I tested on my pg server without any issues using psycopg2 directly.
postgres=# create schema schema;
CREATE TABLE
postgres=# create table schema.t1 (a varchar, b varchar, c varchar, f integer);
CREATE TABLE
postgres=# create table schema.t2 (x varchar, f integer);
CREATE TABLE
postgres=# create table schema.t3 (a varchar, b varchar, c varchar, d varchar);
CREATE TABLE
postgres=# commit;
With the following script:
#!/usr/bin/python3
import psycopg2;
conn = psycopg2.connect("dbname=postgres")
cur = conn.cursor()
cur.execute("""
CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;
""")
conn.commit()
cur.close()
conn.close()
I tested with quite current versions of python3.7/2.7 and current version of psycopg2 module and current libraries (I have 11.5 pg client and 2.8.3 psycopg2) from pgdg installed on a quite recent linux? Can You try to execute directly on psycopg2 like I did?
Also did You make sure Your dots are plain ascii dots as all the other characters in the statement are in this case? (Also keep in mind there can be invisible codepoints in unicode that can cause such sort of problems.) Maybe You can convert Your string to ASCII binary and back to Unicode-String if You are on Python. If it does not raise an error on .encode('ASCII') it should be clean.
I have two tables, Table A and Table B. I have added one column to Table A, record_id. Table B has record_id and the primary ID for Table A, table_a_id. I am looking to deprecate Table B.
Relationships exist between Table B's table_a_id and Table A's id, if that helps.
Currently, my solution is:
db.execute("UPDATE table_a t
SET record_id = b.record_id
FROM table_b b
WHERE t.id = b.table_a_id")
This is my first time using this ORM -- I'd like to see if there is a way I can use my Python models and the actual functions SQLAlchemy gives me to be more 'Pythonic' rather than just dumping a Postgres statement that I know works in an execute call.
My solution ended up being as follows:
(db.query(TableA)
.filter(TableA.id == TableB.table_a_id,
TableA.record_id.is_(None))
.update({TableA.record_id: TableB.record_id}, synchronize_session=False))
This leverages the ability of PostgreSQL to do updates based on implicit references of other tables, which I did in my .filter() call (this is analogous to a WHERE in a JOIN query). The solution was deceivingly simple.
I am using Python and I would like to have a list of IDs stored in disk preserving some of the functionalities of a set (that is, efficiently checking if an ID is contained). To this end, I think using SQLite library is a wise decision (at least that is my impression after googling and stacking a bit). However, I am a beginner in SQL world and could not find any post explaining what I am looking for.
How can I store IDs (strings) in SQLite and later check if a specific ID appears or not in the database?
import sqlite3
id1 = 'abc'
id2 = 'def'
# Initialization of the database
define_database()
# Update the database by inserting a new ID
insert_in_database(id1)
insert_in_database(id2)
# Check if the specified ID is contained in the database (returns a Boolean)
check_if_exists_in_database(id1)
PS: I am aware of the sqlite3 library.
Thanks!
Just use a table with a single column. This column must be indexed (explicitly, or by making it the primary key) for lookups over large data to be efficient:
db = sqlite3.connect('...filename...')
def define_database():
db.execute('CREATE TABLE IF NOT EXISTS MyStuff(id PRIMARY KEY)')
(Use a WITHOUT ROWID table if your Python version is recent enough to have a modern version of the SQLite library.)
Inserting is done with standard SQL:
def insert_in_database(value):
db.execute('INSERT INTO MyStuff(id) VALUES(?)', [value])
To check whether a value exists, just try to read its row:
def check_if_exists_in_database(value):
for row in db.execute('SELECT 1 FROM MyStuff WHERE id = ?', [value])
return True
else:
return False
Till now our application has been using one SQLite database with SQLObject as the ORM. Obviously at some point we knew we had to face the SQLite concurrency problem and so we did.
We ended up splitting the current database into multiple databases. Meaning each table schema remained the same but we distributed different tables into multiple databases keeping tightly coupled tables together.
Now this works very well in a clean install of the new version of our application but upgrade to the previous versions of our application to this new version needs a special data migration before our application can start working. In this case the database migration is simple moving the tables from this single database into appropriate different databases.
To exemplify, consider this is the older structure:
single_db.db --- A single db
* A -- Table A
* B -- Table B
* C -- Table C
* D -- Table D
* E -- Table E
* F -- Table F
The new structure:
db1.db --- Database 1
- A -- Table A
- B -- Table B
- C -- Table C
- D -- Table D
db2.db --- Database 2
- E -- Table E
db3.db --- Database 3
- F -- Table F
When the upgrade will happen, our application will create the new structure with the above 3 databases and with empty tables in them. Also the older database single_db.db with all the tables and actual data will be there. Now before our application can begin working, it should move the tables or I should say copy the data from a table from the older database to the corresponding table in the corresponding new database.
I will need to write the code for this database migration. I know I can query a table using the older database connection and insert the returned rows to the corresponding table using the newer database connection. One caveat I should mention here is some of these tables can contain large number of rows. That is rows can be till 2 - 2.5 million in 2/3 tables.
So want to ask if I can use any other SLQObject tricks since I am using SQLObject on top of SQLite and also has anyone done this before?
Thanks for your help.
I realise you probably solved this by now but for anyone googling I had to do almost exactly the same as the OP, this was the core part of the code that I used (it's modified from something I found, but I can't find it again to credit the original author, apologies!)
def _iterdump(connection, table_name):
"""
Returns an iterator to dump a database table in SQL text format.
"""
cu = connection.cursor()
yield('BEGIN TRANSACTION;')
# sqlite_master table contains the SQL CREATE statements for the database.
q = """
SELECT name, type, sql
FROM sqlite_master
WHERE sql NOT NULL AND
type == 'table' AND
name == :table_name
"""
schema_res = cu.execute(q, {'table_name': table_name})
for table_name, type, sql in schema_res.fetchall():
if table_name == 'sqlite_sequence':
yield('DELETE FROM sqlite_sequence;')
elif table_name == 'sqlite_stat1':
yield('ANALYZE sqlite_master;')
elif table_name.startswith('sqlite_'):
continue
else:
yield('%s;' % sql)
# Build the insert statement for each row of the current table
res = cu.execute("PRAGMA table_info('%s')" % table_name)
column_names = [str(table_info[1]) for table_info in res.fetchall()]
q = "SELECT 'INSERT INTO \"%(tbl_name)s\" VALUES("
q += ",".join(["'||quote(" + col + ")||'" for col in column_names])
q += ")' FROM '%(tbl_name)s'"
query_res = cu.execute(q % {'tbl_name': table_name})
for row in query_res:
yield("%s;" % row[0])
If you pass the sqlite connection for the original db and the name of the table in the original db this generator will give back commands that you can pass to execute on the sqlite object for the new db.
When I did this I also did a count of rows first on all the tables and incremented a counter as I executed INSERT lines so I could show progress on the migration.