I have created a Python module that creates and populates several SQLite tables. Now, I want to use it in a program but I don't really know how to call it properly. All the tutorials I've found are essentially "inline", i.e. they walk through using SQLite in a linear fashion rather than how to actually use it in production.
What I'm trying to do is have a method check to see if the database is already created. If so, then I can use it. If not, an exception is raised and the program will create the database. (Or use if/else statements, whichever is better).
I created a test script to see if my logic is correct but it's not working. When I create the try statement, it just creates a new database rather than checking if one already exists. The next time I run the script, I get an error that the table already exists, even if I tried catching the exception. (I haven't used try/except before but figured this is a good time to learn).
Are there any good tutorials for using SQLite operationally or any suggestions on how to code this? I've looked through the pysqlite tutorial and others I found but they don't address this.
Don't make this more complex than it needs to be. The big, independent databases have complex setup and configuration requirements. SQLite is just a file you access with SQL, it's much simpler.
Do the following.
Add a table to your database for "Components" or "Versions" or "Configuration" or "Release" or something administrative like that.
CREATE TABLE REVISION(
RELEASE_NUMBER CHAR(20)
);
In your application, connect to your database normally.
Execute a simple query against the revision table. Here's what can happen.
The query fails to execute: your database doesn't exist, so execute a series of CREATE statements to build it.
The query succeeds but returns no rows or the release number is lower than expected: your database exists, but is out of date. You need to migrate from that release to the current release. Hopefully, you have a sequence of DROP, CREATE and ALTER statements to do this.
The query succeeds, and the release number is the expected value. Do nothing more, your database is configured correctly.
AFAIK an SQLITE database is just a file.
To check if the database exists, check for file existence.
When you open a SQLITE database it will automatically create one if the file that backs it up is not in place.
If you try and open a file as a sqlite3 database that is NOT a database, you will get this:
"sqlite3.DatabaseError: file is encrypted or is not a database"
so check to see if the file exists and also make sure to try and catch the exception in case the file is not a sqlite3 database
SQLite automatically creates the database file the first time you try to use it. The SQL statements for creating tables can use IF NOT EXISTS to make the commands only take effect if the table has not been created This way you don't need to check for the database's existence beforehand: SQLite can take care of that for you.
The main thing I would still be worried about is that executing CREATE TABLE IF EXISTS for every web transaction (say) would be inefficient; you can avoid that by having the program keep an (in-memory) variable saying whether it has created the database today, so it runs the CREATE TABLE script once per run. This would still allow for you to delete the database and start over during debugging.
As #diciu pointed out, the database file will be created by sqlite3.connect.
If you want to take a special action when the file is not there, you'll have to explicitly check for existance:
import os
import sqlite3
if not os.path.exists(mydb_path):
#create new DB, create table stocks
con = sqlite3.connect(mydb_path)
con.execute('''create table stocks
(date text, trans text, symbol text, qty real, price real)''')
else:
#use existing DB
con = sqlite3.connect(mydb_path)
...
Sqlite doesn't throw an exception if you create a new database with the same name, it will just connect to it. Since sqlite is a file based database, I suggest you just check for the existence of the file.
About your second problem, to check if a table has been already created, just catch the exception. An exception "sqlite3.OperationalError: table TEST already exists" is thrown if the table already exist.
import sqlite3
import os
database_name = "newdb.db"
if not os.path.isfile(database_name):
print "the database already exist"
db_connection = sqlite3.connect(database_name)
db_cursor = db_connection.cursor()
try:
db_cursor.execute('CREATE TABLE TEST (a INTEGER);')
except sqlite3.OperationalError, msg:
print msg
Doing SQL in overall is horrible in any language I've picked up. SQLalchemy has shown to be easiest from them to use because actual query and committing with it is so clean and absent from troubles.
Here's some basic steps on actually using sqlalchemy in your app, better details can be found from the documentation.
provide table definitions and create ORM-mappings
load database
ask it to create tables from the definitions (won't do so if they exist)
create session maker (optional)
create session
After creating a session, you can commit and query from the database.
See this solution at SourceForge which covers your question in a tutorial manner, with instructive source code :
y_serial.py module :: warehouse Python objects with SQLite
"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."
http://yserial.sourceforge.net
Yes, I was nuking out the problem. All I needed to do was check for the file and catch the IOError if it didn't exist.
Thanks for all the other answers. They may come in handy in the future.
Related
I have a Python script to import data from raw csv/xlsx files. For these I use Pandas to load the files, do some light transformation, and save to an sqlite3 database. This is fast (as fast as any method). After this, I run some queries against these to make some intermediate datasets. These I run through a function (see below).
More information: I am using Anaconda/Python3 (3.9) on Windows 10 Enterprise.
UPDATE:
Just as information for anybody reading this, I ended up going back to
just using standalone python (still using JupyterLab though)... I no
longer have this issue. So not sure if it is a problem with something
Anaconda does or just the versions of various libraries being used for
that particular Anaconda distribution (latest available). My script
runs more or less in the time that I would expect using Python 3.11
and the versions pulled in by pip for Pandas and sqlite (1.5.3 and
3.38.4).
Python function for running sqlite3 queries:
def runSqliteScript(destConnString, queryString):
'''Runs an sqlite script given a connection string and a query string
'''
try:
print('Trying to execute sql script: ')
print(queryString)
cursorTmp = destConnString.cursor()
cursorTmp.executescript(queryString)
except Exception as e:
print('Error caught: {}'.format(e))
Because somebody asked, here is the function that creates the "destConnString", though it's called something else in the actual function call, but is the same type.
def createSqliteDb(db_file):
''' Creates an sqlite database at direct/file name specified
'''
conSqlite = None
try:
conSqlite = sqlite3.connect(db_file)
return conSqlite
except Error as e:
print('Error {} when trying to create {}'.format(e, db_file))
Example of one of the queries (I commented out journal mode/synchronous pragmas after it didn't seem to help at all):
-- PRAGMA journal_mode = WAL;
-- PRAGMA synchronous = NORMAL;
BEGIN;
drop table if exists tbl_1110_cop_omd_fmd;
COMMIT;
BEGIN;
create table tbl_1110_cop_omd_fmd as
select
siteId,
orderNumber,
familyGroup01, familyGroup02,
count(*) as countOfLines
from tbl_0000_ob_trx_for_frazelle
where 1 = 1
-- and dateCreated between datetime('now', '-365 days') and datetime('now', 'localtime') -- temporarily commented due to no date in file
group by siteId,
orderNumber,
familyGroup01, familyGroup02
order by dateCreated asc
;
COMMIT
;
Here is a list of things that I have tried. Unfortunately, no matter what combination of things I have tried, it has ended up having one bottleneck or another. It seems there is some kind of write bottleneck from python to sqlite3, yet the pandas to_sql method doesn't seem to be affected by it. Complete list of all combinations of things that I have tried.
I tried wrapping all my queries in begin/commit statements. I put these in-line with the query, though I'd be interested in knowing if this is the correct way to do this. This seemed to have no effect.
I tried setting the journal mode to WAL and synchronous to normal, again to no effect.
I tried running the queries in an in-memory database.
Firstly, I tried creating everything from scratch in the in-memory database. The tables didn't create any faster. Saving this in-memory database seems to be a bottleneck (backup method).
Next, I tried creating views instead of tables (again, creating everything from scratch in the in-memory database). This created really quickly. Weirdly, querying these views was very fast. Saving this in-memory database seems to be a bottleneck (backup method).
I tried just writing views to the database file (not in-memory). Unfortunately, the views take as long as the make tables when running from Python/sqlite.
I don't really want to do anything strictly in-memory for the database creation, as this python script is used for different sets of data, some which could have too many rows for an in-memory setup. The only thing I have left to try is to take the in-memory from scratch setup, make views instead of tables, read ALL the in-memory db tables with pandas (from_sql), then write ALL the tables to a file db with pandas (to_sql)... Hoping there is something easy to try to resolve this problem.
connOBData = sqlite3.connect('file:cachedb?mode=memory?cache=shared')
These take approximately 1,000 times or more longer than if I run these queries directly in DB Browser (an sqlite frontend). These queries aren't that complex and run fine (in ~2-4 seconds) in DB Browser. All told, if I run all the queries in a row in DB Browser they'd run in 1-2 minutes. If I let them run through the Python script, it literally takes close to 20 hours. I'd expect the queries to finish in approximately the same time that they run in DB Browser.
Currently using cx_Oracle module in Python to connect to my Oracle database. I would like to only allow the user of the program to do read only executions, like Select, and NOT INSERT/DELETE queries.
Is there something I can do to the connection/cursor variables once I establish the connection to prevent writable queries?
I am using the Python Language.
Appreciate any help.
Thanks.
One possibility is to issue the statement "set transaction read only" as in the following code:
import cx_Oracle
conn = cx_Oracle.connect("cx_Oracle/welcome")
cursor = conn.cursor()
cursor.execute("set transaction read only")
cursor.execute("insert into c values (1, 'test')")
That will result in the following error:
ORA-01456: may not perform insert/delete/update operation inside a READ ONLY transaction
Of course you'll have to make sure that you create a Connection class that calls this statement when it is first created and after each and every commit() and rollback() call. And it can still be circumvented by calling a PL/SQL block that performs a commit or rollback.
The only other possibility that I can think of right now is to create a restricted user or role which simply doesn't have the ability to insert, update, delete, etc. and make sure the application uses that user or role. This one at least is fool proof, but a lot more effort up front!
I have this code to insert data to Database using MySQL. But when I ran that code using Python, there's no error. But when I checked the Database, the data isn't inserted. Is there anyone who can help me? I would appreciate it. :)
This is the code:
import MySQLdb
db=MySQLdb.connect(host="localhost", user="root", passwd="", db="try")
cursor=db.cursor()
insert="INSERT INTO `try`.`try` (`nomor`, `nama`) VALUES (NULL, 'bismillah')"
cursor.execute(insert)
You're not doing a COMMIT anywhere. So, if auto-commit is not on, all you've done is create a transaction that, if later committed, will insert this row.
Since you haven't done a SET AUTOCOMMIT anywhere, whether auto-commit is on depends on how you created the database. With at least some storage types (in particular, InnoDB), you can change the default at creation time, and, because you often want auto-commit disabled with those storage types, your GUI design tool, or the sample code you copied and pasted, or whatever may have done so for you. Also, the server variable that provides the default can itself be set to a different value at server startup/configuration. (See System Server Variables.)
If you want to make sure that auto-commit is on, just execute SET autocommit=1 before any other statements.
If you want to find out whether auto-commit is on, execute SHOW VARIABLES. (And if it's disabled, you may want to try SHOW GLOBAL VARIABLES LIKE 'autocommit' and SHOW SESSION VARIABLES like 'autocommit' to see which context you've disabled it in.)
If you cannot insert into mysql, there are several ways to solve it:
1: check the log
2: check the structure of your table, maybe it must not be null for nomor or other field
3: the last, when insert into mysqldb using program, you need to commit before you close the connection. for here: db.commit()
I'm using Sqlalchemy in a multitenant Flask application and need to create tables on the fly when a new tenant is added. I've been using Table.create to create individual tables within a new Postgres schema (along with search_path modifications) and this works quite well.
The limitation I've found is that the Table.create method blocks if there is anything pending in the current transaction. I have to commit the transaction right before the .create call or it will block. It doesn't appear to be blocked in Sqlalchemy because you can't Ctrl-C it. You have to kill the process. So, I'm assuming it's something further down in Postgres.
I've read in other answers that CREATE TABLE is transactional and can be rolled back, so I'm presuming this should be working. I've tried starting a new transaction with the current engine and using that for the table create (vs. the current Flask one) but that hasn't helped either.
Does anybody know how to get this to work without an early commit (and risking partial dangling data)?
This is Python 2.7, Postgres 9.1 and Sqlalchemy 0.8.0b2.
(Copy from comment)
Assuming sess is the session, you can do sess.execute(CreateTable(tenantX_tableY)) instead.
EDIT: CreateTable is only one of the things being done when calling table.create(). Use table.create(sess.connection()) instead.
I am tracking changes made to a levels in a game. The way I currently track changes is in a sqlite database. Each level is supposed to have its own database, as just one database for all the levels would provide complications when adding and deleting levels. So for each level, I want a database that has the same name as that level. SO that changes made to level "foo" get written to database "foo". I don't need to edit the tables just the actual name of the database. I guess now that I could just use a file renaming function in python, but I would like to know if there is any way to change names from the start.
Heres an example:
connection = sqlite.connect('\database\foo.db')
cursor = connection.cursor()
Where foo is the variable
You'll need to post some code for us to answer this completely.
You can use the ALTER TABLE command to rename the tables within your database, you can rename your sqlite db file on disk if you close it first, and you can use a variable in your python code to represent the name of the DB you're using. But you need to be more specific if you want a more specific answer.