I have a bz2 file (I have never worked with such files). When I manually unzip it, I see it's a sqlite db with several tables in it, but I don't know how to connect to it all from python without having to unzip it manually (I have many dbs so it has to be automated in the script). So far, I have tried the following but get an error.
import bz2
import sqlite3
zipfile = bz2.BZ2File("file.sqlite.bz2")
connection = sqlite3.connect(zipfile.read())
query = "SELECT * FROM sqlite_master WHERE type='table';"
cursor = connection.execute(query)
cursor.fetchall()
[]
But, when I do the same query for the unzipped file I do get all the tables.
If you can use apsw instead of the standard python library's sqlite3 module, it's possible to open an in-memory representation of a database (Like the bytes returned by BZ2File.read():
#!/usr/bin/env python3
import bz2
import apsw
zipfile = bz2.BZ2File("file.sqlite.bz2")
db = apsw.Connection(":memory:")
db.deserialize("main", zipfile.read())
query = "SELECT * FROM sqlite_master WHERE type='table';"
cursor = db.cursor()
for row in cursor.execute(query):
print(row)
Otherwise, since the standard bindings don't support Sqlite3's serialization functions, you'll have to save the decompressed database to a temporary file, and connect to that.
Related
Can someone point me in the right direction on how to open a .mdb file in python? I normally like including some code to start off a discussion, but I don't know where to start. I work with mysql a fair bit with python. I was wondering if there is a way to work with .mdb files in a similar way?
Below is some code I wrote for another SO question.
It requires the 3rd-party pyodbc module.
This very simple example will connect to a table and export the results to a file.
Feel free to expand upon your question with any more specific needs you might have.
import csv, pyodbc
# set up some constants
MDB = 'c:/path/to/my.mdb'
DRV = '{Microsoft Access Driver (*.mdb)}'
PWD = 'pw'
# connect to db
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
# run a query and get the results
SQL = 'SELECT * FROM mytable;' # your query goes here
rows = cur.execute(SQL).fetchall()
cur.close()
con.close()
# you could change the mode from 'w' to 'a' (append) for any subsequent queries
with open('mytable.csv', 'w') as fou:
csv_writer = csv.writer(fou) # default field-delimiter is ","
csv_writer.writerows(rows)
There's the meza library by Reuben Cummings which can read Microsoft Access databases through mdbtools.
Installation
# The mdbtools package for Python deals with MongoDB, not MS Access.
# So install the package through `apt` if you're on Debian/Ubuntu
$ sudo apt install mdbtools
$ pip install meza
Usage
>>> from meza import io
>>> records = io.read('database.mdb') # only file path, no file objects
>>> print(next(records))
Table1
Table2
…
This looks similar to a previous question:
What do I need to read Microsoft Access databases using Python?
http://code.activestate.com/recipes/528868-extraction-and-manipulation-class-for-microsoft-ac/
Answer there should be useful.
For a solution that works on any platform that can run Java, consider using Jython or JayDeBeApi along with the UCanAccess JDBC driver. For details, see the related question
Read an Access database in Python on non-Windows platform (Linux or Mac)
In addition to bernie's response, I would add that it is possible to recover the schema of the database. The code below lists the tables (b[2] contains the name of the table).
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
tables = list(cur.tables())
print 'tables'
for b in tables:
print b
The code below lists all the columns from all the tables:
colDesc = list(cur.columns())
This code will convert all the tables to CSV.
Happy Coding
for tbl in mdb.list_tables("file_name.MDB"):
df = mdb.read_table("file_name.MDB", tbl)
df.to_csv(tbl+'.csv')
I am trying to open a .sqlite3 file in python but I see no information is returned. So I tried r and still get empty for tables. I would like to know what tables are in this file.
I used the following code for python:
import sqlite3
from sqlite3 import Error
def create_connection(db_file):
""" create a database connection to the SQLite database
specified by the db_file
:param db_file: database file
:return: Connection object or None
"""
try:
conn = sqlite3.connect(db_file)
return conn
except Error as e:
print(e)
return None
database = "D:\\...\assignee.sqlite3"
conn = create_connection(database)
cur = conn.cursor()
rows = cur.fetchall()
but rows are empty!
This is where I got the assignee.sqlite3 from:
https://github.com/funginstitute/downloads
I also tried RStudio, below is the code and results:
> con <- dbConnect(drv=RSQLite::SQLite(), dbname="D:/.../assignee")
> tables <- dbListTables(con)
But this is what I get
first make sure you provided correct path on your connection string to the sql
light db ,
use this conn = sqlite3.connect("C:\users\guest\desktop\example.db")
also make sure you are using the SQLite library in the unit tests and the production code
check the types of sqllite connection strings and determain which one your db belongs to :
Basic
Data Source=c:\mydb.db;Version=3;
Version 2 is not supported by this class library.
SQLite
In-Memory Database
An SQLite database is normally stored on disk but the database can also be
stored in memory. Read more about SQLite in-memory databases.
Data Source=:memory:;Version=3;New=True;
SQLite
Using UTF16
Data Source=c:\mydb.db;Version=3;UseUTF16Encoding=True;
SQLite
With password
Data Source=c:\mydb.db;Version=3;Password=myPassword;
so make sure you wrote the proper connection string for your sql lite db
if you still cannot see it, check if the disk containing /tmp full otherwise , it might be encrypted database, or locked and used by some other application maybe , you may confirm that by using one of the many tools for sql light database ,
you may downliad this tool , try to navigate directly to where your db exist and it will give you indication of the problem .
download windows version
Download Mac Version
Download linux version
good luck
I have been given an SQLite file to exam using python. I have imported the SQLite module and attempted to connect to the database but I'm not having any luck. I am wondering if I have to actually open the file up as "r" as well as connecting to it? please see below; ie f = open("History.sqlite","r+")
import sqlite3
conn = sqlite3.connect("history.sqlite")
curs = conn.cursor()
results = curs.execute ("Select * From History.sqlite;")
I keep getting this message when I go to run results:
Operational Error: no such table: History.sqlite
An SQLite file is a single data file that can contain one or more tables of data. You appear to be trying to SELECT from the filename instead of the name of one of the tables inside the file.
To learn what tables are in your database you can use any of these techniques:
Download and use the command line tool sqlite3.
Download any one of a number of GUI tools for looking at SQLite files.
Write a SELECT statement against the special table sqlite_master to list the tables.
In Python, is there a more or less hacky way to open a compressed SQLite database without having to write a temporary file somewhere?
Something like:
import bz2
import sqlite3
dbfile = bz2.BZ2File("/path/to/file.bz2", "wb")
dbconn = sqlite3.connect(dbfile)
cursor = dbconn.cursor()
...
This of course raises:
ValueError: database parameter must be string or APSW Connection object
The underlying C-library directly uses the filename string. Thus there is no way to transparently work on it from Python.
See the code on Github
Depending on your OS, you might be able to use a RAM-disk to work on the file. If your sqlite-file is bigger than that, it might be time to switch to another DB-system, like Postgres.
Can someone point me in the right direction on how to open a .mdb file in python? I normally like including some code to start off a discussion, but I don't know where to start. I work with mysql a fair bit with python. I was wondering if there is a way to work with .mdb files in a similar way?
Below is some code I wrote for another SO question.
It requires the 3rd-party pyodbc module.
This very simple example will connect to a table and export the results to a file.
Feel free to expand upon your question with any more specific needs you might have.
import csv, pyodbc
# set up some constants
MDB = 'c:/path/to/my.mdb'
DRV = '{Microsoft Access Driver (*.mdb)}'
PWD = 'pw'
# connect to db
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
# run a query and get the results
SQL = 'SELECT * FROM mytable;' # your query goes here
rows = cur.execute(SQL).fetchall()
cur.close()
con.close()
# you could change the mode from 'w' to 'a' (append) for any subsequent queries
with open('mytable.csv', 'w') as fou:
csv_writer = csv.writer(fou) # default field-delimiter is ","
csv_writer.writerows(rows)
There's the meza library by Reuben Cummings which can read Microsoft Access databases through mdbtools.
Installation
# The mdbtools package for Python deals with MongoDB, not MS Access.
# So install the package through `apt` if you're on Debian/Ubuntu
$ sudo apt install mdbtools
$ pip install meza
Usage
>>> from meza import io
>>> records = io.read('database.mdb') # only file path, no file objects
>>> print(next(records))
Table1
Table2
…
This looks similar to a previous question:
What do I need to read Microsoft Access databases using Python?
http://code.activestate.com/recipes/528868-extraction-and-manipulation-class-for-microsoft-ac/
Answer there should be useful.
For a solution that works on any platform that can run Java, consider using Jython or JayDeBeApi along with the UCanAccess JDBC driver. For details, see the related question
Read an Access database in Python on non-Windows platform (Linux or Mac)
In addition to bernie's response, I would add that it is possible to recover the schema of the database. The code below lists the tables (b[2] contains the name of the table).
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
tables = list(cur.tables())
print 'tables'
for b in tables:
print b
The code below lists all the columns from all the tables:
colDesc = list(cur.columns())
This code will convert all the tables to CSV.
Happy Coding
for tbl in mdb.list_tables("file_name.MDB"):
df = mdb.read_table("file_name.MDB", tbl)
df.to_csv(tbl+'.csv')