How to save (and also restore, and add elements to) a set of strings in a Sqlite3 database?
This does not work because sets are not JSON-serializable:
import sqlite3, json
db = sqlite3.connect(':memory:')
db.execute('CREATE TABLE t(id TEXT, myset TEXT);')
s = {'a', 'b', 'c'}
db.execute("INSERT INTO t VALUES (?, ?);", ('1', json.dumps(s)))
# Error: Object of type set is not JSON serializable
so we can use a list, or a dict with dummy values:
s = list(s)
# or s = {'a':0, 'b':0, 'c': 0}
db.execute("INSERT INTO t VALUES (?, ?);", ('1', json.dumps(s)))
# RETRIEVE A SET FROM DB
r = db.execute("SELECT myset FROM t WHERE id = '1'").fetchone()
if r is not None:
s = set(json.loads(r[0]))
print(s)
Then adding a string element to a set already in the DB is not very elegant:
one has to SELECT,
retrieve as string,
parse the JSON with json.loads,
convert from list to set,
add an element to the set,
convert from set to list (or, as an alternative for these 3 last steps: check if the element is already present in the list, and add it or not to the list)
JSONify it with json.dumps,
database UPDATE
Is there a more pythonic way to work with sets in a Sqlite database?
You can register adapter and converter functions with sqlite that will automatically perform the desired conversions.
import json
import sqlite3
def adapt_set(value):
return json.dumps(list(value))
def convert_set(value):
return set(json.loads(value))
sqlite3.register_adapter(set, adapt_set)
sqlite3.register_converter('set_type', convert_set)
Once these functions have been registered, pass detect_types to the connection factory to tell sqlite how to use them.
Passing sqlite3.PARSE_DECLTYPE will make the connection use the declared type to look up the adapter/converter.
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_DECLTYPES)
# Declare the myset column as type "set_type".
db.execute('CREATE TABLE t(id TEXT, myset set_type);')
db.execute("INSERT INTO t VALUES (?, ?);", ('1', {1, 2, 3}))
r = db.execute("""SELECT myset FROM t WHERE id = '1'""").fetchone()
print(r[0]) # <- r[0] is a set.
Passing sqlite.PARSE_COLNAMES will cause the column name in the cursor description to be searched for the type name enclosed in square brackets.
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_COLNAMES)
# The type is not declared in the created table statement.
db.execute('CREATE TABLE t(id TEXT, myset TEXT);')
db.execute("INSERT INTO t VALUES (?, ?);", ('1', {1, 2, 3}))
# Include the type in the column label.
r = db.execute("""SELECT myset "AS myset [set_type]" FROM t WHERE id = '1'""").fetchone()
print(r[0]) <- r[0] is a set
I would register a "set adapter" that converts a set into a byte string by simply taking the string representation of the set and encoding it into a bytes string for storage and a "set converter" that converts our user-defined column type named "set_type" (pick any alternate you wish) from a byte string back into a set by decoding the byte string back into a Unicode string and then applying eval against it:
import sqlite3
from decimal import Decimal
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)
def set_adapter(the_set):
return str(the_set).encode('utf-8')
def set_converter(s):
return eval(s.decode('utf-8'))
sqlite3.register_adapter(set, set_adapter)
sqlite3.register_converter('set_type', set_converter)
# Define the columns with type set_type:
db.execute('CREATE TABLE t(id TEXT, myset set_type);')
s = {'a', 'b', 'c', (1, 2, 3), True, False, None, b'abcde', Decimal('12.345')}
# Our adapter will store s as a byte string:
db.execute("INSERT INTO t VALUES (?, ?);", ('1', s))
cursor = db.cursor()
# Our converter will convert column type set_type from bytes to set:
cursor.execute('select myset from t')
row = cursor.fetchone()
s = row[0]
print(type(s), s)
db.close()
Prints:
<class 'set'> {False, True, 'b', None, (1, 2, 3), Decimal('12.345'), b'abcde', 'a', 'c'}
As you can see, this also handles more datatypes than JSON can after converting a set to a list. JSON could potentially handle as many data types but only if you write JSON converters (which usually means converting those data types to one of the supported types such as strings).
Defining the adapter and converter to use pickle as in the answer offered by Jonathan Feenstra will result in a speed improvement but possibly use more storage. But I would use the technique outlined above, i.e. using adapter and converter functions with a special user-defined column type:
import pickle
def set_adapter(the_set):
return pickle.dumps(the_set, pickle.HIGHEST_PROTOCOL)
def set_converter(s):
return pickle.loads(s)
A simpler way to store the set in the SQLite database is to use the BLOB datatype for the myset column and serialise it to bytes using pickle.dumps:
import sqlite3
import pickle
db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE t(id TEXT, myset BLOB)")
s = {"a", "b", "c"}
db.execute(
"INSERT INTO t VALUES (?, ?)",
("1", sqlite3.Binary(pickle.dumps(s, pickle.HIGHEST_PROTOCOL))),
)
r = db.execute("SELECT myset FROM t WHERE id = '1'").fetchone()
if r is not None:
s = pickle.loads(r[0])
print(s)
To add new elements to a set in the database, the serialisation and deserialisation steps are still required, but no more conversion to/from a list or checking for elements that are already present in the set.
Alternatively, you could ensure the uniqueness of the ID-element combination at database-level, for example using a composite primary key:
import sqlite3
db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE t(id TEXT, element TEXT, PRIMARY KEY(id, element))")
s = {"a", "b", "c"}
for element in s:
db.execute("INSERT INTO t VALUES(?, ?)", ("1", element))
r = db.execute("SELECT element FROM t WHERE id = '1'").fetchall()
s = set(row[0] for row in r)
print(s)
You can use aiosqlitedict
Here is what it can do
Easy conversion between sqlite table and Python dictionary and vice-versa.
Get values of a certain column in a Python list.
Order your list ascending or descending.
Insert any number of columns to your dict.
Getting Started
We start by connecting our database along with
the reference column
from aiosqlitedict.database import Connect
countriesDB = Connect("database.db", "user_id")
Make a dictionary
The dictionary should be inside an async function.
async def some_func():
countries_data = await countriesDB.to_dict("my_table_name", 123, "col1_name", "col2_name", ...)
You can insert any number of columns, or you can get all by specifying
the column name as '*'
countries_data = await countriesDB.to_dict("my_table_name", 123, "*")
so you now have made some changes to your dictionary and want to
export it to sql format again?
Convert dict to sqlite table
async def some_func():
...
await countriesDB.to_sql("my_table_name", 123, countries_data)
But what if you want a list of values for a specific column?
Select method
you can have a list of all values of a certain column.
country_names = await countriesDB.select("my_table_name", "col1_name")
to limit your selection use limit parameter.
country_names = await countriesDB.select("my_table_name", "col1_name", limit=10)
you can also arrange your list by using ascending parameter
and/or order_by parameter and specifying a certain column to order your list accordingly.
country_names = await countriesDB.select("my_table_name", "col1_name", order_by="col2_name", ascending=False)
I am working with the panoply SDK in python. Panoply website documentation
Panoply is a data warehouse and I am using the SDK in python to write directly into the warehouse. I am using sqlalchemy to query my results from a mysql database and the SDK requires the result to be in a dictionary.
{ "column": "value" }
my code so far:
>>>from sqlalchemy import create_engine
>>>result = engine.execute("""\
SELECT
creation_date,
COUNT(*) AS total
FROM SomeTable
GROUP BY 1
""")
>>>result = result.fetchall()
>>>result
[('2020-02-05', 41606), ('2020-02-06', 31223)]
>>>cols = result.keys()
>>>cols
['creation_date', 'trips']
Panoply Example:
import panoply
conn = panoply.SDK( "APIKEY", "APISECRET" )
conn.write( "tablename", { "foo": "bar" } )
How do I get my sqlalchemy query into the format panoply needs to write into the database?
Since panoply supports writing only one row at a time, there's no need to use the fetchall method to retrieve all the rows into memory at once, which can be very memory inefficient. Instead, iterate through the cursor and use the items method to retrieve each row as a sequence of key-value tuples so that you can construct a dict with the dict constructor to feed into panoply's write method:
for row in result:
conn.write('tablename', dict(row.items()))
I hope this will help...
>>> rows = result.fetchall()
>>> results = [dict(row) for row in rows]
I would like to get a list that is a field in a JSON file, which is a record in a mysql table. I am using the following query:
cursor = self.database.cursor()
sql = """ SELECT result->>"$.my_list" FROM my_table
WHERE my_id = 5 ORDER BY date ASC """
cursor.execute(sql)
result = cursor.fetchall()
Using '->' the result is in the string format: '[a,b,c,d], [a,b,c,d]' or using '->>' the result is binary: b'[a,b,c,d]', b'[a,b,c,d]'.
How can I convert it (or get directly) a normal python list object?
New to python, trying to use psycopg2 to read Postgres
I am reading from a database table called deployment and trying to handle a Value from a table with three fields id, Key and Value
import psycopg2
conn = psycopg2.connect(host="localhost",database=database, user=user, password=password)
cur = conn.cursor()
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
records = cur.fetchall()
print(json.dumps(records))
[["newdrive"]]
I want this to be just "newdrive" so that I can do a string comparison in the next line to check if its "newdrive" or not
I tried json.loads on the json.dumps output, didn't work
>>> a=json.loads(json.dumps(records))
>>> print(a)
[['newdrive']]
I also tried to print just the records without json.dump
>>> print(records)
[('newdrive',)]
The result of fetchall() is a sequence of tuples. You can loop over the sequence and print the first (index 0) element of each tuple:
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
records = cur.fetchall()
for record in records:
print(record[0])
Or simpler, if you are sure the query returns no more than one row, use fetchone() which gives a single tuple representing returned row, e.g.:
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
row = cur.fetchone()
if row: # check whether the query returned a row
print(row[0])
How do I insert a python dictionary into a Postgresql2 table? I keep getting the following error, so my query is not formatted correctly:
Error syntax error at or near "To" LINE 1: INSERT INTO bill_summary VALUES(To designate the facility of...
import psycopg2
import json
import psycopg2.extras
import sys
with open('data.json', 'r') as f:
data = json.load(f)
con = None
try:
con = psycopg2.connect(database='sanctionsdb', user='dbuser')
cur = con.cursor(cursor_factory=psycopg2.extras.DictCursor)
cur.execute("CREATE TABLE bill_summary(title VARCHAR PRIMARY KEY, summary_text VARCHAR, action_date VARCHAR, action_desc VARCHAR)")
for d in data:
action_date = d['action-date']
title = d['title']
summary_text = d['summary-text']
action_date = d['action-date']
action_desc = d['action-desc']
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
cur.execute(q)
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
You should use the dictionary as the second parameter to cursor.execute(). See the example code after this statement in the documentation:
Named arguments are supported too using %(name)s placeholders in the query and specifying the values into a mapping.
So your code may be as simple as this:
with open('data.json', 'r') as f:
data = json.load(f)
print(data)
""" above prints something like this:
{'title': 'the first action', 'summary-text': 'some summary', 'action-date': '2018-08-08', 'action-desc': 'action description'}
use the json keys as named parameters:
"""
cur = con.cursor()
q = "INSERT INTO bill_summary VALUES(%(title)s, %(summary-text)s, %(action-date)s, %(action-desc)s)"
cur.execute(q, data)
con.commit()
Note also this warning (from the same page of the documentation):
Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
You're writing your query in a wrong way, by concatenating the values, they should rather be the comma-separated elements, like this:
q = "INSERT INTO bill_summary VALUES({0},{1},{2},{3})".format(str(title), str(summery_text), str(action_date),str(action_desc))
Since you're not specifying the columns names, I already suppose they are in the same orders as you have written the value in your insert query. There are basically two way of writing insert query in postgresql. One is by specifying the columns names and their corresponding values like this:
INSERT INTO TABLE_NAME (column1, column2, column3,...columnN)
VALUES (value1, value2, value3,...valueN);
Another way is, You may not need to specify the column(s) name in the SQL query if you are adding values for all the columns of the table. However, make sure the order of the values is in the same order as the columns in the table. Which you have used in your query, like this:
INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);