Web2py DAL/built in select with JSON - python

I am wondering if the DAL supports select with JSON, or if there is a hack to make it able to select JSON fields. I can do the following:
SELECT count(id) FROM my_table WHERE my_json_colum::json->>'form_id' = '%s';" % (dummy_string)
my_count = db.executesql(query)
return my_count
However, the docs suggest this isn't reliabe:
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver.
I couldn't find anything in the documentation that suggested support for this. More specifically, when I run the above code it returns just the letter H. Is there a workaround (or better yet a legitimate way to do it that I missed) to get the DAL working with JSON?

The DAL is able to save JSON data in individual fields, but it does not provide a mechanism for querying specific attributes of the JSON data, as that requires special functionality within the RDBMS itself, which is not supported by most databases.

Related

How to pass a bytestring to PyGreSQL?

I've got a PostgreSQL table with a column of type bytea. Porting that table from SQLite, I ran into an issue - I couldn't figure out how to pass raw binary data to an SQL query. The framework I use is PyGreSQL. I want to stick to the DB-API 2.0 interface to avoid a lot of conversion.
That interface, unlike the classic one (dollar-sign parameters) and SQLite (question-mark parameters), requires to specify the type (%-formatting like the old Python's).
The data I want to pass is a PNG file, binary-read using the 'rb' flag in the open() method.
The query code looks like this:
db = pgdb.connect(args)
c = db.cursor()
c.execute('INSERT INTO tbl VALUES (%b)', (b'test_bytes',))
This gives an error unsupported format character 'b' (0x62) at index 54 and doesn't allow the formatting to happen. What could be done to solve this issue?
Not really a solution, but a good-enough workaround. I decided to use psycopg2 instead of PyGreSQL as it is more supported and common, so it's easier to find info about any common problems.
There, the solution would be to use %s for every type, which I find much more Pythonic (no need to think about types in a dynamically-typed language)
So, my query would look like this:
c.execute('INSERT INTO tbl VALUES (%s)', (b'test_bytes',))

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)
The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.
Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

Entire JSON into One SQLite Field with Python

I have what is likely an easy question. I'm trying to pull a JSON from an online source, and store it in a SQLite table. In addition to storing the data in a rich table, corresponding to the many fields in the JSON, I would like to also just dump the entire JSON into a table every time it is pulled.
The table looks like:
CREATE TABLE Raw_JSONs (ID INTEGER PRIMARY KEY ASC, T DATE DEFAULT (datetime('now','localtime')), JSON text);
I've pulled a JSON from some URL using the following python code:
from pyquery import PyQuery
from lxml import etree
import urllib
x = PyQuery(url='json')
y = x('p').text()
Now, I'd like to execute the following INSERT command:
import sqlite3
db = sqlite3.connect('a.db')
c = db.cursor()
c.execute("insert into Raw_JSONs values(NULL,DATETIME('now'),?)", y)
But I'm told that I've supplied the incorrect number bindings (i.e. thousands, instead of just 1). I gather it's reading the y variable as all the different elements of the JSON.
Can someone help me store just the JSON, in it's entirety?
Also, as I'm obviously new to this JSON game, any online resources to recommend would be amazing.
Thanks!
.execute() expects a sequence, better give it a one-element tuple:
c.execute("insert into Raw_JSONs values(NULL,DATETIME('now'),?)", (y,))
A Python string is a sequence too, one of individual characters. So the .execute() call tried to treat each separate character as a parameter for your query, and unless your string is one character short that means it'll not provide the right number of parameters.
Don't forget to commit your inserts:
db.commit()
or use the database connection as a context manager:
with db:
# inserts executed here will automatically commit if no exceptions are raised.
You may also be interested to know about the built in sqlite modules adapters. These can convert any python object to an sqlite column both ways. See the standard documentation and the adapters section.

Python Database update error

Usually i use Django orm for making database related query in python but now i am using the python itself
I am trying to update a row of my mysql database
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
But i am getting the error
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
AttributeError: 'C' object has no attribute 'uploaded'
Please help me what is wrong with my query ?
Get is probably mapping to a c object. Try renaming your "get" object to something else.
Here is a list of reserved words. I don't see get in there, but it sound like it could be part of a c library that's being included. If you're including something with from x import *, you could be importing it without knowing.
In short - get probably isn't what you think it is.
However, before you go much further building SQL queries with string formatting, I strongly advise you not to! Search for "SQL injection" and you'll see why. Python DB API compliant libraries utilise "placeholders" which the library can use to insert the variables into a query for you providing any necessary escaping/quoting.
So instead of:
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
An example using SQLite3 (using ? as a placeholder - others use %s or :1 or %(name)s - or any/all of the above - but that'll be detailed in the docs of your library):
query = "update callerdetail set upload=? where agent=? and custid=? and screename=? and status=?"
Then when it comes to execute the query, you provide the values to be substituted as a separate argument:
cursor.execute(query, (get.uploaded, get.agent, get.custid, get.screenname))
If you really wanted, you could have a convenience function, and reduce this to:
from operator import attrgetter
get_fields = attrgetter('uploaded', 'agent', 'custid', 'screenname')
cursor.execute(query, get_fields(get))

Can I pickle a python dictionary into a sqlite3 text field?

Any gotchas I should be aware of? Can I store it in a text field, or do I need to use a blob?
(I'm not overly familiar with either pickle or sqlite, so I wanted to make sure I'm barking up the right tree with some of my high-level design ideas.)
I needed to achieve the same thing too.
I turns out it caused me quite a headache before I finally figured out, thanks to this post, how to actually make it work in a binary format.
To insert/update:
pdata = cPickle.dumps(data, cPickle.HIGHEST_PROTOCOL)
curr.execute("insert into table (data) values (:data)", sqlite3.Binary(pdata))
You must specify the second argument to dumps to force a binary pickling.
Also note the sqlite3.Binary to make it fit in the BLOB field.
To retrieve data:
curr.execute("select data from table limit 1")
for row in curr:
data = cPickle.loads(str(row['data']))
When retrieving a BLOB field, sqlite3 gets a 'buffer' python type, that needs to be strinyfied using str before being passed to the loads method.
If you want to store a pickled object, you'll need to use a blob, since it is binary data. However, you can, say, base64 encode the pickled object to get a string that can be stored in a text field.
Generally, though, doing this sort of thing is indicative of bad design, since you're storing opaque data you lose the ability to use SQL to do any useful manipulation on that data. Although without knowing what you're actually doing, I can't really make a moral call on it.
I wrote a blog about this idea, except instead of a pickle, I used json, since I wanted it to be interoperable with perl and other programs.
http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/
Architecturally, this is a quick and dirty way to get persistence, transactions, and the like for arbitrary data structures. I have found this combination to be really useful when I want persistence, and don't need to do much in the sql layer with the data (or it's very complex to deal with in sql, and simple with generators).
The code itself is pretty simple:
# register the "loader" to get the data back out.
sqlite3.register_converter("pickle", cPickle.loads)
Then, when you want to dump it into the db,
p_string = p.dumps( dict(a=1,b=[1,2,3]))
conn.execute('''
create table snapshot(
id INTEGER PRIMARY KEY AUTOINCREMENT,
mydata pickle);
''')
conn.execute('''
insert into snapshot values
(null, ?)''', (p_string,))
''')
Pickle has both text and binary output formats. If you use the text-based format you can store it in a TEXT field, but it'll have to be a BLOB if you use the (more efficient) binary format.
I have to agree with some of the comments here. Be careful and make sure you really want to save pickle data in a db, there's probably a better way.
In any case I had trouble in the past trying to save binary data in the sqlite db.
Apparently you have to use the sqlite3.Binary() to prep the data for sqlite.
Here's some sample code:
query = u'''insert into testtable VALUES(?)'''
b = sqlite3.Binary(binarydata)
cur.execute(query,(b,))
con.commit()
Since Pickle can dump your object graph to a string it should be possible.
Be aware though that TEXT fields in SQLite uses database encoding so you might need to convert it to a simple string before you un-pickle.
If a dictionary can be pickled, it can be stored in text/blob field as well.
Just be aware of the dictionaries that can't be pickled (aka that contain unpickable objects).
Yes, you can store a pickled object in a TEXT or BLOB field in an SQLite3 database, as others have explained.
Just be aware that some object cannot be pickled. The built-in container types can (dict, set, list, tuple, etc.). But some objects, such as file handles, refer to state that is external to their own data structures, and other extension types have similar problems.
Since a dictionary can contain arbitrary nested data structures, it might not be pickle-able.
SpoonMeiser is correct, you need to have a strong reason to pickle into a database.
It's not difficult to write Python objects that implement persistence with SQLite. Then you can use the SQLite CLI to fiddle with the data as well. Which in my experience is worth the extra bit of work, since many debug and admin functions can be simply performed from the CLI rather than writing specific Python code.
In the early stages of a project, I did what you propose and ended up re-writing with a Python class for each business object (note: I didn't say for each table!) This way the body of the application can focus on "what" needs to be done rather than "how" it is done.
The other option, considering that your requirement is to save a dict and then spit it back out for the user's "viewing pleasure", is to use the shelve module which will let you persist any pickleable data to file. The python docs are here.
Depending on what you're working on, you might want to look into the shove module. It does something similar, where it auto-stores Python objects inside a sqlite database (and all sorts of other options) and pretends to be a dictionary (just like the shelve module).
It is possible to store object data as pickle dump, jason etc but it is also possible to index, them, restrict them and run select queries that use those indices. Here is example with tuples, that can be easily applied for any other python class. All that is needed is explained in python sqlite3 documentation (somebody already posted the link). Anyway here it is all put together in the following example:
import sqlite3
import pickle
def adapt_tuple(tuple):
return pickle.dumps(tuple)
sqlite3.register_adapter(tuple, adapt_tuple) #cannot use pickle.dumps directly because of inadequate argument signature
sqlite3.register_converter("tuple", pickle.loads)
def collate_tuple(string1, string2):
return cmp(pickle.loads(string1), pickle.loads(string2))
#########################
# 1) Using declared types
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)
con.create_collation("cmptuple", collate_tuple)
cur = con.cursor()
cur.execute("create table test(p tuple unique collate cmptuple) ")
cur.execute("create index tuple_collated_index on test(p collate cmptuple)")
cur.execute("select name, type from sqlite_master") # where type = 'table'")
print(cur.fetchall())
p = (1,2,3)
p1 = (1,2)
cur.execute("insert into test(p) values (?)", (p,))
cur.execute("insert into test(p) values (?)", (p1,))
cur.execute("insert into test(p) values (?)", ((10, 1),))
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
cur.execute("insert into test(p) values (?)", (((9, 5), 33) ,))
try:
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
except Exception as e:
print e
cur.execute("select p from test order by p")
print "\nwith declared types and default collate on column:"
for raw in cur:
print raw
cur.execute("select p from test order by p collate cmptuple")
print "\nwith declared types collate:"
for raw in cur:
print raw
con.create_function('pycmp', 2, cmp)
print "\nselect grater than using cmp function:"
cur.execute("select p from test where pycmp(p,?) >= 0", ((10, ),) )
for raw in cur:
print raw
cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
print raw
print "\nselect grater than using collate:"
cur.execute("select p from test where p > ?", ((10,),) )
for raw in cur:
print raw
cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
print raw
cur.close()
con.close()
Many applications use sqlite3 as a backend for SQLAlchemy so, naturally, this question can be asked in the SQLAlchemy framework as well (which is how I came across this question).
To do this, one will have wanted to define the column in which the pickle data is desired to be stored to store "PickleType" data. The implementation is pretty straightforward:
from sqlalchemy import PickleType, Integer
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
import pickle
Base= declarative_base()
class User(Base):
__tablename__= 'Users'
id= Column(Integer, primary_key= True)
user_login_data_array= Column(PickleType)
login_information= {'User1':{'Times': np.arange(0,20),
'IP': ['123.901.12.189','123.441.49.391']}}
engine= create_engine('sqlite:///memory:',echo= False)
Base.metadata.create_all(engine)
Session_maker= sessionmaker(bind=engine)
Session= Session_maker()
# The pickling here is very intuitive! Just need to have
# defined the column "user_login_data_array" to take pickletype data.
pickled_login_data_array= pickle.dumps(login_information)
user_object_to_add= User(user_login_data_array= pickled_login_data_array)
Session.add(user_object_to_add)
Session.commit()
(I'm not claiming that this example would best be suited to use pickle, as others have noted issues with.)
See this solution at SourceForge:
y_serial.py module :: warehouse Python objects with SQLite
"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."
http://yserial.sourceforge.net

Categories

Resources