Entire JSON into One SQLite Field with Python

Entire JSON into One SQLite Field with Python - python

I have what is likely an easy question. I'm trying to pull a JSON from an online source, and store it in a SQLite table. In addition to storing the data in a rich table, corresponding to the many fields in the JSON, I would like to also just dump the entire JSON into a table every time it is pulled.
The table looks like:
CREATE TABLE Raw_JSONs (ID INTEGER PRIMARY KEY ASC, T DATE DEFAULT (datetime('now','localtime')), JSON text);
I've pulled a JSON from some URL using the following python code:
from pyquery import PyQuery
from lxml import etree
import urllib
x = PyQuery(url='json')
y = x('p').text()
Now, I'd like to execute the following INSERT command:
import sqlite3
db = sqlite3.connect('a.db')
c = db.cursor()
c.execute("insert into Raw_JSONs values(NULL,DATETIME('now'),?)", y)
But I'm told that I've supplied the incorrect number bindings (i.e. thousands, instead of just 1). I gather it's reading the y variable as all the different elements of the JSON.
Can someone help me store just the JSON, in it's entirety?
Also, as I'm obviously new to this JSON game, any online resources to recommend would be amazing.
Thanks!

.execute() expects a sequence, better give it a one-element tuple:
c.execute("insert into Raw_JSONs values(NULL,DATETIME('now'),?)", (y,))
A Python string is a sequence too, one of individual characters. So the .execute() call tried to treat each separate character as a parameter for your query, and unless your string is one character short that means it'll not provide the right number of parameters.
Don't forget to commit your inserts:
db.commit()
or use the database connection as a context manager:
with db:
# inserts executed here will automatically commit if no exceptions are raised.

You may also be interested to know about the built in sqlite modules adapters. These can convert any python object to an sqlite column both ways. See the standard documentation and the adapters section.

Related

Web2py DAL/built in select with JSON

I am wondering if the DAL supports select with JSON, or if there is a hack to make it able to select JSON fields. I can do the following:
SELECT count(id) FROM my_table WHERE my_json_colum::json->>'form_id' = '%s';" % (dummy_string)
my_count = db.executesql(query)
return my_count
However, the docs suggest this isn't reliabe:
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver.
I couldn't find anything in the documentation that suggested support for this. More specifically, when I run the above code it returns just the letter H. Is there a workaround (or better yet a legitimate way to do it that I missed) to get the DAL working with JSON?

The DAL is able to save JSON data in individual fields, but it does not provide a mechanism for querying specific attributes of the JSON data, as that requires special functionality within the RDBMS itself, which is not supported by most databases.

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)

The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.

Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

How to select all data in PyMongo?

I want to select all data or select with conditional in table random but I can't find any guide in MongoDB in Python to do this.
And I can't show all data was select.
Here my code:
def mongoSelectStatement(result_queue):
client = MongoClient('mongodb://localhost:27017')
db = client.random
cursor = db.random.find({"gia_tri": "0.5748676522161966"})
# cursor = db.random.find()
inserted_documents_count = cursor.count()
for document in cursor:
result_queue.put(document)

There is a quite comprehensive documentation for mongodb. For python (Pymongo) here is the URL: https://api.mongodb.org/python/current/
Note: Consider the version you are running. Since the latest version has new features and functions.
To verify pymongo version you are using execute the following:
import pymongo
pymongo.version
Now. Regarding the select query you asked for. As far as I can tell the code you presented is fine. Here is the select structure in mongodb.
First off it is called find().
In pymongo; if you want to select specific rows( not really rows in mongodb they are called documents. I am saying rows to make it easy to understand. I am assuming you are comparing mongodb to SQL); alright so If you want to select specific document from the table (called collection in mongodb) use the following structure (I will use random as collection name; also assuming that the random table has the following attributes: age:10, type:ninja, class:black, level:1903):
db.random.find({ "age":"10" }) This will return all documents that have age 10 in them.
you could add more conditions simply by separating with commas
db.random.find({ "age":"10", "type":"ninja" }) This will select all data with age 10 and type ninja.
if you want to get all data just leave empty as:
db.random.find({})
Now the previous examples display everything (age, type, class, level and _id). If you want to display specific attributes say only the age you will have to add another argument to find called projection eg: (1 is show, 0 is do not show):
{'age':1}
Note here that this returns age as well as _id. _id is always returned by default. You have to explicitly tell it not to returning it as:
db.random.find({ "age":"10", "name":"ninja" }, {"age":1, "_id":0} )
I hope that could get you started.
Take a look at the documentation is very thorough.

MySQL LOAD DATA LOCAL INFILE example in python?

I am looking for a syntax definition, example, sample code, wiki, etc. for
executing a LOAD DATA LOCAL INFILE command from python.
I believe I can use mysqlimport as well if that is available, so any feedback (and code snippet) on which is the better route, is welcome. A Google search is not turning up much in the way of current info
The goal in either case is the same: Automate loading hundreds of files with a known naming convention & date structure, into a single MySQL table.
David

Well, using python's MySQLdb, I use this:
connection = MySQLdb.Connect(host='**', user='**', passwd='**', db='**')
cursor = connection.cursor()
query = "LOAD DATA INFILE '/path/to/my/file' INTO TABLE sometable FIELDS TERMINATED BY ';' ENCLOSED BY '\"' ESCAPED BY '\\\\'"
cursor.execute( query )
connection.commit()
replacing the host/user/passwd/db as appropriate for your needs. This is based on the MySQL docs here, The exact LOAD DATA INFILE statement would depend on your specific requirements etc (note the FIELDS TERMINATED BY, ENCLOSED BY, and ESCAPED BY statements will be specific to the type of file you are trying to read in).

You can also get the results for the import by adding the following lines after your query:
results = connection.info()

Can I pickle a python dictionary into a sqlite3 text field?

Any gotchas I should be aware of? Can I store it in a text field, or do I need to use a blob?
(I'm not overly familiar with either pickle or sqlite, so I wanted to make sure I'm barking up the right tree with some of my high-level design ideas.)

I needed to achieve the same thing too.
I turns out it caused me quite a headache before I finally figured out, thanks to this post, how to actually make it work in a binary format.
To insert/update:
pdata = cPickle.dumps(data, cPickle.HIGHEST_PROTOCOL)
curr.execute("insert into table (data) values (:data)", sqlite3.Binary(pdata))
You must specify the second argument to dumps to force a binary pickling.
Also note the sqlite3.Binary to make it fit in the BLOB field.
To retrieve data:
curr.execute("select data from table limit 1")
for row in curr:
data = cPickle.loads(str(row['data']))
When retrieving a BLOB field, sqlite3 gets a 'buffer' python type, that needs to be strinyfied using str before being passed to the loads method.

If you want to store a pickled object, you'll need to use a blob, since it is binary data. However, you can, say, base64 encode the pickled object to get a string that can be stored in a text field.
Generally, though, doing this sort of thing is indicative of bad design, since you're storing opaque data you lose the ability to use SQL to do any useful manipulation on that data. Although without knowing what you're actually doing, I can't really make a moral call on it.

I wrote a blog about this idea, except instead of a pickle, I used json, since I wanted it to be interoperable with perl and other programs.
http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/
Architecturally, this is a quick and dirty way to get persistence, transactions, and the like for arbitrary data structures. I have found this combination to be really useful when I want persistence, and don't need to do much in the sql layer with the data (or it's very complex to deal with in sql, and simple with generators).
The code itself is pretty simple:
# register the "loader" to get the data back out.
sqlite3.register_converter("pickle", cPickle.loads)
Then, when you want to dump it into the db,
p_string = p.dumps( dict(a=1,b=[1,2,3]))
conn.execute('''
create table snapshot(
id INTEGER PRIMARY KEY AUTOINCREMENT,
mydata pickle);
''')
conn.execute('''
insert into snapshot values
(null, ?)''', (p_string,))
''')

Pickle has both text and binary output formats. If you use the text-based format you can store it in a TEXT field, but it'll have to be a BLOB if you use the (more efficient) binary format.

I have to agree with some of the comments here. Be careful and make sure you really want to save pickle data in a db, there's probably a better way.
In any case I had trouble in the past trying to save binary data in the sqlite db.
Apparently you have to use the sqlite3.Binary() to prep the data for sqlite.
Here's some sample code:
query = u'''insert into testtable VALUES(?)'''
b = sqlite3.Binary(binarydata)
cur.execute(query,(b,))
con.commit()

Since Pickle can dump your object graph to a string it should be possible.
Be aware though that TEXT fields in SQLite uses database encoding so you might need to convert it to a simple string before you un-pickle.

If a dictionary can be pickled, it can be stored in text/blob field as well.
Just be aware of the dictionaries that can't be pickled (aka that contain unpickable objects).

Yes, you can store a pickled object in a TEXT or BLOB field in an SQLite3 database, as others have explained.
Just be aware that some object cannot be pickled. The built-in container types can (dict, set, list, tuple, etc.). But some objects, such as file handles, refer to state that is external to their own data structures, and other extension types have similar problems.
Since a dictionary can contain arbitrary nested data structures, it might not be pickle-able.

SpoonMeiser is correct, you need to have a strong reason to pickle into a database.
It's not difficult to write Python objects that implement persistence with SQLite. Then you can use the SQLite CLI to fiddle with the data as well. Which in my experience is worth the extra bit of work, since many debug and admin functions can be simply performed from the CLI rather than writing specific Python code.
In the early stages of a project, I did what you propose and ended up re-writing with a Python class for each business object (note: I didn't say for each table!) This way the body of the application can focus on "what" needs to be done rather than "how" it is done.

The other option, considering that your requirement is to save a dict and then spit it back out for the user's "viewing pleasure", is to use the shelve module which will let you persist any pickleable data to file. The python docs are here.

Depending on what you're working on, you might want to look into the shove module. It does something similar, where it auto-stores Python objects inside a sqlite database (and all sorts of other options) and pretends to be a dictionary (just like the shelve module).

It is possible to store object data as pickle dump, jason etc but it is also possible to index, them, restrict them and run select queries that use those indices. Here is example with tuples, that can be easily applied for any other python class. All that is needed is explained in python sqlite3 documentation (somebody already posted the link). Anyway here it is all put together in the following example:
import sqlite3
import pickle
def adapt_tuple(tuple):
return pickle.dumps(tuple)
sqlite3.register_adapter(tuple, adapt_tuple) #cannot use pickle.dumps directly because of inadequate argument signature
sqlite3.register_converter("tuple", pickle.loads)
def collate_tuple(string1, string2):
return cmp(pickle.loads(string1), pickle.loads(string2))
#########################
# 1) Using declared types
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)
con.create_collation("cmptuple", collate_tuple)
cur = con.cursor()
cur.execute("create table test(p tuple unique collate cmptuple) ")
cur.execute("create index tuple_collated_index on test(p collate cmptuple)")
cur.execute("select name, type from sqlite_master") # where type = 'table'")
print(cur.fetchall())
p = (1,2,3)
p1 = (1,2)
cur.execute("insert into test(p) values (?)", (p,))
cur.execute("insert into test(p) values (?)", (p1,))
cur.execute("insert into test(p) values (?)", ((10, 1),))
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
cur.execute("insert into test(p) values (?)", (((9, 5), 33) ,))
try:
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
except Exception as e:
print e
cur.execute("select p from test order by p")
print "\nwith declared types and default collate on column:"
for raw in cur:
print raw
cur.execute("select p from test order by p collate cmptuple")
print "\nwith declared types collate:"
for raw in cur:
print raw
con.create_function('pycmp', 2, cmp)
print "\nselect grater than using cmp function:"
cur.execute("select p from test where pycmp(p,?) >= 0", ((10, ),) )
for raw in cur:
print raw
cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
print raw
print "\nselect grater than using collate:"
cur.execute("select p from test where p > ?", ((10,),) )
for raw in cur:
print raw
cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
print raw
cur.close()
con.close()

Many applications use sqlite3 as a backend for SQLAlchemy so, naturally, this question can be asked in the SQLAlchemy framework as well (which is how I came across this question).
To do this, one will have wanted to define the column in which the pickle data is desired to be stored to store "PickleType" data. The implementation is pretty straightforward:
from sqlalchemy import PickleType, Integer
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
import pickle
Base= declarative_base()
class User(Base):
__tablename__= 'Users'
id= Column(Integer, primary_key= True)
user_login_data_array= Column(PickleType)
login_information= {'User1':{'Times': np.arange(0,20),
'IP': ['123.901.12.189','123.441.49.391']}}
engine= create_engine('sqlite:///memory:',echo= False)
Base.metadata.create_all(engine)
Session_maker= sessionmaker(bind=engine)
Session= Session_maker()
# The pickling here is very intuitive! Just need to have
# defined the column "user_login_data_array" to take pickletype data.
pickled_login_data_array= pickle.dumps(login_information)
user_object_to_add= User(user_login_data_array= pickled_login_data_array)
Session.add(user_object_to_add)
Session.commit()
(I'm not claiming that this example would best be suited to use pickle, as others have noted issues with.)

See this solution at SourceForge:
y_serial.py module :: warehouse Python objects with SQLite
"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."
http://yserial.sourceforge.net

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Entire JSON into One SQLite Field with Python - python

You may also be interested to know about the built in sqlite modules adapters. These can convert any python object to an sqlite column both ways. See the standard documentation and the adapters section.

Related

Web2py DAL/built in select with JSON

Peewee execute_sql with escaped characters

How to select all data in PyMongo?

MySQL LOAD DATA LOCAL INFILE example in python?

Can I pickle a python dictionary into a sqlite3 text field?

Categories

Resources