Python: Insert QByteArray to SQLite - python

I'm trying to store image data (screenshots) in SQLite database.now = int(math.floor(time.time()))
ba = QByteArray()
buff = QBuffer(ba)
image.save(buff, format)
params = (str(ba.data()), "image/%s"%format, now, url)
s_conn = sqlite.connect("cache/screenshots_%s.db"%row['size'])
s_curs = s_conn.cursor()
s_curs.execute("UPDATE screenshots SET data=?, mime=?, created=? WHERE filename=?", params)This code gives me error "TypeError: not all arguments converted during string formatting"
Any manipulation with QByteArray (incl. converting it to Qstring) gives me this error, or ascii to utf-8 conversion error.
I've Googled this issue for about 2 days and every advice was incorrect for me.
How can I work it around?

The biggest issue is that you are trying to store binary as a string by calling str(ba.data). If you do this then it will not be a valid string and will cause endless grief for you later. Behind the scenes SQLite uses Unicode for all strings. However it does not check that a provided string is valid unicode (UTF8/16). Consequently you can insert binary garbage pretending it is a string but when trying to retrieve it will fail dismally since it won't convert to Unicode.
SQLite has a binary type (named BLOB) and that is exactly what you should be using. The way you provide a binary/blob binding is dependent on the SQLite wrapper you are using. It looks like you are using PySQLite or SQLite 3. For Python 2 use buffer and for Python 3 use bytes.
# Python 2
params=( buffer(ba.data()), ...)
# Python 3
params=( bytes(ba.data()), ...)

Related

JSON Parsing with python from Rethink database [Python]

Im trying to retrieve data from a database named RethinkDB, they output JSON when called with r.db("Databasename").table("tablename").insert([{ "id or primary key": line}]).run(), when doing so it outputs [{'id': 'ValueInRowOfid\n'}] and I want to parse that to just the value eg. "ValueInRowOfid". Ive tried with JSON in Python, but I always end up with the typeerror: list indices must be integers or slices, not str, and Ive been told that it is because the Database outputs invalid JSON format. My question is how can a JSON format be invalid (I cant see what is invalid with the output) and also what would be the best way to parse it so that the value "ValueInRowOfid" is left in a Operator eg. Value = ("ValueInRowOfid").
This part imports the modules used and connects to RethinkDB:
import json
from rethinkdb import RethinkDB
r = RethinkDB()
r.connect( "localhost", 28015).repl()
This part is getting the output/value and my trial at parsing it:
getvalue = r.db("Databasename").table("tablename").sample(1).run() # gets a single row/value from the table
print(getvalue) # If I print that, it will show as [{'id': 'ValueInRowOfid\n'}]
dumper = json.dumps(getvalue) # I cant use `json.loads(dumper)` as JSON object must be str. Which the output of the database isnt (The output is a list)
parsevalue = json.loads(dumper) # After `json.dumps(getvalue)` I can now load it, but I cant use the loaded JSON.
print(parsevalue["id"]) # When doing this it now says that the list is a str and it needs to be an integers or slices. Quite frustrating for me as it is opposing it self eg. It first wants str and now it cant use str
print(parsevalue{'id'}) # I also tried to shuffle it around as seen here, but still the same result
I know this is janky and is very hard to comprehend this level of stupidity that I might be on. As I dont know if it is the most simple problem or something that just isnt possible (Which it should or else I cant use my data in the database.)
Thank you for reading this through and not jumping straight into the comments and say that I have to read the JSON documentation, because I have and I havent found a single piece that could help me.
I tried reading the documentation and watching tutorials about JSON and JSON parsing. I also looked for others whom have had the same problems as me and couldnt find.
It looks like it's returning a dictionary ({}) inside a list ([]) of one element.
Try:
getvalue = r.db("Databasename").table("tablename").sample(1).run()
print(getvalue[0]['id'])

Decoding XML object from mssql in Python

I get back an XML object from a mssql server when I call a SP from Python (2.7). I get it in the following form:
{u'XML_F52E2B61-18A1-11d1-B105-00805F49916B': 'D\x02i\x00d\x00D\x05d\x00e\x00s\x00c\x00r\x00D\x0bd\x00a\x00t\x00a\x00t\x00y\x00p\x00e\x00_\x00i\x00d\x00D\x13e\x00n\x00u\x00m\x00e\x00r\x00a\x00t\x00i\x00o\x00n\x00_\x00t\x00y\x00p\x00e\x00_\x00i\x00d\x00D\rs\x00y\x00s\x00t\x00e\x00m\x00f\x00e\x00a\x00t\x00u\x00r\x00e\x00D\x04l\x00i\x00n\x00k\x00D\x07F\x00e\x00a\x00t\x00u\x00r\x00e\x00\x01\x00\x08F\x00e\x00a\x00t\x00u\x00r\x00e\x00S\x00A\x01\x07A\x01\x01A\x03B\x01\x00\x00\x00\x81\x01\x01\x02A\x03\x11\x1a\x00r\x00e\x00s\x00p\x00o\x00n\x00d\x00e\x00n\x00t\x00_\x00i\x00d\x00\x81\x02\x01\x03A\x03B\x01\x00\x00\x00\x81\x03\x01\x05A\x03F\x01\x81\x05\x01\x06A\x03F\x00\x81\x06\x81\x07\x01\x07A\x01\x01A\x03B\x02\x00\x00\x00\x81\x01\x01\x02A\x03\x11 \x00W\x00o\x00r\x00k\x00s\x00 \x00a\x00t\x00 \x00c\x00o\x00m\x00p\x00a\x00n\x00y\x00\x81\x02\x01\x03A\x03B\x01\x00\x00\x00\x81\x03\x01\x05A\x03F\x01\x81\x05\x01\x06A\x03F\x01\x81\x06\x81\x07\x01\x07A\x01\x01A\x03B\x03\x00\x00\x00\x81\x01\x01\x02A\x03\x11\x0c\x00G\x00e\x00n\x00d\x00e\x00r\x00\x81\x02\x01\x03A\x03B\x08\x00\x00\x00\x81\x03\x01\x04A\x03B\x01\x00\x00\x00\x81\x04\x01\x05A\x03F\x00\x81\x05\x01\x06A\x03F\x00\x81\x06\x81\x07\x81\x00\x08F\x00e\x00a\x00t\x00u\x00r\x00e\x00S\x00'}
I have two questions:
1: What encoding is this?
2: What library should I use to decode this?
Addition:
The XML as it shows in the SQL Management Studio:
The SP:
ALTER PROCEDURE [dbo].[rdb_sql2python]
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM [_rdb].[dbo].[features] FOR XML RAW ('Feature'), ROOT ('FeatureS'), ELEMENTS
SET NOCOUNT OFF
END
I try something like an answer, at least to the question: What is this:
At this JSON-viewer your string as you presented it did not work. But when I removed the "u", replaced the single quotes with double quotes and removed the "D" it worked somehow:
This string
{"XML_F52E2B61-18A1-11d1-B105-00805F49916B":
"\x02i\x00d\x00D\x05d\x00e\x00s\x00c\x00r\x00D\x0bd\x00a\x00t\x00a\x00t\x00y\x00p\x00e\x00_\x00i\x00d\x00D\x13e\x00n\x00u\x00m\x00e\x00r\x00a\x00t\x00i\x00o\x00n\x00_\x00t\x00y\x00p\x00e\x00_\x00i\x00d\x00D\rs\x00y\x00s\x00t\x00e\x00m\x00f\x00e\x00a\x00t\x00u\x00r\x00e\x00D\x04l\x00i\x00n\x00k\x00D\x07F\x00e\x00a\x00t\x00u\x00r\x00e\x00\x01\x00\x08F\x00e\x00a\x00t\x00u\x00r\x00e\x00S\x00A\x01\x07A\x01\x01A\x03B\x01\x00\x00\x00\x81\x01\x01\x02A\x03\x11\x1a\x00r\x00e\x00s\x00p\x00o\x00n\x00d\x00e\x00n\x00t\x00_\x00i\x00d\x00\x81\x02\x01\x03A\x03B\x01\x00\x00\x00\x81\x03\x01\x05A\x03F\x01\x81\x05\x01\x06A\x03F\x00\x81\x06\x81\x07\x01\x07A\x01\x01A\x03B\x02\x00\x00\x00\x81\x01\x01\x02A\x03\x11
\x00W\x00o\x00r\x00k\x00s\x00 \x00a\x00t\x00
\x00c\x00o\x00m\x00p\x00a\x00n\x00y\x00\x81\x02\x01\x03A\x03B\x01\x00\x00\x00\x81\x03\x01\x05A\x03F\x01\x81\x05\x01\x06A\x03F\x01\x81\x06\x81\x07\x01\x07A\x01\x01A\x03B\x03\x00\x00\x00\x81\x01\x01\x02A\x03\x11\x0c\x00G\x00e\x00n\x00d\x00e\x00r\x00\x81\x02\x01\x03A\x03B\x08\x00\x00\x00\x81\x03\x01\x04A\x03B\x01\x00\x00\x00\x81\x04\x01\x05A\x03F\x00\x81\x05\x01\x06A\x03F\x00\x81\x06\x81\x07\x81\x00\x08F\x00e\x00a\x00t\x00u\x00r\x00e\x00S\x00"}
converts to
Name: XML_F52E2B61-18A1-11d1-B105-00805F49916B
Value: "idDdescrDdatatype_idDenumeration_type_idD systemfeatureDlinkDFeatureFeatureSAAABArespondent_idABAFAFAABA Works at companyABAFAFAABAGenderABABAFAFFeatureS"
This is - for sure - not the final solution, but it's clear, that this is BSON encoded JSON.
It might be a good idea to show (the relevant parts of) you(r) SP and the way you are calling this. Might be, that there is a completely different / better approach...

Trying to split a string in Python 3, get 'str' does not support buffer interface

So I'm trying to take data from a website and parse it into an object. The data is separated by vertical bars ("|"). However, when I split my string using .split('|'), I get
TypeError: 'str' does not support the buffer interface
I am still attempting to learn Python. I have done some digging on this error but every example I could find related to sending or receiving data and there was a lot of jargon I was unfamiliar with. One solution said to use .split(b'|'), but then this somehow turned my string into a byte and prevented me from printing it in the final line.
Below is my code. Any help you can offer would be greatly appreciated!
with urllib.request.urlopen('ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqtraded.txt') as response:
html = response.read()
rawStockList = html.splitlines()
stockList = []
for i in rawStockList:
stockInfo = i.split('|')
try:
stockList.append(Stock(stockInfo[1], stockInfo[2], stockInfo[5], stockInfo[10], 0))
except IndexError:
pass
print([str(item) for item in stockList])
The items inside rawStockList is actually of the type byte, as response.read() returns that since it doesn't necessary know the encoding. You need to turn that into a proper string by decoding it with an encoding. Assuming the files are encoded in utf8, you need something like:
for i in rawStockList:
stockInfo = i.decode('utf8').split('|')

Representation of python dictionaries with unicode in database queries

I have a problem that I would like to know how to efficiently tackle.
I have data that is JSON-formatted (used with dumps / loads) and contains unicode.
This is part of a protocol implemented with JSON to send messages. So messages will be sent as strings and then loaded into python dictionaries. This means that the representation, as a python dictionary, afterwards will look something like:
{u"mykey": u"myVal"}
It is no problem in itself for the system to handle such structures, but the thing happens when I'm going to make a database query to store this structure.
I'm using pyOrient towards OrientDB. The command ends up something like:
"CREATE VERTEX TestVertex SET data = {u'mykey': u'myVal'}"
Which will end up in the data field getting the following values in OrientDB:
{'_NOT_PARSED_': '_NOT_PARSED_'}
I'm assuming this problem relates to other cases as well when you wish to make a query or somehow represent a data object containing unicode.
How could I efficiently get a representation of this data, of arbitrary depth, to be able to use it in a query?
To clarify even more, this is the string the db expects:
"CREATE VERTEX TestVertex SET data = {'mykey': 'myVal'}"
If I'm simply stating the wrong problem/question and should handle it some other way, I'm very much open to suggestions. But what I want to achieve is to have an efficient way to use python2.7 to build a db-query towards orientdb (using pyorient) that specifies an arbitrary data structure. The data property being set is of the OrientDB type EMBEDDEDMAP.
Any help greatly appreciated.
EDIT1:
More explicitly stating that the first code block shows the object as a dict AFTER being dumped / loaded with json to avoid confusion.
Dargolith:
ok based on your last response it seems you are simply looking for code that will dump python expression in a way that you can control how unicode and other data types print. Here is a very simply function that provides this control. There are ways to make this function more efficient (for example, by using a string buffer rather than doing all of the recursive string concatenation happening here). Still this is a very simple function, and as it stands its execution is probably still dominated by your DB lookup.
As you can see in each of the 'if' statements, you have full control of how each data type prints.
def expr_to_str(thing):
if hasattr(thing, 'keys'):
pairs = ['%s:%s' % (expr_to_str(k),expr_to_str(v)) for k,v in thing.iteritems()]
return '{%s}' % ', '.join(pairs)
if hasattr(thing, '__setslice__'):
parts = [expr_to_str(ele) for ele in thing]
return '[%s]' % (', '.join(parts),)
if isinstance(thing, basestring):
return "'%s'" % (str(thing),)
return str(thing)
print "dumped: %s" % expr_to_str({'one': 33, 'two': [u'unicode', 'just a str', 44.44, {'hash': 'here'}]})
outputs:
dumped: {'two':['unicode', 'just a str', 44.44, {'hash':'here'}], 'one':33}
I went on to use json.dumps() as sobolevn suggested in the comment. I didn't think of that one at first since I wasn't really using json in the driver. It turned out however that json.dumps() provided exactly the formats I needed on all the data types I use. Some examples:
>>> json.dumps('test')
'"test"'
>>> json.dumps(['test1', 'test2'])
'["test1", "test2"]'
>>> json.dumps([u'test1', u'test2'])
'["test1", "test2"]'
>>> json.dumps({u'key1': u'val1', u'key2': [u'val21', 'val22', 1]})
'{"key2": ["val21", "val22", 1], "key1": "val1"}'
If you need to take more control of the format, quotes or other things regarding this conversion, see the reply by Dan Oblinger.

Compiler error when using GetStringFileInfo in InnoSetup on application created with PyInstaller

I created version info file as described here - What does a "version file" look like?
and do got EXE-file with all version information.
My issue is next, when I try to build setup file with InnoSetup, I'm getting an error:
Error on line 65 in d:\installation\Source\setup_script.iss: Missing
closing quote on parameter "Name"
line 65:
[Icons]
Name: "{group}\{#VerInfoProductName}"; Filename: "{app}\{#ExeFileName}.exe"; WorkingDir: "{app}"
Definition of VerInfoProductName below
#define VerInfoProductName GetStringFileInfo(AddBackslash(SourcePath) + "..\..\dist\app\testapp.exe", "ProductName")
Details are attached in archive.
There's something in your application version info strings that confuses the Inno Setup pre-processor. Your code works with other applications.
The pre-processor loads the ProductName in a way that resulting variable is actually longer than the value, the remaining space filled with some garbage that later confuses the compiler.
You can workaround it by using {#SetupSetting('AppName')} instead of {#VerInfoProductName}. This of course assumes that AppName is set to {#VerInfoProductName}.
Another way is to round-trip the string via an INI file:
#expr WriteIni("C:\path\xxx.ini", "xxx", "xxx", VerInfoProductName)
#define VerInfoProductName ReadIni("C:\path\xxx.ini", "xxx", "xxx")
Actually in normal Windows resource files (.rc), one has to explicitly null-terminate the version info strings (note the \0):
VALUE "ProductName", "TestProductName\0"
The resulting null (\0) character is explicitly stored in the resulting binary. So in the end there are two null characters in the resulting binary (four 0 bytes in UTF-16 encoding). This is common WinAPI format when multiple values are allowed. The null character is values separator, the double-null terminates the sequence.
Your TestApp.exe is missing that second null. I can see that in a hex dump. I'm pretty sure that this is the primary cause of your problem.

Categories

Resources