python SPARQL query RESULTS BINDINGS: IF statement on BINDINGs value?

python SPARQL query RESULTS BINDINGS: IF statement on BINDINGs value? - python

Using Python with SPARQLWrapper, JSON, urlib2 & cgi. Had trouble passing a working SPARQL query with some NULL values to python so I populated the blanks with a literal and will try to filter at the output. I have this results section example:
for result in results["results"]["bindings"]:
project = result["project"]["value"].encode('utf-8')
filename = result["filename"]["value"].encode('utf-8')
url = result["url"]["value"].encode('utf-8')
...and I print the %s. Is there a way to filter a value, i.e., IF VALUE NE "string" then PRINT? Or is there another workaround? I'm at the tail-end of a small project, I know I need a better wrapper, I just need to get these results filtered before I can move on. T very much IA...

I'm one of the developers of the SPARQLWrapper library, and the question had been already answered at the mailing list.
Regarding optionals values on the original query, the result set will come with no values for those variables. The problems is that we'd need to parse the query to populate such missing entries, and we want to avoid such parsing; therefore you'd need to check it for avoiding runtime problems with KeyError.
Usually I use a code like:
for result in results["results"]["bindings"]:
party = result["party"]["value"] if ("party" in result) else None

Related

Python Formatting SQL WHERE clause

I'm having this function that communicates via pymysql to an SQL database stored to my localhost. I know there are similar posts about formatting an SQL section especially this one but could anyone suggest a solution?
Always getting TypeError: can't concat tuple to bytes. I suppose it's sth with the WHERE clause.
def likeMovement(pID):
print("Give a rating for the movement with #id:%s" %pID)
rate=input("Give from 0-5: ")
userID=str(1)
print(rate,type(rate))
print(pID,type(pID))
print(userID,type(userID))
cursor=con.cursor()
sqlquery='''UDPATE likesartmovement SET likesartmovement.rating=%s WHERE
likesartmovement.artisticID=? AND likesartmovement.userID=?''' % (rate,),
(pID,userID)
cursor.execute(sqlquery)
TypeError: not all arguments converted during string formatting
Thanks in advance!

The problem is that you're storing (pID,userID) as part of a tuple stored in sqlquery, instead of passing them as the arguments to execute:
sqlquery='''UDPATE likesartmovement SET likesartmovement.rating=%s WHERE
likesartmovement.artisticID=? AND likesartmovement.userID=?''' % (rate,)
cursor.execute(sqlquery, (pID,userID))
It may be clearer to see why these are different if you take a simpler example:
s = 'abc'
spam(s, 2)
s = 'abc', 2
spam(s)
Obviously those two don't do the same thing.
While we're at it:
You have to spell UPDATE right.
You usually want to use query parameters for SET clauses for exactly the same reasons you want to for WHERE clauses.
You don't need to include the table name in single-table operations, and you're not allowed to include the table name in SET clauses in single-table updates.
So:
sqlquery='''UDPATE likesartmovement SET rating=? WHERE
artisticID=? AND userID=?'''
cursor.execute(sqlquery, (rating, pID, userID))

Should I use python classes for a MySQL database insert program?

I have created a database to store NGS sequencing results. It consists of 17 tables to store all of the information. The results are stored in spreadsheets which I parse data from and store in variables using python (2.7), and then use the python package mysqldb to insert data into the database. I mainly use functions to obtain the information i need in variables, then write a loop in which I call this function followed by a 'try:' statement to insert. Here is a simple example:
def sample_processer(file):
my_file = open(file, 'r+')
samples = []
for line in my_file:
...get info...
samples.append(line[0])
return(samples)
samples = sample_processor('path/to/file')
for sample in samples:
try:
sql = "samsql = "INSERT IGNORE INTO sample(sample_id, diagnosis, screening) VALUES ("
samsql = samsql + "'"+sample+"'," +sam_screen_dict.get(sample)+"')"
except e:
db.rollback()
print("Something went wrong inserting data into the sample table: %s" %(e))
*sam_screen_dict is a dictionary i made from another function.
This is a simple table that I upload into but many of them call of different dictionaries to make sure the correct results are uploaded. However I was wondering whether there would be a more robust way in which to do this using a class.
For example, my sample_id has an associated screening attribute in the sample table, so this is easy to do with one dictionary. I have more complex junction tables, such as the table in which the sample_id, experiment_id and found mutation are stored, alongside other data, would it be a good idea to create a class for this table, calling on a simple 'sample' class to inherit from? That way I would always know that the results being inserted will be for the correct sample/experiment etc.
Also, using classes could I write rules for each attribute so that if the source spreadsheet is for some reason incorrect, it will not insert into the database?
I.e: sample_id is in the format A123/16. Therefore using a class it will check that the first character is 'A' and that sample_id[-3] should always == '/'. I know I could write these into functions, but I feel like it would take up so much space and time writing so many 'if' statements, that if it is stored once in a class then this would be alot better.
Has anybody done anything similar using classes to pass through their variables to test that they are correct before it gets to the insert stage and an error is created?
I am new to python classes and understand the basics, still trying to get to grips with them so a point in the right direction would be great - as would any help on how to go about actually writing the code for a python class that would be used to make a more robust database insertion program.

17tables it means you may use about 17 classes.
Use a simple script. webpy.db
https://github.com/webpy/webpy/blob/master/web/db.py just modify few code.
Then you can use webpy api: http://webpy.org/docs/0.3/api#web.db to finish your job.
Hope it's useful for you

Substituting column names in Python sqlite3 query [duplicate]

This question already has answers here:
How do you escape strings for SQLite table/column names in Python?
(8 answers)
Closed 7 years ago.
I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.
I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:
import sqlite3 as lite
db = lite.connect("mre.sqlite")
c = db.cursor()
# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()
for row in c.execute("SELECT AVG(value) FROM trouble"):
print row # Returns 3
for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
print row # Returns 0
db.close()
Is there a better way to accomplish this than simply injecting a column name into a string and running it?

As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:
It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.
If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.
So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.

Why not use string formatting?
for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
print row # => (3.0,)

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)

The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.

Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

Searching a list of tuples in python

I'm having a database (sqlite) of members of an organisation (less then 200 people). Now I'm trying to write an wx app that will search the database and return some contact information in a wx.grid. The app will have 2 TextCtrls, one for the first name and one for the last name. What I want to do here is make it possible to only write one or a few letters in the textctrls and that will start to return result. So, if I search "John Smith" I write "Jo" in the first TextCtrl and that will return every single John (or any one else having a name starting with those letters). It will not have an "search"-button, instead it will start searching whenever I press a key.
One way to solve this would be to search the database with like " SELECT * FROM contactlistview WHERE forname LIKE 'Jo%' " But that seems like a bad idea (very database heavy to do that for every keystroke?). Instead i thought of use fetchall() on a query like this " SELECT * FROM contactlistview " and then, for every keystroke, search the list of tuples that the query have returned. And that is my problem: Searching a list is not that difficult but how can I search a list of tuples with wildcards?

selected = [t for t in all_data if t[1].startswith('Jo')]
but, measure, don't guess. I think that in some cases, the query would be faster - specially if you have too many records. Maybe you should use a query on the first char, and then start using python-side filter, since you already have the results.

I think that generally, you shouldn't be afraid of giving tasks to a database. It's quite possible that the LIKE clause will be very fast. Sqlite is implemented in fairly robust C code, and will happily deal with queries like this.
If you're worried about sending too many requests, why not send a query once a user has entered a threshold of characters, such as three?
A list comprehension is probably the best way to return the result if you want to do added filtering.

If you are searching for a string matching the start using LIKE, eg 'abc%' (rather than anywhere in the string - '%abc%'), the search should be quite fast if you have an index on the field, as the db can use the index to help find the matches.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.