SQL query builder with seed-query Parser - python

Is there an SQL query builder (in Python) which allows me to "parse" and initial SQL query, add certain operators and then get the resulting SQL text?
My use case is the following:
Start with a query like: "SELECT * from my_table"
I want to be able to do something like query_object = Query.parse("SELECT * from my_table to get a query object I can manipulate and then write something like query_object.where('column < 10').limit(10) or similar (columns and operators could also be part of the library, may also have to consider existing WHERE clauses)
And finally getting the resulting query string str(query_object) with the final modified query.
Is this something that can be achieved with any of the ORMs? I don't need all the database connection to specific DB-engines or object mappings (although having it is not a limitation).
I've seen pypika, which allows to create an SQL query from code, but it doesn't allow one to parse an existing query and continue from there.
I've also seen sqlparse which allows me to parse and SQL query into tokens. But because it does not create a tree, it is non-trivial to add additional elements to am existing statement. (it is close to what I am looking for, if only it created an actual tree)

Related

mysql query from backend python server syntax error with backticks on table name [duplicate]

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?
Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.
Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

Django Raw Query with params on Table Column (SQL Injection)

I have a kinda unusual scenario but in addition to my sql parameters, I need to let the user / API define the table column name too. My problem with the params is that the query results in:
SELECT device_id, time, 's0' ...
instead of
SELECT device_id, time, s0 ...
Is there another way to do that through raw or would I need to escape the column by myself?
queryset = Measurement.objects.raw(
'''
SELECT device_id, time, %(sensor)s FROM measurements
WHERE device_id=%(device_id)s AND time >= to_timestamp(%(start)s) AND time <= to_timestamp(%(end)s)
ORDER BY time ASC;
''', {'device_id': device_id, 'sensor': sensor, 'start': start, 'end': end})
As with any potential for SQL injection, be careful.
But essentially this is a fairly common problem with a fairly safe solution. The problem, in general, is that query parameters are "the right way" to handle query values, but they're not designed for schema elements.
To dynamically include schema elements in your query, you generally have to resort to string concatenation. Which is exactly the thing we're all told not to do with SQL queries.
But the good news here is that you don't have to use the actual user input. This is because, while possible query values are infinite, the superset of possible valid schema elements is quite finite. So you can validate the user's input against that superset.
For example, consider the following process:
User inputs a value as a column name.
Code compares that value (raw string comparison) against a list of known possible values. (This list can be hard-coded, or can be dynamically fetched from the database schema.)
If no match is found, return an error.
If a match is found, use the matched known value directly in the SQL query.
So all you're ever using are the very strings you, as the programmer, put in the code. Which is the same as writing the SQL yourself anyway.
It doesn't look like you need raw() for the example query you posted. I think the following queryset is very similar.
measurements = Measurement.objects.filter(
device_id=device_id,
to_timestamp__gte=start,
to_timestamp__lte,
).order_by('time')
for measurement in measurements:
print(getattr(measurement, sensor)
If you need to optimise and avoid loading other fields, you can use values() or only().

Being that string substitution is frowned upon with forming SQL queries, how do you assign the table name dynamically?

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?
Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.
Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

Can I get table names along with column names using .description() in Python's DB API?

I am using Python with SQLite 3. I have user entered SQL queries and need to format the results of those for a template language.
So, basically, I need to use .description of the DB API cursor (PEP 249), but I need to get both the column names and the table names, since the users often do joins.
The obvious answer, i.e. to read the table definitions, is not possible -- many of the tables have the same column names.
I also need some intelligent behaviour on the column/table names for aggregate functions like avg(field)...
The only solution I can come up with is to use an SQL parser and analyse the SELECT statement (sigh), but I haven't found any SQL parser for Python that seems really good?
I haven't found anything in the documentation or anyone else with the same problem, so I might have missed something obvious?
Edit: To be clear -- the problem is to find the result of an SQL select, where the select statement is supplied by a user in a user interface. I have no control of it. As I noted above, it doesn't help to read the table definitions.
Python's DB API only specifies column names for the cursor.description (and none of the RDBMS implementations of this API will return table names for queries...I'll show you why).
What you're asking for is very hard, and only even approachable with an SQL parser...and even then there are many situations where even the concept of which "Table" a column is from may not make much sense.
Consider these SQL statements:
Which table is today from?
SELECT DATE('now') AS today FROM TableA FULL JOIN TableB
ON TableA.col1 = TableB.col1;
Which table is myConst from?
SELECT 1 AS myConst;
Which table is myCalc from?
SELECT a+b AS myCalc FROM (select t1.col1 AS a, t2.col2 AS b
FROM table1 AS t1
LEFT OUTER JOIN table2 AS t2 on t1.col2 = t2.col2);
Which table is myCol from?
SELECT SUM(a) as myCol FROM (SELECT a FROM table1 UNION SELECT b FROM table2);
The above were very simple SQL statements for which you either have to make up a "table", or arbitrarily pick one...even if you had an SQL parser!
What SQL gives you is a set of data back as results. The elements in this set can not necessarily be attributed to specific database tables. You probably need to rethink your approach to this problem.

How to bulk insert data to mysql with python

Currently i'm using Alchemy as a ORM, and I look for a way to speed up my insert operation, I have bundle of XML files to import
for name in names:
p=Product()
p.name="xxx"
session.commit()
i use above code to insert my data paser from batch xml file to mysql,it's very slow
also i tried to
for name in names:
p=Product()
p.name="xxx"
session.commit()
but it seems didn't change anything
You could bypass the ORM for the insertion operation and use the SQL Expression generator instead.
Something like:
conn.execute(Product.insert(), [dict(name=name) for name in names])
That should create a single statement to do your inserting.
That example was taken from lower down the same page.
(I'd be interested to know what speedup you got from that)

Categories

Resources