I am building a little interface where I would like users to be able to write out their entire sql statement and then see the data that is returned. However, I don't want a user to be able to do anything funny ie delete from user_table;. Actually, the only thing I would like users to be able to do is to run select statements. I know there aren't specific users for SQLite, so I am thinking what I am going to have to do, is have a set of rules that reject certain queries. Maybe a regex string or something (regex scares me a little bit). Any ideas on how to accomplish this?
def input_is_safe(input):
input = input.lower()
if "select" not in input:
return False
#more stuff
return True
I can suggest a different approach to your problem. You can restrict the access to your database as read-only. That way even when the users try to execute delete/update queries they will not be able to damage your data.
Here is the answer for Python on how to open a read-only connection:
db = sqlite3.connect('file:/path/to/database?mode=ro', uri=True)
Python's sqlite3 execute() method will only execute a single SQL statement, so if you ensure that all statements start with the SELECT keyword, you are reasonably protected from dumb stuff like SELECT 1; DROP TABLE USERS. But you should check sqlite's SQL syntax to ensure there is no way to embed a data definition or data modification statement as a subquery.
My personal opinion is that if "regex scares you a little bit", you might as well just put your computer in a box and mail it off to <stereotypical country of hackers>. Letting untrusted users write SQL code is playing with fire, and you need to know what you're doing or you'll get fried.
Open the database as read only, to prevent any changes.
Many statements, such as PRAGMA or ATTACH, can be dangerous. Use an authorizer callback (C docs) to allow only SELECTs.
Queries can run for a long time, or generate a large amount of data. Use a progress handler to abort queries that run for too long.
Related
The problem is that I've ended up with the extra role of having to run these queries for different users everyday because no one in our factory knows how to use SQL and my IT wouldn't grant them database access anyway.
I've consulted some tech savvier friends who have suggested that i create a GUI (using python) that does the querying for me? That sounds like it would be ideal since all my end users would have to do is key into an input field someone on the interface and the query would run, versus them querying for the data directly and potentially messing up my SQL script.
I've read online as well of the possibility of doing something like mentioned before but in Excel instead. Right now I'm just really lost as to where to start. I probably would need to do alot of reading up anyway but it'd be at least nice to know what is a good direction to start. Any new suggestions would be appreciated.
This could be overkill for you, but Metabase is basically a GUI you can pretty easily set up on top of your SQL database.
It's target audience is "business analytics", so it enables people to create "questions" (aka. queries) using a GUI rather than writing SQL (although it allows raw SQL). More relevant to you, you probably just want to preconfigure some "questions" your users can log in and run in Metabase.
Without writing any code at all you could set up Metabase, write the query you want using either raw SQL or their GUI, and parameterize the query with some user_id or something the users can fill in. From there, you could give the users access to your Metabase (via the web), where they can go in and use the GUI to fill in whatever they need to and then run the query you've created for them.
Metabase also has a lot of customization, so you can make sure they only have permissions to see certain data, etc., and they don't need DB credentials. You, the Metabase admin, connect the database when you set up Metabase and give other users Metabase logins instead.
I need to move data from one database to another.
I can use python my counterpart can't.
How can select all data from a table and save it as insert statements.
Using SQLalchemy.
Is there a way to create a back up like this?
As others have suggested in comments, using the database backup program (mysqldump, pg_dump, etc) is your best bet; that will make sure that the data is transferred correctly for the underlying database.
Outputting INSERT statements will be risky; even the built-in SQLAlchemy facility for doing this comes with a big red warning, complete with a picture of a dragon, indicating that it can be dangerous.
If you nevertheless need to do this, and the data is generally trusted and doesn't contain much in the way of odd types, you can use:
Create (but do not execute) an insert expression as though you were inserting the rows back into the database.
Use the .compile() method with the relevant dialect parameter and literal_binds set to True.
Manually double-check that the output is, in fact, valid for the database; as per the warning in the SQLAlchemy FAQ, this method is not very dependable and may expose you to attacks if it's part of any production system.
I wouldn't recommend formatting up INSERT statements by hand; you're unlikely to do a better job than SQLAlchemy...
I have a couple of SQL statements stored as files which get executed by a Python script. The database is hosted in Snowflake and I use Snowflake SQLAlchemy to connect to it.
How can I test those statements? I don't want to execute them, I just want to check if they could be executable.
One very basic thing to check if it is valid standard SQL. A better answer would be something that considers snowflake-specific stuff like
copy into s3://example from table ...
The best answer would be something that also checks permissions, e.g. for SELECT statements if the table is visible / readable.
An in-memory sqlite database is one option. But if you are executing raw SQL queries against snowflake in your code, your tests may fail if the same syntax isn't valid against sqlite. Recording your HTTP requests against a test snowflake database, and then replaying them for your unit tests suits this purpose better. There are two very good libraries that do this, check them out:
vcrpy
betamax
We do run integration tests on our Snowflake databases. We maintain clones of our production databases, for example, one of our production databases is called data_lake and we maintain a clone that gets cloned on a nightly basis called data_lake_test which is used to run our integration tests against.
Like Tim Biegeleisen mentioned, a "true" unittest would mock the response but our integration tests do run real Snowflake queries on our test cloned databases. There is the possibility that a test drastically alters the test database, but we run integration tests only during our CI/CD process so it is rare if there is ever a conflict between two tests.
I very much like this idea, however I can suggest a work around, as I often have to check my syntax and need help there. What I would recommend, if you plan on using the Snowflake interface would be to make sure to use the LIMIT 10 or LIMIT 1 on the SELECT statements that you would be needing to validate.
Another tip I would recommend is talking to a Snowflake representative about a trial if you are just getting started. They will also have alot of tips for more specific queries you are seeking to validate.
And finally, based on some comments, make sure it uses SQL: ANSI and the live in the https://docs.snowflake.net/manuals/index.html for reference.
As far as the validity of the sql statement is a concern you can run explain of the statement and it should give you error if syntax is incorrect or if you do not have permission to access the object/database. That being there still some exceptions which you cannot run explain for like 'use' command which I do not think is needed for validation.
Hope this helps.
I am new to python and pyramid and I am trying to figure out a way to print out some object values that I am using in a view callable to get a better idea of how things are working. More specifically, I am wanting to see what is coming out of a sqlalchemy query.
DBSession.query(User).filter(User.name.like('%'+request.matchdict['search']+'%'))
I need to take that query and then look up what Office a user belongs to by the office_id attribute that is part of the User object. I was thinking of looping through the users that come up from that query and doing another query to look up the office information (in the offices table). I need to build a dictionary that includes some User information and some Office information then return it to the browser as json.
Is there a way that I can experiment with different attempts at this while viewing my output without having to rely on the browser. I am more of a front end developer so when I am writing javascript I just view my outputs using console.log(output).
console.log(output) is to JavaScript
as
????? is to Python (specifically pyramid view callable)
Hope the question is not dumb. Just trying to learn. Appreciate anyones help.
This is a good reason to experiment with pshell, Pyramid's interactive python interpreter. From within pshell you can tinker with things on the command-line and see what they will do before adding them to your application.
http://docs.pylonsproject.org/projects/pyramid/en/1.4-branch/narr/commandline.html#the-interactive-shell
Of course, you can always use "print" and things will show up in the console. SQLAlchemy also has the sqlalchemy.echo ini option that you can turn on to see all queries. And finally, it sounds like you just need to do a join but maybe aren't familiar with how to write complex database queries, so I'd suggest you look into that before resorting to writing separate queries. Likely a single query can return you what you need.
I'm working on a small app which will help browse the data generated by vim-logging, and I'd like to allow people to run arbitrary SQL queries against the datasets.
How can I do that safely?
For example, I'd like to let someone enter, say, SELECT file_type, count(*) FROM commands GROUP BY file_type, then send the result back to their web browser.
Do this:
cmd = "update people set name=%s where id=%s"
curs.execute(cmd, (name, id))
Note that the placeholder syntax depends on the database you are using.
Source and more info here:
http://bobby-tables.com/python.html
Allowing expressive power while preventing destruction is a difficult job. If you let them enter "SELECT .." themselves, you need to prevent them from entering "DELETE .." instead. You can require the statement to begin with "SELECT", but then you also have to be sure it doesn't contain "; DELETE" somewhere in the middle.
The safest thing to do might be to connect to the database with read-only user credentials.
In MySQL, you can create a limited user (create new user and grant limited access), which can only access certain table.
Consider using SQLAlchemy. While SQLAlchemy is arguably the greatest Object Relational Mapper ever, you certainly don't need to use any of the ORM stuff to take advantage of all of the great Python/SQL work that's been done.
As the introductory documentation suggests:
Most importantly, SQLAlchemy is not just an ORM. Its data abstraction layer allows construction and manipulation of SQL expressions in a platform agnostic way, and offers easy to use and superfast result objects, as well as table creation and schema reflection utilities. No object relational mapping whatsoever is involved until you import the orm package. Or use SQLAlchemy to write your own!
Using SQLAlchemy will give you input sanitation "for free" and let you use standard Python logic to analyze statements for safety without having to do any messy text-parsing/pattern-matching.