psycopg2.errors.SyntaxError when instantiating a postgres database [duplicate] - python

It seems PostgreSQL does not allow to create a database table named 'user'. But MySQL will allow to create such a table.
Is that because it is a key word? But Hibernate cannot identify any issue (even if we set the PostgreSQLDialect).

user is a reserved word and it's usually not a good idea use reserved words for identifiers (tables, columns).
If you insist on doing that you have to put the table name in double quotes:
create table "user" (...);
But then you always need to use double quotes when referencing the table. Additionally the table name is then case-sensitive. "user" is a different table name than "User".
If you want to save yourself a lot of trouble use a different name. users, user_account, ...
More details on quoted identifiers can be found in the manual: http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

It is possible to specify tablename with JPA with next syntax:
#Table(name="\"user\"")

We had this same issue time ago, and we just changed the table name from user to app_user. Due to the use of Hibernate/JPA. We thought it would be easier this way.
Hope this little fix will help someone else.

You can create a table user in a schema other than public.
The example:
CREATE SCHEMA my_schema;
CREATE TABLE my_schema.user(...);

Trailing underscore
The SQL standard explicitly promises to never use a trailing underscore in any keyword or reserved word.
So, to avoid conflicts with any of the over a thousand keywords and reserved words used by various database engines, I name all my database identifiers with a trailing underscore. (Yes, really, over a thousand keywords reserved — I counted them.)
Change this:
CREATE TABLE user ( … ) ;
… to this:
CREATE TABLE user_ ( … ) ;
I do this as a habit for all database names: schemas, tables, columns, indexes, etc.
As an extra benefit, this practice makes quite clear in documentation, email, and such when referring to a programming language variable named user versus the database column user_. Anything with a trailing underscore is obviously from the database side.

Related

Refactorable database queries

Say I have category="foo" and a NoSQL query={"category"=category}. Whenever I refactor my variable name of category, I need to manually change it inside the query if I want to adopt it.
In Python 3.8+ I'm able to get the variable name as a string via the variable itself.
Now I could use query={f"{category=}".split("=")[0]=category}. Now refactoring changes the query too. This applies to any database queries or statements (SQL etc.).
Would this be bad practice? Not just concerning Python but any language where this is possible.
Would this be bad practice?
Yes, the names of local variables do not need to correlate with the fields in data stores.
You should be able to retrieve a record and filter on its fields with any python variable, no matter its name or if its nested in a larger data structure.
In pseudocode:
connection = datastore.connect(...)
# passing a string directly
connection.fetch({"category": "fruit"})
# passing a string variable
category_to_fetch = "vegetable"
connection.fetch({"category": category_to_fetch})
# something more exotic like a previous list of records
r = [("fish",)]
connection.fetch({"category": r[0][0]})
# or even a premade filter dictionary
filter = {"category": "meat"}
connection.fetch(filter)

Question about Postgres with upper case TableNames - migrating from MySQL [duplicate]

I have project which was working in MySQL.
It has hundreds of queries and also some table names and column names are in upper case that's why
query like
select * from TEST
would not work in pgSQL without quote.
So can anyone give me the solution which would not make change in all queries?
The easiest thing regarding table names is: Create tables without any quoting around them as in
create table "TEST" -- BAD
create table TEST -- good
create table test -- even better
Also your queries should not contain any quotes around table names or column names.
select "TEST"."COLUMN" from "TEST" -- BAD
select TEST.COLUMN from TEST -- good
select test.column from test -- even better
The last two versions are identical to PostgreSQL since unquoted identifiers are folded to lower case automatically.
Just just make sure there are no quotes everywhere (queries and in DDL) and be done with that part.
EDIT: When you have created tables using the quote syntax and they show up as upper case or mixed case in the psql shell, then you can rename the tables:
alter table "TEST" rename to TEST; -- or "to test" - doesn't matter
Here is a quick way to generate the commands, which you have to copy&paste into a psql shell yourself:
select
'alter table "' || table_schema || '"."' || table_name || '" to ' ||
lower(table_name)
from information_schema.tables
where table_type = 'BASE TABLE' and table_name != lower(table_name);
?column?
-------------------------------------
alter table "public"."TEST" to test
Rationale You have to use one standard for all your queries: Either all unquoted (then the real table names must have been folded to lowercase) or all quoted (then the real table names must match literally). Mixing is not possible withou MUCH pain.
Usually nobody makes the fuss and add quotes into hand-written queries, therefore my expectation is, that settling on this standard is less work than otherwise. This means, you have to rename your tables according to PostgreSQL best practices.
There isn't one. MySQL uses some syntax extension that are not compatible with SQL (so does PostgreSQL) and there are some major differences in queries that would be extremely hard to convert automatically (GROUP BY, DISTINCT).
TL;DR; you have no choice but to fix the queries manually and check they behave the exact same way (not a given).

Trouble creating table in Python using MYSQL [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
I'm trying to execute a simple MySQL query as below:
INSERT INTO user_details (username, location, key)
VALUES ('Tim', 'Florida', 42)
But I'm getting the following error:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'key) VALUES ('Tim', 'Florida', 42)' at line 1
How can I fix the issue?
The Problem
In MySQL, certain words like SELECT, INSERT, DELETE etc. are reserved words. Since they have a special meaning, MySQL treats it as a syntax error whenever you use them as a table name, column name, or other kind of identifier - unless you surround the identifier with backticks.
As noted in the official docs, in section 10.2 Schema Object Names (emphasis added):
Certain objects within MySQL, including database, table, index, column, alias, view, stored procedure, partition, tablespace, and other object names are known as identifiers.
...
If an identifier contains special characters or is a reserved word, you must quote it whenever you refer to it.
...
The identifier quote character is the backtick ("`"):
A complete list of keywords and reserved words can be found in section 10.3 Keywords and Reserved Words. In that page, words followed by "(R)" are reserved words. Some reserved words are listed below, including many that tend to cause this issue.
ADD
AND
BEFORE
BY
CALL
CASE
CONDITION
DELETE
DESC
DESCRIBE
FROM
GROUP
IN
INDEX
INSERT
INTERVAL
IS
KEY
LIKE
LIMIT
LONG
MATCH
NOT
OPTION
OR
ORDER
PARTITION
RANK
REFERENCES
SELECT
TABLE
TO
UPDATE
WHERE
The Solution
You have two options.
1. Don't use reserved words as identifiers
The simplest solution is simply to avoid using reserved words as identifiers. You can probably find another reasonable name for your column that is not a reserved word.
Doing this has a couple of advantages:
It eliminates the possibility that you or another developer using your database will accidentally write a syntax error due to forgetting - or not knowing - that a particular identifier is a reserved word. There are many reserved words in MySQL and most developers are unlikely to know all of them. By not using these words in the first place, you avoid leaving traps for yourself or future developers.
The means of quoting identifiers differs between SQL dialects. While MySQL uses backticks for quoting identifiers by default, ANSI-compliant SQL (and indeed MySQL in ANSI SQL mode, as noted here) uses double quotes for quoting identifiers. As such, queries that quote identifiers with backticks are less easily portable to other SQL dialects.
Purely for the sake of reducing the risk of future mistakes, this is usually a wiser course of action than backtick-quoting the identifier.
2. Use backticks
If renaming the table or column isn't possible, wrap the offending identifier in backticks (`) as described in the earlier quote from 10.2 Schema Object Names.
An example to demonstrate the usage (taken from 10.3 Keywords and Reserved Words):
mysql> CREATE TABLE interval (begin INT, end INT);
ERROR 1064 (42000): You have an error in your SQL syntax.
near 'interval (begin INT, end INT)'
mysql> CREATE TABLE `interval` (begin INT, end INT);
Query OK, 0 rows affected (0.01 sec)
Similarly, the query from the question can be fixed by wrapping the keyword key in backticks, as shown below:
INSERT INTO user_details (username, location, `key`)
VALUES ('Tim', 'Florida', 42)"; ^ ^

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)
The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.
Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

SQLite: Why can't parameters be used to set an identifier?

I'm refactoring a little side project to use SQLite instead of a python data structure so that I can learn SQLite. The data structure I've been using is a list of dicts, where each dict's keys represent a menu item's properties. Ultimately, these keys should become columns in an SQLite table.
I first thought that I could create the table programmatically by creating a single-column table, iterating over the list of dictionary keys, and executing an ALTER TABLE, ADD COLUMN command like so:
# Various import statements and initializations
conn = sqlite3.connect(database_filename)
cursor = conn.cursor()
cursor.execute("CREATE TABLE menu_items (item_id text)")
# Here's the problem:
cursor.executemany("ALTER TABLE menu_items ADD COLUMN ? ?", [(key, type(value)) for key, value in menu_data[0].iteritems()])
After some more reading, I realized parameters cannot be used for identifiers, only for literal values. The PyMOTW on sqlite3 says
Query parameters can be used with select, insert, and update statements. They can appear in any part of the query where a literal value is legal.
Kreibich says on p. 135 of Using SQLite (ISBN 9780596521189):
Note, however, that parameters can only be used to replace literal
values, such as quoted strings or numeric values. Parameters
cannot be used in place of identifiers, such as table names or
column names. The following bit of SQL is invalid:
SELECT * FROM ?; -- INCORRECT: Cannot use a parameter as an identifier
I accept that positional or named parameters cannot be used in this way. Why can't they? Is there some general principle I'm missing?
Similar SO question:
Python sqlite3 string formatting
Identifiers are syntactically significant while variable values are not.
Identifiers need to be known at SQL compilation phase so that the compiled internal bytecode representation knows about the relevant tables, columns, indices and so on. Just changing one identifier in the SQL could result in a syntax error, or at least a completely different kind of bytecode program.
Literal values can be bound at runtime. Variables behave essentially the same in a compiled SQL program regardless of the values bound in them.
I don't know why, but every database I ever used has the same limitation.
I think it would be analogous to use a variable to hold the name of another variable. Most languages do not allow that, PHP being the only exception I know of.
Regardless of the technical reasons, dynamically choosing table/column names in SQL queries is a design smell, which is why most databases do not support it.
Think about it; if you were coding a menu in Python, would you dynamically create a class for each combination of menu items? Of course not; you'd have one Menu class that contains a list of menu items. It's similar in SQL too.
Most of the time, when people ask about dynamically choosing table names, it's because they've split up their data into different tables, like collection1, collection2, ... and use the name to select which collection to query from. This isn't a very good design; it requires the service to repeat the schema for each table, including indexes, constraints, permissions, etc, and also makes altering the schema harder (Need to add a field? Now you need to do it across hundreds of tables instead of one).
The correct way of designing the database would be to have a single collection table and add a collection_id column; instead of querying collection4, you'd add a WHERE collection_id = 4 constraint to your SELECT queries. Note that the 4 is now a value, and can be replaced with a query parameter.
For your case, I would use this schema:
CREATE TABLE menu_items (
item_id TEXT,
key TEXT,
value NONE,
PRIMARY KEY(item_id, key)
);
Use executemany to insert a row for each entry in the dictionary. When you need to load the dictionary, run a SELECT filtering on item_id and recreate the dictionary one row/entry at a time.
(Of course, as with everything in Software Engineering, there are exception. Tools that operate on schemas generically, such as ORMs, will need to specify table/column names dynamically.)

Categories

Resources