PostgreSQL UUID date type - python

I'm building a platform with a PostgreSQL database (first time) but I've experience with Oracle and MySQL databases for a few years now.
My question is about the UUID data type in Postgres.
I am using an UUIDv4 uuid to indentify a record in multiple tables, so the request to /users/2df2ab0c-bf4c-4eb5-9119-c37aa6c6b172 will respond with the user that has that UUID. I also have an auto increment ID field for indexing.
My query is just a select with a where clause on UUID. But when the user enters an invalid UUID like this 2df2ab0c-bf4c-4eb5-9119-c37aa6c6b17 (without the last 2) then the database responds with this error: Invalid input syntax for UUID.
I was wondering why it returned this because when you select on a integer-type with a string-type it does work.
Now I need to set a middleware/check on each route that has an UUID-type parameter in it because otherwise the server would crash.
Btw I'm using Flask 0.12 (Python) and PostgreSQL 9.6

UUID as defined by RFC 4122, ISO/IEC 9834-8:2005... is a 128-bit quantity ... written as a sequence of lower-case hexadecimal digits... for a total of 32 digits representing the 128 bits. (Postgresql Docs)
There is no conversion from a 31 hex digits text to a 128-bit UUID (sorry). You have some options:
Convert to ::text on your query (not really recommended, because you'd be converting every row, every time).
SELECT * FROM my_table WHERE my_uuid::TEXT = 'invalid uid';
Don't store it as a UUID type. If you don't want / need UUID semantics, store it as a varchar.
Check your customer input. (My recommendation). Conceptually, this is no different from asking for someone's age and getting 'ABC' as the response.
Postgres allows upper/lower case, and is flexible about use of hyphens, so a pre-check is simply strip the hyphens, lowercase, count [0-9[a-f] & if == 32, you have a workable UUID. Otherwise, rather than telling your user "not found", you can tell them, "not a UUID", which is probably more user-friendly.

The database is throwing an error because you're trying to match in a UUID-type column with a query that doesn't contain a valid UUID. This doesn't happen with integer or string queries because leaving off the last character of those does result in a valid integer or string, just not the one you probably intended.
You can either prevent passing invalid UUIDs to the database by validating your input (which you should be doing anyway for other reasons) or somehow trap on this error. Either way, you'll need to present a human-readable error message back to the user.
Also consider whether users should be typing in URLs with UUIDs in the first place, which isn't very user-friendly; if they're just clicking links rather than typing them, as usually happens, then how did that error even happen? There's a good chance that it's an attack of some sort, and you should respond accordingly.

Related

Django with Oracle DB - ORA-19011: Character string buffer too small

I have the following model for an Oracle database, which is not a part of my Django project:
class ResultsData(models.Model):
RESULT_DATA_ID = models.IntegerField(primary_key=True, db_column="RESULT_DATA_ID")
RESULT_XML = models.TextField(blank=True, null=True, db_column="RESULT_XML")
class Meta:
managed = False
db_table = '"schema_name"."results_data"'
The RESULT_XML field in the database itself is declared as XMLField. I chose to represent it as TextField in Django model, due to no character limit.
When I do try to download some data with that model, I get the following error:
DatabaseError: ORA-19011: Character string buffer too small
I figure, it is because of the volume of data stored in RESULT_XML field, since when I try to just pull a record with .values("RESULT_DATA_ID"), it pulls fine.
Any ideas on how I can work around this problem? Googling for answers did not yield anything so far.
UPDATED ANSWER
I have found a much better way of dealing with that issue - I wrote a custom field value Transform object, which generates an Oracle SQL query I was after:
OracleTransforms.py
from django.db.models import TextField
from django.db.models.lookups import Transform
class CLOBVAL(Transform):
'''
Oracle-specific transform for XMLType field, which returns string data exceeding
buffer size (ORA-19011: Character string buffer too small) as a character LOB type.
'''
function = None
lookup_name = 'clobval'
def as_oracle(self, compiler, connection, **extra_context):
return super().as_sql(
compiler, connection,
template='(%(expressions)s).GETCLOBVAL()',
**extra_context
)
# Needed for CLOBVAL to work as a .values('field_name__clobval') lookup in Django ORM queries
TextField.register_lookup(CLOBVAL)
With the above, I can now just write a query as follows:
from .OracleTransforms import CLOBVAL
ResultsData.objects.filter(RESULT_DATA_ID=some_id).values('RESULT_DATA_ID', 'RESULT_XML__clobval')
or
ResultsData.objects.filter(RESULT_DATA_ID=some_id).values('RESULT_DATA_ID', XML = CLOBVAL('RESULT_XML'))
This is the best solution for me, as I do get to keep using QuerySet, instead of RawQuerySet.
The only limitation I see with this solution for now, is that I need to always specify .values(CLOBVAL('RESULT_XML')) in my ORM queries, or Oracle DB will report ORA-19011 again, but I guess this still is a good outcome.
OLD ANSWER
So, I have found a way around the problem, thanks to Christopher Jones suggestion.
ORA-19011 is an error which Oracle DB replies with, when the amount of data it would be sending back as a string exceeds allowed buffer. Therefore, it needs to be sent back as a character LOB object instead.
Django does not have a direct support for that Oracle-specific method (at least I did not find one), so an answer to the problem was a raw Django query:
query = 'select a.RESULT_DATA_ID, a.RESULT_XML.getClobVal() as RESULT_XML FROM SCHEMA_NAME.RESULTS_DATA a WHERE a.RESULT_DATA_ID=%s'
data = ResultsData.objects.raw(query, [id])
This way, you get back a RawQuerySet, which if this less known, less liked cousin of Django's QuerySet. You can iterate through the answer, and RESULT_XML will contain a LOB field, which when interrogated will convert to a String type.
Handling a String type-encoded XML data is problematic, so I also employed XMLTODICT Python package, to get it into a bit more civilized shape.
Next, I should probably look for a way to modify Django's getter for the RESULT_XML field only, and have it generate a query to Oracle DB with .getClobVal() method in it, but I will touch on that in a different StackOverflow question: Django - custom getter for 1 field in model

Django Raw Query with params on Table Column (SQL Injection)

I have a kinda unusual scenario but in addition to my sql parameters, I need to let the user / API define the table column name too. My problem with the params is that the query results in:
SELECT device_id, time, 's0' ...
instead of
SELECT device_id, time, s0 ...
Is there another way to do that through raw or would I need to escape the column by myself?
queryset = Measurement.objects.raw(
'''
SELECT device_id, time, %(sensor)s FROM measurements
WHERE device_id=%(device_id)s AND time >= to_timestamp(%(start)s) AND time <= to_timestamp(%(end)s)
ORDER BY time ASC;
''', {'device_id': device_id, 'sensor': sensor, 'start': start, 'end': end})
As with any potential for SQL injection, be careful.
But essentially this is a fairly common problem with a fairly safe solution. The problem, in general, is that query parameters are "the right way" to handle query values, but they're not designed for schema elements.
To dynamically include schema elements in your query, you generally have to resort to string concatenation. Which is exactly the thing we're all told not to do with SQL queries.
But the good news here is that you don't have to use the actual user input. This is because, while possible query values are infinite, the superset of possible valid schema elements is quite finite. So you can validate the user's input against that superset.
For example, consider the following process:
User inputs a value as a column name.
Code compares that value (raw string comparison) against a list of known possible values. (This list can be hard-coded, or can be dynamically fetched from the database schema.)
If no match is found, return an error.
If a match is found, use the matched known value directly in the SQL query.
So all you're ever using are the very strings you, as the programmer, put in the code. Which is the same as writing the SQL yourself anyway.
It doesn't look like you need raw() for the example query you posted. I think the following queryset is very similar.
measurements = Measurement.objects.filter(
device_id=device_id,
to_timestamp__gte=start,
to_timestamp__lte,
).order_by('time')
for measurement in measurements:
print(getattr(measurement, sensor)
If you need to optimise and avoid loading other fields, you can use values() or only().

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)
The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.
Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

Can't find django auth user with PostgreSQL

from django.contrib.auth.models import User as DjangoUser
class Ward(models.Model):
user = models.ForeignKey(DjangoUser, related_name='wards')
group = models.ForeignKey(Group, related_name='wards')
This is my django model and I use this filter.
Group.objects.filter(wards__user=_user).all()
I used this code in sqlite3, it works well.
But, it doesn't work in PostgreSQL.
operator does not exist: character varying = integer
LINE 1: ...rchive_ward"."group_id" ) WHERE "archive_ward"."user_id" = 1
I think it is caused by user_id field in archive_ward tables.
I found this field's data type is character.varying(20).
What can I do for this code?
Try removing the user table in the database and adding it again.
create a new one from scratch. Syncing database again will work..
or else You can do like this Way raw_query
You cannot compare an integer with a varchar. PostgreSQL is strict and does not do any magic typecasting for you. I'm guessing SQLServer does typecasting automagically (which is a bad thing).
If you want to compare these two different beasts, you will have to cast one to the other using the casting syntax ::
The Postgres error means you're comparing an integer to a string:
operator does not exist: character varying = integer
You could change the database model so user_id is of an integer type. Or you could cast the integer to string in Python:
Group.objects.filter(wards__user=str(_user)).all()

Database field length is not enforced

I am using web2py (python) with sqlite3 database (test flowers database :) ). Here is the declaration of the table:
db.define_table('flower',
Field('code', type='string', length=4, required=True, unique=True),
Field('name', type='string', length=100, required=True),
Field('description', type='string', length=250, required=False),
Field('price', type='float', required=True),
Field('photo', 'upload'));
Which translates into correct SQL in sql.log:
CREATE TABLE flower(
id INTEGER PRIMARY KEY AUTOINCREMENT,
code CHAR(4),
name CHAR(200),
description CHAR(250),
price CHAR(5),
photo CHAR(512)
);
But when I insert a value of "code" field that's greater than 4 chars, it still inserts. I tried setting to CHAR(10) (simple test, I guess) with the same result.
>>>db.flower.insert(code="123456789999", name="flower2", description="test flower 2", price="5.00");
>>>1L;
The same problem applies to all field where I set the length. I also tried validation (although, I am not 100% on correct use of it). This is also within flower model flowers.py where the table is defined and follows table declaration:
db.flower.code.requires = [ IS_NOT_EMPTY(), IS_LENGTH(4), IS_NOT_IN_DB(db, 'flower.code')]
Documentation on this is here, but I can't find anything that's limiting SQLite3 or web2py length check of the string. I would expect to see an error on insert.
Would appreciate some help on this? What did I miss in the documentation? I used symphony2 with PHP and MySQL before and would expect similar behaviour here.
SQLite is not like other databases. For all (most) practical purposes columns are untyped and INSERTs will always succeed and not lose data or precision (meaning, you can INSERT a text value into a REAL field if you want).
The declared type of the column is used for a system called "type affinity", which is described here: https://www.sqlite.org/datatype3.html.
Once you get used to it, it's kind of fun -- but definitely not what you'd expect!
You have to perform length checking in your code before issuing the INSERT.
As already mentioned, SQLite does not enforce character field length declarations (see https://www.sqlite.org/faq.html#q9). Furthermore, the IS_LENGTH validator is only applied if you do the insert via a SQLFORM submission or via the .validate_and_insert method -- if you just use the .insert method, the validators stored in the requires attribute are not applied, so you will get no error.

Categories

Resources