It appears that the quote() and unquote() methods inside django.contrib.admin.utils do not effectively handle underscores in primary keys. Specifically, I have some string-type primary keys that look like cus_C2xVQnht and when I use the django admin interface to edit them via the small pencil icon, the popup window will display an error like Customer with ID "cusÂxVQnht" doesn't exist. Perhaps it was deleted? (it is converting the C2 to the codepoint 00C2, aka Â. This is true for other valid codepoints as well (00C7, 00C6, 001B, etc)
If I manually go to the customers model and find the ID, I can pull it up and edit it just fine, but it seems the URL encoding doesn't work right when the primary key has an underscore in it.
After quite a lot of digging I managed to find these two functions buried deep inside django.contrib.admin.utils:
def quote(s):
"""
Ensure that primary key values do not confuse the admin URLs by escaping
any '/', '_' and ':' and similarly problematic characters.
Similar to urllib.parse.quote(), except that the quoting is slightly
different so that it doesn't get automatically unquoted by the Web browser.
"""
if not isinstance(s, str):
return s
res = list(s)
for i in range(len(res)):
c = res[i]
if c in """:/_#?;#&=+$,"[]<>%\n\\""":
res[i] = '_%02X' % ord(c)
return ''.join(res)
def unquote(s):
"""Undo the effects of quote(). Based heavily on urllib.parse.unquote()."""
mychr = chr
myatoi = int
list = s.split('_')
res = [list[0]]
myappend = res.append
del list[0]
for item in list:
if item[1:2]:
try:
myappend(mychr(myatoi(item[:2], 16)) + item[2:])
except ValueError:
myappend('_' + item)
else:
myappend('_' + item)
return "".join(res)
They appear to be called somewhere in the admin template rendering process, but I couldn't figure out where/how often/all the locations, so I decided to do a quick monkey patch to decide if it was worth pursuing as a solution: I changed all the underscores in quote() and unquote() except for the one in the list of problem characters in quote to dots...for example:
'_%02X' in quote()becomes '.%02X'
split('_') in unquote() becomes split('.')
myappend('_' + item) in unquote() becomes myappend('.' + item)
Upon doing this, the admin works correctly and it appears that the links attached to the edit icons on related fields are to the correct model instances, so I can edit them by clicking the pencil icons and don't get the error message noted above.
All that said, I can't seem to find a way to safely override these two methods. I really would rather not change the primary keys to eliminate the underscores because there are a lot of linked models in my database and it just seems like it will become a huge pain. This fix seems much easier and more reliable, and given that it worked properly on previous versions of Django I don't see how it's a bad idea to implement.
So, how can I override those methods? Or, as a related question, is there something I can do in the __str__ methods of my models to alleviate this problem? I'd do that much sooner than start writing custom classes that override Django admin internals. If there is no other solution, I would need some help in properly restructuring my database to adjust the primary keys, but I can say that these keys work perfectly on the "old" site I'm working on, which runs Django 1.11.6 and Python 2.7.9 (vs the current Django 2.1.1 and Python 3.6.5)
Please let me know if I can provide any more info. Thank you!!
This is fixed in django 2.2. See https://github.com/django/django/commit/e4df8e6dc021fa472fa77f9b835db74810184748
Related
Similar to this question I want to extract the info of a cron job trigger from APScheduler.
However, I need the "day_of_week" field and not everything. Using
for job in scheduler.get_jobs():
for f in job.trigger.fields:
print(f.name + " " + str(f))
i can see all the fields, e.g. week,hour,day_of_week , but
job.trigger.day_of_week is seemingly 'not an attribute' of the "CronTrigger" object. I'm confused as to what kind of object this job.trigger is and how its fields are packed. I tried to read the code on github, but it is even more puzzling.
How do I extract only the one field day_of_week, and how is this trigger class structured?
Diving deeper I found that
apscheduler.triggers.cron.fields.DayOfWeekField
I can find by indexing the job.trigger.fields[4], which seems really bad style, since it depends on the 'position'of the field. What I get is this DayOfWeekField, from which comically I am not able to retrieve it's value either:
a.get_value
<bound method DayOfWeekField.get_value of DayOfWeekField('day_of_week', '1,2,3,4')>
The structure of the fields is coded here, but I don't know what to do with dateval, the argument of get_value().
Eventually, after hopefully understanding the concept, I want to do
if job-day_of_week contains mon
if job-day_of_week == '*'
print ( job-day_of_week )
I am grateful for any suggestions/hints!
Looking at the code, you should be able to get the day_of_week field without hardcoding the index by using the CronTrigger class's FIELD_NAMES property, e.g.
dow_index = CronTrigger.FIELD_NAMES.index('day_of_week')
dow = job.trigger.fields[dow_index]
Getting the value of the field is a bit more complicated, but it appears that BaseField implements the str function that should give you the value of the expression that created the field as a string that you could parse to find what you want:
dow_value_as_string = str(dow)
if 'mon' in dow_value_as_string:
# do something
if dow_value_as_string = "*":
# do something else
Please read this whole question before answering, as it's not what you think... I'm looking at creating python object wrappers that represent hardware devices on a system (trimmed example below).
class TPM(object):
#property
def attr1(self):
"""
Protects value from being accidentally modified after
constructor is called.
"""
return self._attr1
def __init__(self, attr1, ...):
self._attr1 = attr1
...
#classmethod
def scan(cls):
"""Calls Popen, parses to dict, and passes **dict to constructor"""
Most of the constructor inputs involve running command line outputs in subprocess.Popen and then parsing the output to fill in object attributes. I've come up with a few ways to handle these, but I'm unsatisfied with what I've put together just far and am trying to find a better solution. Here are the common catches that I've found. (Quick note: tool versions are tightly controlled, so parsed outputs don't change unexpectedly.)
Many tools produce variant outputs, sometimes including fields and sometimes not. This means that if you assemble a dict to be wrapped in a container object, the constructor is more or less forced to take **kwargs and not really have defined fields. I don't like this because it makes static analysis via pylint, etc less than useful. I'd prefer a defined interface so that sphinx documentation is clearer and errors can be more reliably detected.
In lieu of **kwargs, I've also tried setting default args to None for many of the fields, with what ends up as pretty ugly results. One thing I dislike strongly about this option is that optional fields don't always come at the end of the command line tool output. This makes it a little mind-bending to look at the constructor and match it up to tool output.
I'd greatly prefer to avoid constructing a dictionary in the first place, but using setattr to create attributes will make pylint unable to detect the _attr1, etc... and create warnings. Any ideas here are welcome...
Basically, I am looking for the proper Pythonic way to do this. My requirements, for a re-summary are the following:
Command line tool output parsed into a container object.
Container object protects attributes via properties post-construction.
Varying number of inputs to constructor, with working static analysis and error detection for missing required fields during runtime.
Is there a good way of doing this (hopefully without a ton of boilerplate code) in Python? If so, what is it?
EDIT:
Per some of the clarification requests, we can take a look at the tpm_version command. Here's the output for my laptop, but for this TPM it doesn't include every possible attribute. Sometimes, the command will return extra attributes that I also want to capture. This makes parsing to known attribute names on a container object fairly difficult.
TPM 1.2 Version Info:
Chip Version: 1.2.4.40
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: IFX
Vendor Specific data: 04280077 0074706d 3631ffff ff
TPM Version: 01010000
Manufacturer Info: 49465800
Example code (ignore lack of sanity checks, please. trimmed for brevity):
def __init__(self, chip_version, spec_level, errata_revision,
tpm_vendor_id, vendor_specific_data, tpm_version,
manufacturer_info):
self._chip_version = chip_version
...
#classmethod
def scan(cls):
tpm_proc = Popen("/usr/sbin/tpm_version")
stdout, stderr = Popen.communicate()
tpm_dict = dict()
for line in tpm_proc.stdout.splitlines():
if "Version Info:" in line:
pass
else:
split_line = line.split(":")
attribute_name = (
split_line[0].strip().replace(' ', '_').lower())
tpm_dict[attribute_name] = split_line[1].strip()
return cls(**tpm_dict)
The problem here is that this (or a different one that I may not be able to review the source of to get every possible field) could add extra things that cause my parser to work, but my object to not capture the fields. That's what I'm really trying to solve in an elegant way.
I've been working on a more solid answer to this the last few months, as I basically work on hardware support libraries and have finally come up with a satisfactory (though pretty verbose) answer.
Parse the tool outputs, whatever they look like, into objects structures that match up to how the tool views the device. These can have very generic dict structures, but should be broken out as much as possible.
Create another container class on top of that that which uses attributes to access items in the tool-container-objects. This enforces an API and can return sane errors across multiple versions of the tool, and across differing tool outputs!
I'm trying to use the get_list tastypie function but I can't make it work. I've looked for documentation about that but I can't find it.
Whatever, I've a list of item ids and an ItemResource. I'm trying to return a list of serialized objects.
So I just want to do something like that :
item_resource = ItemResource()
item_ids = my_item_id_list
return item_resource.get_list(request, id=item_ids)
But of course it's not working.
What would be the correct syntax to do that ?
Thx !
Unless your ItemResource accepts filters (more here), you have to copy-paste all the stuff from here, lines #1306 - #1313.
The point is that get_list results get filtered only by obj_get_list (initial filters), and apply_filters (request-specific filters) so you have to skip directly to the serialization part (you can include the pagination part, if needed).
This is one of the cases where django-restframework appears to be better than django-tastypie - it refactores serialization out into a separate class, avoiding the code duplication.
I have a strange error using the built in webserver in Django (haven't tested against Apache as I'm in active development). I have a url pattern that works for short url parameters (e.g. Chalk%20Hill), but locks up python on this one
http://localhost:8000/chargeback/checkDuplicateProject/Bexar%20Street%20Phase%20IV%20Brigham%20Ln%20to%20Myrtle%20St
The get request just says pending, and never returns, and I have to force quit python to get the server to function again. What am I doing wrong?
EDIT:
In continuing testing, it's strange, if I just enter the url, it returns the correct json response. Then it locks python. While I'm in the website, though, it never returns, and locks python.
urls:
url(r'^chargeback/checkDuplicateProject/(?P<aProjectName>(\w+)((\s)?(-)?(\w+)?)*)/$', 'chargeback.views.isProjectDuplicate'),
views:
def isProjectDuplicate(request, aProjectName):
#count the number of matching project names
p = Project.objects.filter(projectName__exact = aProjectName).count()
#if > 0, the project is a duplicate
if p > 0:
return HttpResponse('{"results":["Duplicate"]}', mimetype='application/json')
else:
return HttpResponse('{"results":["Not Duplicate"]}', mimetype='application/json')
Model:
class Project(models.Model):
projectName = models.TextField('project name')
department = models.ForeignKey('Department')
def __unicode__(self):
return self.projectName
The accepted answer is spot on about the regex, but since we're discussing optimization, I thought I should note that the code for checking whether a project exists could be modified to generate a much quicker query, especially in other contexts where you could be counting millions of rows needlessly. Call this 'best practices' advice, if you will.
p = Project.objects.filter(projectName__exact = aProjectName).count()
if p > 0:
could instead be
if Project.objects.filter(project_name__iexact=aProjectName).exists():
for two reasons.
First, you're not using p for anything so there's no need to store it as a variable as it increases readability and p is an obscure variable name and the best code is no code at all.
Secondly, this way we only ask for a single row instead of saving the results to the queryset cache. Please see the official Queryset API docs, a related question on Stack Overflow and the discussion about the latter on the django-developers group.
Additionally, it is customary in python (and Django, naturally) to name your fields lower_cased_separated_by_underscores. Please see more about this on the Python Style Guide (PEP 8).
Since you are going to check whether aProjectName already exists in the database, there's no need for you to make the regex so complicated.
I suggest you simplify the regex to
url(r'^chargeback/checkDuplicateProject/(?P<aProjectName>[\w+\s-]*)/$', 'chargeback.views.isProjectDuplicate'),
For a further explanation, see the question url regex keeps django busy/crashing on the django-users group.
When I was trying to access a tuple inside a list in the Django template format, I found out I couldn't access it with a[ 0 ][ 1 ], instead I had to use a.0.1.
Suppose that a is something like
a = [
( 'a', 'apple' ),
( 'b', 'bee' ),
]
Why doesn't Django template language support a[ 0 ][ 1 ]? In normal Python programming, a.0.1 would give you a syntax error.
The Django docs on the template API explain this nicely:
Dots have a special meaning in template rendering. A dot in a variable name signifies a lookup. Specifically, when the template system encounters a dot in a variable name, it tries the following lookups, in this order:
Dictionary lookup. Example: foo["bar"]
Attribute lookup. Example: foo.bar
List-index lookup. Example: foo[bar]
The template system uses the first lookup type that works. It's short-circuit logic. Here are a few examples:
>>> from django.template import Context, Template
>>> t = Template("My name is {{ person.first_name }}.")
>>> d = {"person": {"first_name": "Joe", "last_name": "Johnson"}}
>>> t.render(Context(d))
"My name is Joe."
>>> class PersonClass: pass
>>> p = PersonClass()
>>> p.first_name = "Ron"
>>> p.last_name = "Nasty"
>>> t.render(Context({"person": p}))
"My name is Ron."
>>> t = Template("The first stooge in the list is {{ stooges.0 }}.")
>>> c = Context({"stooges": ["Larry", "Curly", "Moe"]})
>>> t.render(c)
"The first stooge in the list is Larry."
Variable._resolve_lookup in django.templates.base appears to be the function responsible for this, and hasn't changed much since the oldest revision I can find
You can find some information about this in the django book:
The beginning of the chapter should explain why it works this way:
In the previous chapter, you may have noticed something peculiar in how we returned the text in our example views. Namely, the HTML was hard-coded directly in our Python code, like this:
def current_datetime(request):
now = datetime.datetime.now()
html = "<html><body>It is now %s.</body></html>" % now
return HttpResponse(html)
Although this technique was convenient for the purpose of explaining how views work, it’s not a good idea to hard-code HTML directly in your views. Here’s why:
Any change to the design of the page requires a change to the Python code. The design of a site tends to change far more frequently than the underlying Python code, so it would be convenient if the design could change without needing to modify the Python code.
Writing Python code and designing HTML are two different disciplines, and most professional Web development environments split these responsibilities between separate people (or even separate departments). Designers and HTML/CSS coders shouldn’t be required to edit Python code to get their job done.
It’s most efficient if programmers can work on Python code and designers can work on templates at the same time, rather than one person waiting for the other to finish editing a single file that contains both Python and HTML.
For these reasons, it’s much cleaner and more maintainable to separate the design of the page from the Python code itself. We can do this with Django’s template system, which we discuss in this chapter.
...
Dot lookups can be summarized like this: when the template system encounters a dot in a variable name, it tries the following lookups, in this order:
Dictionary lookup (e.g., foo["bar"])
Attribute lookup (e.g., foo.bar) 1
Method call (e.g., foo.bar())
List-index lookup (e.g., foo[2])
The system uses the first lookup type that works. It’s short-circuit logic.
The reason I would say that the django template language doesnt do XYZ way of accessing context data is because generally at that point you are doing too much in the template side vs your view that renders it.
The design decision of their template engine seems lighter than maybe some others which give you more pythonic direct access to data. But ideally you would be formatting th context before passing it in.
You also have the ability to create your own template filters for doing more custom processing of data.
Specific to your question, accesing child members using dot notation is the django template way to try multiple approaches to resolving the member. It tries dictionary keys, attributes, etc. in a certain order. You just use dot notation for everything.