django - regex for optional url parameters

django - regex for optional url parameters - python

I have a view in django that can accept a number of different filter parameters, but they are all optional. If I have 6 optional filters, do I really have to write urls for every combination of the 6 or is there a way to define what parts of the url are optional?
To give you an example with just 2 filters, I could have all of these url possibilities:
/<city>/<state>/
/<city>/<state>/radius/<miles>/
/<city>/<state>/company/<company-name>/
/<city>/<state>/radius/<miles>/company/<company-name>/
/<city>/<state>/company/<company-name>/radius/<miles>/
All of these url's are pointing to the same view and the only required params are city and state. With 6 filters, this becomes unmanageable.
What's the best way to go about doing what I want to achieve?

One method would be to make the regular expression read all the given filters as a single string, and then split them up into individual values in the view.
I came up with the following URL:
(r'^(?P<city>[^/]+)/(?P<state>[^/]+)(?P<filters>(?:/[^/]+/[^/]+)*)/?$',
'views.my_view'),
Matching the required city and state is easy. The filters part is a bit more complicated. The inner part - (?:/[^/]+/[^/]+)* - matches filters given in the form /name/value. However, the * quantifier (like all Python regular expression quantifiers) only returns the last match found - so if the url was /radius/80/company/mycompany/ only company/mycompany would be stored. Instead, we tell it not to capture the individual values (the ?: at the start), and put it inside a capturing block which will store all filter values as a single string.
The view logic is fairly straightforward. Note that the regular expression will only match pairs of filters - so /company/mycompany/radius/ will not be matched. This means we can safely assume we have pairs of values. The view I tested this with is as follows:
def my_view(request, city, state, filters):
# Split into a list ['name', 'value', 'name', 'value']. Note we remove the
# first character of the string as it will be a slash.
split = filters[1:].split('/')
# Map into a dictionary {'name': 'value', 'name': 'value'}.
filters = dict(zip(split[::2], split[1::2]))
# Get the values you want - the second parameter is the default if none was
# given in the URL. Note all entries in the dictionary are strings at this
# point, so you will have to convert to the appropriate types if desired.
radius = filters.get('radius', None)
company = filters.get('company', None)
# Then use the values as desired in your view.
context = {
'city': city,
'state': state,
'radius': radius,
'company': company,
}
return render_to_response('my_view.html', context)
Two things to note about this. First, it allows unknown filter entries into your view. For example, /fakefilter/somevalue is valid. The view code above ignores these, but you probably want to report an error to the user. If so, alter the code getting the values to
radius = filters.pop('radius', None)
company = filters.pop('company', None)
Any entries remaining in the filters dictionary are unknown values about which you can complain.
Second, if the user repeats a filter, the last value will be used. For example, /radius/80/radius/50 will set the radius to 50. If you want to detect this, you will need to scan the list of values before it is converted to a dictionary:
given = set()
for name in split[::2]:
if name in given:
# Repeated entry, complain to user or something.
else:
given.add(name)

This is absolutely the use-case for GET parameters. Your urlconf should just be /city/state/, then the various filters go on the end as GET variables:
/city/state/?radius=5&company=google
Now, in your view, you accept city and state as normal parameters, but everything else is stored in the request.GET QueryDict.

You could also make just one url (that only checks the start of the path, that should be the same) pointing to your view and then parse request.path in your view.
On the other hand, if you have really many optional filter parameters in various combinations the best solution is very often to do th filtering via GET-parameters, especially if the urls used for filtering don't need to be optimized for any search engine...

Try use something like that in your urls.py:
url(r'^(?P<city>[^/]+)/(?P<state>[^/]+)/(radius/(?P<miles>[^/]+)/|company/(?P<company_name>[^/]+)/)*$', 'view')

Related

LDAP extensible match filter LDAP_MATCHING_RULE_IN_CHAIN

When I run the following I end up with a good list of results:
base = 'OU=Security Groups,OU=Groups,DC=myserver,DC=com'
criteria = 'CN=My Example'
attributes = ['member', 'groupType', 'description', 'memberOf']
result = connection.search_ext_s(base, ldap.SCOPE_SUBTREE, criteria, attributes, sizelimit=0)
However I can't seem to find anything that helps me when using an LDAP_MATCHING_RULE_IN_CHAIN.
base = 'OU=Security Groups,OU=Groups,DC=myserver,DC=com'
criteria = '1.2.840.113556.1.4.1941:=CN=MatchedRuleChainExample'
attributes = ['member', 'groupType', 'description', 'memberOf']
result = connection.search_ext_s(base, ldap.SCOPE_SUBTREE, criteria, attributes, sizelimit=0)
The above always returns blank. Can anyone help me grasp this? I feel completely lost on how to get through the subgroups in Python.

This criteria syntax 1.2.840.113556.1.4.1941:=CN=MatchedRuleChainExample is wrong.
The string representation of an LDAP extensible match filter must be comprised of the following components in order :
An opening parenthesis
The name of the attribute type, or an empty string if none was provided
The string ":dn" if the dnAttributes flag is set, or an empty string if not
If a matching rule ID is available, then a string comprised of a colon followed by that OID, or an empty string if there is no matching
rule ID
The string ":="
The string representation of the assertion value
A closing parenthesis
To sum it up, it should look like :
([<attr>][:dn][:<OID>]:=<assertion>)
# In your case, fixing the attribute position :
(cn:1.2.840.113556.1.4.1941:=MatchedRuleChainExample)
But there is another issue here : LDAP_MATCHING_RULE_IN_CHAIN only works when used with Distinguished Names (DN) type attributes (like member or memberOf that are commonly used with extensible match filter), but cn is not, so it can't work.
To grab all Security Groups member of CN=My Example, including nested groups, use the memberOf attribute with extensible match and apply it to the group's dn.
# Fixing the attribute type and assertion value :
(memberOf:1.2.840.113556.1.4.1941:=<groupDN>)
Also, you need to filter objectClass to match only group entries (group members could also be users or machines). So in the end, the filter criteria should look like :
(&(objectClass=groupOfNames)(memberOf:1.2.840.113556.1.4.1941:=CN=My Example,OU=Security Groups,OU=Groups,DC=myserver,DC=com))
cf. Active Directory Group Related Searches
Note that LDAP_MATCHING_RULE_IN_CHAIN is available only on Domain Controllers with Windows Server 2003 R2 (or above).

Check if string is in certain format in Python

I have string as below.
/customer/v1/123456789/account/
The id in the url is dynamic.
What I want to check is if I have that string how can I be sure that if first part and second part is matching with below structure. /customer/v1/<customer_id>/account
What I have done so far is this. however, I want to check if endpoints is totally matching to the structure or not.
endpoint_structure = '/customer/v1/'
endpoint = '/customer/v1/123456789/account/'
if endpoint_structure in endpoint:
return True
return False
Endpoint structure might change as well.
For example: /customer/v1/<customer_id>/documents/<document_id>/ and there will be again given endpoint and I need to check if given endpoint fits with the structure.

You can use a regular expression;
import re
return re.match(r'^/customer/v1/\d+/account/$', endpoint)
or you can examine the beginning and the end:
return endpoint.startswith('/customer/v1/') and endpoint.endswith('/account/')
... though this doesn't attempt to verify that the stuff between the beginning and the end is numeric.

Can solve this using regular expression
^(/customer/v1/)(\d)+(/account/)$
Also if you want to specify the minimum length for customer_id
(/customer/v1/<customer_id>/account ) then use the following regexp
^(/customer/v1/)(\d){5,}(/account/)$
Here expecting the customer_id must have at least 5 digits length
Check here

pass a multi-value as parameters in pyramid URL Dispatch (add_route)

how to configure and use multidict in pyramid.
config.add_route('show_choosed_categories', '/categories/[list]')
and generate the urls like
${request.route_url('show_choosed_categories', categories=[1, 2] )}
in view i would use
request.GET.getall('categories')
pyramid seems to support it by webob.multidict – multi-value dictionary object https://docs.pylonsproject.org/projects/webob/en/stable/api/multidict.html
but how to use it with URL Dispatch.

The route you have configured would match only on a string literal [list]. Routes cannot match on Python objects, only strings and replacement markers. From Route Pattern Syntax under URL Dispatch:
A pattern segment (an individual item between / characters in the pattern) may either be a literal string (e.g., foo) or it may be a replacement marker (e.g., {foo}), or a certain combination of both.
Nonetheless you can extract a multidict from the request object.
# Conjugation of English verbs is horrible
config.add_route('show_chosen_categories', '/categories/')
Assuming you have a list of checkboxes named the same or a select multiple input in a form, with either input being named category, then your request parameters would be generated to look like this:
category=1&category=2
Then any URL that begins with categories would be matched, and request parameters would be usable in your view, depending on your form action:
# form action="POST"
request.POST.getall('category')
# form action="GET"
request.GET.getall('category')
>>> [1, 2]
See Multidict under Request and Response Objects for further information.

You are probably looking for config.add_route('foo', '/categories/*subpath') and request.route_url('foo', subpath=(1, 2, 3). Support for this is limited but it does work if it fits your use cases. Note that an empty list is valid here so you need to handle that.

Django: how to annotate based on only domain name and entire url with query params?

I am using django and I am trying to come up with a query that will allow me to do the following,
I have a column in the database called url. The url column has values that are very long. Basically the domain name followed by a long list of query parameters.
Eg:
https://www.somesite.com/something-interesting-digital-cos-or-make-bleh/?utm_source=something&utm_medium=email&utm_campaign=biswanyam%20report%20-%20digital%20cos%20or%20analog%20prey&ut
http://www.anothersite.com/holly-moly/?utm_source=something&utm_medium=email&tm_campaign=biswanyam%20report%20-%20digital%20cos%20or%20analog%20prey&ut
https://www.onemoresite.com/trinkle-star/?utm_source=something&utm_medium=email&utm_campaign=biswanyam%20report%20-%20digital%20cos%20or%20analog%20prey&ut
https://www.somesite.com/nothing-interesting-bleh/?utm_source=something&utm_medium=email&utm_campaign=biswanyam%20report%20-%20digital%20cos%20or%20analog%20prey&ut
I want a django query that can basically give me an annotated count of urls with the same domain name regardless of the query parameters in the URL.
So essentially this is what I am looking for,
{
'url': 'https://www.somesite.com/something-interesting-digital-cos-or-make-bleh', 'count': 127,
'url': 'http://www.anothersite.com/holly-moly', 'count': 87,
'url': 'https://www.onemoresite.com/trinkle-star', 'count': 94,
'url': 'https://www.somesite.com/nothing-interesting-bleh', 'count':72
}
I tried this query,
Somemodel.objects.filter(url__iregex='http.*\/\?').values('url').annotate(hcount=Count('url'))
This doesn't work as expected. It does an entire URL match along with the query parameters instead of matching only the domain name. Can someone please tell me how I might accomplish this or at least point me in the right direction. Thanks

This might not be possible because you cannot group by some partial information on a certain field. If you really want to achieve this, you might want to consider changing your schema. You should store url and parameters separately, as 2 model fields. Then you would have a method or if you want to make it look like an attribute, use #property decorator, to combine them and return the whole url. It wouldn't be too hard to split them in a migration/script to fit the new schema.

Django: Arbitrary number of unnamed urls.py parameters

I have a Django model with a large number of fields and 20000+ table rows. To facilitate human readable URLs and the ability to break down the large list into arbitrary sublists, I would like to have a URL that looks like this:
/browse/<name1>/<value1>/<name2>/<value2>/ .... etc ....
where 'name' maps to a model attribute and 'value' is the search criteria for that attribute. Each "name" will be treated like a category to return subsets of the model instances where the categories match.
Now, this could be handled with GET parameters, but I prefer more readable URLs for both the user's sake and the search engines. These URLs subsets will be embedded on each page that displays this model, so it seems worth the effort to make pretty URLs.
Ideally each name/value pair will be passed to the view function as a parameter named name1, name2, etc. However, I don't believe it's possible to defined named patterns via a regex's matched text. Am I wrong there?
So, it seems I need to do something like this:
urlpatterns = patterns('',
url(r'^browse/(?:([\w]+)/([\w]+)/)+$', 'app.views.view', name="model_browse"),
)
It seems this should match any sets of two name/value pairs. While it matches it successfully, it only passes the last name/value pair as parameters to the view function. My guess is that each match is overwriting the previous match. Under the guess that the containing (?:...)+ is causing it, I tried a simple repeating pattern instead:
urlpatterns = patterns('',
url(r'^browse/([\w]+/)+$', 'app.views.view', name="model_browse"),
)
... and got the same problem, but this time *args only includes the last matched pattern.
Is this a limitation of Django's url dispatcher, and/or Python's regex support? It seems either of these methods should work. Is there a way to achieve this without hardcoding each possible model attribute in the URL as an optional (.*) pattern?

A possibility that you might consider is matching the entire string of possible values within the url pattern portion and pull out the specific pieces within your view. As an example:
urlpatterns = patterns('',
url(r'^browse/(?P<match>.+)/$', 'app.views.view', name='model_browse'),
)
def view(request, match):
pieces = match.split('/')
# even indexed pieces are the names, odd are values
...
No promises about the regexp I used, but I think you understand what I mean.
(Edited to try and fix the regexp.)

I agree with Adam, but I think the pattern in urls.py should be:
... r'^browse/(?P<match>.+)/$' ...
The '\w' will only match 'word' characters, but the '.' will match anything.

I've an alternative solution, which isn't quite different from the previous but it's more refined:
url(r'^my_app/(((list\/)((\w{1,})\/(\w{1,})\/(\w{1,3})\/){1,10})+)$'
I've used unnamed url parameters and a repetitive regexp. Not to get the "is not a valid regular expression: multiple repeat" i place a word at the beginning of the list.
I'm still working at the view receiving the list. But i think ill' go through the args or kwargs.. Cannot still say it exactly.
My 2 cents

Same answer came to me while reading the question.
I believe model_browse view is the best way to sort the query parameters and use it as a generic router.

I think the answer of Adam is more generic than my solution, but if you like to use a fixed number of arguments in the url, you could also do something like this:
The following example shows how to get all sales of a day for a location by entering the name of the store and the year, month and day.
urls.py:
urlpatterns = patterns('',
url(r'^baseurl/location/(?P<store>.+)/sales/(?P<year>[0-9][0-9][0-9][0-9])-(?P<month>[0-9][0-9])-(?P<day>[0-9][0-9])/$', views.DailySalesAtLocationListAPIView.as_view(), name='daily-sales-at-location'),
)
Alternativly, you could also use the id of the store by changing (?P<store>.+) to (?P<store>[0-9]+). Note that location and sales are no keywords, they just improve readability of the url.
views.py
class DailySalesAtLocationListAPIView(generics.ListAPIView):
def get(self, request, store, year, month, day):
# here you can start using the values from the url
print store
print year
print month
print date
# now start filtering your model
Hope it helps anybody!
Best regards,
Michael

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

django - regex for optional url parameters - python

Try use something like that in your urls.py: url(r'^(?P<city>[^/]+)/(?P<state>[^/]+)/(radius/(?P<miles>[^/]+)/|company/(?P<company_name>[^/]+)/)*$', 'view')

Related

LDAP extensible match filter LDAP_MATCHING_RULE_IN_CHAIN

Check if string is in certain format in Python

pass a multi-value as parameters in pyramid URL Dispatch (add_route)

Django: how to annotate based on only domain name and entire url with query params?

Django: Arbitrary number of unnamed urls.py parameters

Categories

Resources