Is it possible to pass multiple values for a single URL parameter without using your own separator?
What I want to do is that the backend expects an input parameter urls to have one or more values. It can me set to a single or multiple URLs. What is a way to set the urls parameter so it can have multiple values? I can't use my own separator because it can be part of the value itself.
Example:
http://example.com/?urls=[value,value2...]
The urls parameter be set to just http://google.com or it can be set to http://google.com http://yahoo.com .... In the backend, I want to process each url as a separate values.
http://.../?urls=foo&urls=bar&...
...
request.GET.getlist('urls')
The following is probably the best way of doing it - ie, don't specify a delimited list of URLs, rather, use the fact you can specify the same param name multiple times, eg:
http://example.com/?url=http://google.co.uk&url=http://yahoo.com
The URL list be then be used and retrieved via request.GET.getlist('url')
Related
I have a list of links, and I want to see if they're listed in my disavow file.
My disavow file contains both URLs (e.g. http://getpaydayloan.org/blog/blog-how-to-apply-for-online-payday-loans-san) as well as whole domains, listed as domain:getpaydayloan.org.
The new URLs file holds URLs only, e.g. http://getpaydayloan.org/blog/blog-how-to-apply-for-online-payday-loans-san
I want to see if the new URLs are already in the disavow file. I am currently generating a diff using diff = set(url_set)-set(disavow_urls), but I also need to check to see if they are in the disavow file using the domain:url.com format.
How would I do something like that?
In case it helps, here is the whole script: https://github.com/growth-austen/disavow_automator
Here is a function to check if the url contains any of the disavowed domains.
def inDisavow(url, disavowDomainList):
for domain in disavowDomainList:
if domain in url:
return true
return false
Some alternative definitions to David's function for fun:
return any(domain in url for domain in disavowDomainList)
return any(map(url.__contains__, disavowDomainList))
(replace map with itertools.imap in Python 2 for memory efficiency)
I can get the filler variable from the URL below just fine.
url(r'^production/(?P<filler>\w{7})/$', views.Filler.as_view()),
In my view I can retrieve the filler as expected. However, if I try to do use a URL like the one below.
url(r'^production/(?P<filler>)\w{7}/(?P<day>).*/$', views.CasesByDay.as_view()),
Both variables (filler, day) are blank.
You need to include the entire parameter in parenthesis. It looks like you did that in your first example but not the second.
Try:
url(r'^production/(?P<filler>\w{7})/(?P<day>.*)/$', views.CasesByDay.as_view()),
See the official documentation for more information and examples: URL dispatcher
I have started a scraping project, and I have a small problem with ItemLoader.
Suppose I have some ItemLoader in a scraper:
l = ScraperProductLoader(item=ScraperProduct(), selector=node)
l.add_xpath('sku', 'id/text()')
I would like to add a URL to the item loader based on the sku I have provided:
l.add_value('url', '?????')
...However, based on the documentation, I don't see a clear way to do this.
Options I have considered:
Input processor: Add a string, and pass the sku as the context somehow
Handle separately: Create the URL without using the item loader
How can I use loaded data to add a new value in an ItemLoader?
You can use get_output_value() method:
get_output_value(field_name)
Return the collected values parsed using
the output processor, for the given field. This method doesn’t
populate or modify the item at all.
l.add_value('url', 'http://domain.com/' + l.get_output_value('scu'))
I'm developing application using Bottle. How do I get full query string when I get a GET Request.
I dont want to catch using individual parameters like:
param_a = request.GET.get("a","")
as I dont want to fix number of parameters in the URL.
How to get full query string of requested url
You can use the attribute request.query_string to get the whole query string.
Use request.query or request.query.getall(key) if you have more than one value for a single key.
For eg., request.query.a will return you the param_a you wanted. request.query.b will return the parameter for b and so on.
If you only want the query string alone, you can use #halex's answer.
In Pyramids framework, functions route_path and route_url are used to generate urls from routes configuration. So, if I have route:
config.add_route('idea', 'ideas/{idea}')
I am able to generate the url for it using
request.route_url('idea', idea="great");
However, sometimes I may want to add additional get parameters to generate url like:
idea/great?sort=asc
How to do this?
I have tried
request.route_url('idea', idea='great', sort='asc')
But that didn't work.
You can add additional query arguments to url passing the _query dictionary
request.route_url('idea', idea='great', _query={'sort':'asc'})
If you are using Mako templates, _query={...} won't work; instead you need to do:
${request.route_url('idea', idea='great', _query=(('sort', 'asc'),))}
The tuple of 2-tuples works as a dictionary.