So I am using python Requests( http://docs.python-requests.org/en/latest/) to work with a REST api. My question is when using Requests what data field actually appends info to the url for GET requests? I wasn't sure if if I could use "payload" or if "params" does it.
It's all in the docs:
http://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls
As an example, if you wanted to pass key1=value1 and key2=value2 to
httpbin.org/get, you would use the following code:
payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.get("http://httpbin.org/get", params=payload)
After all - you can of course easily test it, in case you're simply not sure which of two choices is the proper one.
As the example shows, http://httpbin.org (a site by the requests author) is a nice way to test any web programming you're doing.
Related
So I understand the concept of sending payloads with requests, but I am struggling to understand how to know what to send.
for example, payload = {'key1': 'value1', 'key2': 'value2'} is the payload,
but how do I find what key1 is? Is it the element ID? I have worked closely to selenium and I am looking at requests at a faster alternative.
Thanks for your help, Jake.
Requests is a library which allows you to fetch URL's using GET, POST, PUT, etc requests. On some of those you can add a payload, or extra data which the server uses to do something.
So the payload is handled by the server, but you have to supply them which you do by defining that dictionary. This is all dependent on what the server expects to receive. I am assuming you read the following explanation:
Source
You often want to send some sort of data in the URL’s query string. If
you were constructing the URL by hand, this data would be given as
key/value pairs in the URL after a question mark, e.g.
httpbin.org/get?key=val. Requests allows you to provide these
arguments as a dictionary of strings, using the params keyword
argument. As an example, if you wanted to pass key1=value1 and
key2=value2 to httpbin.org/get, you would use the following code:
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.get('https://httpbin.org/get', params=payload)
The payload here is a variable which defines the parameters of a request.
i fetch all the detail from the desire website but unable to get the some specific information please guide me for that.
targeted domain: https://shop.adidas.ae/en/messi-16-3-indoor-boots/BA9855.html
my code isresponse.xpath('//ul[#class="product-size"]//li/text()').extract()
need to fetch data!!!
Thanks!
Often ecommerce websites have data in json format in page source and then have javscript unpack it on users end.
In this case you can open up the page source with javascript disabled and search for keywords (like specific size).
I found in this case it can be found with regular expressions:
import re
import json
data = re.findall('window.assets.sizesMap = (\{.+?\});', response.body_as_unicode())
json.loads(data[0])
Out:
{'16': {'uk': '0k', 'us': '0.5'},
'17': {'uk': '1k', 'us': '1'},
'18': {'uk': '2k', 'us': '2.5'},
...}
Edit: More accurately you probably want to get different part of the json but nevertheless the answer is more or less the same:
data = re.findall('window.assets.sizes = (\{(?:.|\n)+?\});', response.body_as_unicode())
json.loads(data[0].replace("'", '"')) # replace single quotes to doubles
The data you want to fetch is loaded from a javascript. It is said explicitly in the tag class="js-size-value ".
If you want to get it, you will need to use a rendering service. I suggest you use Splash, it is simple to install and simple to use. You will need docker to install splash.
I have this long list of URL that I need to check response code of, where the links are repeated 2-3 times. I have written this script to check the response code of each URL.
connection =urllib.request.urlopen(url)
return connection.getcode()
The URL comes in XML in this format
< entry key="something" > url</entry>
< entry key="somethingelse" > url</entry>
and I have to associate the response code with the attribute Key so I don't want to use a SET.
Now I definitely don't want to make more than 1 request for the same URL so I was searching whether urlopen uses cache or not but didn't find a conclusive answer. If not what other technique can be used for this purpose.
You can store the urls in a dictionary (urls = {}) as you make a request and check if you have already made a req to that url later:
if key not in urls:
connection = urllib.request.urlopen(url)
urls[key] = url
return connection.getcode()
BTW if you make requests to the same urls repeatedly (multiple runs of the script), and need a persistent cache, i recommend using requests with requests-cache
Why don't you create a python set() of the URLs? That way each url is included only once.
How are you associating the URL with the key? A dictionary?
You can use a dictionary to map the URL to it's response and any other information you need to keep track of. If the URL is already in the dictionary then you know the response. So you have one dictionary:
url_cache = {
"url1" : ("response", [key1,key2])
}
If you need to organize things differently it shouldn't be too hard with another dictionary.
I have got data from POST like
first_name=jon&nick_name=harry
How can I change this to a python dictionary, like :
{
"first_name":"jon",
"nick_name":"harry"
}
>>> urlparse.parse_qs("first_name=jon&nick_name=harry")
{'nick_name': ['harry'], 'first_name': ['jon']}
If you trust that the URL arguments will always be properly formatted, this would be a minimal solution:
dict(pair.split("=") for pair in urlargs.split("&"))
In code that's going to be publicly accessible though, you'll probably want to use a library that does error checking. If you're using Django, you'll probably have access to them already in the dictionary-like object HttpRequest.POST.
there's lots of information on retrieving GET variables from a python script. Unfortunately I can't figure out how to send GET variables from a python script to an HTML page. So I'm just wondering if there's a simple way to do this.
I'm using Google App Engine webapp to develop my site. Thanks for your support!
Just append the get parameters to the url: request.html?param1=value1¶m2=value2.
Now you could just create your string with some python variables which would hold the param names and values.
Edit: better use python's url lib:
import urllib
params = urllib.urlencode({'param1': 'value1', 'param2': 'value2', 'value3': 'param3'})
url = "example.com?%s" % params