I need help on how do I split the parameter from an url in when using python requests get.
Assuming I have this url
https://blabla.io/bla1/blas/blaall/messages?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D
and I did requests.get by
_get = requests.get("https://blabla.io/bla1/blas/blaall/messages?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D", headers={"Authorization":"MyToken 1234abcd"})
I checked with _get.url and it return
u'https://blabla.io/bla1/blas/blaall/messages?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D'
Then I tried with the following to split the parameter
url = "https://blabla.io/bla1/blas/blaall/messages"
query = {"data[]":[{"limit_count":100, "limit_size":100}]}
headers = {"Authorization":"MyToken 1234abcd"}
_get = requests.get(url, params=query, headers=headers)
_get.url return the following result
u'https://blabla.io/bla1/blas/blaall/messages?data%5B%5D=limit_count&data%5B%5D=limit_size'
without 100 and 10000
In this kind of url --> https://blabla.io/bla1/blas/blaall/messages?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D, how exactly to split its parameter?
Thank you for your help.
So you are looking for:
data={"limit_count":100,"limit_size":1000}
as your query params.
Unfortunately, requests will not flatten this nested structure, it treats any Iterable value as multiple values for the key, e.g. your nest dictionary is treated like:
query = {'data': ['limit_count', 'limit_size']}
Which is why you don't see 100 and 1000 in the end result.
You will need to flatten it into a string. You can use json.dumps() to create the required string (double quotes vs. single quotes, compact). Then requests will do the required URL encoding, e.g.:
In []:
data = {'limit_count': 100, 'limit_size': 1000}
query = {'data': json.dumps(data, separators=(',', ':'))}
request.get('http://httpbin.org', params=query).url
Out[]:
'http://httpbin.org/?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D'
from urllib.parse import urlsplit, parse_qs
import requests
url = "https://blabla.io/bla1/blas/blaall/messages?data=%7B%22limit_count%22%3A100%2C%22limit_size%22%3A1000%7D"
query = urlsplit(url).query
params = parse_qs(query)
headers = {"Authorization":"MyToken 1234abcd"}
_get = requests.get(url, params=params, headers=headers)
As far as I know, you can't use the requests library to parse URLs.
You use that to handle the requests. If you want a URL parser, use urllib.parse instead.
Related
I'm trying to retrieve data from https://clinicaltrials.gov/ and althought I've specified the format as Json in the request parameter:
fmt=json
the returned value is txt by default.
As a consequence i'm not able to retrieve the response in json()
Good:
import requests
response = requests.get('https://clinicaltrials.gov/api/query/study_fields?expr=heart+attack&fields=NCTId%2CBriefTitle%2CCondition&min_rnk=1&max_rnk=&fmt=json')
response.text
Not Good:
import requests
response = requests.get('https://clinicaltrials.gov/api/query/study_fields?expr=heart+attack&fields=NCTId%2CBriefTitle%2CCondition&min_rnk=1&max_rnk=&fmt=json')
response.json()
Any idea how to turn this txt to json ?
I've tried with response.text which is working but I want to retrieve data in Json()
You can use following code snippet:
import requests, json
response = requests.get('https://clinicaltrials.gov/api/query/study_fields?expr=heart+attack&fields=NCTId%2CBriefTitle%2CCondition&min_rnk=1&max_rnk=&fmt=json')
jsonResponse = json.loads(response.content)
You should use the JSON package (that is built-in python, so you don't need to install anything), that will convert the text into a python object (dictionary) using the json.loads() function. Here you can find some examples.
I have been trying to figure out how to use python-requests to send a request that the url looks like:
http://example.com/api/add.json?name='hello'&data[]='hello'&data[]='world'
Normally I can build a dictionary and do:
data = {'name': 'hello', 'data': 'world'}
response = requests.get('http://example.com/api/add.json', params=data)
That works fine for most everything that I do. However, I have hit the url structure from above, and I am not sure how to do that in python without manually building strings. I can do that, but would rather not.
Is there something in the requests library I am missing or some python feature I am unaware of?
Also what do you even call that type of parameter so I can better google it?
All you need to do is putting it on a list and making the key as list like string:
data = {'name': 'hello', 'data[]': ['hello', 'world']}
response = requests.get('http://example.com/api/add.json', params=data)
What u are doing is correct only. The resultant url is same what u are expecting.
>>> payload = {'name': 'hello', 'data': 'hello'}
>>> r = requests.get("http://example.com/api/params", params=payload)
u can see the resultant url:
>>> print(r.url)
http://example.com/api/params?name=hello&data=hello
According to url format:
In particular, encoding the query string uses the following rules:
Letters (A–Z and a–z), numbers (0–9) and the characters .,-,~ and _ are left as-is
SPACE is encoded as + or %20
All other characters are encoded as %HH hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
So array[] will not be as expected and will be automatically replaced according to the rules:
If you build a url like :
`Build URL: http://example.com/api/add.json?name='hello'&data[]='hello'&data[]='world'`
OutPut will be:
>>> payload = {'name': 'hello', "data[]": 'hello','data[]':'world'}
>>> r = requests.get("http://example.com/api/params", params=payload)
>>> r.url
u'http://example.com/api/params?data%5B%5D=world&name=hello'
This is because Duplication will be replaced by the last value of the key in url and data[] will be replaced by data%5B%5D.
If data%5B%5D is not the problem(If server is able to parse it correctly),then u can go ahead with it.
Source Link
One solution if using the requests module is not compulsory, is using the urllib/urllib2 combination:
payload = [('name', 'hello'), ('data[]', ('hello', 'world'))]
params = urllib.urlencode(payload, doseq=True)
sampleRequest = urllib2.Request('http://example.com/api/add.json?' + params)
response = urllib2.urlopen(sampleRequest)
Its a little more verbose and uses the doseq(uence) trick to encode the url parameters but I had used it when I did not know about the requests module.
For the requests module the answer provided by #Tomer should work.
Some api-servers expect json-array as value in the url query string. The requests params doesn't create json array as value for parameters.
The way I fixed this on a similar problem was to use urllib.parse.urlencode to encode the query string, add it to the url and pass it to requests
e.g.
from urllib.parse import urlencode
query_str = urlencode(params)
url = "?" + query_str
response = requests.get(url, params={}, headers=headers)
The solution is simply using the famous function: urlencode
>>> import urllib.parse
>>> params = {'q': 'Python URL encoding', 'as_sitesearch': 'www.urlencoder.io'}
>>> urllib.parse.urlencode(params)
'q=Python+URL+encoding&as_sitesearch=www.urlencoder.io'
I'm having problems getting data from an HTTP response. The format unfortunately comes back with '\n' attached to all the key/value pairs. JSON says it must be a str and not "bytes".
I have tried a number of fixes so my list of includes might look weird/redundant. Any suggestions would be appreciated.
#!/usr/bin/env python3
import urllib.request
from urllib.request import urlopen
import json
import requests
url = "http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL"
response = urlopen(url)
content = response.read()
print(content)
data = json.loads(content)
info = data[0]
print(info)
#got this far - planning to extract "id:" "22144"
When it comes to making requests in Python, I personally like to use the requests library. I find it easier to use.
import json
import requests
r = requests.get('http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL')
json_obj = json.loads(r.text[4:])
print(json_obj[0].get('id'))
The above solution prints: 22144
The response data had a couple unnecessary characters at the head, which is why I am only loading the relevant (json) portion of the response: r.text[4:]. This is the reason why you couldn't load it as json initially.
Bytes object has method decode() which converts bytes to string. Checking the response in the browser, seems there are some extra characters at the beginning of the string that needs to be removed (a line feed character, followed by two slashes: '\n//'). To skip the first three characters from the string returned by the decode() method we add [3:] after the method call.
data = json.loads(content.decode()[3:])
print(data[0]['id'])
The output is exactly what you expect:
22144
JSON says it must be a str and not "bytes".
Your content is "bytes", and you can do this as below.
data = json.loads(content.decode())
I was wondering how to use the requests library to pull the text from a field in a Json? I wouldn't need beautiful soup for that right?
If your response is indeed a json format, you can simply use requests .json() to access the fields, example like this:
import requests
url = 'http://time.jsontest.com/'
r = requests.get(url)
# use .json() for json response data
r.json()
{u'date': u'03-28-2015',
u'milliseconds_since_epoch': 1427574682933,
u'time': u'08:31:22 PM'}
# to access the field
r.json()['date']
u'03-28-2015'
This will automatically parse the json response into Python's dictionary:
type(r.json())
dict
You can read more about response.json here.
Alternatively just use Python's json module:
import json
d = json.loads(r.content)
print d['date']
03-28-2015
type(d)
dict
What version of Python are you using ? From 2.6 onwards you can do this:
import json
json_data=open(file_directory).read()
data = json.loads(json_data)
print(data)
This question already has answers here:
How can I parse (read) and use JSON?
(5 answers)
What are the differences between the urllib, urllib2, urllib3 and requests module?
(11 answers)
Closed last month.
I want to dynamically query Google Maps through the Google Directions API. As an example, this request calculates the route from Chicago, IL to Los Angeles, CA via two waypoints in Joplin, MO and Oklahoma City, OK:
http://maps.googleapis.com/maps/api/directions/json?origin=Chicago,IL&destination=Los+Angeles,CA&waypoints=Joplin,MO|Oklahoma+City,OK&sensor=false
It returns a result in the JSON format.
How can I do this in Python? I want to send such a request, receive the result and parse it.
I recommend using the awesome requests library:
import requests
url = 'http://maps.googleapis.com/maps/api/directions/json'
params = dict(
origin='Chicago,IL',
destination='Los+Angeles,CA',
waypoints='Joplin,MO|Oklahoma+City,OK',
sensor='false'
)
resp = requests.get(url=url, params=params)
data = resp.json() # Check the JSON Response Content documentation below
JSON Response Content: https://requests.readthedocs.io/en/master/user/quickstart/#json-response-content
The requests Python module takes care of both retrieving JSON data and decoding it, due to its builtin JSON decoder. Here is an example taken from the module's documentation:
>>> import requests
>>> r = requests.get('https://github.com/timeline.json')
>>> r.json()
[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...
So there is no use of having to use some separate module for decoding JSON.
requests has built-in .json() method
import requests
requests.get(url).json()
import urllib
import json
url = 'http://maps.googleapis.com/maps/api/directions/json?origin=Chicago,IL&destination=Los+Angeles,CA&waypoints=Joplin,MO|Oklahoma+City,OK&sensor=false'
result = json.load(urllib.urlopen(url))
Use the requests library, pretty print the results so you can better locate the keys/values you want to extract, and then use nested for loops to parse the data. In the example I extract step by step driving directions.
import json, requests, pprint
url = 'http://maps.googleapis.com/maps/api/directions/json?'
params = dict(
origin='Chicago,IL',
destination='Los+Angeles,CA',
waypoints='Joplin,MO|Oklahoma+City,OK',
sensor='false'
)
data = requests.get(url=url, params=params)
binary = data.content
output = json.loads(binary)
# test to see if the request was valid
#print output['status']
# output all of the results
#pprint.pprint(output)
# step-by-step directions
for route in output['routes']:
for leg in route['legs']:
for step in leg['steps']:
print step['html_instructions']
just import requests and use from json() method :
source = requests.get("url").json()
print(source)
OR you can use this :
import json,urllib.request
data = urllib.request.urlopen("url").read()
output = json.loads(data)
print (output)
Try this:
import requests
import json
# Goole Maps API.
link = 'http://maps.googleapis.com/maps/api/directions/json?origin=Chicago,IL&destination=Los+Angeles,CA&waypoints=Joplin,MO|Oklahoma+City,OK&sensor=false'
# Request data from link as 'str'
data = requests.get(link).text
# convert 'str' to Json
data = json.loads(data)
# Now you can access Json
for i in data['routes'][0]['legs'][0]['steps']:
lattitude = i['start_location']['lat']
longitude = i['start_location']['lng']
print('{}, {}'.format(lattitude, longitude))
Also for pretty Json on console:
json.dumps(response.json(), indent=2)
possible to use dumps with indent. (Please import json)