Get JSON response in Python, but in original JavaScript format - python

I am putting a JSON response into a variable via requests.json() like this:
response = requests.get(some_url, params=some_params).json()
This however converts JSON's original " to Python's ', true to True, null to None.
This poses a problem when trying to save the response as text and the convert it back to JSON - sure, I can use .replace() for all conversions mentioned above, but even once I do that, I get other funny json decoder errors.
Is there any way in Python to get JSON response and keep original JavaScript format?

json() is the JSON decoder method. You are looking at a Python object, that is why it looks like Python.
Other formats are listed on the same page, starting from Response Content
.text: text - it has no separate link/paragraph, it is right under "Response Content"
.content: binary, as bytes
.json(): decoded JSON, as Python object
.raw: streamed bytes (so you can get parts of content as it comes)
You need .text for getting text, including JSON data.

You can get the raw text of your response with requests.get(some_url, params=some_params).text
It is the json method which converts to a Python friendly format.

Related

How to separate data in a Restful API?

I am working on a program that reads the content of a Restful API from ImportIO. The connection works, and data is returned, but it's a jumbled mess. I'm trying to clean it to only return Asins.
I have tried using the split keyword and delimiter to no success.
stuff = requests.get('https://data.import.io/extractor***')
stuff.content
I get the content, but I want to extract only Asins.
results
While .content gives you access to the raw bytes of the response payload, you will often want to convert them into a string using a character encoding such as UTF-8. the response will do that for you when you access .text.
response.txt
Because the decoding of bytes to str requires an encoding scheme, requests will try to guess the encoding based on the response’s headers if you do not specify one. You can provide an explicit encoding by setting .encoding before accessing .text:
If you take a look at the response, you’ll see that it is actually serialized JSON content. To get a dictionary, you could take the str you retrieved from .text and deserialize it using json.loads(). However, a simpler way to accomplish this task is to use .json():
response.json()
The type of the return value of .json() is a dictionary, so you can access values in the object by key.
You can do a lot with status codes and message bodies. But, if you need more information, like metadata about the response itself, you’ll need to look at the response’s headers.
For More Info: https://realpython.com/python-requests/
What format is the return information in? Typically Restful API's will return the data as json, you will likely have luck parsing the it as a json object.
https://realpython.com/python-requests/#content
stuff_dictionary = stuff.json()
With that, you can load the content is returned as a dictionary and you will have a much easier time.
EDIT:
Since I don't have the full URL to test, I can't give an exact answer. Given the content type is CSV, using a pandas DataFrame is pretty easy. With a quick StackOverflow search, I found the following answer: https://stackoverflow.com/a/43312861/11530367
So I tried the following in the terminal and got a dataframe from it
from io import StringIO
import pandas as pd
pd.read_csv(StringIO("HI\r\ntest\r\n"))
So you should be able to perform the following
from io import StringIO
import pandas as pd
df = pd.read_csv(StringIO(stuff.content))
If that doesn't work, consider dropping the first three bytes you have in your response: b'\xef\xbb\xf'. Check the answer from Mark Tolonen to get parse this.
After that, selecting the ASIN (your second column) from your dataframe should be easy.
asins = df.loc[:, 'ASIN']
asins_arr = asins.array
The response is the byte string of CSV content encoded in UTF-8. The first three escaped byte codes are a UTF-8-encoded BOM signature. So stuff.content.decode('utf-8-sig') should decode it. stuff.text may also work if the encoding was returned correctly in the response headers.

How to extract a javascript object as json from a HTML page using python or nodejs?

https://yeastmine.yeastgenome.org/yeastmine/customQuery.do
The above webpage has something like this. As far as I understand, JSON does not support single quote, only double quote is allowed. So the things in {} is not a valid JSON object. What is the best way to extract this object from the resulted HTML page and convert it to JSON? Thanks.
var helpMap = {'NcRNAGene': ...
This one mentions JSON.stringify. But I am not sure how to first get helpMap as JS object in the first place in python or nodejs.
Convert JS object to JSON string
In the console of that website you can write javascript. In this case you are right that JSON.Stringify is what you want here, you use it by passing the javascript object helpMap into it as a parameter, the result is the JSON-encoded string:
jsonString = JSON.stringify(helpMap)
console.log(jsonString)
You should be able to copy that json string out of your console (in chrome there will be a "Copy" button at the end of it).
Suppose the webpage is downloaded to x.html, run the following.
grep '^ \+var helpMap' < x.html | ./main.js
main.js has the following code.
fs=require('fs');
data = fs.readFileSync(process.stdin.fd);
eval(data.toString());
console.log(helpMap);
Then use JSON.stringify() on helpMap if necesssary.

Access JSON data in Python

header = {'Content-type': 'application/json','Authorization': 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' }
url = 'https://sandbox-authservice.priaid.ch/login'
response = requests.post(url, headers = header, verify=False).json()
token = json.dumps(response)
print token['ValidThrough']
I want to print the ValidThrough Attribute in my webhook, which is received as JSON data via a POST call. I know this has been asked a number of times here, but print token['ValidThrough']isnt working for me.I receive the error "TypeError: string indices must be integers, not str"
Since the response already seems to be in json, there is no need to use json.dumps.
json.dumps on a dictionary will return a string which cannot be indexed obviously and hence that error.
a requests response .json() method already loads the content of the string to json.
You should use that, but your code later serializes it back to a string, and hence the error (token is a string representation of the dict you are expecting, not the dict). You should just omit the json.dumps(response) line, and use response['ValidThrough']
There's another error here, even if you assume that the .json() returns a string that should be unserialized again you should've used json.loads(response) in order to load it into a dict (not dumps to serialize it again)

How to get the request body bytes in Flask?

The request's content-type is application/json, but I want to get the request body bytes. Flask will auto convert the data to json. How do I get the request body?
You can get the non-form-related data by calling request.get_data() You can get the parsed form data by accessing request.form and request.files.
However, the order in which you access these two will change what is returned from get_data. If you call it first, it will contain the full request body, including the raw form data. If you call it second, it will typically be empty, and form will be populated. If you want consistent behavior, call request.get_data(parse_form_data=True).
You can get the body parsed as JSON by using request.get_json(), but this does not happen automatically like your question suggests.
See the docs on dealing with request data for more information.
To stream the data rather than reading it all at once, access request.stream.
If you want the data as a string instead of bytes, use request.get_data(as_text=True). This will only work if the body is actually text, not binary, data.
Files in a FormData request can be accessed at request.files then you can select the file you included in the FormData e.g. request.files['audio'].
So now if you want to access the actual bytes of the file, in our case 'audio' using .stream, you should make sure first that your cursor points to the first byte and not to the end of the file, in which case you will get empty bytes.
Hence, a good way to do it:
file = request.files['audio']
file.stream.seek(0)
audio = file.read()
If the data is JSON, use request.get_json() to parse it.

<class 'requests.models.Response'> to Json

I've never done any object oriented programming, only basic script writing.
I'm playing around with grequests
rs = (grequests.get('https://api.github.com/repositories?since='+str(page), auth=(login, password)) for page in pages)
blah = grequests.map(rs)
print type(blah[0])
The response is:
<class 'requests.models.Response'>
Normally I convert the response to text and then load it into json so I can parse it, but I can't do that with this response.
I understand the concept of classes but haven't used them or know really what to do with that response.
Is there a way I can convert it to json?
blah[0] in your case is a requests.models.Response class which, according to the source code and the documentation, has json() method that deserializes the JSON response into a Python object using json.loads():
print blah[0].json()
Response object can be converted to JSON in two ways.
Use the method .json()
blah[0].json()
OR
Convert to text and load as json.
json.loads(blah[0].text)

Categories

Resources