requests.get(url).json() gives JSONDecodeError - python

I am writing an api to get the data of an app in another app. I have my views setup to get the data from the url like:
import requests
user = 'hello'
pwd = 'python'
class SomeView(APIView):
def get(self, request):
if request.user.is_authenticated():
r = requests.get('http://localhost:8000/foo/bar/',
auth=HTTPBasicAuth(user, pwd))
return HttpResponse(r.json())
else:
return HttpResponse(json.dumps({'success':'false', 'message':'login required '}))
This gives me error like:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/requests/models.py", line 799, in json
return json.loads(self.text, **kwargs)
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/simplejson/__init__.py", line 505, in loads
return _default_decoder.decode(s)
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/simplejson/scanner.py", line 127, in scan_once
return _scan_once(string, idx)
File "/home/abhishek/Documents/venv/local/lib/python2.7/site-packages/simplejson/scanner.py", line 118, in _scan_once
raise JSONDecodeError(errmsg, string, idx)
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I have django == 1.4.5 and requests == 2.5.1 installed in my virtual environment. I have checked almost everything and now i am starting to conclude that the requests version and django version have something to do with the following traceback. I also have simplejson==3.6.5 installed in my virtual environment, which i think has no relevance. Help Please.

you can do something like this
import requests
from rest_framework.response import Response
...
if request.user.is_authenticated():
r = requests.get('http://localhost:8000/foo/bar/',
auth=HTTPBasicAuth(user, pwd))
return Response(r.json())
return Response({'success':'false', 'message':'login required '})

I had some some issues with authentication. Fixed the authentication and fixed the bugs. Thanks for help everyone!!

Related

Scrapy - how to wait for json page to be fully loaded

I scrape json pages but sometimes I get this error:
ERROR: Spider error processing <GET https://reqbin.com/echo/get/json/page/2>
Traceback (most recent call last):
File "/home/user/.local/lib/python3.8/site-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/home/user/path/scraping.py", line 239, in parse_images
jsonresponse = json.loads(response.text)
File "/usr/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 48662 (char 48661)
So I suspect that the json page does not have the time to be fully loaded and that's why parsing of its json content fails. And if I do it manually, I mean taking the json content as a string and loading it with the json module, it works and I don't get the json.decoder.JSONDecodeError error.
What I've done so far is to set in settings.py:
DOWNLOAD_DELAY = 5
DOWNLOAD_TIMEOUT = 600
DOWNLOAD_FAIL_ON_DATALOSS = False
CONCURRENT_REQUESTS = 8
hoping that it would slow down the scraping and solve my problem but the problem still occurs.
Any idea on how to be sure that the json page loaded completely so the parsing of its content does not fail ?
you can try to increase DOWNLOAD_TIMEOUT. It usually helps. If that's not enough, you can try to reduce CONCURRENT_REQUESTS.
If that still doesn't help, try use retry request. You can write your own retry_request function and call it return self.retry_request(response).
Or do it something like that req = response.request.copy(); req.dont_filter=True And return req.
You can also use RetryMiddleware. Read more on the documentation page https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.retry

Get file type json from bucket (Google Cloud Storage) to upload it and save it in VM instance

In order to read a new file(json content) from bucket and send it to vm instance using Cloud Functions, I have tried the following code I got the below error.
import requests
import json
import ndjson
from google.cloud import storage
def hello_gcs(data, context):
"""Background Cloud Function to be triggered by Cloud Storage.
Args:
data (dict): The Cloud Functions event payload.
context (google.cloud.functions.Context): Metadata of triggering event.
Returns:
None; the file is sent as a request to
"""
print('Bucket: {}'.format(data['bucket']))
print('File: {}'.format(data['name']))
client = storage.Client()
bucket = client.get_bucket(format(data['bucket']))
blob = bucket.get_blob(format(data['name']))
contents = blob.download_as_string()
headers = {
'Content-type': 'application/json',
}
data = ndjson.loads(contents)
print(data)
response = requests.post('10.0.0.2', headers=headers, data=data)
return "Request has been sent"
Error:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 383, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 217, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 214, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 30, in hello_gcs data = ndjson.loads(contents) File "/env/local/lib/python3.7/site-packages/ndjson/api.py", line 14, in loads return json.loads(*args, **kwargs) File "/opt/python3.7/lib/python3.7/json/__init__.py", line 361, in loads return cls(**kw).decode(s) File "/env/local/lib/python3.7/site-packages/ndjson/codecs.py", line 9, in decode return super(Decoder, self).decode(text, *args, **kwargs) File "/opt/python3.7/lib/python3.7/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/opt/python3.7/lib/python3.7/json/decoder.py", line 353, in raw_decode obj, end = self.scan_once(s, idx) json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 3 (char 2)
The error seems pretty clear to me:
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 3 (char 2)
Your file probably does not contain valid json (or possibly ndjson in this case).
Also you post to 'internal_IP_of_vm_instance' which can never be a valid url.

Can not parse response from sg.media-imdb in python

I'm trying to parse response from https://sg.media-imdb.com/suggests/a/a.json in Python 3.6.8.
Here is my code:
import requests
url = 'https://sg.media-imdb.com/suggests/a/a.json'
data = requests.get(url).json()
I get this error:
$ /usr/bin/python3 /home/livw/Python/test_scrapy/phase_1.py
Traceback (most recent call last):
File "/home/livw/Python/test_scrapy/phase_1.py", line 33, in <module>
data = requests.get(url).json()
File "/home/livw/.local/lib/python3.6/site-packages/requests/models.py", line 889, in json
self.content.decode(encoding), **kwargs
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 518, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
It seems like the response format is not JSON format, although I can parse the response at JSON Formatter & Validator
How to fix it and store the response in a json object?
This probably happend because its not a complete json, it have a prefix
you can see that the response start with imdb$a( and ends with )
json parsing doesn't know how to handle it and he fails, you can remove those values and just parse the json itself
you can do this:
import json
import requests
url = 'https://sg.media-imdb.com/suggests/a/a.json'
data = requests.get(url).text
json.loads(data[data.index('{'):-1])

Quandl is not working, some error is popping out

I am trying to import Quandl data using the following instruction:
quandl.get('WIKI/GOOGL')
The following error getting popped out:
Traceback (most recent call last):
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/connection.py", line 55, in parse
return response.json()
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/requests/models.py", line 850, in json
return complexjson.loads(self.text, **kwargs)
File "/home/shravilp/anaconda3/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/home/shravilp/anaconda3/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/shravilp/anaconda3/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ml1.py", line 4, in <module>
df = quandl.get('WIKI/GOOGL')
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/get.py", line 48, in get
data = Dataset(dataset_args['code']).data(params=kwargs, handle_column_not_found=True)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/model/dataset.py", line 47, in data
return Data.all(**updated_options)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/operations/list.py", line 14, in all
r = Connection.request('get', path, **options)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/connection.py", line 36, in request
return cls.execute_request(http_verb, abs_url, **options)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/connection.py", line 44, in execute_request
cls.handle_api_error(response)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/connection.py", line 61, in handle_api_error
error_body = cls.parse(resp)
File "/home/shravilp/anaconda3/lib/python3.6/site-packages/quandl/connection.py", line 57, in parse
raise QuandlError(http_status=response.status_code, http_body=response.text)
quandl.errors.quandl_error.QuandlError: (Status 403) Something went wrong. Please try again. If you continue to have problems, please contact us.
Following is my code:
import pandas as pd
import quandl
df = quandl.get("FRED/GDP")
print(df.head())
An error 403, as stated on the Quandl API docs states that you need an API key to access that. Here are the steps to rectify it-
1.Create a free/premium account on Quandl.
2.Generate an API key from account settings options.
3.Include your API key in the script as, quandl.ApiConfig.api_key = "YOURAPIKEY".
Create account get api key and add this
quandl.ApiConfig.api_key = "#######"
df = quandl.get("FRED/GDP")
print(df.head())
generate a new key

what is the server for jira-python connection?

I want to create issue on JIRA using python, so I am learning the way on Welcome to jira-python's documentation.
But then the first question puzzles me. What is the server if we are using our own JIRA? On this documentation, it uses https://jira.atlassian.com. If I am using JIRA whose url is like: https://bugs.company.com/secure/Dashboard.jspa. What is the server for me?
Now, I am using
jira = JIRA(options={'server': 'https://bugs.company.com'})
projects = jira.projects()
keys = [project.key for project in projects]
I will get the error:
Traceback (most recent call last):
File "MethodTest.py", line 9, in <module>
projects = jira.projects()
File "/Library/Python/2.7/site-packages/jira/client.py", line 838, in projects
r_json = self._get_json('project')
File "/Library/Python/2.7/site-packages/jira/client.py", line 1423, in _get_json
r_json = json.loads(r.text)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
The problems might comes from the fact that you are using a secure connection to your jira instance. You need to setup a proper certificate for your connection or simply disable certificate verification.
See jira.client.JIRA options and set verify to False as such:
jira = JIRA(options={'server': 'https://bugs.company.com',
'verify': False})
Are you setting the proper username and password?
Finally, you might want to check with your IT department for the proper url.

Categories

Resources