I'm trying to execute Overpass queries from a Python script. I'm practicing at overpass-turbo.eu and found the following query to work as intended:
[out:json][timeout:600];
{{geocodeArea:Niedersachsen}}->.searchArea;
(
node[place=city](area.searchArea);
node[place=town](area.searchArea);
);
out;
However, when I submit the exact same query from a Python script, I get an error:
import requests
overpass_query = """
[out:json][timeout:600];
{{geocodeArea:Niedersachsen}}->.searchArea;
(
node[place=city](area.searchArea);
node[place=town](area.searchArea);
);
out;
"""
overpass_url = "http://overpass-api.de/api/interpreter"
response = requests.get(overpass_url, params={'data': overpass_query})
data = response.json()
/home/enno/events/docker/etl/venv/bin/python /home/enno/events/docker/etl/test2.py
Traceback (most recent call last):
File "/home/enno/events/docker/etl/test2.py", line 16, in <module>
data = response.json()
File "/home/enno/events/docker/etl/venv/lib/python3.6/site-packages/requests/models.py", line 897, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Process finished with exit code 1
Why is this? It seems to have to do with the curly braces, but I can't figure out how to solve this.
Many thanks,
Enno
The curly braces (aka {{geocodeArea:Niedersachsen}}) are a special feature of overpass turbo and are not part of Overpass API. See extended overpass turbo queries for a list of these shortcuts.
{{geocodeArea:name}} will tell overpass turbo to perform a geocoding request using Nominatim. It will then use the first result to construct an area(id) query. You have to perform the same step (using Nominatim or any other geocoder) in your program.
Related
I scrape json pages but sometimes I get this error:
ERROR: Spider error processing <GET https://reqbin.com/echo/get/json/page/2>
Traceback (most recent call last):
File "/home/user/.local/lib/python3.8/site-packages/twisted/internet/defer.py", line 857, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "/home/user/path/scraping.py", line 239, in parse_images
jsonresponse = json.loads(response.text)
File "/usr/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 48662 (char 48661)
So I suspect that the json page does not have the time to be fully loaded and that's why parsing of its json content fails. And if I do it manually, I mean taking the json content as a string and loading it with the json module, it works and I don't get the json.decoder.JSONDecodeError error.
What I've done so far is to set in settings.py:
DOWNLOAD_DELAY = 5
DOWNLOAD_TIMEOUT = 600
DOWNLOAD_FAIL_ON_DATALOSS = False
CONCURRENT_REQUESTS = 8
hoping that it would slow down the scraping and solve my problem but the problem still occurs.
Any idea on how to be sure that the json page loaded completely so the parsing of its content does not fail ?
you can try to increase DOWNLOAD_TIMEOUT. It usually helps. If that's not enough, you can try to reduce CONCURRENT_REQUESTS.
If that still doesn't help, try use retry request. You can write your own retry_request function and call it return self.retry_request(response).
Or do it something like that req = response.request.copy(); req.dont_filter=True And return req.
You can also use RetryMiddleware. Read more on the documentation page https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.retry
When I call the json() method on request response I get an error.
Any suggestions to what could be wrong here?
My code:
import requests
import bs4
url = 'https://www.reddit.com/r/AskReddit/comments/l4styp/serious_what_is_the_the_scariest_thing_that_you/'
rsp = requests.get(url)
sc = rsp.json()
print(sc)
Output:
File "c:\VS_Code1\scrape.py", line 6, in <module>
sc = rsp.json()
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\requests\models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 5 (char 5)
Your rsp is actually returning <Response [200]> which is not a JSON. If you want to read the content of the response, you can simply do:
rsp.text
What you get from the URL you posted here is HTML, and not JSON.
This does not work because the page you are fetching does not return Json but HTML source code instead.
To fetch the content of the webpage you need to replace sc = rsp.json() with sc = rsp.text
If you need this data in Json, you can look into Reddit's API: https://www.reddit.com/dev/api
I figured out that if I want to get the JSON from the URL that I've inputted I have to add .json to the end of the URL, this is for some reason (to my knowledge) unique to Reddit and a few other sites that allow it.
I'm trying to work with JSON data that is pulled from USGS Earthquake API. If you follow that link, you can see the raw JSON data.
The JSON looks great; however, the returned request is wrapped in an eqfeed_callback(); that is breaking the JSON deserializer in Python.
A quick look at the code I have so far:
import requests
import json
from pprint import pprint
URL = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_week.geojsonp"
response = requests.get(URL)
raw_json = str(response.content)
json = json.loads(raw_json)
print(json)
I get the errors:
Traceback (most recent call last):
File "run.py", line 11, in <module>
json = json.loads(raw_json)
File "C:\Program Files\Anaconda3\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Program Files\Anaconda3\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\Anaconda3\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Although I'm positive the issue is that it's wrapped in that function and the JSON decoder doesn't like it. So how would I go about removing the function wrapper to leave me with the clean JSON inside.
You're using the wrong URL.
JSON wrapped in a function call is JSONP, which is needed for getting around CORS when calling an API from web browsers.
The URL to get normal JSON is
URL = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_week.geojson"
I got a program from a formerly colleague and now should maintain it.
This python script asks our Jira instance with a given jql ( on the API ).
The return is a list of all issues, which are matching the search criteria.
But now it's not working, and I receive on the server ( Ubuntu ) and on my local windows PC a Json error message.
note : it ran for about a year not, but back then it worked.
Here is what the script looks like :
import json
import subprocess
jiraSerachUrl = "https://ourJiraInstance.net/rest/api/2/search?jql=key%20=%20%22TEST-123%22"
jiraResponse = subprocess.Popen(["curl","-l","-s","-u", "jiraUser"+":"+"jiraUserPassword", "-X", "GET", jiraSerachUrl ],stdout=subprocess.PIPE,shell=True).communicate()[0]
## shell=True only added for Windows Instance
print(type(jiraResponse))
##print = <class 'bytes'>
print(jiraResponse)
## print = b''
jiraJsonResponse = json.loads(jiraResponse.decode('utf-8'))
print(jiraJsonResponse)
The jql/jira search address returns the following (shorted answer, all fields of the task are returned):
{"expand":"names,schema","startAt":0,"maxResults":50,"total":1,"issues":
[{"expand":"operations,versionedRepresentations,editmeta,changelog,transitions,renderedFields",
"id":"145936","self":"https://ourJiraInstance.net/rest/api/2/issue/145936","key":"TEST-123","fields":{"parent": ...
The Error on the Windows PC is the following
Traceback (most recent call last): File
"C:\Users\User\Desktop\test.py", line 10, in
jiraJsonResponse = json.loads(jiraResponse.decode('utf-8')) File "C:\Users\User\AppData\Local\Programs\Python\Python35-32\lib\json__init__.py",
line 319, in loads
return _default_decoder.decode(s) File "C:\Users\User\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py",
line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Users\User\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py",
line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char
0)
This is the error on the Ubuntu Server ( running the same script )
Traceback (most recent call last): File "searchJira.py", line 33, in
jiraJsonResponse = json.loads(jiraResponse) File "/usr/lib/python2.7/json/init.py", line 338, in loads
return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
So far I tried to change the Json load to simpleJson, but with the same result.
Changing the format to which Json should decode ( e.g. unicode ) took no effect.
I have tried a bit and finaly got it. by replacing curl with responses i got finally the result I wanted. my request looks now like this :
r = requests.get(jiraSerachUrl,auth=HTTPBasicAuth(user, password), verify=False)
jiraJsonResponse=json.loads(r.text)
python 3.4 and Coinbase V2 API
I am working on some BTC data analysis and trying to make continuous requests to coinbase API. When running my script, it will always eventually crash on a calls to
r = client.get_spot_price()
r = client.get_buy_price()
r = client.get_sell_price()
The unusual thing is that the script will always crash at different times. Sometimes it will successfully collect data for an hour or so and then crash, other times it will crash after 5 - 10 minutes.
ERROR:
r = client.get_spot_price()
File "/home/g/.local/lib/python3.4/site-packages/coinbase/wallet/client.py", line 191, in get_spot_price
response = self._get('v2', 'prices', 'spot', data=params)
File "/home/g/.local/lib/python3.4/site-packages/coinbase/wallet/client.py", line 129, in _get
return self._request('get', *args, **kwargs)
File "/home/g/.local/lib/python3.4/site-packages/coinbase/wallet/client.py", line 116, in _request
return self._handle_response(response)
File "/home/g/.local/lib/python3.4/site-packages/coinbase/wallet/client.py", line 125, in _handle_response
raise build_api_error(response)
File "/home/g/.local/lib/python3.4/site-packages/coinbase/wallet/error.py", line 49, in build_api_error
blob = blob or response.json()
File "/home/g/.local/lib/python3.4/site-packages/requests/models.py", line 812, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.4/json/__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.4/json/decoder.py", line 361, in raw_decode
raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)
It seems to be crashing due to some json decoding?
Does anyone have any idea why this will only throw errors at certain times?
I have tried something like the following to avoid crashing due to this error:
#snap is tuple of data containing data from buy, sell , spot price
if not any(snap):
print('\n\n-----ENTRY ERROR---- Snap returned None \n\n')
success = False
return
but it isn't doing the trick
What are some good ways to handle this error in your opinion?
Thanks, any help is much appreciated!
For me it could be something related with that issue https://github.com/coinbase/coinbase-python/issues/15. It seems in fact to be an internal library error (as the code does raise build_api_error(response) what confirms my assertions).
Maybe it possible that the problem is related to a internet connectivity? If your network (or the server fails), it can either fail to retrieve the JSON file or can retrieve an empty one. But, the library should inform you more clearly.
So, it will try to decode an empty file inside the JSON decoder, what causes the error.
A temporary workaround would be to brace your code with a try statement and to try again if it fails.
You have to supply it with a currency to get a price.
Here is an example:
price = client.get_spot_price(currency_pair='XRP-USD')