how to avoid 429 error requests.get() in python? - python

I'm trying to get some data from pubg API using requests.get().
While code was excuting, response.status_code returned 429.
After I got 429, I couldn't get 200.
how to fix this situation?
Here is part of my code.
for num in range(len(platform)):
url = "https://api.pubg.com/shards/"+platform[num]+"/players/"+playerID[num]+"/seasons/"+seasonID+"/ranked"
req = requests.get(url, headers=header)
print(req.status_code)
[output]
200
429

As per https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
you are sending too many requests at once/in short span of time. I recommend using time.sleep(10)
import time
for num in range(len(platform)):
....
....
time.sleep(10)
I used 10 seconds, but you have to test it and understand how much time gap is requred. I would also recommend using https://pypi.org/project/retry/ after you figure out the right amount of sleep time.

As mentioned by sam, HTTP error 429 means you are making too many requests in a certain amount of time.
According to the official PUBG API documentation, the API actually tells you these rate limits by sending an additional header called X-RateLimit-Limit with every request. Each request also has a header called X-RateLimit-Remaining which tells you how many requests you have currently left until the next rate reset which happens at the time specified in the third header X-RateLimit-Reset.
Since you seem to use the requests library, you can access these headers with a simple req.headers after making your request in python.

Related

how to get to know when the status_code will be 200?

I request data from API with python.
2 steps:
I need to request for the url for the data.
They are obtained by requests.POST method
with the urls, I can finally request for the data I need.
but the problem is:
the url's resources are not well prepared instantly the urls are obtained. If you want to pull a request for data at once, the outcome is usually failure with status_code != 200 and prompting data not ready.
so, I set a sleep time for random seconds. After some time, I will request the data again until I get it. Although, the code will collapse due to 'Max retries exceeded with url' Error.
How could I get to know the exact time when the url's resources are ready so that I don't need to try again and again?

Is there a way to get the timestamp of http/s get request?

I have taken an interest in web crawling using requests and BeautifulSoup.
raw = requests.get(url)
print(raw.text)
When I run the code I notice that the result is delayed by as much as 7 minutes. Is there a way to get a timestamp for the requests library after I perform a get request?
I think those request are not realtime.
Do you want the time stamp for when you made the request or when the server got the request?
If you want the time you made the request I would use
import time
request_time = time.time()
I'm not sure about getting the time the server received it unless the server tells you that in a json response or something

How does coingecko api requests limit work? Getting too many requests error after second request

I am trying to get some data from coingecko. In the first request fetching list of 100 biggest currencies. In following requests I want to fetch some detailed info about the biggest currencies. Unfortunately already for the second request I am receiving the too many requests error. I am able to make another successful request after a few minutes but with this pace it would take hours to get info about all the big currencies.
I am using rapidapi.com that is recommended by coingecko website.
I used a copy of the code from documentation e.g.:
import requests
url = "https://coingecko.p.rapidapi.com/coins/%7Bid%7D"
querystring = {"localization":"true","tickers":"true","market_data":"true","community_data":"true","developer_data":"true","sparkline":"false"}
headers = {
'x-rapidapi-host': "coingecko.p.rapidapi.com",
'x-rapidapi-key': "mykey"
}
response = requests.request("GET", url, headers=headers, params=querystring)
print(response.text)
Am I doing anything wrong? Thanks a lot in advance.
The CoinGecko API is completely free on RapidAPI and has a rate limit of 50 calls/minute.
It should be working fine now. Give it another go. I have tested this API on RapidAPI, and I'm getting the expected response.
Actually I was just misled by rapidapi.com article. Actually it is better to go directly via coingecko api. You can find more info at https://www.coingecko.com/en/api/documentation

waiting for completion of get request

I have to get several pages of a json API with about 130'000 entries.
The request is fairly simple with:
response = requests.request("GET", url, headers=headers, params=querystring)
Where the querystring is an access token and the headers fairly simple.
I created a while loop where basically every request url is in the form of
https://urlprovider.com/endpointname?pageSize=10000&rowStart=0
and the rowStart increments by pageSize until there is no more further pages.
The problem I encounter is the following response after about 5-8 successful requests:
{'errorCode': 'ERROR_XXX', 'code': 503, 'message': 'Maximum limit for unprocessed API requests have been reached. Please try again later.', 'success': False}
From the error message I get that I initiate the next request before the last has finished. Does anyone know how I can make sure the get request has finished before the next one starts (except something crude like a sleep()) or if the error could lie elsewhere?
I found the answer to my question.
Requests is synchronous, meaning that it will ALWAYS wait until the the call has finished before continuing
The response from the API provider is misleading, as the request has thus already been processed before the next one comes.
The root cause is difficult to assess, but it may be to do with a limit imposed by the API provider
What has worked:
A crude sleep(10), which makes the program wait 10 seconds before processing the next request
Better solution: Create a Session. According to the documentation:
The Session object [...] will use urllib3’s connection pooling. So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection).
Not only does this resolve the problem but also increases the performance compared to my initial code.

Python "requests" module truncating responses

When I use the python requests module, calling requests.get(url), I have found that the response from the url is being truncated.
import requests
url = 'https://gtfsrt.api.translink.com.au/Feed/SEQ'
response = requests.get(url)
print response.text
The response I get from the URL is being truncated. Is there a way to get requests to retrieve the full set of data and not truncate it?
Note: The given URL is a public transport feed which puts out a huge quantity data during the peak of day.
I ran into the same issue. The problem is not your Python code. It might be PyCharm or Utility you are using - The console has a buffer limit. You may have to increase that to see your full response.
Refer to this article for more help:
Increase output buffer when running or debugging in PyCharm
Add "gzip":true to your request options.
That fixed the issue for me.

Categories

Resources