How to repeatedly execute one yield step in a generator?

How to repeatedly execute one yield step in a generator? - python

I am writing tests that involve HTTP requests. There are some requests I have to make in a certain order, but I want to check the response from each step. So I thought a generator would be appropriate to enforce the sequence:
# Main code
def sequence_of_requests(arg1, arg2):
yield request_a(arg1)
yield request_b(arg1, arg2)
yield request_c(arg1, arg2)
Then in my test code I can write:
# Test code
generator_responses = sequence_of_requests()
r = next(generator_responses)
assert r.status_code == 200
r = next(generator_responses)
assert r.status_code == 204
r = next(generator_responses)
assert r.status_code == 404
The problem is that request_c() does not always receive the correct status on the first try, so I have been wrapping this function with a decorator in the test code so it repeatedly tries until success or timeout.
I'm wondering if there's some way I can do this wrapping on the generator, so I still get the enforced sequence of events. So I'm basically wondering if there's some way to repeatedly call one yield step of a generator.
Note: I don't want to put the code to wait for a response directly in the sequence_of_requests function, because it's not test code. The sequence_of_requests function is just used to ensure those steps are completed in the right order.

Instead of yielding the response values themselves, you can yield the functions, which can then be called repeatedly. The parameters to the functions can be applied with functools.partial.
from functools import partial
def sequence_of_requests(arg1, arg2):
yield partial(request_a, arg1)
yield partial(request_b, arg1, arg2)
yield partial(request_c, arg1, arg2)
Now the steps can be tested:
generator_functions = sequence_of_requests()
r = next(generator_functions)()
assert r.status_code == 200
r = next(generator_functions)()
assert r.status_code == 204
# Repeatedly make the request until the expected status code is returned.
func_slow_request = next(generator_functions)
wait_for_status(404)(func_slow_request)

You weren't very specific about timeout and number of retries, or condition for success... But, for simplicity, let's assume you want to retry up to 5 times, with 1s in between retries and you're expecting an error code other than 500 (or 5xx) to proceed.
Then you could use something like:
import time
def sequence_of_requests(arg1, arg2):
yield request_a(arg1)
yield request_b(arg1, arg2)
for retries in range(5):
result = request_c(arg1, arg2)
if result.status_code < 500:
break
time.sleep(1)
yield result
If you get to the maximum number of retries, you'll simply return the last received response. If you get a valid response, you break out of the loop and return the good response.
There's a small inefficiency in that if you reach the total number of retries, you still sleep for 1s before returning the value. You can fix that by managing the retries variable explicitly and checking for the limit inside the loop. But you might have other ideas on how to manage giving up (e.g. a timeout, rather than a fixed number of retries), so you should adapt that part of the code to follow the logic that makes sense to you.

Related

Who/How to get the control of the program after an exception has ocurred

I have always wondered who takes the control of the program after an exception has thrown. I was seeking for a clear answer but did not find any. I have the following functions described, each one executes an API call which involves a network request, therefore I need to handle any possible errors by a try/except and possibly else block (JSON responses must be parsed/decoded as well):
# This function runs first, if this fails, none of the other functions will run. Should return a JSON.
def get_summary():
pass
# Gets executed after get_summary. Should return a string.
def get_block_hash():
pass
# Gets executed after get_block_hash. Should return a JSON.
def get_block():
pass
# Gets executed after get_block. Should return a JSON.
def get_raw_transaction():
pass
I wish to implement a kind of retry functionality on each function, so if it fails due to a timeout error, connection error, JSON decode error etc., it will keep retrying without compromising the flow of the program:
def get_summary():
try:
response = request.get(API_URL_SUMMARY)
except requests.exceptions.RequestException as error:
logging.warning("...")
#
else:
# Once response has been received, JSON should be
# decoded here wrapped in a try/catch/else
# or outside of this block?
return response.text
def get_block_hash():
try:
response = request.get(API_URL + "...")
except requests.exceptions.RequestException as error:
logging.warning("...")
#
else:
return response.text
def get_block():
try:
response = request.get(API_URL + "...")
except requests.exceptions.RequestException as error:
logging.warning("...")
#
else:
#
#
#
return response.text
def get_raw_transaction():
try:
response = request.get(API_URL + "...")
except requests.exceptions.RequestException as error:
logging.warning("...")
#
else:
#
#
#
return response.text
if __name__ == "__main__":
# summary = get_summary()
# block_hash = get_block_hash()
# block = get_block()
# raw_transaction = get_raw_transaction()
# ...
I want to keep clean code on the outermost part of it (block after if __name__ == "__main__":), I mean, I don't want to fill it with full of confused try/catch blocks, logging, etc.
I tried to call a function itself when an exception threw on any of those functions but then I read about stack limit and thought it was a bad idea, there should be a better way to handle this.
request already retries by itself N number of times when I call the get method, where N is a constant in the source code, it is 100. But when the number of retries has reached 0 it will throw an error I need to catch.
Where should I decode JSON response? Inside each function and wrapped by another try/catch/else block? or in the main block? How can I recover from an exception and keep trying on the function it failed?
Any advice will be grateful.

You could keep those in an infinite loop (to avoid recursion) and once you get the expected response just return:
def get_summary():
while True:
try:
response = request.get(API_URL_SUMMARY)
except requests.exceptions.RequestException as error:
logging.warning("...")
#
else:
# As winklerrr points out, try to return the transformed data as soon
# as possible, so you should be decoding JSON response here.
try:
json_response = json.loads(response)
except ValueError as error: # ValueError will catch any error when decoding response
logging.warning(error)
else:
return json_response
This function keeps executing until it receives the expected result (reaches return json_response) otherwise it will be trying again and again.

You can do the following
def my_function(iteration_number=1):
try:
response = request.get(API_URL_SUMMARY)
except requests.exceptions.RequestException:
if iteration_number < iteration_threshold:
my_function(iteration_number+1)
else:
raise
except Exception: # for all other exceptions, raise
raise
return json.loads(resonse.text)
my_function()

Where should I decode JSON response?
Inside each function and wrapped by another try/catch/else block or in the main block?
As a rule thumb: try to transform data as soon as possible into the format you want it to be. It makes the rest of your code easier if you don't have to extract everything again from a response object all the time. So just return the data you need, in the easiest format you need it to be.
In your scenario: You call that API in every function with the same call to requests.get(). Normally all the responses from an API have the same format. So this means, you could write an extra function which does that call for you to the API and directly loads the response into a proper JSON object.
Tip: For working with JSON make use of the standard library with import json
Example:
import json
def call_api(api_sub_path):
repsonse = requests.get(API_BASE_URL + api_sub_path)
json_repsonse = json.loads(repsonse.text)
# you could verify your result here already, e.g.
if json_response["result_status"] == "successful":
return json_response["result"]
# or maybe throw an exception here, depends on your use case
return json_response["some_other_value"]
How can I recover from an exception and keep trying on the function it failed?
You could use a while loop for that:
def main(retries=100): # default value if no value is given
result = functions_that_could_fail(retries)
if result:
logging.info("Finished successfully")
functions_that_depend_on_result_from_before(result)
else:
logging.info("Finished without result")
def functions_that_could_fail(retry):
while(retry): # is True as long as retry is bigger than 0
try:
# call all functions here so you just have to write one try-except block
summary = get_summary()
block_hash = get_block_hash()
block = get_block()
raw_transaction = get_raw_transaction()
except Exception:
retry -= 1
if retry:
logging.warning("Failed, but trying again...")
else:
# else gets only executed when no exception was raised in the try block
logging.info("Success")
return summary, block_hash, block, raw_transaction
logging.error("Failed - won't try again.")
result = None
def functions_that_depend_on_result_from_before(result):
[use result here ...]
So with the code from above you (and maybe also some other people who use your code) could start your program with:
if __name__ == "__main__":
main()
# or when you want to change the number of retries
main(retries=50)

Testing and mocking a threaded function within a view

I'm trying to unit test a function that I run threaded within a view. Whenever I try to mock it, it always goes to the original function, no the mocked function.
The code I'm testing, from the view module:
def restart_process(request):
batch_name = request.POST.get("batch_name", "")
if batch_name:
try:
batch = models.Batch.objects.get(num=batch_name)
except models.Batch.DoesNotExist:
logger.warning("Trying to restart a batch that does not exist: " + batch_name)
return HttpResponse(404)
else:
logger.info(batch_name + " restarted")
try:
t = threading.Thread(target=restart_from_last_completed_state, args=(batch,))
t.daemon = True
t.start()
except RuntimeError:
return HttpResponse(500, "Threading error")
return HttpResponse(200)
else:
return HttpResponse(400)
The test function:
class ThreadTestCases(TransactionTestCase):
def test_restart_process(self):
client = Client()
mock_restart_from_last_completed_state = mock.Mock()
with mock.patch("processapp.views.restart_from_last_completed_state", mock_restart_from_last_completed_state):
response = client.post('/batch/restart/', {"batch_name": "BATCH555"})
self.assertEqual(response.status_code, 200)
mock_restart_from_last_completed_state.assert_called_once()
The URL:
url(r'^batch/restart/$', views.restart_from_last_completed_state, name="restart_batch"),
I always get this error:
ValueError: The view processapp.processing.process_runner.restart_from_last_completed_state didn't return an HttpResponse object. It returned None instead.
I put a print command in the original function (restart_from_last_completed_state) and it always runs so the mocking does not take place.
The error seems to take the function as a view although it is not.
I'm not sure where the error is, the threading, testing, something else?

The URL variable was wrong. Was supposed to be views.restart_process not views.restart_from_last_completed_state
A copy/paste error as so many times...

Python API Rate Limiting - How to Limit API Calls Globally

I'm trying to restrict the API calls in my code. I already found a nice python library ratelimiter==1.0.2.post0
https://pypi.python.org/pypi/ratelimiter
However, this library can only limit the rate in local scope. i.e) in function and loops
# Decorator
#RateLimiter(max_calls=10, period=1)
def do_something():
pass
# Context Manager
rate_limiter = RateLimiter(max_calls=10, period=1)
for i in range(100):
with rate_limiter:
do_something()
Because I have several functions, which make API calls, in different places, I want to limit the API calls in global scope.
For example, suppose I want to limit the APIs call to one time per second. And, suppose I have functions x and y in which two API calls are made.
#rate(...)
def x():
...
#rate(...)
def y():
...
By decorating the functions with the limiter, I'm able to limit the rate against the two functions.
However, if I execute the above two functions sequentially, it looses track of the number of API calls in global scope because they are unaware of each other. So, y will be called right after the execution of x without waiting another second. And, this will violate the one time per second restriction.
Is there any way or library that I can use to limit the rate globally in python?

I had the same problem, I had a bunch of different functions that calls the same API and I wanted to make rate limiting work globally. What I ended up doing was to create an empty function with rate limiting enabled.
PS: I use a different rate limiting library found here: https://pypi.org/project/ratelimit/
from ratelimit import limits, sleep_and_retry
# 30 calls per minute
CALLS = 30
RATE_LIMIT = 60
#sleep_and_retry
#limits(calls=CALLS, period=RATE_LIMIT)
def check_limit():
''' Empty function just to check for calls to API '''
return
Then I just call that function at the beginning of every function that calls the API:
def get_something_from_api(http_session, url):
check_limit()
response = http_session.get(url)
return response
If the limit is reached, the program will sleep until the (in my case) 60 seconds have passed, and then resume normally.

After all, I implemented my own Throttler class. By proxying every API request to the request method, we can keep track of all API requests. Taking advantage of passing function as the request method parameter, it also caches the result in order to reduce API calls.
class TooManyRequestsError(Exception):
def __str__(self):
return "More than 30 requests have been made in the last five seconds."
class Throttler(object):
cache = {}
def __init__(self, max_rate, window, throttle_stop=False, cache_age=1800):
# Dict of max number of requests of the API rate limit for each source
self.max_rate = max_rate
# Dict of duration of the API rate limit for each source
self.window = window
# Whether to throw an error (when True) if the limit is reached, or wait until another request
self.throttle_stop = throttle_stop
# The time, in seconds, for which to cache a response
self.cache_age = cache_age
# Initialization
self.next_reset_at = dict()
self.num_requests = dict()
now = datetime.datetime.now()
for source in self.max_rate:
self.next_reset_at[source] = now + datetime.timedelta(seconds=self.window.get(source))
self.num_requests[source] = 0
def request(self, source, method, do_cache=False):
now = datetime.datetime.now()
# if cache exists, no need to make api call
key = source + method.func_name
if do_cache and key in self.cache:
timestamp, data = self.cache.get(key)
logging.info('{} exists in cached # {}'.format(key, timestamp))
if (now - timestamp).seconds < self.cache_age:
logging.info('retrieved cache for {}'.format(key))
return data
# <--- MAKE API CALLS ---> #
# reset the count if the period passed
if now > self.next_reset_at.get(source):
self.num_requests[source] = 0
self.next_reset_at[source] = now + datetime.timedelta(seconds=self.window.get(source))
# throttle request
def halt(wait_time):
if self.throttle_stop:
raise TooManyRequestsError()
else:
# Wait the required time, plus a bit of extra padding time.
time.sleep(wait_time + 0.1)
# if exceed max rate, need to wait
if self.num_requests.get(source) >= self.max_rate.get(source):
logging.info('back off: {} until {}'.format(source, self.next_reset_at.get(source)))
halt((self.next_reset_at.get(source) - now).seconds)
self.num_requests[source] += 1
response = method() # potential exception raise
# cache the response
if do_cache:
self.cache[key] = (now, response)
logging.info('cached instance for {}, {}'.format(source, method))
return response

Many API providers constrain developers from making too many API calls.
Python ratelimit packages introduces a function decorator preventing a function from being called more often than that allowed by the API provider.
from ratelimit import limits
import requests
TIME_PERIOD = 900 # time period in seconds
#limits(calls=15, period=TIME_PERIOD)
def call_api(url):
response = requests.get(url)
if response.status_code != 200:
raise Exception('API response: {}'.format(response.status_code))
return response
Note: This function will not be able to make more then 15 API call within a 15 minute time period.

Adding to Sunil answer, you need to add #sleep_and_retry decorator, otherwise your code will break when reach the rate limit:
#sleep_and_retry
#limits(calls=0.05, period=1)
def api_call(url, api_key):
r = requests.get(
url,
headers={'X-Riot-Token': api_key}
)
if r.status_code != 200:
raise Exception('API Response: {}'.format(r.status_code))
return r

There are lots of fancy libraries that will provide nice decorators, and special safety features, but the below should work with django.core.cache or any other cache with a get and set method:
def hit_rate_limit(key, max_hits, max_hits_interval):
'''Implement a basic rate throttler. Prevent more than max_hits occurring
within max_hits_interval time period (seconds).'''
# Use the django cache, but can be any object with get/set
from django.core.cache import cache
hit_count = cache.get(key) or 0
logging.info("Rate Limit: %s --> %s", key, hit_count)
if hit_count > max_hits:
return True
cache.set(key, hit_count + 1, max_hits_interval)
return False

Using the Python standard library:
import threading
from time import time, sleep
b = threading.Barrier(2)
def belay(s=1):
"""Block the main thread for `s` seconds."""
while True:
b.wait()
sleep(s)
def request_something():
b.wait()
print(f'something at {time()}')
def request_other():
b.wait()
print(f'or other at {time()}')
if __name__ == '__main__':
thread = threading.Thread(target=belay)
thread.daemon = True
thread.start()
# request a lot of things
i = 0
while (i := i+1) < 5:
request_something()
request_other()
There's about s seconds between each timestamp printed. Because the main thread waits rather than sleeps, time it spends responding to requests is unrelated to the (minimum) time between requests.

Using 'while True' while doing http request in Python 2.7

Is there a more Pythonic (2.7) way to check the server for a good status_code (200) that doesn't include using while True? My code snippet is as follows - and it's called many times:
import time
import json
from datetime import datetime
import requests
while True:
response = requests.get('http://example.com')
if response.status_code != 200:
print 'sleeping:',str(datetime.now()),response.status_code
print 'sleeping:',str(datetime.now()),response.headers
time.sleep(5.0)
else: break
if "x-mashery-error-code" in response.headers:
return None
return response.json()
edit: I included the 'if' loop with the header errors.

You can use Event Hooks
requests.get('http://example.com', hooks=dict(response=check_status))
def check_status(response):
if response.status_code != 200:
print 'not yet'

I would like this solution:
response = requests.get('http://example.com')
while response.status_code != 200:
print 'sleeping:',str(datetime.now()),response.status_code
print 'sleeping:',str(datetime.now()),response.headers
time.sleep(5.0)
response = requests.get('http://example.com')
Because:
>>> import this
...
Explicit is better than implicit.
Simple is better than complex.
...
Flat is better than nested.
...
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
...
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
...
Because I read it and understand it right away. With event hooks this is not the case. Do they open a thread to retrieve bytes in parallel? When are they called? Do I need to retrieve the data myself?

I'm using aspect-oriented programming by applying decorators for things like doing retries. If my function for getting the value I want looks like this:
def getValue():
return requests.get('http://example.com')
Then I'm decorating this function to apply the retry mechanism without interfering with the original (naive) code:
def retryUntilCondition(condition):
def decorate(function):
def f(*args, **kwargs):
while True:
result = function(*args, **kwargs)
if condition(result):
return result
time.sleep(5.0)
return f
return decorate
def responseIs200(response):
return response.status_code == 200
The above is the preparation (part of a utility library), below follows the usage:
#retryUntilCondition(responseIs200)
def getValue():
return requests.get('http://example.com')
This way the while loop is completely hidden from the application code and does not complicate reading it. The aspect of retrying is added by prepending a simple decorator which can even be reused in other situations.
If later you decide that you only want to retry for a specific number of times, have different delays etc., all this can be implemented in the retry decorator alone.

Responding to httpRequest after using threading.Timer to delay response

I'm trying to patch a testing framework built in python for javascript called mootools-test-runner (i'm a front end developer by day, so my python skills are pretty weak... really weak.)
The use case is we want to be able to make a json request to the server and have it delay x amount of time before it returns -- originally it was written to use a sleep method, but that prevented multiple simultaneous requests. Sooo... after poking around for about a day i arrived at the code below. The problem i'm seeing (although there could well be many problems with my code) is:
The view test_runner.views.echo_json didn't return an HttpResponse object.
if anyone could offer any advice or point me in the right direction I would be super grateful -- thanks!
def echo_json(req, wasDelayed=False):
if req.REQUEST.get('delay') and wasDelayed == False:
sleeper(req, echo_jsonp)
else:
response = {}
callback = req.REQUEST.get('callback', False)
noresponse_eys = ['callback', 'delay']
for key, value in req.REQUEST.items():
if key not in noresponse_keys:
response.update({key: value})
response = simplejson.dumps(response)
if callback:
response = '%s(%s);' % (callback, response)
return HttpResponse(response, mimetype='application/javascript')
def sleeper(req, callback)
delay = float(req.REQUEST.get('delay'))
t = threading.Timer(delay, functools.partial(callback, req, true))
t.start()

Are you sure you want the return statement inside the for key, value loop? You're only allowing a single iteration, and returning.
Also, check the flow of the function. There are cases in which it will return None. Easiest way to do this is printing out your request object and examining it in the cases in which the function doesn't return an HttpResponse object.
See that your function will return None if:
req.request contains the key 'delay' and wasDelayed is True
req.REQUEST.items() is empty
I can't be sure, but I think the 2 problems are the else: and the return there. Shouldn't the code below the else: be executing whether the response is delayed or not? And shouldn't the return statement be outside the for loop?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to repeatedly execute one yield step in a generator? - python

Related

Who/How to get the control of the program after an exception has ocurred

Testing and mocking a threaded function within a view

Python API Rate Limiting - How to Limit API Calls Globally

Using 'while True' while doing http request in Python 2.7

Responding to httpRequest after using threading.Timer to delay response

Categories

Resources