Is this the right way to measure round-trip time?

Is this the right way to measure round-trip time? - python

I need to compare a few CDN services, so I write a short python script to repeatedly send get requests to resources deployed on these CDN, and record the round-trip time. I run the scripts on several PCs in different cities.
This is how I did it:
t0 = time.clock()
r = requests.get(test_cdn_url)
t1 = time.clock()
roundtrip = t1-t0 # in seconds
For most requests, the roundtrip time is within 1 second:200-500ms, but occasionally, it reports a request that finishes in several seconds: 3-5 seconds, once 9 seconds.
Is this just the way it is, or am I using the wrong tool to measure? In other words, does requests lib do something (caching or some heavy-weight operations) that makes the metric totally wrong?

The Response object provides an elapsed attribute:
The amount of time elapsed between sending the request and the arrival
of the response (as a timedelta)
Your code would then look like:
r = requests.get(test_cdn_url)
roundtrip = r.elapsed.total_seconds()

If you're worried that requests is doing anything heavy-weight (or caching), you could always use urllib:
nf = urllib.urlopen(url)
t0 = time.time()
page = nf.read()
t1 = time.time()
nf.close()
roundtrip = t1 - t0
Alternatively, if you include a Cache-Control: no-cache header along with your request, then that should ensure that no caching happens along the way - and your original code should time the request effectively.

Related

How to use API response in multithreading Python

So, I'm trying to write a queue on Python for load testing and I'm stuck. What I have:
Rest API with authentication;
POST route for sending a request;
requests package for sending requests
Array with users email that also contains their user_id.
When I passed the authentication I got a UserID and access_token. I can save it in a dictionary or a list, but then I need to use it for sending requests: user_id need for route, access_token for checksum calculation. But I don't see any variants for using these cases. I thought about two for loops, but I need a way to save access_token and user_id for all of my users in the dictionary and use it. And I'm not sure about this case, too.
I tried to do it in multithreading. For example:
q = queue.Queue()
def run_with_queue_api_authorize():
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as pool:
auth = [pool.submit(ApiAuthorization.get_token_generated, email) for email in ApiVariables.users]
for r in concurrent.futures.as_completed(auth):
q.put(r)
print(q.qsize())
return q
def run_with_queue_send_activity():
start_time = int(datetime.datetime.now().timestamp())
run_with_queue_api_authorize()
end_time = int(datetime.datetime.now().timestamp())
print(f"Execution time for auth:{end_time - start_time}, start time is {start_time}, end_time is {end_time}")
time.sleep(10)
start_time = int(datetime.datetime.now().timestamp())
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as pool:
while not q.empty():
task = q.get()
new_try = [pool.submit(MultiQueue.multi_post_user_activity, task if task.done())]
if q.empty():
break
end_time = int(datetime.datetime.now().timestamp())
print(f"Execution time for post user activity:{end_time - start_time}, start time is {start_time}, end_time is {end_time}")
When I use this code I can send my test data but for one user in the 3000 of threads. And it'll be failed for 2999 threads. I don't understand why it doesn't work.
I tried to create a queue with threading package. It finishes immediately without any information. And I think that the solution with ThreadPoolExecutor is more reliable and most possible.
I can take a user_id from a special array but I need to get an access_token for this user. It doesn't look as a work case.
Why do I need it? Cause I need to get a count of requests in the second. But when I do authorization in the threads, I also send data at the same time asynchronously. And the total time for my tests will be counting with the time of authorization.
How can I resolved this problem? I've seen some videos about multithreading but it doesn't work for me. And I read a lot of information about the subject. But I can't apply it for my case.
I'd be grateful for any advice.

Faster way to iterate over dataframe?

I have a dataframe where each row is a record and I need to send each record in the body of a post request. Right now I am looping through the dataframe to accomplish this. I am constrained by the fact that each record must be posted individually. Is there a faster way to accomplish this?

Iterating over the data frame is not the issue here. The issue is you have to wait for the server to response to each of your request. Network request takes eons compared to CPU time need to iterate over the data frame. In other words, your program is I/O bound, not CPU bound.
One way to speed it up is to use coroutines. Let's say you have to make 1000 requests. Instead of firing one request, wait for the response, then fire the next request and so on, you fire 1000 requests at once and tell Python to wait until you have received all 1000 responses.
Since you didn't provide any code, here's a small program to illustrate the point:
import aiohttp
import asyncio
import numpy as np
import time
from typing import List
async def send_single_request(session: aiohttp.ClientSession, url: str):
async with session.get(url) as response:
return await response.json()
async def send_all_requests(urls: List[str]):
async with aiohttp.ClientSession() as session:
# Make 1 coroutine for each request
coroutines = [send_single_request(session, url) for url in urls]
# Wait until all coroutines have finished
return await asyncio.gather(*coroutines)
# We will make 10 requests to httpbin.org. Each request will take at least d
# seconds. If you were to fire them sequentially, they would have taken at least
# delays.sum() seconds to complete.
np.random.seed(42)
delays = np.random.randint(0, 5, 10)
urls = [f"https://httpbin.org/delay/{d}" for d in delays]
# Instead, we will fire all 10 requests at once, then wait until all 10 have
# finished.
t1 = time.time()
result = asyncio.run(send_all_requests(urls))
t2 = time.time()
print(f"Expected time: {delays.sum()} seconds")
print(f"Actual time: {t2 - t1:.2f} seconds")
Output:
Expected time: 28 seconds
Actual time: 4.57 seconds
You have to read up a bit on coroutines and how they work but for the most part, they are not too complicated for your use case. This comes with a couple caveats:
All your requests must be independent from each other.
The rate limit on the server must be sufficient to handle your workload. For example, if it restricts you to 2 requests per minute, there is no way around that other than upgrading to different service tier.

grpc python measure response time

How can I measure the full time grpc-python takes to handle a request?
So far the best I can do is:
def Run(self, request, context):
start = time.time()
# service code...
end = time.time()
return myservice_stub.Response()
But this doesn't measure how much time grpc takes to serialize the request, response, to transfer it over the network.. and so on. I'm looking for a way to "hook" into these steps.

You can measure on the client side:
start = time.time()
response = stub.Run(request)
total_end_to_end = time.time() - start
Then you can get the total overhead (serialization, transfer) by reducing the computation of the Run method.
To automate the process, you can add (at least for the sake of the test) the computation time as a field to the myservice_stub.Response.

Running asynchronous queries in BigQuery not noticeably faster

I am using Google's python API client library on App Engine to run a number of queries in Big Query to generate live analytics. The calls take roughly two seconds each and with five queries, this is too long, so I looked into ways to speed things up and thought running queries asynchronously would be a solid improvement. The thinking was that I could insert the five queries at once and Google would do some magic to run them all at the same time and then use jobs.getQueryResults(jobId) to get the results for each job. I decided to test the theory out with a proof of concept by timing the execution of two asynchronous queries and comparing it to running queries synchronously. The results:
synchronous: 3.07 seconds (1.34s and 1.29s for each query)
asynchronous: 2.39 seconds (0.52s and 0.44s for each insert, plus another 1.09s for getQueryResults())
Which is only a difference of 0.68 seconds. So while asynchronous queries are faster, they aren't achieving the goal of Google parallel magic to cut down on total execution time. So first question: is that expectation of parallel magic correct? Even if it's not, of particular interest to me is Google's claim that
An asynchronous query returns a response immediately, generally before
the query completes.
Roughly half a second to insert the query doesn't meet my definition of 'immediately'! I imagine Jordan or someone else on the Big Query team will be the only ones that can answer this, but I welcome any answers!
EDIT NOTES:
Per Mikhail Berlyant's suggestion, I gathered creationTime, startTime and endTime from the jobs response and found:
creationTime to startTime: 462ms, 387ms (timing for queries 1 and 2)
startTime to endTime: 744ms, 1005ms
Though I'm not sure if that adds anything to the story as it's the timing between issuing insert() and the call completing that I'm wondering about.
From BQ's Jobs documentation, the answer to my first question about parallel magic is yes:
You can run multiple jobs concurrently in BigQuery
CODE:
For what it's worth, I tested this both locally and on production App Engine. Local was slower by a factor of about 2-3, but replicated the results. In my research I also found out about partitioned tables, which I wish I knew about before (which may well end up being my solution) but this question stands on its own. Here is my code. I am omitting the actual SQL because they are irrelevant in this case:
def test_sync(self, request):
t0 = time.time()
request = bigquery.jobs()
data = { 'query': (sql) }
response = request.query(projectId=project_id, body=data).execute()
t1 = time.time()
data = { 'query': (sql) }
response = request.query(projectId=project_id, body=data).execute()
t2 = time.time()
print("0-1: " + str(t1 - t0))
print("1-2: " + str(t2 - t1))
print("elapsed: " + str(t2 - t0))
def test_async(self, request):
job_ids = {}
t0 = time.time()
job_id = async_query(sql)
job_ids['a'] = job_id
print("job_id: " + job_id)
t1 = time.time()
job_id = async_query(sql)
job_ids['b'] = job_id
print("job_id: " + job_id)
t2 = time.time()
for key, value in job_ids.iteritems():
response = bigquery.jobs().getQueryResults(
jobId=value,
projectId=project_id).execute()
t3 = time.time()
print("0-1: " + str(t1 - t0))
print("1-2: " + str(t2 - t1))
print("2-3: " + str(t3 - t2))
print("elapsed: " + str(t3 - t0))
def async_query(sql):
job_data = {
'jobReference': {
'projectId': project_id
},
'configuration': {
'query': {
'query': sql,
'priority': 'INTERACTIVE'
}
}
}
response = bigquery.jobs().insert(
projectId=project_id,
body=job_data).execute()
job_id = response['jobReference']['jobId']
return job_id

The answer to whether running queries in parallel will speed up the results is, of course, "it depends".
When you use the asynchronous job API there is about a half a second of built-in latency that gets added to every query. This is because the API is not designed for short-running queries; if your queries run in under a second or two, you don't need asynchronous processing.
The half second latency will likely go down in the future, but there are a number of fixed costs that aren't going to get any better. For example, you're sending two HTTP requests to google instead of one. How long these take depends on where you are sending the requests from and the characteristics of the network you're using. If you're in the US, this could be only a few milliseconds round-trip time, but if you're in Brazil, it might be 100 ms.
Moreover, when you do jobs.query(), the BigQuery API server that receives the request is the same one that starts the query. It can return the results as soon as the query is done. But when you use the asynchronous api, your getQueryResults() request is going to go to a different server. That server has to either poll for the job state or find the server that is running the request to get the status. This takes time.
So if you're running a bunch of queries in parallel, each one takes 1-2 seconds, but you're adding half of a second to each one, plus it takes a half a second in the initial request, you're not likely to see a whole lot of speedup. If your queries, on the other hand, take 5 or 10 seconds each, the fixed overhead would be smaller as a percentage of the total time.
My guess is that if you ran a larger number of queries in parallel, you'd see more speedup. The other option is to use the synchronous version of the API, but use multiple threads on the client to send multiple requests in parallel.
There is one more caveat, and that is query size. Unless you purchase extra capacity, BigQuery will, by default, give you 2000 "slots" across all of your queries. A slot is a unit of work that can be done in parallel. You can use those 2000 slots to run one giant query, or 20 smaller queries that each use 100 slots at once. If you run parallel queries that saturate your 2000 slots, you'll experience a slowdown.
That said, 2000 slots is a lot. In a very rough estimate, 2000 slots can process hundreds of Gigabytes per second. So unless you're pushing that kind of volume through BigQuery, adding parallel queries is unlikely to slow you down.

Timing a HTTP / database request: time.time() or time.clock()?

I am interested in measuring the time elapsed during a (synchronous) HTTP request and/or a (synchronous) request to a database on a remote server. After reading this page, my understanding is that time.clock() is an accurate measure of the processor time. But I don't know if "processor time" is relevant in my case, since the CPU would be idling while waiting for the response. In other words:
s0 = time.time()
# send a HTTP request
s1 = time.time()
t0 = time.clock()
# send a HTTP request
t1 = time.clock()
Which one actually measures what I want?

For measuring HTTP response time, I think time.time() is enough.
As others suggested, use timeit if you want to do benchmarking.
I personally haven't used time.clock() before, but after reading the example :
#!/usr/bin/python
import time
def procedure():
time.sleep(2.5)
# measure process time
t0 = time.clock()
procedure()
print time.clock() - t0, "seconds process time"
# measure wall time
t0 = time.time()
procedure()
print time.time() - t0, "seconds wall time"
I don't think time.clock() is appropriate measuring HTTP response time.

One approach is to use New Relic for python. You just install it and enable in application. After that, you will be able to see such charts in your New Relic account. It has a free plan.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.