Python script load testing web page

Python script load testing web page - python

I want to do a test load for a web page. I want to do it in python with multiple threads.
First POST request would login user (set cookies).
Then I need to know how many users doing the same POST request simultaneously can server take.
So I'm thinking about spawning threads in which requests would be made in loop.
I have a couple of questions:
1. Is it possible to run 1000 - 1500 requests at the same time CPU wise? I mean wouldn't it slow down the system so it's not reliable anymore?
2. What about the bandwidth limitations? How good the channel should be for this test to be reliable?
Server on which test site is hosted is Amazon EC2 script would be run from another server(Amazon too).
Thanks!

cPython does not take advantage from multiple cores when running multiple threads. It means, that basically, You will only have one core doing the testing job.
There are dedicated tools to do what You want to do. Let me suggest two:
FunkLoad is a functional and load web tester, written in Python, whose main use cases are:
Functional testing of web projects, and thus regression testing as well.
Performance testing: by loading the web application and monitoring
your servers it helps you to pinpoint bottlenecks, giving a detailed
report of performance measurement.
Load testing tool to expose bugs that do not surface in cursory testing,
like volume testing or longevity testing.
Stress testing tool to overwhelm the web application resources and test
the application recoverability.
Writing web agents by scripting any web repetitive task, like checking if
a site is alive.
Tsung is an open-source multi-protocol distributed load testing tool
The purpose of Tsung is to simulate
users in order to test the scalability
and performance of IP based
client/server applications. You can
use it to do load and stress testing
of your servers. Many protocols have
been implemented and tested, and it
can be easily extended. WebDAV, LDAP
and MySQL support have been added
recently (experimental).
It can be distributed on several
client machines and is able to
simulate hundreds of thousands of
virtual users concurrently (or even
millions if you have enough hardware
...).
If You decide to write Your own tool, You will probably want to use Python's multiprocessing module as it would let You use multiple cores. You should also take a look on Twisted as it would let You easily handle multiple sockets while a limited number of threads. That would be much better than spawning a new thread for each socket.
You work with Amazon EC2, so I would recommend using Tsung. You can rent a dozen of multicore servers for a few hours and run some really heavy load tests with Tsung. It scales very well in this kind of configuration.
As for the bandwidth, it's usually not a problem, but it depends on the application. You will have to monitor all Your resources closely while performing a load test.

too many variables. 1000 at the same time... no. in the same second... possibly. bandwidth may well be the bottleneck. this is something best solved by experimentation.

Related

How to calculate max requests per second of a Django app?

I am about the deploy a Django app, and then it struck me that I couldn't find a way to anticipate how many requests per second my application can handle.
Is there a way of calculating how many requests per second can a Django application handle, without resorting to things like doing a test deployment and use an external tool such as locust?
I know there are several factors involved (such as number of database queries, etc.), but perhaps there is a convenient way of calculating, even estimating, how many visitors can a single Django app instance handle.
EDIT: Removed the mention to Gunicorn, since it only adds confusion to what I truly wanted to know.

Is there a way of calculating how many requests per second can a
Django application handle, without resorting to things like doing a
test deployment and use an external tool such as locust?
No and Yes. As mackarone pointed out, I don't think there's anyway you avoid measuring it. Consider the case where you did a local benchmark on your local dev server talking to a local DB instance, in order to generate a baseline for estimation. The issue with this is that the hardware, network (distance between services) all make a huge difference. So any numbers you generated locally would be relatively worthless for capacity planning.
In my experiences, local testing is great for relative changes. Consider the case where you wanted to see the performance impact of sql query planninng. Establishing a local baseline, making the change, than observing the effect locally is useful to gauge relative speedup.
How to generate these numbers?
I would recommend deploying the app to the hardware, and network you plan on testing on. This deploy should use your production configuration and component topology (ie if you're going to run gunicorn, make sure gunicorn is running instead of NGINX, or if you're going to have a proxy in front of gunicorn, make sure that is setup. I would run a single instance of your application using your production config.
Once this is running, I would launch a load test against the single instance using any of the popular load testing tools:
Apache Benchmark
Siege
Vegeta
K6
etc
You can launch these load tests from your single machine and ramp up traffic until response times are no longer acceptable in order to get a feel for the # of concurrent connections, and throughput your application can accommodate.
Now you have some idea of what a single instance of your service is able to handle. Up until your db (or other shared resources) are saturated these numbers can be used to project how many instances of your service are necessary to handle some amount of traffic!

According to the Gunicorn documentation
How Many Workers?
DO NOT scale the number of workers to the number of clients you expect to have. Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.
Obviously, your particular hardware and application are going to affect the optimal number of workers. Our recommendation is to start with the above guess and tune using TTIN and TTOU signals while the application is under load.
Always remember, there is such a thing as too many workers. After a point your worker processes will start thrashing system resources decreasing the throughput of the entire system.
The best thing is tune it using some load testing tool as locust as you mentioned.
Emphasis mine

You have to install (loadtest) first, it is a npm package,
I was learning redis and at that time I found this, you can use it, it worked for me,
For More check this tutorial: https://realpython.com/caching-in-django-with-redis/#start-by-measuring-performance
npm install -g loadtest
loadtest -n 100 -k http://localhost:8000/myUrl/

How to manage many highspeed client-server autonomus connections and remote programme execution

Here is an interesting problem. I am building an application in AI, that is due to be fundamentally agnostic to computational architectures(hopefully to include mobile). We have built a python application with a machine learning and deep-learning data structure that connects to despaired systems, and combines data with the new systems usually running a windows app/sometimes mac.
We have been working with net-workers and pen testers to determine the best approach to maintaining multiple client-server connections with asynchronicity and a heavy load, and maintain a positive security posture.
Here are the Upside's:
1. Data from the producers is very fast.
2. Programmatically we have increased the ability to load-balance.
The downsides:
1. The controllers / are not very fast. Here we are using languages (Node.js, go-Lang, cherrypy, rabbitmq) to speed up the connections and multi/Hyper thread. (Raspberry pi). Our goal is to use minimal frameworks pizero, raspberry, Arduino, to install and use our tech in an "instance" (See Tony Stark in avengers, connect the chips to the rail and create a command center).
2.point machines are not always quick, AND networking can suck. Using some fabric (with some hopes and dreams) we are nearly to the point of being able to deploy a system in seconds. Currently, data is slow on the back-end, and DB's are not helping that case. Some of our clients would like to test, and we are not interested in making a massive distro. With the amount of data, we only hope to achieve a "lightweight" client application that will leverage (dissimilar) technologies.
as we connect to "permanent" despaired systems we will pass our computational loads, and use for cold storage.
-We are having issues with application persistence / Surviving reboot
-With the multi-client control, commands are getting dropped.(some get them 2 times +)
-We are running many scripts and programs that get, write, read, put. We have them spread between multiple languages with all different uses and requirements.
- After the application is loaded updates are not always making it to each unit.
Thank you to all whom reply

Fastest, simplest way to handle long-running upstream requests for Django

I'm using Django with Uwsgi. We have 8 processes running, and I have no real indication that our code is particularly thread safe, as it was never designed with threads in mind.
Recently, we added the ability to get live rates from vendors of a service through their various APIs and display them at once for the user. The problem is these requests are old web services technologies, and due to their response times, the time needed before the all rates from vendors are acquired (or it gives up), can be up to 10 seconds.
This presents a problem. We have a pretty decent amount of traffic on our site, and the customers need to look at these rates pretty often. With only 8 processes, it's quite easy to see how the server can get tied up waiting on these upstream requests. Especially when other optimizations need to be made to make the site baseline faster anyway (we're working on that).
We made a separate library (which should be mostly threadsafe, and if not, should be converted to it easily enough) for the rates requesting, and we can separate out its configuration. So I was thinking of making a separate service with its own threads, perhaps in Twisted, and having the browser contact that service for JSON instead of having it run in the main Django server.
Is this solution a good one? Can you think of a better or simpler way to do it? Should I use something other than Twisted, and if so, why?

If you want to use your code in-process with Django, you can simply call out to your Twisted by using Crochet, which can automatically manage the creation, running, and shutdown of the reactor within whatever WSGI implementation you choose (presuming that it behaves like a regular Python process, at least).
Obviously it might be less complex to just run within the Twisted WSGI container :-).
It might also be worth looking at TReq to issue your service client requests; your new "thread safe" library will still have the disadvantage of tying up an entire thread for each blocking client, which is a non-trivial amount of memory and additional concurrency overhead, whereas with Twisted you will only need to worry about a couple of objects.

AWS and Python threading scalability

I have a service running on a local server, written using Python threading library. Think of it as a kind of web crawler. It uses 50 threads. I want deploy it on Amazon Web Services cloud and scale it up, so it uses more threads.
Simply, I have two queues: Qinput with URLs and Qoutput with pages content. The threads pick URLs from Qinput, fetch content of the web page an put it to Qoutput
Question: is it enough that I simply increase the number of threads to, say, 500, 5,000 or 50,000 and AWS + Python will handle it? Should I expect the service to run seamlessly or there are some "standard" design pitfalls that I should be aware of when porting a multithreading service on AWS?
I am aware of Global Interpreter Lock although it should not be an issue here, as the main task of the threads is to call outside the interpreter while crawling / scraping pages

Any single instance has its limit. You will probably be able to spawn quite a lot of threads in your instance, especially if you choose the larger ones. But you will get diminished return on the additional threads, until it will not help you any more to get more performance.
However, if you want your system to scale beyond the limitation of a single instance, it is best to be able to run your system on multiple instances. Then your decisions is only operational and not technical. I think that if you are running in AWS environment, which allows you almost endless operational resources, you should think into it.
You can also check out SQS, which is basically a distributed queue system. It will allow you to synchronize the work of as many instances as you need.

A good multithreaded python webserver?

I am looking for a python webserver which is multithreaded instead of being multi-process (as in case of mod_python for apache). I want it to be multithreaded because I want to have an in memory object cache that will be used by various http threads. My webserver does a lot of expensive stuff and computes some large arrays which needs to be cached in memory for future use to avoid recomputing. This is not possible in a multi-process web server environment. Storing this information in memcache is also not a good idea as the arrays are large and storing them in memcache would lead to deserialization of data coming from memcache apart from the additional overhead of IPC.
I implemented a simple webserver using BaseHttpServer, it gives good performance but it gets stuck after a few hours time. I need some more matured webserver. Is it possible to configure apache to use mod_python under a thread model so that I can do some object caching?

CherryPy. Features, as listed from the website:
A fast, HTTP/1.1-compliant, WSGI thread-pooled webserver. Typically, CherryPy itself takes only 1-2ms per page!
Support for any other WSGI-enabled webserver or adapter, including Apache, IIS, lighttpd, mod_python, FastCGI, SCGI, and mod_wsgi
Easy to run multiple HTTP servers (e.g. on multiple ports) at once
A powerful configuration system for developers and deployers alike
A flexible plugin system
Built-in tools for caching, encoding, sessions, authorization, static content, and many more
A native mod_python adapter
A complete test suite
Swappable and customizable...everything.
Built-in profiling, coverage, and testing support.

Consider reconsidering your design. Maintaining that much state in your webserver is probably a bad idea. Multi-process is a much better way to go for stability.
Is there another way to share state between separate processes? What about a service? Database? Index?
It seems unlikely that maintaining a huge array of data in memory and relying on a single multi-threaded process to serve all your requests is the best design or architecture for your app.

Twisted can serve as such a web server. While not multithreaded itself, there is a (not yet released) multithreaded WSGI container present in the current trunk. You can check out the SVN repository and then run:
twistd web --wsgi=your.wsgi.application

Its hard to give a definitive answer without knowing what kind of site you are working on and what kind of load you are expecting. Sub second performance may be a serious requirement or it may not. If you really need to save that last millisecond then you absolutely need to keep your arrays in memory. However as others have suggested it is more than likely that you don't and could get by with something else. Your usage pattern of the data in the array may affect what kinds of choices you make. You probably don't need access to the entire set of data from the array all at once so you could break your data up into smaller chunks and put those chunks in the cache instead of the one big lump. Depending on how often your array data needs to get updated you might make a choice between memcached, local db (berkley, sqlite, small mysql installation, etc) or a remote db. I'd say memcached for fairly frequent updates. A local db for something in the frequency of hourly and remote for the frequency of daily. One thing to consider also is what happens after a cache miss. If 50 clients all of a sudden get a cache miss and all of them at the same time decide to start regenerating those expensive arrays your box(es) will quickly be reduced to 8086's. So you have to take in to consideration how you will handle that. Many articles out there cover how to recover from cache misses. Hope this is helpful.

Not multithreaded, but twisted might serve your needs.

You could instead use a distributed cache that is accessible from each process, memcached being the example that springs to mind.

web.py has made me happy in the past. Consider checking it out.
But it does sound like an architectural redesign might be the proper, though more expensive, solution.

Perhaps you have a problem with your implementation in Python using BaseHttpServer. There's no reason for it to "get stuck", and implementing a simple threaded server using BaseHttpServer and threading shouldn't be difficult.
Also, see http://pymotw.com/2/BaseHTTPServer/index.html#module-BaseHTTPServer about implementing a simple multi-threaded server with HTTPServer and ThreadingMixIn

I use CherryPy both personally and professionally, and I'm extremely happy with it. I even do the kinds of thing you're describing, such as having global object caches, running other threads in the background, etc. And it integrates well with Apache; simply run CherryPy as a standalone server bound to localhost, then use Apache's mod_proxy and mod_rewrite to have Apache transparently forward your requests to CherryPy.
The CherryPy website is http://cherrypy.org/

I actually had the same issue recently. Namely: we wrote a simple server using BaseHTTPServer and found that the fact that it's not multi-threaded was a big drawback.
My solution was to port the server to Pylons (http://pylonshq.com/). The port was fairly easy and one benefit was it's very easy to create a GUI using Pylons so I was able to throw a status page on top of what's basically a daemon process.
I would summarize Pylons this way:
it's similar to Ruby on Rails in that it aims to be very easy to deploy web apps
it's default templating language, Mako, is very nice to work with
it uses a system of routing urls that's very convenient
for us performance is not an issue, so I can't guarantee that Pylons would perform adequately for your needs
you can use it with Apache & Lighthttpd, though I've not tried this
We also run an app with Twisted and are happy with it. Twisted has good performance, but I find Twisted's single-threaded/defer-to-thread programming model fairly complicated. It has lots of advantages, but would not be my choice for a simple app.
Good luck.

Just to point out something different from the usual suspects...
Some years ago while I was using Zope 2.x I read about Medusa as it was the web server used for the platform. They advertised it to work well under heavy load and it can provide you with the functionality you asked.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.