Why are my dynamodb requests via boto:get_item so slow and too frequently very slow? The AWS console reports that my get latency has hit a high of 12.5ms. None of my requests are anywhere near that low.
Python 2.7.5
AWS region us-west-1
boto 2.31.1
dynamodb table size ~180k records
Code:
from boto.dynamodb2.fields import HashKey
from boto.dynamodb2.table import Table
from boto.dynamodb2.types import STRING
import boto.dynamodb2
import time
REGION = "us-west-1"
AWS_KEY = "xxxxx"
AWS_SECRET = "xxxxx"
start = time.time()
peeps = ("cefbdadf518f44da8a68e35b2321bb1f", "7e3a691df6134a4f83d381a5507cbb18")
connection = boto.dynamodb2.connect_to_region(REGION, aws_access_key_id=AWS_KEY, aws_secret_access_key=AWS_SECRET)
users = Table("users-test", schema=[HashKey("id", data_type=STRING)], connection=connection)
for peep in peeps:
user = users.get_item(consistent=True, id=peep)
print time.time() - start
Results:
(botot)➜ ~ python test2.py
0.056941986084
0.0681240558624
(botot)➜ ~ python test2.py
1.05709600449
1.06937909126
(botot)➜ ~ python test2.py
0.048614025116
0.0575139522552
(botot)➜ ~ python test2.py
0.0553398132324
0.064425945282
(botot)➜ ~ python test2.py
3.05251288414
3.06584000587
(botot)➜ ~ python test2.py
0.0579640865326
0.0699849128723
(botot)➜ ~ python test2.py
0.0530469417572
0.0628390312195
(botot)➜ ~ python test2.py
1.05059504509
1.05963993073
(botot)➜ ~ python test2.py
1.05139684677
1.0603158474
update 2014-07-11 08:03 PST
The actual use-case is looking up a user for each web request. As #gamaat said, the cost for DynamoDB is on the first lookup because thats when the HTTPS connection is made. So it seems if I can store the DynamoDB connection between requests and reuse it, things would go faster. So I used werkzeug.contrib.cache.FileSystemCache to store the connection but it never seems to actually store the connection for retrieval. Other values get stored fine, just not this connection object. Any ideas? And if this is not a good way to store the connection between requests, then what is?
update 2014-07-11 15:30 PST
Since I'm using supervisor and uwsgi to manage my Flask app, it seems that the problem is actually how can I share the connection object between requests for my Flask app.
The solution to the question that appears to be yielding better response times (before average response time was ~500ms, and after it is ~50ms) was to do two things:
1) put the Boto DynamoDB connection object in default_settings.py so that it gets loaded in once into app.config["DYNDB_CONN"] per application load; and
2) configure uwsgi to have a cheaper value of num_proccesses - 1, and cheaper-initial of num_proccesses - 1. This tells uwsgi to always to have num_processes - 1 uwsgi processes running at all times with the option of starting up one more process if load requires it.
I did this to minimize the number of uwsgi processes that would restart and therefore create a new Boto DynamoDB connection object (incurring HTTP connection setup costs).
Related
I am having trouble getting a Python CGI script to execute properly on an apache2 webserver that is running on a virtual Ubuntu 18 server. The hosting provider is DreamCompute, if that matters. I have gotten CGI scripts working under var/www/html/apps/cgi-bin/. Both helloworld.py and helloworld.pl execute fine both in the SSH terminal and in browser (Microsoft Edge).
The script giving me trouble is a Python script that accesses a MySQL database, reads data from it, then uses that data to fill some lists and generate some simple output at random (a magical spell with random effects for a tabletop RPG).
The spell generator script also executes fine in SSH, but when I try to view it in a browser it throws a 500 internal server error. The apache error logs tell me that the problem is end of script output before headers. I looked around, but couldn't find anyone with a similar problem and configuration.
EDIT: The full entry in the error log is:
[Tue Apr 20 00:20:25.324101 2021] [cgid:error] [pid 17275:tid 140105176721152] [client 2607:fea8:1d41:b800:7dca:305:b11a:447f:62505] End of script output before headers: spell_generator.py, referer: http://my.website/apps/cgi-bin/
After adding and removing parts of the Python script to see how it behaves in a browser, I believe I have isolated the problem: the part of the script that connects to the MySQL database. That database is also hosted on the same Ubuntu virtual machine, and it definitely has the right kind of data in it (just strings and IDs; nothing fancy).
Here's the Python code. I've removed some documentation comments but it should be pretty straightforward:
#!/usr/bin/python3
print("Content-type: text/html\n\n");
# The above two lines are boilerplate to ensure no print statements cause
# problems later in the program.
# Import statements
import cgi
import cgitb
cgitb.enable();
from random import choice
import mysql.connector
import sys
from enum import Enum
# Initialize enums for system comparison
class Ruleset(Enum):
GENERIC = "GENERIC"
GENERIC_DB = "generic_attributes"
DND_5 = "DND5e"
DND_5_DB = "dnd5e_attributes"
PATHFINDER_1 = "PF1e"
PATHFINDER_1_DB = "pathfinder1e_attributes"
# Determine what system to use, generic by default
spellSystem = Ruleset.GENERIC.value;
if (len(sys.argv)) > 1:
if (sys.argv[1] == Ruleset.DND_5.value):
spellSystem = Ruleset.DND_5.value;
if (sys.argv[1] == Ruleset.PATHFINDER_1.value):
spellSystem = Ruleset.PATHFINDER_1.value;
# === PROBLEM HERE ===
# Initialize SQL cursor, defaulting to generic
if spellSystem == Ruleset.DND_5.value:
spellDB = mysql.connector.connect(
host = "localhost",
user = "RemoteUser",
password = "password",
database = Ruleset.DND_5_DB.value
)
elif spellSystem == Ruleset.PATHFINDER_1.value:
spellDB = mysql.connector.connect(
host = "localhost",
user = "RemoteUser",
password = "password",
database = Ruleset.PATHFINDER_1_DB.value
)
else:
spellDB = mysql.connector.connect(
host = "localhost",
user = "RemoteUser",
password = "password",
database = Ruleset.GENERIC_DB.value
)
spellCursor = spellDB.cursor();
spellCursor.execute("SELECT ElementName FROM Element");
listHolder = spellCursor.fetchall();
# === CODE BELOW DOES NOT CAUSE PROBLEMS ===
#
# [logic that uses the SQL data]
#
# Output HTML page
print("<html> <head> <title>TEST - Magic Spell Generator</title> <link rel='stylesheet' href='../../css/style.css'> </head>");
print("<body> body contents");
print("</body>");
print("</html>");
The RemoteUser is a user account on the SQL server, which is used by "external" (non-root) programs to access the databases. password in the actual deployed script is a cryptographically secure password.
I'm not a Python expert, but the code runs with no problems when I execute it from the SSH terminal, so I don't think that bad code is to blame (although I could certainly be wrong). Here's a list of things I've tried already:
Checking for too many active SQL connections (there appears to be only one, the root user).
Making sure the script file has the correct privileges (chmod 755, same as the rest).
Making sure the necessary Python modules are installed (they are, and the Python-MySQL connector is the up to date one that works with the version of Python I'm using).
Restarting apache2.
I've spent most of today trying to find an answer. Any help or potential leads are welcome.
Turns out the problem wasn't actually the script trying to access the SQL database, which I figured out when I ran it properly in the SSH terminal (./spell_generator.py instead of python3 spell_generator.py). It crashed due to segmentation fault (core dumped), which implied that it was a code problem rather than a configuration problem.
After a bit of searching, I found Database connectivity through MySQL connector Python with CGI is not working, which pointed me to the real culprit: I was importing modules in the wrong order. I changed the import statements so that import mysql.connector is now the first one, and the script runs perfectly.
It shouldn't make a difference, but it does. I'm just happy.
I am new to Locust Load testing framework and in process of migrating my existing Azure cloud based Performance testing C# scripts to Locust's Python based scripts. Our team almost completed migration of scripts. But during our load tests, we are getting errors as below, which fails to create new requests from the machine due to high CPU utilization or because of so many exception on Locust. We are running with Locust web based mode - details are indicated below. These scritps are working fine on smaller loads of 50 to 100 users
"Error 1 -('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))"
"Error 2 : Connection pool is full, discarding connection"
"** **Error 3 :urllib3.exceptions.NewConnectionError: : Failed to establish a new connection: [Errno 110] Connection timed out****"
Yes, we are using UrlLibs on the utility classes . But first 2 error does seems to be of Locust.
Our Load testing configurations are : "3500 users at a hatch rate of 5 users per second". Running natively(no docker container) on a 8 Core , 16 Gb Linux Ubuntu Virtual machine on Azure. ulimit set as 50,000 on Linux machine.
Please help us with your thoughts
Sample test is as below
import os
import sys
sys.path.append(os.environ.get('WORKDIR', os.getcwd()))
from locust import HttpLocust, TaskSet, task
from locust.wait_time import between
class ContactUsBehavior(TaskSet):
wait_time = AppUtil.get_wait_time_function(2)
#task(1)
def post_load_test_contact(self):
data = { "ContactName" : "Mane"
, "Email" : "someone#someone.com"
, "EmailVerifaction" : "someone#someone.com"
, "TelephoneContact" : ""
, "PhoneNumber" : ""
, "ContactReason" : "Other"
, "OtherComment" : "TEST Comments 2019-12-30"
, "Agree" : "true"
}
self.client.post("app/contactform", self.client, 'Contact us submission', post_data = data)
class UnauthenticatedUser(HttpLocust):
task_set = ContactUsBehavior
# host is override-able
host = 'https://app.devurl.com/'
Locust’s default HTTP client uses python-requests which internally use urllib3 .
If you are working on a large scale tests, you should consider another HTTP client. The connection pool of urllib 3 (PoolManager ) will reuse connections and limit how many connections are allowed per host at any given time to avoid accumulating too many unused sockets.
So you have option to tweak the pool : https://urllib3.readthedocs.io/en/latest/advanced-usage.html#customizing-pool-behavior
Or you can try any other high performance HTTP client . Eg: gevenhttp
Locust also provides a built-int client which is faster than the default python-requests:
https://docs.locust.io/en/stable/increase-performance.html
You should consider to run Locust in cluster mode in different nodes if the client still couldn't handle the big load.
I am using a single node Cassandra and I intend to run some queries in order to check the response time. In some queries, after 10s of execution occurs to me the following error:
OperationTimedOut: errors = {}, last_host = 127.0.0.1
So I ran the following command:
sudo gedit /usr/bin/cqlsh.py
And changed cqlsh.py file:
# cqlsh should run correctly when run out of a Cassandra source tree,
# out of an unpacked Cassandra tarball, and after a proper package install.
cqlshlibdir = os.path.join(CASSANDRA_PATH, 'pylib')
if os.path.isdir(cqlshlibdir):
sys.path.insert(0, cqlshlibdir)
from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling
from cqlshlib.displaying import (ANSI_RESET, BLUE, COLUMN_NAME_COLORS, CYAN,
RED, FormattedValue, colorme)
from cqlshlib.formatting import (DEFAULT_DATE_FORMAT, DEFAULT_NANOTIME_FORMAT,
DEFAULT_TIMESTAMP_FORMAT, DateTimeFormat,
format_by_type, format_value_utype,
formatter_for)
from cqlshlib.tracing import print_trace, print_trace_session
from cqlshlib.util import get_file_encoding_bomsize, trim_if_present
DEFAULT_HOST = '127.0.0.1'
DEFAULT_PORT = 9042
DEFAULT_CQLVER = '3.3.1'
DEFAULT_PROTOCOL_VERSION = 4
DEFAULT_CONNECT_TIMEOUT_SECONDS = 240
DEFAULT_FLOAT_PRECISION = 5
DEFAULT_MAX_TRACE_WAIT = 300
However, when I try to run the query again, cql return the same error after 10s:
OperationTimedOut: errors = {}, last_host = 127.0.0.1
What I have to do so that the query has no answer timeout?
The latest version of cassandra allows you to specify cqlsh timeout when you use it, instead of having to edit your cqlshrc file.
cqlsh --request-timeout <your-timeout>
Are you executing these queries in cqlsh?
If so, you are hitting the client request timeout (not the connect timeout, nor the server-side read request timeout).
You can change the default timeout by setting one in ~/.cassandra/cqlshrc:
[connection]
client_timeout = 20
# Can also be set to None to disable:
# client_timeout = None
See https://issues.apache.org/jira/browse/CASSANDRA-7516 for more detail.
I see from another comment you are already aware of paging. This will be the best approach because it does not require you to marshal the entire result set in memory at the data and app tiers.
You'll see a handful of responses telling you how to raise the various timeouts, but the real answer is that you almost never want to raise those timeouts, because if you have a real data set, you will kill your server (or drop requests/mutations) with lots of long-running queries. You are better off using paging and more short-running queries than huge, long running queries.
You have to change the read_request_timeout_in_ms parameter in the cassandra.yaml file. And then restart Cassandra.
I have a question to ask regarding the performance of my flask app when I incorporated uwsgi and nginx.
My app.view file looks like this:
import app.lib.test_case as test_case
from app import app
import time
#app.route('/<int:test_number>')
def test_case_match(test_number):
rubbish = test_case.test(test_number)
return "rubbish World!"
My app.lib.test_case file look like this:
import time
def test_case(test_number):
time.sleep(30)
return None
And my config.ini for my uwsgi looks like this:
[uwsgi]
socket = 127.0.0.1:8080
chdir = /home/ubuntu/test
module = app:app
master = true
processes = 2
daemonize = /tmp/uwsgi_daemonize.log
pidfile = /tmp/process_pid.pid
Now if I run this test case just purely through the flask framework without switching on uwsgi + nginx, using the ab benchmark, I received a response in 31seconds which is expected owning to the sleep function. What I dont get is when I run the app through uwsgi + nginx , the response time I got was 38 seconds, which is an overhead of around 25%. Can anyone enlighten me?
time.sleep() is not time-safe.
From the documentation of time.sleep(secs):
[…] Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
I am working on a web service with Twisted that is responsible for calling up several packages I had previously used on the command line. The routines these packages handle were being prototyped on their own but now are ready to be integrated into our webservice.
In short, I have several different modules that all create a mysql connection property internally in their original command line forms. Take this for example:
class searcher:
def __init__(self,lat,lon,radius):
self.conn = getConnection()[1]
self.con=self.conn.cursor();
self.mgo = getConnection(True)
self.lat = lat
self.lon = lon
self.radius = radius
self.profsinrange()
self.cache = memcache.Client(["173.220.194.84:11211"])
The getConnection function is just a helper that returns a mongo or mysql cursor respectively. Again, this is all prototypical :)
The problem I am experiencing is when implemented as a consistently running server using Twisted's WSGI resource, the sql connection created in init times out, and subsequent requests don't seem to regenerate it. Example code for small server app:
from twisted.web import server
from twisted.web.wsgi import WSGIResource
from twisted.python.threadpool import ThreadPool
from twisted.internet import reactor
from twisted.application import service, strports
import cgi
import gnengine
import nn
wsgiThreadPool = ThreadPool()
wsgiThreadPool.start()
# ensuring that it will be stopped when the reactor shuts down
reactor.addSystemEventTrigger('after', 'shutdown', wsgiThreadPool.stop)
def application(environ, start_response):
start_response('200 OK', [('Content-type','text/plain')])
params = cgi.parse_qs(environ['QUERY_STRING'])
try:
lat = float(params['lat'][0])
lon = float(params['lon'][0])
radius = int(params['radius'][0])
query_terms = params['query']
s = gnengine.searcher(lat,lon,radius)
query_terms = ' '.join( query_terms )
json = s.query(query_terms)
return [json]
except Exception, e:
return [str(e),str(params)]
return ['error']
wsgiAppAsResource = WSGIResource(reactor, wsgiThreadPool, application)
# Hooks for twistd
application = service.Application('Twisted.web.wsgi Hello World Example')
server = strports.service('tcp:8080', server.Site(wsgiAppAsResource))
server.setServiceParent(application)
The first few requests work fine, but after mysqls wait_timeout expires, the dread error 2006 "Mysql has gone away" error surfaces. It had been my understanding that every request to the WSGI Twisted resource would run the application function, thereby regenerating the searcher object and re-leasing the connection. If this isn't the case, how can I make the requests processed as such? Is this kind of Twisted deployment not transactional in this sense? Thanks!
EDIT: Per request, here is the prototype helper function calling up the connection:
def getConnection(mong = False):
if mong == False:
connection = mysql.connect(host = db_host,
user = db_user,
passwd = db_pass,
db = db,
cursorclass=mysql.cursors.DictCursor)
cur = connection.cursor();
return (cur,connection)
else:
return pymongo.Connection('173.220.194.84',27017).gonation_test
i was developing a piece of software with twisted where i had to utilize a constant MySQL database connection. i did run into this problem and digging through the twisted documentation extensively and posting a few questions i was unable to find a proper solution.There is a boolean parameter you can pass when you are instantiating the adbapi.connectionPool class; however it never seemed to work and i kept getting the error irregardless. However, what i am guessing the reconnect boolean represents is the destruction of the connection object when SQL disconnect does occur.
adbapi.ConnectionPool("MySQLdb", cp_reconnect=True, host="", user="", passwd="", db="")
I have not tested this but i will re-post some results when i do or if anyone else has please share.
When i was developing the script i was using twisted 8.2.0 (i havent touched twisted in a while) and back then the framework had no such explicit keep alive method, so i developed a ping/keepalive extension employing event driven paradigm twisted builds upon in conjunction with direct MySQLdb module ping() method (see code comment).
As i was typing this response; however, i did look around the current twisted documentation i was still unable to find an explicit keep-alive method or parameter. My guess is because twisted itself does not have database connectivity libraries/classes. It uses the methods available to python and provides an indirect layer of interfacing with those modules; with some exposure for direct calls to the database library being used. This is accomplished by using the adbapi.runWithConnection method.
here is the module i wrote under twisted 8.2.0 and python 2.6; you can set the intervals between pings. what the script does is, every 20 minutes it pings the database and if it fails, it attempts to reconnect back to it every 60 seconds. I must warn that the script does NOT handle sudden/dropped connection; that you can handle through addErrback whenever you run a query through twisted, atleast thats how i did it. I have noticed that whenever database connection drops, you can only find out if it has when you are executing a query and the event raises an errback, and then at that point you deal with it. Basically, if i dont run a query for 10 minutes, and my database disconnects me, my application will not respond in real time. the application will realize the connection has been dropped when it runs the query that follows; so the database could have disconnected us 1 minute after the first query, 5, 9, etc....
I guess this sort of goes back to the original idea that i have stated, twisted utilizes python's own libraries or 3rd party libraries for database connectivity and because of that, some things are handled a bit differently.
from twisted.enterprise import adbapi
from twisted.internet import reactor, defer, task
class sqlClass:
def __init__(self, db_pointer):
self.dbpool=db_pointer
self.dbping = task.LoopingCall(self.dbping)
self.dbping.start(1200) #20 minutes = 1200 seconds; i found out that if MySQL socket is idled for 20 minutes or longer, MySQL itself disconnects the session for security reasons; i do believe you can change that in the configuration of the database server itself but it may not be recommended.
self.reconnect=False
print "database ping initiated"
def dbping(self):
def ping(conn):
conn.ping() #what happens here is that twisted allows us to access methods from the MySQLdb module that python posesses; i chose to use the native command instead of sending null commands to the database.
pingdb=self.dbpool.runWithConnection(ping)
pingdb.addCallback(self.dbactive)
pingdb.addErrback(self.dbout)
print "pinging database"
def dbactive(self, data):
if data==None and self.reconnect==True:
self.dbping.stop()
self.reconnect=False
self.dbping.start(1200) #20 minutes = 1200 seconds
print "Reconnected to database!"
elif data==None:
print "database is active"
def dbout(self, deferr):
#print deferr
if self.reconnect==False:
self.dbreconnect()
elif self.reconnect==True:
print "Unable to reconnect to database"
print "unable to ping MySQL database!"
def dbreconnect(self, *data):
self.dbping.stop()
self.reconnect=True
#self.dbping = task.LoopingCall(self.dbping)
self.dbping.start(60) #60
if __name__ == "__main__":
db = sqlClass(adbapi.ConnectionPool("MySQLdb", cp_reconnect=True, host="", user="", passwd="", db=""))
reactor.callLater(2, db.dbping)
reactor.run()
let me know how it works out for you :)