CherryPy Sessions and large objects?

CherryPy Sessions and large objects? - python

I have a CherryPy Webapp that I originally wrote using file based sessions. From time to time I store potentially large objects in the session, such as the results of running a report - I offer the option to download report results in a variety of formats, and I don't want to re-run the query when the user selects a download due to the potential of getting different data. While using file based sessions, this worked fine.
Now I am looking at the potential of bringing a second server online, and as such I need to be able to share session data between the servers, for which it would appear that using the memchached session storage type is the most appropriate. I briefly looked at using a PostgreSQL storage type, but this option was VERY poorly documented, and from what I could find, may well be broken. So I implemented the memcached option.
Now, however, I am running into a problem where, when I try to save certain objects to the session, I get an "AssertionError: Session data for id xxx not set". I'm assuming that this is due to the object size exceeding some arbitrary limit set in the CherryPy session backend or memcached, but I don't really know since the exception doesn't tell me WHY it wasn't set. I have increased the object size limit in memcached to the maximum of 128MB to see if that helped, but it didn't - and that's probably not a safe option anyway.
So what's my solution here? Is there some way I can use the memcached session storage to store arbitrarily large objects? Do I need to "roll my own" DB based or the like solution for these objects? Is the problem potentially NOT size based? Or is there another option I am missing?

I use mysql for handling my cherrypy sessions. As long as the object is serializeable (can be pickled) you can store it as a blob (binary large object) in mysql. Here's the code you would want to use for mysql session storage...
https://bitbucket-assetroot.s3.amazonaws.com/Lawouach/cherrypy/20111008/936/mysqlsession.py?Signature=gDmkOlAduvIZS4WHM2OVgh1WVuU%3D&Expires=1424822438&AWSAccessKeyId=0EMWEFSGA12Z1HF1TZ82
"""
MySQLdb session module for CherryPy by Ken Kinder <http://kenkinder.com/>
Version 0.3, Released June 24, 2000.
Copyright (c) 2008-2009, Ken Kinder
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the Ken Kinder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
import MySQLdb
import cPickle as pickle
import cherrypy
import logging
import threading
__version__ = '0.2'
logger = logging.getLogger('Session')
class MySQLSession(cherrypy.lib.sessions.Session):
##
## These can be over-ridden by config file
table_name = 'web_session'
connect_arguments = {}
SCHEMA = """create table if not exists %s (
id varchar(40),
data text,
expiration_time timestamp
) ENGINE=InnoDB;"""
_database = None
def __init__(self, id=None, **kwargs):
logger.debug('Initializing MySQLSession with %r' % kwargs)
for k, v in kwargs.items():
setattr(MySQLSession, k, v)
self.db = self.get_db()
self.cursor = self.db.cursor()
super(MySQLSession, self).__init__(id, **kwargs)
#classmethod
def get_db(cls):
##
## Use thread-local connections
local = threading.local()
if hasattr(local, 'db'):
return local.db
else:
logger.debug("Connecting to %r" % cls.connect_arguments)
db = MySQLdb.connect(**cls.connect_arguments)
cursor = db.cursor()
cursor.execute(cls.SCHEMA % cls.table_name)
db.commit()
local.db = db
return db
def _load(self):
logger.debug('_load %r' % self)
# Select session data from table
self.cursor.execute('select data, expiration_time from %s '
'where id = %%s' % MySQLSession.table_name, (self.id,))
row = self.cursor.fetchone()
if row:
(pickled_data, expiration_time) = row
data = pickle.loads(pickled_data)
return data, expiration_time
else:
return None
def _save(self, expiration_time):
logger.debug('_save %r' % self)
pickled_data = pickle.dumps(self._data)
self.cursor.execute('select count(*) from %s where id = %%s and expiration_time > now()' % MySQLSession.table_name, (self.id,))
(count,) = self.cursor.fetchone()
if count:
self.cursor.execute('update %s set data = %%s, '
'expiration_time = %%s where id = %%s' % MySQLSession.table_name,
(pickled_data, expiration_time, self.id))
else:
self.cursor.execute('insert into %s (data, expiration_time, id) values (%%s, %%s, %%s)' % MySQLSession.table_name,
(pickled_data, expiration_time, self.id))
self.db.commit()
def acquire_lock(self):
logger.debug('acquire_lock %r' % self)
self.locked = True
self.cursor.execute('select id from %s where id = %%s for update' % MySQLSession.table_name,
(self.id,))
self.db.commit()
def release_lock(self):
logger.debug('release_lock %r' % self)
self.locked = False
self.db.commit()
def clean_up(self):
logger.debug('clean_up %r' % self)
self.cursor.execute('delete from %s where expiration_time < now()' % MySQLSession.table_name)
self.db.commit()
def _delete(self):
logger.debug('_delete %r' % self)
self.cursor.execute('delete from %s where id=%%s' % MySQLSession.table_name, (self.id,))
self.db.commit()
def _exists(self):
# Select session data from table
self.cursor.execute('select count(*) from %s '
'where id = %%s and expiration_time > now()' % MySQLSession.table_name, (self.id,))
(count,) = self.cursor.fetchone()
logger.debug('_exists %r (%r)' % (self, bool(count)))
return bool(count)
def __del__(self):
logger.debug('__del__ %r' % self)
self.db.commit()
self.db.close()
self.db = None
def __repr__(self):
return '<MySQLSession %r>' % (self.id,)
cherrypy.lib.sessions.MysqlSession = MySQLSession
then your webapp.py would look something like this...
from mysqlsession import MySQLSession
import cherrypy
import logging
logging.basicConfig(level=logging.DEBUG)
sessionInfo = {
'tools.sessions.on': True,
'tools.sessions.storage_type': "Mysql",
'tools.sessions.connect_arguments': {'db': 'sessions'},
'tools.sessions.table_name': 'session'
}
cherrypy.config.update(sessionInfo)
class HelloWorld:
def index(self):
v = cherrypy.session.get('v', 1)
cherrypy.session['v'] = v+1
return "Hello world! %s" % v
index.exposed = True
cherrypy.quickstart(HelloWorld())
If you need to put some object in there do something like this...
import pickle
pickledThing = pickle.dumps(YourObject.GetItems(), protocol=0, fix_imports=False)
Hope this helps!

Sounds like you want to store a reference to the object stored in Memcache and then pull it back when you need it, rather than relying on the state to handle the loading / saving.

From what you have explained I can conclude that conceptually it isn't a good idea to mix user sessions and a cache. What sessions are mostly designed for is holding state of user identity. Thus it has security measures, locking, to avoid concurrent changes, and other aspects. Also a session storage is usually volatile. Thus if you mean to use sessions as a cache you should understand how sessions really work and the consequences are.
What I suggest you to do it to establish normal caching of your domain model that produces report data and keep session for identity.
CherryPy details
Default CherryPy session implementation locks the session data. In the OLAP case your user likely won't be able to perform concurrent requests (open another tab for instance) until the report is completed. There's however an option of manual locking management.
PostgreSQL session storage is broken and may be removed in next releases.
Memcached session storage doesn't implement distributed locking, so make sure you use consistent rule to balance your user across your servers.

Related

Sawtooth Transaction error: "Tried to set unauthorized address"

I am trying to write my custom Transaction processor. I am writing for simple Account class
class Account:
def __init__(self, name, ac_number, balance):
self.name = name
self.ac_number = ac_number
self.balance = balance
My TP is working fine for a single account. Now I want to improve it for multiple accounts. To get a different state for each account number I have changed _'_get_account_address_' function. I am following #danintel 's Cookiejar and XO_python projects. I am following xo code to get the address.
AC_NAMESPACE = hashlib.sha512('account'.encode("utf-8")).hexdigest()[0:6]
def _make_account_address(name):
return AC_NAMESPACE + \
hashlib.sha512(name.encode('utf-8')).hexdigest()[:64]
_get_account_address is working fine but _make_account_address showing error in cli
Tried to set unauthorized address
My state code is
import logging
import hashlib
from sawtooth_sdk.processor.exceptions import InternalError
LOGGER = logging.getLogger(__name__)
FAMILY_NAME = "account"
# TF Prefix is first 6 characters of SHA-512("cookiejar"), a4d219
AC_NAMESPACE = hashlib.sha512('account'.encode("utf-8")).hexdigest()[0:6]
def _make_account_address(name):
return AC_NAMESPACE + \
hashlib.sha512(name.encode('utf-8')).hexdigest()[:64]
def _hash(data):
'''Compute the SHA-512 hash and return the result as hex characters.'''
return hashlib.sha512(data).hexdigest()
def _get_account_address(from_key):
'''
Return the address of a cookiejar object from the cookiejar TF.
The address is the first 6 hex characters from the hash SHA-512(TF name),
plus the result of the hash SHA-512(cookiejar public key).
'''
return _hash(FAMILY_NAME.encode('utf-8'))[0:6] + \
_hash(from_key.encode('utf-8'))[0:64]
class Account:
def __init__(self, name, ac_number, balance):
self.name = name
self.ac_number = ac_number
self.balance = balance
class AccountState:
def __init__(self, context):
self._context = context
def make_account(self, account_obj, from_key):
'''Bake (add) "amount" cookies.'''
account_address = _make_account_address(account_obj.name) # not working
#account_address = _get_account_address(from_key) # working fine
LOGGER.info('Got the key %s and the account address %s.',
from_key, account_address)
state_str = ",".join([str(account_obj.name), str(account_obj.ac_number), str(account_obj.balance)])
state_data = state_str.encode('utf-8')
addresses = self._context.set_state({account_address: state_data})
if len(addresses) < 1:
raise InternalError("State Error")

This probably has been answered already, but I've lesser credits to add a comment.
The error you see "Tried to set unauthorized address: " is because client did not include these addresses in TransactionHeader's "outputs" addresses field.
It is possible for client to give prefix instead of complete address in "outputs" addresses field, but make use of this feature cautiously because it'll impact parallel transaction scheduling.
Please refer to https://sawtooth.hyperledger.org/docs/core/nightly/master/architecture/transactions_and_batches.html#dependencies-and-input-output-addresses for detailed understanding on different fields when composing TransactionHeader.

It means a the transaction processor tried to set (put) a value not in the list of outputs. This occurs when a client submits a transaction with an inaccurate list of inputs/outputs.
Make sure the Sawtooth address is the correct length--the address is 70 hex characters, which represent a 35 byte address (including the 6 hex character or 3 byte Transaction Family prefix).
Also, you can set the outputs list to empty--that will allow all addresses to be written (at the expense of security and efficiency). It is better to set the inputs and outputs to the state addresses you are changing--that allows transactions to be ran parallel (if you run sawtooth-validator --scheduler parallel -vv ) and is more safe and secure as the transaction processor cannot write to state addresses outside the list.

I had this issue as well. I realized that I had different prefixs to my address. Make sure they match!!

Python flassger: Get query with extended conditions ? (more, less, between...)

I develop a python application based on flask that connects to a postgresql database and exposes the API using flassger (swagger UI).
I already defined a basic API (handle entries by ID, etc) as well a a query api to match different parameters (name=='John Doe'for example).
I would like to expand this query api to integrate more complex queries such as lower than, higher than, between, contains, etc.
I search on internet but couldn't find a proper way to do it. Any suggestion ?
I found this article which was useful but does not say anything about the implementation of the query: https://hackernoon.com/restful-api-designing-guidelines-the-best-practices-60e1d954e7c9
Here is briefly how it looks like so far (some extracted code):
GET_query.xml:
Return an account information
---
tags:
- accounts
parameters:
- name: name
in: query
type: string
example: John Doe
- name: number
in: query
type: string
example: X
- name: opened
in: query
type: boolean
example: False
- name: highlighted
in: query
type: boolean
example: False
- name: date_opened
in: query
type: Date
example: 2018-01-01
Blueprint definition:
ACCOUNTS_BLUEPRINT = Blueprint('accounts', __name__)
Api(ACCOUNTS_BLUEPRINT).add_resource(
AccountQueryResource,
'/accounts/<query>',
endpoint='accountq'
)
Api(ACCOUNTS_BLUEPRINT).add_resource(
AccountResource,
'/accounts/<int:id>',
endpoint='account'
)
Api(ACCOUNTS_BLUEPRINT).add_resource(
AccountListResource,
'/accounts',
endpoint='accounts'
)
Resource:
from flasgger import swag_from
from urllib import parse
from flask_restful import Resource
from flask_restful.reqparse import Argument
from flask import request as req
...
class AccountQueryResource(Resource):
""" Verbs relative to the accounts """
#staticmethod
#swag_from('../swagger/accounts/GET_query.yml')
def get(query):
""" Handle complex queries """
logger.debug('Recv %s:%s from %s', req.url, req.data, req.remote_addr)
query = dict(parse.parse_qsl(parse.urlsplit(req.url).query))
logger.debug('Get query: {}'.format(query))
try:
account = AccountRepository.filter(**query)
except Exception as e:
logger.error(e)
return {'error': '{}'.format(e)}, 409
if account:
result = AccountSchema(many=True).dump(account)
logger.debug('Get query returns: {}({})'.format(account, result))
return {'account': result}, 200
logger.debug('Get query returns: {}'.format(account))
return {'message': 'No account corresponds to {}'.format(query)}, 404
And finally the epository:
class AccountRepository:
""" The repository for the account model """
#staticmethod
def get(id):
""" Query an account by ID """
account = AccountModel.query.filter_by(id=id).first()
logger.debug('Get ID %d: got:%s', id, account)
return account
#staticmethod
def filter(**kwargs):
""" Query an account """
account = AccountModel.query.filter_by(**kwargs).all()
logger.debug('Filter %s: found:%s', kwargs, account)
return account
...

I don't know about your exact problem, but I had a problem similar to yours, and I fixed it with:
query = []
if location:
query.append(obj.location==location)
I will query this list of queries with
obj.query.filter(*query).all()
Where in above examples, obj is the name of a model you have created.
How is this help? this will allow you to fill in the variables you have dynamically and each query has its own conditions. you can use ==, !=, <=, etc.
note you should use filter and not filter_by then you can as many operators as you like.
you can read link1 and link2 for documents on how to query sqlalchemy.
edit:
name = request.args.get("name")
address = request.args.get("address")
age = request.args.get("address")
query = []
if name:
query.append(Myobject.name==name)
if address:
query.append(Myobject.address==name)
if age:
query.append(Myobject.age >= age) # look how we select people with age over the provided number!
query_result = Myobject.query.filter(*query).all()
if's will help you when there is no value provided by the user. this way you are not including those queries in your main query. with use of get, if these values are not provided by the user, they will be None and respected query won't be added to the query list.

How to escape Python boto's SelectExpression for Amazon SimpleDB

Currently my code is
client = boto3.client('sdb')
query = 'SELECT * FROM `%s` WHERE "%s" = "%s"' % (domain, key, value)
response = client.select(SelectExpression = query)
The variable key and value is input by user, what are the best way to escape them in my above code?
Edit: What I concern is how to escape the fields such as we did in the past to prevent SQL injection, but now in SimpleDB

Subselects and destructive operations can't be performed using simpledb.
Amazon provides quoting rules: http://docs.aws.amazon.com/AmazonSimpleDB/latest/DeveloperGuide/QuotingRulesSelect.html
You can apply this behavior in python using this function:
def quote(string):
return string.replace("'", "''").replace('"', '""').replace('`', '``')
client = boto3.client('sdb')
query = 'SELECT * FROM `%s` WHERE "%s" = "%s"' % (quote(domain), quote(key), quote(value))
response = client.select(SelectExpression = query)

If you meant sideffect of SQL injection is deletion/destruction, SimpleDB only support querying data, if you want to protect data exposing ( that you dont want to ) check aws docs here
Note: Since the guide is good to go, i thought the link is enough

Read/write values using Ethernet/IP

I recently have acquired an ACS Linear Actuator (Tolomatic Stepper) that I am attempting to send data to from a Python application. The device itself communicates using Ethernet/IP protocol.
I have installed the library cpppo via pip. When I issue a command
in an attempt to read status of the device, I get None back. Examining the
communication with Wireshark, I see that it appears like it is
proceeding correctly however I notice a response from the device indicating:
Service not supported.
Example of the code I am using to test reading an "Input Assembly":
from cpppo.server.enip import client
HOST = "192.168.1.100"
TAGS = ["#4/100/3"]
with client.connector(host=HOST) as conn:
for index, descr, op, reply, status, value in conn.synchronous(
operations=client.parse_operations(TAGS)):
print(": %20s: %s" % (descr, value))
I am expecting to get a "input assembly" read but it does not appear to be
working that way. I imagine that I am missing something as this is the first
time I have attempted Ethernet/IP communication.
I am not sure how to proceed or what I am missing about Ethernet/IP that may make this work correctly.

clutton -- I'm the author of the cpppo module.
Sorry for the delayed response. We only recently implemented the ability to communicate with simple (non-routing) CIP devices. The ControlLogix/CompactLogix controllers implement an expanded set of EtherNet/IP CIP capability, something that most simple CIP devices do not. Furthermore, they typically also do not implement the *Logix "Read Tag" request; you have to struggle by with the basic "Get Attribute Single/All" requests -- which just return raw, 8-bit data. It is up to you to turn that back into a CIP REAL, INT, DINT, etc.
In order to communicate with your linear actuator, you will need to disable these enhanced encapsulations, and use "Get Attribute Single" requests. This is done by specifying an empty route_path=[] and send_path='', when you parse your operations, and to use cpppo.server.enip.getattr's attribute_operations (instead of cpppo.server.enip.client's parse_operations):
from cpppo.server.enip import client
from cpppo.server.enip.getattr import attribute_operations
HOST = "192.168.1.100"
TAGS = ["#4/100/3"]
with client.connector(host=HOST) as conn:
for index, descr, op, reply, status, value in conn.synchronous(
operations=attribute_operations(
TAGS, route_path=[], send_path='' )):
print(": %20s: %s" % (descr, value))
That should do the trick!
We are in the process of rolling out a major update to the cpppo module, so clone the https://github.com/pjkundert/cpppo.git Git repo, and checkout the feature-list-identity branch, to get early access to much better APIs for accessing raw data from these simple devices, for testing. You'll be able to use cpppo to convert the raw data into CIP REALs, instead of having to do it yourself...
...
With Cpppo >= 3.9.0, you can now use much more powerful cpppo.server.enip.get_attribute 'proxy' and 'proxy_simple' interfaces to routing CIP devices (eg. ControlLogix, Compactlogix), and non-routing "simple" CIP devices (eg. MicroLogix, PowerFlex, etc.):
$ python
>>> from cpppo.server.enip.get_attribute import proxy_simple
>>> product_name, = proxy_simple( '10.0.1.2' ).read( [('#1/1/7','SSTRING')] )
>>> product_name
[u'1756-L61/C LOGIX5561']
If you want regular updates, use cpppo.server.enip.poll:
import logging
import sys
import time
import threading
from cpppo.server.enip import poll
from cpppo.server.enip.get_attribute import proxy_simple as device
params = [('#1/1/1','INT'),('#1/1/7','SSTRING')]
# If you have an A-B PowerFlex, try:
# from cpppo.server.enip.ab import powerflex_750_series as device
# parms = [ "Motor Velocity", "Output Current" ]
hostname = '10.0.1.2'
values = {} # { <parameter>: <value>, ... }
poller = threading.Thread(
target=poll.poll, args=(device,), kwargs={
'address': (hostname, 44818),
'cycle': 1.0,
'timeout': 0.5,
'process': lambda par,val: values.update( { par: val } ),
'params': params,
})
poller.daemon = True
poller.start()
# Monitor the values dict (updated in another Thread)
while True:
while values:
logging.warning( "%16s == %r", *values.popitem() )
time.sleep( .1 )
And, Voila! You now have regularly updating parameter names and values in your 'values' dict. See the examples in cpppo/server/enip/poll_example*.py for further details, such as how to report failures, control exponential back-off of connection retries, etc.
Version 3.9.5 has recently been released, which has support for writing to CIP Tags and Attributes, using the cpppo.server.enip.get_attribute proxy and proxy_simple APIs. See cpppo/server/enip/poll_example_many_with_write.py

hope this is obvious, but accessing HOST = "192.168.1.100" will only be possible from a system located on the subnet 192.168.1.*

Rally python REST: Query all tasks from chosen iteration

I'm trying to query all tasks from a specific iteration using the python toolkit for the rally REST API. The iteration will be chosen at run-time.
However I have been unable to set up the right query. I feel like i'm missing something small but important here.
This is the code:
query_criteria = 'Iteration.Name = "2014 november"'
response = rally.get('Task', fetch=True, query=query_criteria)
if response.errors:
sys.stdout.write("\n".join(response.errors))
sys.exit(1)
for Task in response:
if getattr(Task,"Iteration"):
print "%s %s" % (Task.Name,Task.Iteration.Name)
It will receive 0 rows in response.
If I remove , query=query_criteria and fetch all tasks, then i can see that there are tasks where the Task.Iteration.Name value is 2014 November.
The query does not give an error so I assume that the values of related objects (task->Iteration) are able to be included in the query. Yet I receive 0 rows in response.
Could the reason be that some tasks do not seem to be attached to an iteration?
One solution would be to fetch all tasks and then filter them afterwards. But that seems dirty.

If you query directly in the WS API in the browser do you get results?
https://rally1.rallydev.com/slm/webservice/v2.0/task?workspace=https://rally1.rallydev.com/slm/webservice/v2.0/workspace/12352608129&query=(Iteration.Name%20%3D%20%22my%20iteration%22)&pagesize=200
I verified that this code works with pyral 1.1.0, Python 2.7.0 and requests-2.3.0 - it returns all tasks of workproducts(e.g. user stories and defects) assigned to an iteration. I tested 3 queries: by state, by iteration reference and by iteration name (the first two are commented out in the code).
#!/usr/bin/env python
#################################################################################################
#
# showitems -- show artifacts in a workspace/project conforming to some common criterion
#
#################################################################################################
import sys, os
from pyral import Rally, rallyWorkset, RallyRESTAPIError
#################################################################################################
errout = sys.stderr.write
#################################################################################################
def main(args):
options = [opt for opt in args if opt.startswith('--')]
args = [arg for arg in args if arg not in options]
server, username, password, apikey, workspace, project = rallyWorkset(options)
if apikey:
rally = Rally(server, apikey=apikey, workspace=workspace, project=project)
else:
rally = Rally(server, user=username, password=password, workspace=workspace, project=project)
rally.enableLogging("rally.history.showitems")
fields = "FormattedID,State,Name"
#criterion = 'State != Closed'
#criterion = 'iteration = /iteration/20502967321'
criterion = 'iteration.Name = \"iteration 5\"'
response = rally.get('Task', fetch=fields, query=criterion, order="FormattedID",
pagesize=200, limit=400)
for task in response:
print "%s %s %s" % (task.FormattedID, task.Name, task.State)
print "-----------------------------------------------------------------"
print response.resultCount, "qualifying tasks"
#################################################################################################
#################################################################################################
if __name__ == '__main__':
main(sys.argv[1:])
sys.exit(0)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.