Django and CFR 21 Part 11

Django and CFR 21 Part 11 - python

I have to transform my Django application so that it is compliant with "21 CFR Part 11", that is make electronic records have the same validity as signed paper records. Is there any project or application I should look at?
Some issues:
audit trail: every change in selected models must be traced (who, when, what)
detect unauthorized record editing: if a record has been changed/added/deleted outside normal procedure, the application should detect it
for particular operations, user has to enter the password again
passwords must be changed periodically and must satisfy certain criteria
etc...
I've found no ready solution on the net...

I work in an environment requiring CFR 21 Part 11 and similar. I have not yet gotten our apps fully compliant, but I have gone through a number of trial and errors so helpfully I can get you started in a few places.
1) I would also suggest Django reversion; however, you will require a little more than what it offers to achieves a variable level audit trail with the action that was taken (in addition to by whom and when). For this I used one of the reversion signals to turn the comment field into a dict that could be evaluated and then called for any variable in the row and the action that was taken on it etc. This is below:
https://github.com/etianen/django-reversion
#receiver(reversion.pre_revision_commit)
def it_worked(sender, **kwargs):
currentVersion = kwargs.pop('versions')[0].field_dict
fieldList = currentVersion.keys()
fieldList.remove('id')
commentDict = {}
try:
pastVersion = reversion.get_for_object(kwargs.pop('instances')[0])[0].field_dict
except IndexError:
for field in fieldList:
commentDict[field] = "Created"
except TypeError:
for field in fieldList:
commentDict[field] = "Deleted"
else:
for field in fieldList:
try:
pastTest = pastVersion[field]
except KeyError:
commentDict[field] = "Created"
else:
if currentVersion[field] != pastTest:
commentDict[field] = "Changed"
else:
commentDict[field] = "Unchanged"
comment = commentDict
revision = kwargs.pop('revision')
revision.comment = comment
revision.save()
kwargs['revision'] = revision
sender.save_revision
2/3) You are going to need to use an object-level permission system for this. I have implemented django-guardian. Pretty much the only limit on the complexity you can implement is how many things you can keep straight yourself. The base set of permissions you will need to implement are view, edit, delete, and some sort of data controller/manager role; however, you will probably want to go more complicated. I would highly suggest using class-based-views and mixins for permission checking, but function based can work as well. This can also be used to prompt for password for certain actions because you can control what happens to a field in any way you like.
https://github.com/lukaszb/django-guardian
4) Expiring passwords can be implemented with even just the Django auth system if you want or any user account management app. You will just need to add an extra field to record whatever datetime you want to begin your expiry countdown. Then on login just check for time from countdown and see if they have gone beyond the window, and if so require them to create a new password by directing them through the built-in view for password change or which mechanism is appropriate to your app.
I will tell you the most difficult part of implementing CFR 21 Part 11 will be getting the appropriate people to tell you exactly what your project should do to meet requirements, and getting inspected for compliance can be time consuming and costly.
Hope this helps you get started.

Django Reversion might give you a start on an audit trail, although you probably don't need all of its facilities.
For 2, 3 and 4 on your list, those are things you'll most likely end up coding yourself.

Related

How do I start using the RAWG api using rawgpy?

I'd like to make a database of all the games I play with their developers/publishers/platforms/etc... And I am sure that the RAWG api is the way to do that.
I'm experienced with python but I've never used an API before, here is the code I used from the quickstart guide:
import rawgpy
rawg = rawgpy.RAWG("User-Agent, this should identify your app")
results = rawg.search("Warframe") # defaults to returning the top 5 results
game = results[0]
game.populate() # get additional info for the game
print(game.name)
print(game.description)
for store in game.stores:
print(store.url)
rawg.login("someemail#example.com", "somepassword")
me = rawg.current_user()
print(me.name) # print my name, equivalent to print(self.username)
me.populate() # gets additional info for the user
for game in me.playing:
print(game.name) # prints all the games i'm currently playing
However I don't know what to use as my user agent in the second line. Any help would be much appreciated.
Here is the link to the quickstart guide

It's a bit tricky to tell from their documentation, but typically this means you just need to put in a user agent that shows the API your an app calling their api. Lots of APIs block default user agents to prevent spam and abuse, hence the requirement.
So you could literally put:
rawg = rawgpy.RAWG("My first app")
However it's better practice to put something unique and descriptive that describes your app. For your use case this could be "game-database-app-01".
There probably aren't any syntax requirements on what you can put in, but I wouldn't be surprised if they only accept alphanumeric entries.
It's probably a good idea to always call the same API with the same app name to avoid throwing any errors their end.
I hope this answers your question.

Checking username availability - Handling of AJAX requests (Google App Engine)

I want to add the 'check username available' functionality on my signup page using AJAX. I have few doubts about the way I should implement it.
With which event should I register my AJAX requests? We can send the
requests when user focus out of the 'username' input field (blur
event) or as he types (keyup event). Which provides better user
experience?
On the server side, a simple way of dealing with requests would be
to query my main 'Accounts' database. But this could lead to a lot
of request hitting my database(even more if we POST using the keyup
event). Should I maintain a separate model for registered usernames
only and use that to get better results?
Is it possible to use Memcache in this case? Initializing cache with
every username as key and updating it as we register users and use a
random key to check if cache is actually initialized or pass the
queries directly to db.

Answers -
Do the check on blur. If you do it on key up, you will be hammering your server with unnecessary queries, annoying the user who is not yet done typing, and likely lag the typing anyway.
If your Account entity is very large, you may want to create a separate AccountName entity, and create a matching such entity whenever you create a real Account (but this is probably an unnecessary optimization). When you create the Account (or AccountName), be sure to assign id=name when you create it. Then you can do an AccountName.get_by_id(name) to quickly see if the AccountName has already been assigned, and it will automatically pull it from memcache if it has been recently dealt with.
By default, GAE NDB will automatically populate memcache for you when you put or get entities. If you follow my advice in step 2, things will be very fast and you won't have to mess around with pre-populating memcache.
If you are concerned about 2 people simultaneously requesting the same user name, put your create method in a transaction:
#classmethod
#ndb.transactional()
def create_account(cls, name, other_params):
acct = Account.get_by_id(name)
if not acct:
acct = Account(id=name, other param assigns)
acct.put()

I would recommend the blur event of the username field, combined with some sort of inline error/warning display.
I would also suggest maintaining a memcache of registered usernames, to reduce DB hits and improve user experience - although probably not populate this with a warm-up, but instead only when requests are made. This is sometimes called a "Repository" pattern.
BUT, you can only populate the cache with USED usernames - you should not store the "available" usernames here (or if you do, use a much lower timeout).
You should always check directly against the DB/Datastore when actually performing the registration. And ideally in some sort of transactional method so that you don't have race conditions with multiple people registering.
BUT, all of this work is dependant on several things, including how busy your app is and what data storage tech you are using!

Best strategy for error handling in an interface to a database and web display

I decided to ask this question after going back and forth 100s of times trying to place error handling routines to optimize data integrity while also taking into account speed and efficiency (and wasting 100s of hours in the process. So here's the setup.
Database -> python classes -> python code -> javascript
MongoDB | that represent | that serves | web interface
the data pages (pyramid)
I want data to be robust, that is the number one requirement. So right now I validate data on the javascript side of the page, but also validate in the python classes which more or so represent data structures. While most server routines run through python classes, sometimes that feel inefficient given that it have to pass through different levels of error checking.
EDIT: I guess I should clarify. I am not looking to unify validation of client and server side code. Sorry for the bad write-up. I'm looking more to figure out where the server side validation should be done. Should it be in the direct interface to the database, or in the web server code where the data is received.
for instance, if I have an object with a barcode, should I validate the barcode in the code that reviews the data through AJAX or should I just call the object's class and validate there?
Again, is there sort of guidelines on how to do error checking in general? I want to be sort of professional, and learn but hopefully not have to go through a whole book.
I am not a software engineer, but I hope those of you who are familiar with complex projects, can tell me where I can find few guidelines on how to model/error check in a situation like this.
I'm not necessarily looking for an answer, but more like pointing me to a short set of guidelines when creating projects with different layers like this. Hopefully not extremely long..
I don't even know what tags to use in the post. HELP!!

Validating on the client and validating on the server serve different purposes entirely. Validating on the server is to make sure your model invariants hold and has to be done to maintain data integrity. Validating on the client is so the user has a friendly error message telling him that his input would've validated data integrity instead of having a traceback blow up into his face.
So there's a subtle difference in that when validating on the server you only really care whether or not the data is valid. On the client you also care, on a finer-grained level, why the input could be invalid. (Another thing that has to be handled at the client is an input format error, i.e. entering characters where a number is expected.)
It is possible to meet in the middle a little. If your model validity constraints are specified declaratively, you can use that metadata to generate some of the client validations, but they're not really sufficient. A good example would be user registration. Commonly you want two password fields, and you want the input in both to match, but the model will only contain one attribute for the password. You might also want to check the password complexity, but it's not necessarily a domain model invariant. (That is, your application will function correctly even if users have weak passwords, and the password complexity policy can change over time without the data integrity breaking.)
Another problem specific to client-side validation is that you often need to express a dependency between the validation checks. I.e. you have a required field that's a number that must be lower than 100. You need to validate that a) the field has a value; b) that the field value is a valid integer; and c) the field value is lower than 100. If any of these checks fails, you want to avoid displaying unnecessary error messages for further checks in the sequence in order to tell the user what his specific mistake was. The model doesn't need to care about that distinction. (Aside: this is where some frameworks fail miserably - either JSF or Spring MVC or either of them first attempts to do data-type conversion from the input strings to the form object properties, and if that fails, they cannot perform any further validations.)
In conclusion, the above implies that if you care about data integrity, and usability, you necessarily have to validate data at least twice, since the validations achieve different purposes even if there's some overlap. Client-side validation will have more checks and more finer-grained checks than the model-layer validation. I wouldn't really try to unify them except where your chosen framework makes it easy. (I don't know about Pyramid - Django makes these concerns separate in that Forms are a different layer than your Models, both can be validated, and they're joined by ModelForms that let you add additional validations to the ones performed by the model.)

Not sure I fully understand your question, but error handling on pymongo can be found here -
http://api.mongodb.org/python/current/api/pymongo/errors.html
Not sure if you're using a particular ORM - the docs have links to what's available, and these individually have their own best usages:
http://api.mongodb.org/python/current/tools.html
Do you have a particular ORM that you're using, or implementing your own through pymongo?

Error while creating PDF using ReportLab in Python

Data:
['<p>Work! please work.img:0\xc3\x82\xc2\xa0Will you?img:1</p>img:2img:3\xc3\x82\xc2\xa0ascasdacasdadasdaca HAHAHAHAHA! BAND!\n', '\n', "<p>Random test.</p><p><br />If you want to start a flame war, mention lines of code per day or hour in a developer\xc3\xa2€™s public forum. At least that is what I found when I started investigating how many lines of code are written per day per programmer. Lines of code, or loc for short, are supposedly a terrible metric for measuring programmer productivity and empirically I agree with this. There are too many variables involved starting with the definition of a line of code and going all the way up to the complexity of the requirements. There are single lines that take a long time to get right and there many lines which are mindless boilerplate code. All the same this measurement does have information encoded in it; the hard part is extracting that information and drawing the correct conclusions. Unfortunately I don\xc3\xa2€™t have access to enough data about software projects to provide a statistically sound analysis but I got a very interesting result from measuring two very different projects that I would like to share.</p><p>The first project is a traditional client server data mining tool for a vertical market mostly built in VB.NET and WinForms. This project started in 2003 and has been through several releases and an upgrade from .NET 1.1 to .NET 2.0. It has server components but most of the half a million lines of code lives in the client side. The team has always had around four developers although not always the same people. The average lines of code for this project came in at around ninety lines of code per day per developer. I wasn\xc3\xa2€™t able to measure the SQL in the stored procedures so this number is slightly inflated.</p><p><em>The second project is much smaller adding up to ten thousand lines of C# plus seven thousand lines of XAML c</em>reated by a team of four that also worked on the first project. This project lasted three months and it is a WPF point of sale application thus very different in scope from the first project. <strong>It was built around a number of web services in SOA fashion and does not have a database per se. Its average came up around seventy lines of code per developer per day.</strong></p><p>I am very surprised with the closeness of these numbers, especially given the difference in size and scope of the products. The commonality between them are the .NET framework and the team and one of them may be the key. Of these two, I am leaning to the .NET framework being the unifier because although the developers worked on both projects, three of elements on the team of the second project have spent less than a year on the first project and did not belong to the core team that wrote the vast majority of that first product. Or maybe there is something more general at work here?</p><p>The first step in using the WP_Filesystem is requesting credentials from the user. The normal way this is accomplished is at the time when you're saving the results of a form input, or you have otherwise determined that you need to write to a file.</p><p>The credentials form can be displayed onto an admin page by using the following code:</p><pre>$url = wp_nonce_url('themes.php?page=example','example-theme-options');\n</pre>", "if (false === ($creds = request_filesystem_credentials($url, '', false, false, null) ) ) {\n", '\treturn; // stop processing here\n', '}\n', '<p>The request_filesystem_credentials() call takes five arguments.</p><ul><li>The URL to which the form should be submitted (a nonced URL to a theme page was used in the example above)</li><li>A method override (normally you should leave this as the empty string: "")</li><li>An error flag (normally false unless an error is detected, see below)</li><li>A context directory (false, or a specific directory path that you want to test for access)</li><li>Form fields (an array of form field names from your previous form that you wish to "pass-through" the resulting credentials form, or null if there are none)</li></ul><p>The request_filesystem_credentials call will test to see if it is capable of writing to the local filesystem directly without credentials first. If this is the case, then it will return true and not do anything. Your code can then proceed to use the WP_Filesystem class.</p><p>The request_filesystem_credentials call also takes into account hardcoded information, such as hostname or username or password, which has been inserted into the wp-config.php file using defines. If these are pre-defined in that file, then this call will return that information instead of displaying a form, bypassing the form for the user.</p><p>If it does need credentials from the user, then it will output the FTP information form and return false. In this case, you should stop processing further, in order to allow the user to input credentials. Any form fields names you specified will be included in the resulting form as hidden inputs, and will be returned when the user resubmits the form, this time with FTP credentials.</p><p>Note: Do not use the reserved names of hostname, username, password, public_key, or private_key for your own inputs. These are used by the credentials form itself. Alternatively, if you do use them, the request_filesystem_credentials function will assume that they are the incoming FTP credentials.</p><p>When the credentials form is submitted, it will look in the incoming POST data for these fields, and if found, it will return them in an array suitable for passing to WP_Filesystem, which is the next step.</p><p><a id="Initializing_WP_Filesystem_Base" name="Initializing_WP_Filesystem_Base"></a>']
I use ReportLab to convert it to pdf but it fails.
This is my ReportLab code:
for page in self.pagelist:
self.image_parser(page)
print page.content
for i in range(0,len(page.content)):
bogustext = page.content[i]
while (len(re.findall(r'img:?',bogustext)) > 0):
for m in re.finditer( r'img:?', bogustext ):
image_tag = bogustext[m.start():m.end()+1]
print (image_tag.split(':')[1])
im = Image(page.images[int(image_tag.split(':')[1])],width=2*inch, height=2*inch)
Story.append(Paragraph(bogustext[0:m.start()], style))
bogustext = bogustext.replace(bogustext[0:m.start()],'')
Story.append(im)
bogustext = bogustext.replace(image_tag,'')
break
p = Paragraph(bogustext,style)
Story.append(p)
Story.append(Spacer(1,0.2*inch))
page is class of which page.content contains the Data I mentioned above.
self.image(page) is a function that removes all the image urls in the page.content(Data).
Error:
xml parser error (invalid attribute name id) in paragraph beginning
'<p>The request_filesystem_cred'
I don't get this error if I produce a PDF for every element of the list but I do get one if I try to make a complete PDF out of it. Where am I going wrong?

Best practice designing a permission system

I'm currently developing a little Python website using Pyramid.
But I don't know how to design the permission system.
The System should be very flexible: I have to establish connections between many different tables.
Instead of writing one permission table for every variant i thought to just create one table - I call it PermissionCollection:
PermissionCollection:
permissionCollectionId - PrimaryKey
onType = ENUM("USER","TEACHER","GROUP","COURSE"...)
onId = Integer
and the Permission table:
permissionId - PrimaryKey
key
value
permissionCollectionId - ForeignKey
I'll define standard PermissionCollections for every possible relationship hard-coded in sources and if a user,course,teacher... has special rights i'll create a new PermissionCollection and add the permission to it.
I'm very new to web programming, and don't know if this approach is useful. Or if something like this even exists. I think the Pyramid ACL isn't the right tool for this task, is it?

Not sure if you read about it already but pyramid does come with a really nice permission system. Authorization with ACL.
How to handle it, it really only depend of you...
You could have a ACL table
(object_id, allow/deny, who?(group, userid), permission, order)
object_id is a unique id to a record in your database
allow/deny is what this ACE is supposed to do...allow or deny access
who? is either a group, username or whatever you want for example system.everyone is everyone
permission is the permission parameter in view_config
order is one important thing order does matter
For example
__acl__ = [
(Deny, Everyone, 'view'),
(Allow, 'group:admin', 'view')
]
This sample will always deny view even for admin... As soon as pyramid find something that tells you if you can see or not see a record it automatically stop searching
__acl__ = [
(Allow, 'group:admin', 'view'),
(Deny, Everyone, 'view')
]
This will allow view for every admin but not for anyone else. That is why you have to remember the order of your ACEs.
The fun part is here actually. This is all good. You have acl mapped to a record in your data. When you load for example a page... You will have to load the acl and set them in your object.
myobject.__acl__ = load_acls(myobject)
If you have a data tree. You can even not set acls.
For example you have a site that looks like that
root
\--pages with acl
+---- page1 without acl
\---- page2 with acl
When you will access page1, it will check for acl if it can't find it, it will check for parent if parent has an acl, it will check permission for it, if it doesn't it will check for its parent until you reach root. If it can't find the permission, im not so sure what happens.. I guess it will either give you a forbidden error or predicate error. That it can't find the proper view.
That said, in order to make that work you have to make location aware object that knows their parents.
But why would you want to do all that?
You can have acl for any object and have really fine grained control on who can watch or not every object in your database. You can also put acl directly in your class object without database.
as long as your acl is in the attribute acl pyramid will be able to do something with it. It's not really important how you got it.
Check this out
http://pyramid.readthedocs.org/en/1.3-branch/tutorials/wiki/authorization.html

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.