Will PKCS11 always find objects in the same order?

Will PKCS11 always find objects in the same order? - python

I have observed that both the bash command and what is probably a corresponding method from the Python PyKCS11 library seem to always find objects in the same order. My code relies on this being true, but have not read it anywhere, just observed it.
In the terminal:
$ pkcs11-tool --list-objects
Using slot 0 with a present token (0x0)
Public Key Object; RSA 2048 bits
label: bob_key
ID: afe438bbe0e0c2784c5385b8fbaa9146c75d704a
Usage: encrypt, verify, wrap
Public Key Object; RSA 2048 bits
label: alice_key
ID: b03a4f6c375e8a8a53bd7a35947511e25cbdc34b
Usage: encrypt, verify, wrap
With Python:
objects = session.findObjects([(CKA_CLASS, CKO_PUBLIC_KEY)])
for i, object in enumerate(objects):
d = object.to_dict()
print(d['CKA_LABEL'])
output:
bob_key
alice_key
objects is of type list and each element in objects is of type <class 'PyKCS11.CK_OBJECT_HANDLE'>
Will session.findObjects([(CKA_CLASS, CKO_PRIVATE_KEY)]) when run from a logged-in session also always be a list with exactly the same order as the expression above? In this case with two keys, would never want to see Alice come before Bob.

(Wanted to write a comment, but it got quite long...)
PKCS#11 does not guarantee any specific order of returned object handles so it is up to the particular implementation.
Even though your implementation might seem to be consistently giving the same order of objects there are some examples when this could unexpectedly change:
key renewal (keys do not last forever. You will need to generate some new keys in the future)
middleware upgrade (newer implementations might return objects in a different order)
HSM firmware upgrade (major upgrades might change the way objects are stored and change object enumeration order)
HSM recovery from backup (object order can change after HSM restore)
host OS data recovery (some implementatins store HSM objects encrypted in external folders and object search order might be the same as directory listing order which could change without a warning)
HSM change (are you sure that you will be using the same device for the whole lifetime of your application)
Relying on an undefined behaviour in general is a bad practice. Especially in security you should be very cautious.
It is definitely worth the time to stay on the safe side.
I would recommend to perform a separate search for each required object (using some strong identifier -- e.g. label) -- this way you can perform additional checks (e.g. enforce expected object type, ensure that object is unique etc.).
A similar example is Cryptoki object handle re-use. PKCS#11 states that object handle is bound to particular session (i.e. if you obtained object handle in session A you should not use it in session B -- even if both sessions are running in the same application).
There are implementations that preserve object handle for the same object across sessions. There are even implementations that preserve the same object handle in different applications (i.e. if you get object handle 123 in application A you will get object handle 123 in application B for the same object).
This behaviour is even described in the respective developer manual. But if you ask the vendor if you can rely on it you are told that there are some corner cases for some setups and that you must perform additional checks to be 100% sure that it will work as expected...
Good luck with your project!

Related

(Python) Monkeypatch new for objects of type int, float, str, list, dict, set, and module in python

I want to implicitly extend the int, float, str, list, dict, set, and module classes with custom built substitutions (extensions).
When I say 'implicitly', what I mean is that when I declare 'a = 1', and object of the type Custom_Int (as an example) is produced, as opposed to a normal integer object.
Now, I understand and respect the reasons not to do this. Firstly- messing with built-ins is like messing with the laws of physics. No good can come from it. That said- I do understand the gravity of what I'm trying to do and what can happen if I do it wrong.
Second- I understand that modifying a base case will effect not just the current run-time but all running python processes. I feel that by overriding the __new__ method of these base classes, such that it returns Custom_Object_Whatever if and ONLY IF certain environmental factors are true, other run times will remain largely unaffected.
So, getting back to the issue at hand- how can I override the __new__ method of these various types?
Pythons forbiddenfruit package seems to be promising. I havn't had a chance to reeeeeeally investigate it though, and if someone who understands it could summarize what it does, that would save me a lot of time.
Beyond that, I've observed something strange.
Every answer to monkeypatching that doesn't eventually circle back to forbiddenfruit or how forbiddenfruit works has to do with modifying what I will refer to as the 'absolute_dictionary' of the class. Because everything in Python is essentially a mapping (or dictionary) of functions/values to names, if you change the name __new__ within the right mapping, you change the nature of the object.
Problem is- every near-success I've had has it that if I call 'str( 'a' ).__new__( *args )' it works fine {in some cases}, but the calling of varOne = 'a' does not seem to actually call str.__new__().
My guess- this has something to do with either python's parsing of a program prior to launch, or else the caching of the various classes during/post launch. Or maybe I'm totally off the mark. Either python pre-reads and applies some regex to it's modules prior to launch or else the machine code, when it attempts to implicitly create an object, it reaches for something other than the class located in moduleObject.builtins[ __classname__ ]
Any ideas?

If you want to do this, your best option is probably to modify the CPython source code and build your own custom Python build with your extensions baked into the actual built-in types. The result will integrate a lot better with all the low-level mechanisms you don't yet understand, and you'll learn a lot in the process.
Right now, you're getting blocked by a lot of factors. Here are the ones that have come to my mind.
The first is that most ways of creating built-in objects don't go through a __new__ method at all. They go through C-level calls like PyLong_FromLong or PyList_New. These calls are hardwired to use the actual built-in types, allocating memory sized for the real built-in types, fetching the type object by the address of its statically-allocated C struct, and stuff like that. It's basically impossible to change any of this without building your own custom Python.
The second factor is that messing with __new__ isn't even enough to correctly affect things that theoretically should go through __new__, like int("5"). Python has reasons for stopping you from setting attributes on built-in classes, and two of those reasons are slots and the type attribute cache.
Slots are a public part of the C API that you'll probably learn about if you try to modify the CPython source code. They're function pointers in the C structs that make up type objects at C level, and most of them correspond to Python-level magic methods. For example, the __new__ method has a corresponding tp_new slot. Most C code accesses slots instead of methods, and there's code to ensure the slots and methods are in sync, but if you bypass Python's protections, that breaks and everything goes to heck.
The type attribute cache isn't a public part of anything even at C level. It's a cache that saves the results of type object attribute lookups, to make Python go faster. Its memory safety relies on all type object attribute modification going through type.__setattr__ (and all built-in type object attribute modification getting rejected by type.__setattr__), but if you bypass the protection, memory safety goes out the window and arbitrarily weird results can occur.
The third factor is that there's a bunch of caching for immutable objects. The small int cache, the interned string dict, constants getting saved in bytecode objects, compile-time constant folding... there's a lot. Objects aren't going to be created when you expect. (There's also stuff like, say, zip saving the last output tuple and reusing it if it sees you didn't keep a reference, for even more ways object creation will mess with your assumptions.)
There's more. Stuff like, what argument would int.__new__ even take if you tried to use int.__new__ to evaluate the expression 5? Stuff like all the low-level code that knows exactly how to work with the types it expects and will get very confused if it gets a MyCustomTuple with a completely different memory layout from a real tuple. Screwing with built-ins has a lot of issues.
Incidentally, one of the things you expected to be a problem is mostly not a problem. Screwing with one Python process's built-ins won't affect other Python processes' built-ins... unless those other processes are created by forking the first process, such as with multiprocessing in fork mode.

How does PySNMP handle tables with read-create permissions?

I'm new to SNMP, and finding it difficult to understand some of the mechanisms in PySNMP. I need to implement a table with read-create permissions to monitor and control a bridge on my network. I think it would be helpful if I had more clarity on one of the pieces of example code to understand what's happening in the framework when a manager attempts to create a new row.
I've been examining the sample code for implementing a conceptual table and executing the example snmpset/walk commands:
$ snmpset -v2c -c public 127.0.0.1 1.3.6.6.1.5.2.97.98.99 s “my value”
$ snmpset -v2c -c public 127.0.0.1 1.3.6.6.1.5.4.97.98.99 i 4
$ snmpwalk -v2c -c public 127.0.0.1 1.3.6
As far as I can tell, the set commands work because the MIB promises that exampleTableColumn2 describes OctetString scalars. How is this data created/stored by the agent? Is a generic scalar object created with the suffix ".97.98.99," or is this information somehow associated with the instance of exampleTableColumn2? If I were to subsequently run an snmpget or snmpset command on the object we just created, what would I be interacting with in the eyes of the framework?
In a real-world implementation, the agent would really be querying the device to create a new entry in some internal table, and you would need custom scalar objects with modified readGet/writeCommit methods, but the sample code hasn't established scalar classes to implement get/set methods. By understanding how columns with read-create permissions should be handled in PySNMP, I think I can implement a more robust agent application. Any help/clarity is sincerely appreciated.

How is this data created/stored by the agent? Is a generic scalar object created with the suffix ".97.98.99," or is this information somehow associated with the instance of exampleTableColumn2?
This is a generic scalar value of type OctetString associated with a leaf node in a tree of objects (MIB tree) of type MibTableColumn. In the MIB tree you will find a handful of node types each exhibiting distinct behavior (see docstrings), but otherwise they are very similar. Each node is identified by an OID.
If I were to subsequently run an snmpget or snmpset command on the object we just created, what would I be interacting with in the eyes of the framework?
The MIB tree object responsible for the OID you are querying will receive read* (for SNMP GET) or read*Next (for SNMP GETNEXT/GETBULK) events to which it should respond with a value.
In a real-world implementation, the agent would really be querying the device to create a new entry in some internal table, and you would need custom scalar objects with modified readGet/writeCommit methods
There are a couple of approaches to this problem, the way I've been pursuing it so far is to override some of these read*/read*Next/write* methods to read or write the value from/to its ultimate source (your internal table).
To simplify and keep your code in-sync with the MIB you are implementing, the pysmi library can turn a MIB into Python code with stubs via Jinja2 templates. From these stubs you can access your internal table whenever SNMP request triggers a read or write event. You can put your custom code into these stubs and/or into Jinja2 templates from which these stubs are generated.
Alternatively to implementing your own SNMP agent, you might consider this general purpose tool, which is driven by the same technology.

mlock a variable in python [duplicate]

This question already has an answer here:
Prevent RAM from paging to swap area (mlock)
(1 answer)
Closed 3 years ago.
I'm working on a password manager application for linux and I'm using Python for it.
Because of the security reasons I want to call ‍mlock‍‍ system call in order to avoid swapping password variable on hard drive.
I noticed that python itself didn't wrap this function.
so is there any way so can I avoid swapping?
Thanks

For CPython, there is no good answer for this that doesn't involve writing a Python C extension, since mlock works on pages, not objects. The internals of the str object differ from version to version (in Py3.3 and higher, a str may actually have several copies of the data in memory in different encodings, some inlined after the object structure, some dynamically allocated separately and linked by pointer), and even if you used ctypes to retrieve the necessary addresses and mlock-ed them all through ctypes mlock calls, you'll have a hell of a time determining when to mlock and when to munlock. Since mlock works on pages, you'd have to carefully track how many strings are currently in any given page (because if you just mlock and munlock blindly, and there are more than one things to lock in a page, the first munlock would unlock all of them; mlock/munlock is a boolean flag, it doesn't count the number of locks and unlocks).
Even if you manage that, you still would have a race between password acquisition and mlock during which the data could be written to swap, and those cached alternate encodings are computed lazily, so mlocking the non-NULL pointers at any given time doesn't necessarily mean those pointers might not be populated later.
You could partially avoid these problems through careful use of the mmap module and memoryviews (mmap gives you pages of memory, memoryview references said memory without copying it, so ctypes could be used to mlock the page), but you'd have to build it all from scratch (can't use the getpass module because it would store as a str for a moment).
In short, Python doesn't care about swapping or memory protection in the way you want; it trusts the swap file to be configured to your desired security (e.g. disabled or encrypted), neither providing additional protection nor providing the information you'd need to add it in.

Python - Releasing/replacing a string variable, how does it get handled?

Say i store a password in plain text in a variable called passWd as a string.
How does python release this variable once i discard of it (for instance, with del passWd or passWd= 'new random data')?
Is the string stored as a byte-array meaning it can be overwritten in the memoryplace that it originally existed or is it a fixed set in a memory area which can't be modified and there for when assining a new value a new memory area is created and the old area is discareded but not overwritten by null?
I'm questioning how Python implements the safety of memory areas and would like to know more about it, mainly because i'm curious :)
From what i've gathered so far, using del (or __del__) causes the interpreter to not release memory areas of that variable automaticly which can cause issues, and also i'm not sure that del is so thurrow on deleting the values. But that's just from what i've gathered and not something in black or white :)
The main reason for me asking, is I'm intending to write a hand-over application that gets a string, does some I/O, passes it along to another subsystem (bootloader for raspberry pi for instance) and the interface is written in Python (how odd that must sound in some peoples ears..) and i'm not worried that the data is compromised during the I/O calculations but that a memory dump might be occuring in between the two subsystem handovers. or if the system is frozen (say a hiberation) say 20min after the system is booted and i removed the variable as fast as i could, but somehow it's still in the memory despite me doing a del passWd :)
(Ps. I've asked on Superuser, they refered me here aand i'm sorry for poor grammar!)

Unless you use custom coded input methods to get the password, it will be in many more places then just your immutable string. So don't worry too much.
The OS should take care that any data from your process is cleared before the memory is allocated to another process. This may of course fail if the page is copied to disk (swapped out or hibernated).
Secure password entry is not easy. Maybe you can find a special library or module that handles this.

I finally whent with two solutions.
ld_preload to replace the functionality of the string handling of Python on a lower level.
One other option which is a bit easier was to develop my own C library that has more functionality then what Python offers through the standard string handling.
Mainly the C code has a shread() function that writes over the memory area where the string "was" stored and some other error checks.
However, #Ber gave me a good enough answer to start developing my own solution since (as he pointed out) there is no secure method in Python and python stores strings in way to many places and relies on the OS (which, on it's own isn't a bad thing except when you don't trust the OS you are installing your realtively secure application on).

Python design question

I'm a C programmer and I'm getting quite good with Python. But I still have some problems getting my mind around the OO awesomeness of Python.
Here is my current design problem:
The end "product" is a JSON data structure created in Python (and passed to Javascript code) containing different types of data like:
{ type:url, {urlpayloaddict) }
{ type:text, {textpayloaddict}
...
My Javascript knows how to parse and display each type of JSON response.
I'm happy with this design. My question comes from handling this data in the Python code.
I obtain my data from a variety of sources: MySQL, a table lookup, an API call to a web service...
Basically, should I make a super class responseElement and specialise it for each type of response, then pass around a list of these objects in the Python code OR should I simply pass around a list of dictionaries that contain the response data in key value pairs. The answer seems to result in significantly different implementations.
I'm a bit unsure if I'm getting too object happy ??

In my mind, it basically goes like this: you should try to keep things the same where they are the same, and separate them where they're different.
If you're performing the exact same operations on and with the data, and it can all be represented in a common format, then there's no reason to have separate objects for it - translate it into a common format ASAP and Don't Repeat Yourself when it comes to implementing things that don't distinguish.
If each type/source of data requires specialized operations specific to it, and there isn't much in the way of overlap between such at the layer your Python code is dealing with, then keep things in separate objects so that you maintain a tight association between the specialized code and the specific data on which it is able to operate.

Do the different response sources represent fundamentally different categories or classes of objects? They don't appear to, the way you've described it.
Thus, various encode/decode functions and passing around only one type seems the best solution for you.
That type can be a dict or your own class, if you have special methods to use on the data (but those methods would then not care what input and output encodings were), or you could put the encode/decode pairs into the class. (Decode would be a classmethod, returning a new instance.)

Your receiver objects (which can perfectly well be instances of different classes, perhaps generated by a Factory pattern depending on the source of incoming data) should all have a common method that returns the appropriate dict (or other directly-JSON'able structure, such as a list that will turn into a JSON array).
Differently from what one answer states, this approach clearly doesn't require higher level code to know what exact kind of receiver it's dealing with (polymorphism will handle that for you in any OO language!) -- nor does the higher level code need to know "names of keys" (as, again, that other answer peculiarly states), as it can perfectly well treat the "JSON'able data" as a pretty opaque data token (as long as it's suitable to be the argument for a json.dumps later call!).
Building up and passing around a container of "plain old data" objects (produced and added to the container in various ways) for eventual serialization (or other such uniform treatment, but you can see JSON translation as a specific form of serialization) is a common OO pattern. No need to carry around anything richer or heavier than such POD data, after all, and in Python using dicts as the PODs is often a perfectly natural implementation choice.

I've had success with the OOP approach. Consider a base class with a "ToJson" method and have each subclass implement it appropriately. Then your higher level code doesn't need to know any detail about how the data was obtained...it just knows it has to call "ToJson" on every object in the list you mentioned.
A dictionary would work too, but it requires your calling code to know names of keys, etc and won't scale as well.
OOP I say!

Personally, I opt for the latter (passing around a list of data) wherever and whenever possible. I think OO is often misused/abused for certain things. I specifically avoid things like wrapping data in an object just for the sake of wrapping it in an object. So this, {'type':'url', 'data':{some_other_dict}} is better to me than:
class DataObject:
def __init__(self):
self.type = 'url'
self.data = {some_other_dict}
But, if you need to add specific functionality to this data, like the ability for it to sort its data.keys() and return them as a set, then creating an object makes more sense.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.