Get the parent message of a protobuf message (in python) - python

Is there any officially supported way to get the parent message for a given ProtoBuf message in Python? The way the Python protobuf interface is designed, we are guaranteed that each message will have at most one parent. It would be nice to be able to navigate from a message to its parent without building an external index.
Clearly, this information is present, and I can use the following code to get a weak pointer to the parent of any given message:
>>> my_parent = my_message._listener._parent_message_weakref
However, this uses internal attributes -- I would much rather use officially supported methods if possible.
If there is no officially supported way to do this, then I'll need to decide whether to build an external child→parent index (which could hurt performance), or to use this "hackish" method (appropriately wrapped).

After looking into this further (reading the source code), it's clear that there's no officially supported way to do this in Python.

Related

Will PKCS11 always find objects in the same order?

I have observed that both the bash command and what is probably a corresponding method from the Python PyKCS11 library seem to always find objects in the same order. My code relies on this being true, but have not read it anywhere, just observed it.
In the terminal:
$ pkcs11-tool --list-objects
Using slot 0 with a present token (0x0)
Public Key Object; RSA 2048 bits
label: bob_key
ID: afe438bbe0e0c2784c5385b8fbaa9146c75d704a
Usage: encrypt, verify, wrap
Public Key Object; RSA 2048 bits
label: alice_key
ID: b03a4f6c375e8a8a53bd7a35947511e25cbdc34b
Usage: encrypt, verify, wrap
With Python:
objects = session.findObjects([(CKA_CLASS, CKO_PUBLIC_KEY)])
for i, object in enumerate(objects):
d = object.to_dict()
print(d['CKA_LABEL'])
output:
bob_key
alice_key
objects is of type list and each element in objects is of type <class 'PyKCS11.CK_OBJECT_HANDLE'>
Will session.findObjects([(CKA_CLASS, CKO_PRIVATE_KEY)]) when run from a logged-in session also always be a list with exactly the same order as the expression above? In this case with two keys, would never want to see Alice come before Bob.
(Wanted to write a comment, but it got quite long...)
PKCS#11 does not guarantee any specific order of returned object handles so it is up to the particular implementation.
Even though your implementation might seem to be consistently giving the same order of objects there are some examples when this could unexpectedly change:
key renewal (keys do not last forever. You will need to generate some new keys in the future)
middleware upgrade (newer implementations might return objects in a different order)
HSM firmware upgrade (major upgrades might change the way objects are stored and change object enumeration order)
HSM recovery from backup (object order can change after HSM restore)
host OS data recovery (some implementatins store HSM objects encrypted in external folders and object search order might be the same as directory listing order which could change without a warning)
HSM change (are you sure that you will be using the same device for the whole lifetime of your application)
Relying on an undefined behaviour in general is a bad practice. Especially in security you should be very cautious.
It is definitely worth the time to stay on the safe side.
I would recommend to perform a separate search for each required object (using some strong identifier -- e.g. label) -- this way you can perform additional checks (e.g. enforce expected object type, ensure that object is unique etc.).
A similar example is Cryptoki object handle re-use. PKCS#11 states that object handle is bound to particular session (i.e. if you obtained object handle in session A you should not use it in session B -- even if both sessions are running in the same application).
There are implementations that preserve object handle for the same object across sessions. There are even implementations that preserve the same object handle in different applications (i.e. if you get object handle 123 in application A you will get object handle 123 in application B for the same object).
This behaviour is even described in the respective developer manual. But if you ask the vendor if you can rely on it you are told that there are some corner cases for some setups and that you must perform additional checks to be 100% sure that it will work as expected...
Good luck with your project!

How to generate python class files from protobuf

I am trying to transfer large amounts of structured data from Java to Python. That includes many objects that are related to each other in some form or another. When I receive them in my Python code, it's quiet ugly to work with the types that are provided by protobuf. My VIM IDE crashed when trying to use autocomplete on the types, PyCharm doesn't complete anything and generally it just seems absurd that they don't provide some clean class definition for the different types.
Is there a way to get IDE support while working with protobuf messages in python? I'm looking at 20+ methods handling complex messages and without IDE support I might as well code with notepad.
I understand that protobuf is using metaclasses (although I don't know why they do that). Maybe there is a way to generate python class files from that data or maybe there is something similar to typescript typing files.
Did I maybe misuse protobuf? I believed I would describe my domain model in a way that may be used across languages. In Java I am happy with the generated classes and I can use them easily. Should I maybe have used something like swagger.io instead?
If you are using a recent Python (3.7+) then https://github.com/danielgtaylor/python-betterproto (disclaimer: I'm the author) will generate very clean Python dataclasses as output which will give you proper typing and IDE completion support.
For example, this input:
syntax = "proto3";
package hello;
// Greeting represents a message you can tell a user.
message Greeting {
string message = 1;
}
Would generate the following output:
# Generated by the protocol buffer compiler. DO NOT EDIT!
# sources: hello.proto
# plugin: python-betterproto
from dataclasses import dataclass
import betterproto
#dataclass
class Hello(betterproto.Message):
"""Greeting represents a message you can tell a user."""
message: str = betterproto.string_field(1)
In general the output of this plugin mimics the *.proto input and is very easy to read if you happen to jump to definition on a message or field. It's been a huge improvement for me personally over the official Google compiler plugin, and supports async gRPC out of the box as well.
As of now, nothing like that is available. You might want to follow this issue: https://github.com/google/protobuf/issues/2638 to be up to date.
mypy-protobuf generates the type hint files. But as discussed here this works only from protobuf 3.0 and python 2.7 onwards.

Get types in Python expression using MyPy as library

I have some Python source code, and want to find out the type of a variable. For example given the string
"""
greeting = "Hello"
"""
I want to have get_type('greeting') == str. Or a more complex example:
"""
def test(input: str):
output = len(input)
return str
"""
In pseudocode, I want to be able to do something like:
>>> m = parse_module()
>>> m.functions['test'].locals['output'].get_type()
int
It seems this should be possible with type annotations and MyPy in Python 3, but I can't figure out how. IDEs like VS code have become very good at guessing the types in python code, that is why I'm guessing there must be an exposed way to do this.
There seems to be a module typed-ast, which is also used by MyPy, that gets me part of the way there. However, this does no type inference or propagation, it just gives me the explicit annotations as far as I understand. MyPy as an api, but it only lets you run the checker, and returns the same error messages as the command line tool. I am looking for a way to "reach into" MyPy, and get some of the inferred information out - or some alternative solution I haven't thought of.
Mypy currently has an extremely primitive, bare-bones API, which you can find "documented" within the source code here: https://github.com/python/mypy/blob/master/mypy/api.py. To use it, you essentially need to write your string to a temporary file which you later clean up.
You can perhaps combine this with the reveal_type(...) special directive (and perhaps even the hidden --shadow-file option) to typecheck your string.
The other alternative is to reverse engineer and re-implement pieces of mypy's main.py, essentially hijacking their internal API. I don't really think this will be hard, just somewhat ugly and fragile.
(Note that mypy can theoretically support typechecking arbitrary strings, and the core devs aren't opposed to extending the API for mypy in principle -- it's just that mypy is still under active development which means implementing an API has been very low priority for a while now. And since mypy is still actively being worked on/extended, the devs are somewhat reluctant to commit to implementing a more complex API that they'll subsequently have to support. You can find more context and details regarding the current state of the API in mypy's issue tracker.)

How to specifiy an alias for a Python class/attribute/constants/method in Sphinx?

I'm currently writing my Python module's documentation, using Sphinx.
While documenting some functions, if found myself writing things like:
"""
Some documentation.
:param foo: My param.
:raises my_module.some.wicked.but.necessary.hierarchy.MyException: Something bad happenned.
"""
This works fine, and Sphinx even links my_module.some.wicked.but.necessary.hierarchy.MyException to the documentation of my exception class.
However, I can see two issues here:
Having to type the complete module path to the exception is tedious. Not a big deal but I can understand how this avoids ambiguity when actually parsing the documentation. So I might live with that.
The generated documentation also lists the complete name (including the module path).
This second point makes the output rather hard to read and not nice at all. It clutters the documentation and doesn't bring much since one can click the link to get a complete definition of the exception class anyway.
I tried to write it as a relative path (using ..hierarchy.MyException for instance, but Sphinx would not find the class and the link would be broken).
Is there a way to define a default alias/caption that will be used instead of the complete path, whenever I reference my exception class ? I'd obviously like that the link remains as it is: I just want a nicer (shorter) caption.
If not, is there an option somewhere in Sphinx that tells it to avoid displaying the complete module path for some objects ? An option of some sort ?
try :
:raises :py:class:`~.MyException`: something bad happened

Swig / Python memory leak detected

I have a very complicated class for which I'm attempting to make Python wrappers in SWIG. When I create an instance of the item in Python, however, I'm unable to initialize certain data members without receiving the message:
>>> myVar = myModule.myDataType()
swig/python detected a memory leak of type 'MyDataType *', no destructor found.
Does anyone know what I need to do to address this? Is there a flag I could be using to generate destructors?
SWIG always generates destructor wrappers (unless %nodefaultdtor directive is used). However, in case where it doesn't know anything about a type, it will generate an opaque pointer wrapper, which will cause leaks (and the above message).
Please check that myDataType is a type that is known by SWIG. Re-run SWIG with debug messages turned on and check for any messages similar to
Nothing is known about Foo base type - Bar. Ignored
Receiving a message as above means that SWIG doesn't know your type hierarchy to the full extent and thus operates on limited information - which could cause it to not generate a dtor.
The error message is pretty clear to me, you need to define a destructor for this type.

Categories

Resources