Using google.protobuf.Any in python file

Using google.protobuf.Any in python file - python

I have such .proto file
syntax = "proto3";
import "google/protobuf/any.proto";
message Request {
google.protobuf.Any request_parameters = 1;
}
How can I create Request object and populate its fields? I tried this:
import ma_pb2
from google.protobuf.any_pb2 import Any
parameters = {"a": 1, "b": 2}
Request = ma_pb2.Request()
some_any = Any()
some_any.CopyFrom(parameters)
Request.request_parameters = some_any
But I have an error:
TypeError: Parameter to CopyFrom() must be instance of same class: expected google.protobuf.Any got dict.
UPDATE
Following prompts of #Kevin I added new message to .proto file:
message Small {
string a = 1;
}
Now code looks like this:
Request = ma_pb2.Request()
small = ma_pb2.Small()
small.a = "1"
some_any = Any()
some_any.Pack(small)
Request.request_parameters = small
But at the last assignment I have an error:
Request.request_parameters = small
AttributeError: Assignment not allowed to field "request_parameters" in protocol message object.
What did I do wrong?

Any is not a magic box for storing arbitrary keys and values. The purpose of Any is to denote "any" message type, in cases where you might not know which message you want to use until runtime. But at runtime, you still need to have some specific message in mind. You can then use the .Pack() and .Unpack() methods to convert that message into an Any, and at that point you would do something like Request.request_parameters.CopyFrom(some_any).
So, if you want to store this specific dictionary:
{"a": 1, "b": 2}
...you'll need a .proto file which describes some message type that has integer fields named a and b. Personally, I'd see that as overkill; just throw your a and b fields directly into the Request message, unless you have a good reason for separating them out. If you "forget" one of these keys, you can always add it later, so don't worry too much about completeness.
If you really want a "magic box for storing arbitrary keys and values" rather than what I described above, you could use a Map instead of Any. This has the advantage of not requiring you to declare all of your keys upfront, in cases where the set of keys might include arbitrary strings (for example, HTTP headers). It has the disadvantage of being harder to lint or type-check (especially in statically-typed languages), because you can misspell a string more easily than an attribute. As shown in the linked resource, Maps are basically syntactic sugar for a repeated field like the following (that is, the on-wire representation is exactly the same as what you'd get from doing this, so it's backwards compatible to clients which don't support Maps):
message MapFieldEntry {
key_type key = 1;
value_type value = 2;
}
repeated MapFieldEntry map_field = N;

Related

Passing a function to an Object at Instantiation in C#

I am currently working on a project where I am trying to make an GUI for programming an I2C device. I have a bunch of register addresses and values in those registers both in hex strings like so:
[Address, Value][0x00, 0x01][0x01, 0xFF][0x02, 0xA0]... //not code, just abstract representation
Each register value represents something different. I have to convert each register value from a hex string to its associated human-understandable representation. My plan for this is to create a dictionary whose keys are the register addresses and whose values are a register object. The register object would contain information about the register like a description and a conversion function which takes in the hex string and outputs a converted value. The conversion function is different for every register since they represent different things. I want to create a generic register class and pass in each register's unique conversion function at instantiation. Passing this function is where I'm not sure if I'm making things more complicated than they need to be.
My code currently looks like this:
private class my_register
{
private string Description;
{get;}
//using delegate to be able to store function specified outside of class
public delegate string convert_del(string reg_val);
public convert_del conversion_func;
//constructor uses a delegate to store the function passed at instantiation
my_register(string desc, func<string, string> convert_func_input)
this.Description = desc;
this.conversion_func = new convert_del(convert_func_input)
}
//Now I can create the object and pass in the function
static void Main()
{
// lambda function is just a simple place holder to show that I can pass a function
my_register first_reg = new my_register("Temp", reg_val => (Convert.ToInt32(reg_val, 16)
+ 10).ToString())
Console.WriteLine(first_reg.conversion_func("0x0A")) //output is 20
}
This code works at least for the minimal testing I ran. This took me a long time to figure out, perhaps because I just didn't understand delegates well, but I am wondering if this is a convoluted way of going about this in C#. I come from a python background, though I'm not particularly skilled in that either. The way I would do this in python is as follows:
class my_register(object):
def __init__(self, desc, conversion):
self.Description = desc
self.conversion = conversion
def temp_convert(reg_val):
return str(int(reg_val, 0) + 10)
first_reg = my_register("Temp", temp_convert)
first_reg.conversion(10) #returns '20'
The pythonic way just seems way simpler, so I'm wondering if there a better more canonical way of achieving this function passing in C# or if there is a way to avoid passing the function all together?

What's the most Pythonic way to pull from structured data with an inconsistent maximum "depth"?

I've got a JSON file holding different dialog lines for a Discord bot to use, sorted by which bot command triggers the line. It looks something like this:
{
"!remind": {
"responses": {
"welcome": "Reminder set for {time}.",
"reminder": "At {time} you asked me to remind you {thing}."
},
"errors": {
"R0": "Invalid reminder time.",
"R1": "Reminder time is in the past."
},
"help": "To set a reminder say `!reminder [time] [thing]`"
},
"<!timezone, !spoilimage, !crosspost, etc.>": {
<same structure>
}
}
I have a function that's meant to access the values stored in the JSON file, do any necessary formatting using kwargs, and return a string. My original approach was
def dialog(command, category, name, **fmt):
json_data = <json stuff>
return json_data[command][category][name].format(**fmt)
# Sample call:
pastcommand = <magic>
reply = dialog("!remind", "response", "reminder", time=pastcommand.time, thing=pastcommand.message)
# Although in practice I've made wrapper methods to avoid having to specify all of these args each time
But this will only work for "responses" and "errors", not "help," since in "help" the message to send is a level "shallower".
Two other things to note:
It's unlikely there will ever need to be anything in "help" other than the single value.
Currently there are no name conflicts between keys in different subcategories, and it's very easy to keep it that way. However, "responses"/"errors"/"help" is consistent across all categories, and some key names are repeated across categories (although I could change that if necessary).
So, in terms of fixing this, I could always just restructure the JSON file, something like
"help": {"main": "To set a reminder say `!reminder [time] [thing]`"}
but I don't like the idea of turning a string into a dict containing just a single string, just to satisfy the constraints of a function that pulls it.
Beyond that, I've run through a number of options, namely: explicitly checking the category and making it a special case (if category == "help"); trying both options with a try/except block, and using pandas.json_normalize (which I'm pretty sure would work? I haven't actually worked with it. Either way, any time a seemingly simple problem brings me to a third-party library, it makes me suspect I'm doing something wrong.).
What I've settled on, so far, is this:
def dialog(*json_keys, **fmt):
json_data = <json stuff>
current_level = json_data
for key in json_keys:
# Let's pretend I did error-handling here.
current_level = current_level[key]
return current_level.format(**fmt)
It's a lot more elegant and more flexible than any of the other things I considered, but I'm self-taught and pretty inexperienced, and I'm wondering if I'm overlooking some better approach.

Safe and generic serialization in Python

I want to (de)serialize simple objects in Python to a human-readable (e.g. JSON) format. The data may come from an untrusted source. I really like how the Rust library, serde, works:
#[derive(Serialize, Deserialize, Debug)]
struct Point {
x: i32,
y: i32,
}
fn main() {
let point = Point { x: 1, y: 2 };
// Convert the Point to a JSON string.
let serialized = serde_json::to_string(&point).unwrap();
// Prints serialized = {"x":1,"y":2}
println!("serialized = {}", serialized);
// Convert the JSON string back to a Point.
let deserialized: Point = serde_json::from_str(&serialized).unwrap();
// Prints deserialized = Point { x: 1, y: 2 }
println!("deserialized = {:?}", deserialized);
}
I'd like to achieve something like this in Python. Since Python is not statically typed, I'd expect the syntax to be something like:
deserialized = library.loads(data_str, ClassName)
where ClassName is the expected class.
jsonpickle is bad, bad, bad. It makes absolutely no sanitization and its usage leads to arbitrary code execution
There are the serialization libraries: lima, marshmallow, kim but all of them require manually defining serialization schemes. It, in fact, leads to code duplication, which is bad.
Is there anything I could use for simple, generic yet secure serialization in Python?
EDIT: other requirements, which were implicit before
Handle nested serialization (serde can do it: https://gist.github.com/63bcd00691b4bedee781c49435d0d729)
Handle built-in types, i.e. be able to serialize and deserialize everything that the built-in json module can, without special treatment of built-in types.

Since Python doesn't require type annotations, any such library would need to either
use its own classes
take advantage of type annotations.
The latter would be the perfect solution but I have not found any library doing that.
I found a module, though, which requires to define only one class as a model: https://github.com/dimagi/jsonobject
Usage example:
import jsonobject
class Node(jsonobject.JsonObject):
id = jsonobject.IntegerProperty(required=True)
name = jsonobject.StringProperty(required=True)
class Transaction(jsonobject.JsonObject):
provider = jsonobject.ObjectProperty(Node)
requestor = jsonobject.ObjectProperty(Node)
req = Node(id=42, name="REQ")
prov = Node(id=24, name="PROV")
tx = Transaction(provider=prov, requestor=req)
js = tx.to_json()
tx2 = Transaction(js)
print(tx)
print(tx2)

For Python, I would start just by checking the size of the input. The only security risk is running json.load() is a DOS by sending an enormous file.
Once the JSON is parsed, consider running a schema validator such as PyKwalify.

Require a `oneof` in protobuf?

I want to make a protobuf Event message that can contain several different event types. Here's an example:
message Event {
required int32 event_id = 1;
oneof EventType {
FooEvent foo_event = 2;
BarEvent bar_event = 3;
BazEvent baz_event = 4;
}
}
This works fine, but one thing that bugs me is that EventType is optional: I can encode an object with only an event_id and protobuf won't complain.
>>> e = test_pb2.Event()
>>> e.IsInitialized()
False
>>> e.event_id = 1234
>>> e.IsInitialized()
True
Is there any way to require the EventType to be set? I'm using Python, if that matters.

According to Protocol Buffers document, the required field rule is not recommended and has already been removed in proto3.
Required Is Forever You should be very careful about marking fields as required. If at some point you wish to stop writing or sending a required field, it will be problematic to change the field to an optional field – old readers will consider messages without this field to be incomplete and may reject or drop them unintentionally. You should consider writing application-specific custom validation routines for your buffers instead. Some engineers at Google have come to the conclusion that using required does more harm than good; they prefer to use only optional and repeated. However, this view is not universal.
And as the above document says, you should consider using application-specific validation instead of marking the fields as required.
There is no way to mark a oneof as "required" (even in proto2) because at the time oneof was introduced, it was already widely accepted that fields probably should never be "required", and so the designers did not bother implementing a way to make a oneof required.

Using options syntax, there are ways to specify validation rules and auto-generate the code for the validation routines.
You can use https://github.com/envoyproxy/protoc-gen-validate like this:
import "validate/validate.proto";
message Event {
required int32 event_id = 1;
oneof EventType {
option (validate.required) = true;
FooEvent foo_event = 2;
BarEvent bar_event = 3;
BazEvent baz_event = 4;
}
}
"You should consider writing application-specific custom validation routines for your buffers instead." And here we are auto-generating such custom validation routines.
But wait, is this going against the spirit of protobuf spec? Why is required bad and validate good? My own answer is that the protobuf spec cares very much about "proxies", i.e. software which serializes/deserializes messages, but has almost no business logic on its own. Such software can simply omit the validation (it's an option), but it cannot omit required (it must render the message unparseable).
For business logic's side, all of this is not a big problem in my experience.

Python TypeError: 'TagList' object is not iterable

This happens all the time. A function returns an object that I can't read. Here:
discoverer = GstPbutils.Discoverer()
discoverer.connect('discovered', on_discovered)
info = discoverer.discover_uri(self.loaded_file)
print(vinfo.get_tags())
Returns this:
<Gst.TagList object at 0x7f00a360c0a8 (GstTagList at 0x7f00880024a0)>
But when I try to do this:
tags = vinfo.get_tags()
for tag in tags:
print (tag)
I get this:
TypeError: 'TagList' object is not iterable
But when I read the doc of this data structure, I seem to understand it's ... List? Can somebody, beyond telling me how to get the tags, indicate me how to read those docs? Also, am I missing some introspection methods and tools, that I could use to discover what the objects I encounter are, and how they work?

This is all hypothetical as I never used python with GStreamer:
According to documentation - yes it is said its list.. but this could be represented as internal structure.. remember that python bindings are just.. bindings - it all works similarly (if not implemented in a better way) as in C.. and what do you do in C with tags to iterate them .. but dont ask me how I found it out - you have to look around the docs checking all available functions.
You have to be wise and think of how could the object you are using may be implemented - along with the fact you know what it represents.. I mean - this is the list of tags when each tag has different type - one is string, the other one is int etc.. you cannot easily iterate over that.
So I think you have two options - according to what do you want to do with the tags..
1, serialize to string and work with that:
I am not sure but in C there is to_string which may do the same thing as in to_string in python - so try that if you are interested only in the tag names.. or whatever it returns.
2, use builtin foreach with its callback definition:
tags = vinfo.get_tags()
tags.foreach(my_callback, self)
And in your callback:
def my_callback(list, tag, user_data):
print(tag)
#do whatever you want with list
#not sure how to use casting in python:
YourClass ptr = user_data
ptr.your_method(whatever, tag);

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.