Is there a Protostuff-equivalent Protobuff library for Python? - python

In Java, libraries like protostuff allow you to generate buffers from a Java POJO approximately like so:
Schema<Foo> schema = RuntimeSchema.getSchema(Foo.class);
...
protostuff = ProtostuffIOUtil.toByteArray(foo, schema, buffer);
I've been trying to find a similar solution for Python, but except for attempting to programmatically build Descriptors and FieldDescriptors (which come with their own challenges and problems since 4.x.x), I couldn't find anything. Is this simply impossible in Python, just isn't implemented anywhere, or am I missing something obvious here?

The alternative Python protobuf libraries of pure-protobuf and python-betterproto both have their own syntax that can be used directly from Python, without a .proto file.
It however doesn't work for completely plain Python objects, as you still need to specify the field types and tag numbers (example from pure-protobuf):
#message
#dataclass
class SearchRequest:
query: str = field(1, default='')
page_number: int32 = field(2, default=int32(0))
result_per_page: int32 = field(3, default=int32(0))

Related

Why are extension attributes inaccessible in python protobuf objects?

I am attempting to read and analyze GTFS-realtime data from the NYC subway in Python. So far, I have successfully used both gtfs-realtime.proto and nyct-subway.proto to generate the proper Python classes and parsed the protobuf data into Python objects.
My problem comes when trying to access certain fields in these objects. For example, the header (feed.header) looks like this:
gtfs_realtime_version: "1.0"
incrementality: FULL_DATASET
timestamp: 1533111586
[nyct_feed_header] {
nyct_subway_version: "1.0"
trip_replacement_period {
route_id: "A"
replacement_period {
end: 1533113386
...
I can access the first three attributes using dot access, but not nyct_feed_header. I suspect this is because it is part of the nyct-subway.proto extension, while the other three are part of the original.
I have found this attribute accessible in feed.header.ListFields(), but since that returns a list of (name, attribute) pairs, it is at best awkward to access.
Why aren't attributes from extensions accessible by dot access like the rest of them? Is there a better or more elegant way to access them than by using ListFields?
Extensions are accessed via the Extensions property on an object (see docs). E.g. with GTFS and the NYCT extensions:
import gtfs_realtime_pb2 as gtfs
import nyct_subway_pb2 as nyct
feed = gtfs.FeedMessage()
feed.ParseFromString(...)
feed.entity[0].trip_update.trip.Extensions[nyct.nyct_trip_descriptor].direction

Escape reserved keyword in Python [Protocol Buffer autogenerated class]

so I'm sort of in a fix.
I'm using Google protocol buffers, and it just so happens that one of the fields in the schema is named "from".
I'm using python, so everytime I try to access it, I get a syntax error.
[ex - SomeClass.from -> Syntax error ]
Is there anyway to somehow access the field without using its identifier?
Maybe a way to escape reserved keywords in Python ? (One of the answers already says no, but...)
Or maybe some protobuf specific solution?
Thanks
After you pull your data, you can always save the from into from_ (the pythonic way of avoiding namespace clashes) by using the getattr(var, "from") statement; Ie
SomeClass # is a protocol-buffer
SomeClass.from_ = getattr(SomeClass, "from")
And then you an just use .from_ as you would otherwise.

How to turn Perl blessed objects into YAML that Python can read

We have a REST web service written in Perl Dancer. It returns perl data structures in YAML format and also takes in parameters in YAML format - it is supposed to work with some other teams who query it using Python.
Here's the problem -- if I'm passing back just a regular old perl hash by Dancer's serialization everything works completely fine. JSON, YAML, XML... they all do the job.
HOWEVER, sometimes we need to pass Perl objects back that the Python can later pass back in as a parameter to help with unnecessary loading, etc. I played around and found that YAML is the only one that works with Perl's blessed objects in Dancer.
The problem is that Python's YAML can't parse through the YAMLs of the Perl objects (whereas it can handle regular old perl hash YAMLs without an issue).
The perl objects start out like this in YAML:
First one:
--- &1 !!perl/hash:Sequencing_API
Second:
--- !!perl/hash:SDB::DBIO
It errors out like this.
yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:perl/hash:SDB::DBIO'
The regular files seem to get passed through like this:
---
fields:
library:
It seems like the extra stuff after --- are causing the issues. What can I do to address this? Or am I trying to do too much by passing around Perl objects?
the short answer is
!! is yaml shorthand for tag:yaml.org,2002: ... as such !!perl/hash is really tag:yaml.org,2002:perl/hash
now you need to tell python yaml how to deal with this type
so you add a constructor for it as follows
import yaml
def construct_perl_object(loader, node):
print "S:",suffix,"N:",node
return loader.construct_yaml_node(node)#this is likely wrong ....
yaml.add_multi_constructor(u"tag:yaml.org,2002:perl/hash:SDB::DBIO", construct_perl_object)
yaml.load(yaml_string)
or maybe just parse it out or return None maybe ... its hard to test with just that line ... but that may be what you are looking for

How can I change a Python object into XML?

I am looking to convert a Python object into XML data. I've tried lxml, but eventually had to write custom code for saving my object as xml which isn't perfect.
I'm looking for something more like pyxser. Unfortunately pyxser xml code looks different from what I need.
For instance I have my own class Person
Class Person:
name = ""
age = 0
ids = []
and I want to covert it into xml code looking like
<Person>
<name>Mike</name>
<age> 25 </age>
<ids>
<id>1234</id>
<id>333333</id>
<id>999494</id>
</ids>
</Person>
I didn't find any method in lxml.objectify that takes object and returns xml code.
Best is rather subjective and I'm not sure it's possible to say what's best without knowing more about your requirements. However Gnosis has previously been recommended for serializing Python objects to XML so you might want to start with that.
From the Gnosis homepage:
Gnosis Utils contains several Python modules for XML processing, plus other generally useful tools:
xml.pickle (serializes objects to/from XML)
API compatible with the standard pickle module)
xml.objectify (turns arbitrary XML documents into Python objects)
xml.validity (enforces XML validity constraints via DTD or Schema)
xml.indexer (full text indexing/searching)
many more...
Another option is lxml.objectify.
Mike,
you can either implement object rendering into XML :
class Person:
...
def toXml( self):
print '<Person>'
print '\t<name>...</name>
...
print '</Person>'
or you can transform Gnosis or pyxser output using XSLT.

Python string templater

I'm using this REST web service, which returns various templated strings as urls, for example:
"http://api.app.com/{foo}"
In Ruby, I can then use
url = Addressable::Template.new("http://api.app.com/{foo}").expand('foo' => 'bar')
to get
"http://api.app.com/bar"
Is there any way to do this in Python? I know about %() templates, but obviously they're not working here.
In python 2.6 you can do this if you need exactly that syntax
from string import Formatter
f = Formatter()
f.format("http://api.app.com/{foo}", foo="bar")
If you need to use an earlier python version then you can either copy the 2.6 formatter class or hand roll a parser/regex to do it.
Don't use a quick hack.
What is used there (and implemented by Addressable) are URI Templates. There seem to be several libs for this in python, for example: uri-templates. described_routes_py also has a parser for them.
I cannot give you a perfect solution but you could try using string.Template.
You either pre-process your incoming URL and then use string.Template directly, like
In [6]: url="http://api.app.com/{foo}"
In [7]: up=string.Template(re.sub("{", "${", url))
In [8]: up.substitute({"foo":"bar"})
Out[8]: 'http://api.app.com/bar'
taking advantage of the default "${...}" syntax for replacement identifiers. Or you subclass string.Template to control the identifier pattern, like
class MyTemplate(string.Template):
delimiter = ...
pattern = ...
but I haven't figured that out.

Categories

Resources