generate python classes from application/x-protobuf file

generate python classes from application/x-protobuf file - python

I have a python script that receives a application/x-protobuf file from youtube.
This part of the file
*youtubei#playerResponse
\862\94
yt_ad1\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\88\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\D6+
InnerTubeBuildLabelyoutube_20150727_RC2
InnerTubeChangelist99168778\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\88&
innertube.build.changelist99168778-
innertube.build.labelyoutube_20150727_RC2'
innertube.build.timestamp
1437996969E
!innertube.build.variants.checksum 47cbe83e1d9f5a44654ab7896473362e
innertube.client_name3!
innertube.client_version10.28z\BC\9A\EF\CA\DC\A2\80\C8 \E0](\A8F0\88\A48\98u#\B0\EAM\CD\CCL?U\CD\CCL?]\00\00#?`2h\E42p\C0>x\80\88\C4\90\98\A0\00\AD\00\00\00\00\B0\00\B8\00\C0\C8Ќ\D0\00\D8\E0\E8\C7\F0d\F8\00\80\90\00\98\00\C0\00\C8\A0\A8\00\B5\00\00\A0B\B8\00\D8\00\E0\00\E8\00\F0\F0\F8\80\00\88\00\90\00\B2ϭ\D3
\00\00 \C1\A2\DFޜ\00Z\AD
I want to generate python classes from this file.
I used protoc to decode the file
cat binay_file | protoc --decode_raw > decoded_file
Then I used this command to generate the classes
protoc -I=/root --python_out=$DST_DIR /root/decoded_file
however this command always returns "Expected top-level statement (e.g. "message")." error.

The input to protoc is a .proto source file declaring the overall structure of the protocol. It looks like you're trying to use as input an actual message. This won't work -- these aren't the same thing.
There is no automated way to reverse-engineer a .proto file from a message instance, since an encoded message does not contain things like type names or field names and contains only limited information about actual field types. You can use the output of --decode_raw to make guesses about the original .proto file, but this is a reverse-engineering task that requires human analysis, not something that can be done by a program.

Related

lldb crashlog - substring not found

I want to export my crash from Xcode to file. I've created at desktop file crash.log but I have an error when I'm trying to export file.
crashlog -c ~/Desktop/crash.log
Error:
error: python exception: substring not found

Does your crashlog start with the following banner by any chance?
-------------------------------------
Translated Report (Full Report Below)
-------------------------------------
If so, then you're dealing with the new JSON crashlog format. Nobody wants to read the raw JSON, so Xcode.app and Console.app are showing you a textual, user-readable version of the crash, even though the actual file contains just JSON. The textual representation of crashlog you're seeing is not exactly same as the old textual crashlog format and therefore LLDB doesn't know how to parse it. That means you cannot copy/paste it into a file and symbolicate it in LLDB.
As I mentioned before, even though Xcode and Console are showing you a textual representation, the real file is just plain JSON. If you don't believe me, try opening the file with TextEdit.
LLDB is fully capable of parsing the new JSON format. So if you pass the crashlog file to LLDB, it will be able to parse it. Alternatively, if you must, you can look for the "Full Report" banner and copy all the JSON below it into a file and pass that into LLDB.
-----------
Full Report
-----------

How to capture a program's output, output being a file, in Python?

I have a json2xml.exe which I execute it like:
> json2xml.exe -i '/path/to/file.json' -o '/path/to/file.xml';
(The names used are for explaining purposes)
The example-program takes as input a json file and outputs an xml file.
What I need is, instead of letting the program write to a file, I want to capture this output and give it as input to another .py program.
Is it possible with PIPES ? (I have no access to the source code of the json2xml.exe program.

What are the command line arguments passed to grpc_tools.protoc

Even though every python grpc quickstart references using grpc_tools.protoc to generate python classes that implement a proto file, the closest thing to documentation that I can find simply says
Given protobuf include directories $INCLUDE, an output directory $OUTPUT, and proto files $PROTO_FILES, invoke as:
$ python -m grpc.tools.protoc -I$INCLUDE --python_out=$OUTPUT --grpc_python_out=$OUTPUT $PROTO_FILES
Which is not super helpful. I notice there are many limitations. For example using an $OUTPUT of .. just fails silently.
Where can I find documentation on this tool?

I thought you are asking about the Python plugin. Did you try -h?
$ python -m grpc.tools.protoc -h
Usage: /usr/local/google/home/lidiz/.local/lib/python2.7/site-packages/grpc_tools/protoc.py [OPTION] PROTO_FILES
Parse PROTO_FILES and generate output based on the options given:
-IPATH, --proto_path=PATH Specify the directory in which to search for
imports. May be specified multiple times;
directories will be searched in order. If not
given, the current working directory is used.
--version Show version info and exit.
-h, --help Show this text and exit.
--encode=MESSAGE_TYPE Read a text-format message of the given type
from standard input and write it in binary
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode=MESSAGE_TYPE Read a binary message of the given type from
standard input and write it in text format
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode_raw Read an arbitrary protocol message from
standard input and write the raw tag/value
pairs in text format to standard output. No
PROTO_FILES should be given when using this
flag.
--descriptor_set_in=FILES Specifies a delimited list of FILES
each containing a FileDescriptorSet (a
protocol buffer defined in descriptor.proto).
The FileDescriptor for each of the PROTO_FILES
provided will be loaded from these
FileDescriptorSets. If a FileDescriptor
appears multiple times, the first occurrence
will be used.
-oFILE, Writes a FileDescriptorSet (a protocol buffer,
--descriptor_set_out=FILE defined in descriptor.proto) containing all of
the input files to FILE.
--include_imports When using --descriptor_set_out, also include
all dependencies of the input files in the
set, so that the set is self-contained.
--include_source_info When using --descriptor_set_out, do not strip
SourceCodeInfo from the FileDescriptorProto.
This results in vastly larger descriptors that
include information about the original
location of each decl in the source file as
well as surrounding comments.
--dependency_out=FILE Write a dependency output file in the format
expected by make. This writes the transitive
set of input file paths to FILE
--error_format=FORMAT Set the format in which to print errors.
FORMAT may be 'gcc' (the default) or 'msvs'
(Microsoft Visual Studio format).
--print_free_field_numbers Print the free field numbers of the messages
defined in the given proto files. Groups share
the same field number space with the parent
message. Extension ranges are counted as
occupied fields numbers.
--plugin=EXECUTABLE Specifies a plugin executable to use.
Normally, protoc searches the PATH for
plugins, but you may specify additional
executables not in the path using this flag.
Additionally, EXECUTABLE may be of the form
NAME=PATH, in which case the given plugin name
is mapped to the given executable even if
the executable's own name differs.
--grpc_python_out=OUT_DIR Generate Python source file.
--python_out=OUT_DIR Generate Python source file.
#<filename> Read options and filenames from file. If a
relative file path is specified, the file
will be searched in the working directory.
The --proto_path option will not affect how
this argument file is searched. Content of
the file will be expanded in the position of
#<filename> as in the argument list. Note
that shell expansion is not applied to the
content of the file (i.e., you cannot use
quotes, wildcards, escapes, commands, etc.).
Each line corresponds to a single argument,
even if it contains spaces.

For those seeking a simple & fast example to follow this is what worked for me:
python3 -m grpc_tools.protoc --proto_path=/home/estathop/tf --python_out=/home/estathop/tf --grpc_python_out=/home/estathop/tweetf0rm ASD.proto
I used python 3, defined the absolute proto_path to the folder the .proto file is present, I also included in python_out to save the outcome of this one-line-execution in the same folder as absolute path which will be the ASD_pb2.py. In addition, the grpc_python_out wants also an absolute path on where to save the ASD_pb2_grpc.py And Finally, wrote ASD.proto which is the .proto file I want to include that can be found in the current active directory of the command line window.

Python: How to get the URL to a file when the file is received from a pipe?

I created, in Python, an executable whose input is the URL to a file and whose output is the file, e.g.,
file:///C:/example/folder/test.txt --> url2file --> the file
Actually, the URL is stored in a file (url.txt) and I run it from a DOS command line using a pipe:
type url.txt | url2file
That works great.
I want to create, in Python, an executable whose input is a file and whose output is the URL to the file, e.g.,
a file --> file2url --> URL
Again, I am using DOS and connecting executables via pipes:
type url.txt | url2file | file2url
Question: file2url is receiving a file. How do I get the file's URL (or path)?

In general, you probably can't.
If the url is not stored in the file, I seems very difficult to get the url. Imagine someone reads a text to you. Without further information you have no way to know what book it comes from.
However there are certain usecases where you can do it.
Pipe the url together with the file.
If you need the url and you can do that, try to keep the url together with the file. Make url2file pipe your url first and then the file.
Restructure your pipeline
Maybe you don't need to find the url for the file, if you restructure your pipeline.
Index your files
If only a certain files could potentially be piped into file2url, you could precalculate a hash for all files and store it in your program together with the url. In python you would do this using a dict where the key is the file (as a string) and the value is the url. You could use pickle to write the dict object to a file and load it at the start of your program.
Then you could simply lookup the url from this dict.
You might want to research how databases or search functions in explorers handle indexing or alternative solutions.
Searching for the file
You could use one significant line of the file and use something like grep or head on linux to search all files of your computer for this line. Note that grep and head are programs, not python functions. For DOS, you might need to google the equivalent programs.
FYI: grep searches for one line of text inside a file.
head puts out the first few lines of a file. I suggest comparing only the first few lines of files to avoid searching through huge file.
Searching all files on the computer might take very long.
You could only search files with the same size as your piped input.
Use url.txt
If file2url knows the location of the file url.txt, then you could look up all files in url.txt until you find a file identical to the file that was piped into your program. You could combine this with the hashing/ indexing solution.

'file2url' receives the data via standard input (like keyboard).
The data is transferred by the kernel and it doesn't necessarily have to have any file-system representation. So if there's no file there's no URL or path to that for you to get.

Let's try to do it by obvious way:
$ cat test.py | python test.py
import sys
print ''.join(sys.stdin.readlines())
print sys.stdin.name
<stdin>
So, filename is "< stdin>" because, for the python there is no filename - only input.
Another way is a system-dependent. Find a command line, which was used, for example, but no garantee that is will be works.

File names have a `hidden' m character prepended

I have a simple python script which produces some data in a Neutron star mode. I use it to automate file names so I don't later forget the inputs. The script succesfully saves the file as
some_parameters.txt
but when I then list the files in terminal I see
msome_parameters.txt
The file name without the "m" is still valid and trying to call the file with the m returns
$ ls m*
No such file or directory
So I think the "m" has some special meaning of which numerous google searches do not yields answers. While I can carry on without worrying, I would like to know the cause. Here is how I create the file in python
# chi,epsI etc are all floats. Make a string for the file name
file_name = "chi_%s_epsI_%s_epsA_%s_omega0_%s_eta_%s.txt" % (chi,epsI,epsA,omega0,eta)
# a.out is the compiled c file which outputs data
os.system("./a.out > %s" % (file_name) )
Any advise would be much appreciated, usually I can find the answer already posted in the stackoverflow but this time I'm really confused.

You have a file with some special characters in the name which is confusing the terminal output. What happens if you do ls -l or (if possible) use a graphical file manager - basically, find a different way of listing the files so you can see what's going on. Another possibility would be to do ls > some_other_filename and then look at the file with a hex editor.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.