What are the command line arguments passed to grpc_tools.protoc - python

Even though every python grpc quickstart references using grpc_tools.protoc to generate python classes that implement a proto file, the closest thing to documentation that I can find simply says
Given protobuf include directories $INCLUDE, an output directory $OUTPUT, and proto files $PROTO_FILES, invoke as:
$ python -m grpc.tools.protoc -I$INCLUDE --python_out=$OUTPUT --grpc_python_out=$OUTPUT $PROTO_FILES
Which is not super helpful. I notice there are many limitations. For example using an $OUTPUT of .. just fails silently.
Where can I find documentation on this tool?

I thought you are asking about the Python plugin. Did you try -h?
$ python -m grpc.tools.protoc -h
Usage: /usr/local/google/home/lidiz/.local/lib/python2.7/site-packages/grpc_tools/protoc.py [OPTION] PROTO_FILES
Parse PROTO_FILES and generate output based on the options given:
-IPATH, --proto_path=PATH Specify the directory in which to search for
imports. May be specified multiple times;
directories will be searched in order. If not
given, the current working directory is used.
--version Show version info and exit.
-h, --help Show this text and exit.
--encode=MESSAGE_TYPE Read a text-format message of the given type
from standard input and write it in binary
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode=MESSAGE_TYPE Read a binary message of the given type from
standard input and write it in text format
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode_raw Read an arbitrary protocol message from
standard input and write the raw tag/value
pairs in text format to standard output. No
PROTO_FILES should be given when using this
flag.
--descriptor_set_in=FILES Specifies a delimited list of FILES
each containing a FileDescriptorSet (a
protocol buffer defined in descriptor.proto).
The FileDescriptor for each of the PROTO_FILES
provided will be loaded from these
FileDescriptorSets. If a FileDescriptor
appears multiple times, the first occurrence
will be used.
-oFILE, Writes a FileDescriptorSet (a protocol buffer,
--descriptor_set_out=FILE defined in descriptor.proto) containing all of
the input files to FILE.
--include_imports When using --descriptor_set_out, also include
all dependencies of the input files in the
set, so that the set is self-contained.
--include_source_info When using --descriptor_set_out, do not strip
SourceCodeInfo from the FileDescriptorProto.
This results in vastly larger descriptors that
include information about the original
location of each decl in the source file as
well as surrounding comments.
--dependency_out=FILE Write a dependency output file in the format
expected by make. This writes the transitive
set of input file paths to FILE
--error_format=FORMAT Set the format in which to print errors.
FORMAT may be 'gcc' (the default) or 'msvs'
(Microsoft Visual Studio format).
--print_free_field_numbers Print the free field numbers of the messages
defined in the given proto files. Groups share
the same field number space with the parent
message. Extension ranges are counted as
occupied fields numbers.
--plugin=EXECUTABLE Specifies a plugin executable to use.
Normally, protoc searches the PATH for
plugins, but you may specify additional
executables not in the path using this flag.
Additionally, EXECUTABLE may be of the form
NAME=PATH, in which case the given plugin name
is mapped to the given executable even if
the executable's own name differs.
--grpc_python_out=OUT_DIR Generate Python source file.
--python_out=OUT_DIR Generate Python source file.
#<filename> Read options and filenames from file. If a
relative file path is specified, the file
will be searched in the working directory.
The --proto_path option will not affect how
this argument file is searched. Content of
the file will be expanded in the position of
#<filename> as in the argument list. Note
that shell expansion is not applied to the
content of the file (i.e., you cannot use
quotes, wildcards, escapes, commands, etc.).
Each line corresponds to a single argument,
even if it contains spaces.

For those seeking a simple & fast example to follow this is what worked for me:
python3 -m grpc_tools.protoc --proto_path=/home/estathop/tf --python_out=/home/estathop/tf --grpc_python_out=/home/estathop/tweetf0rm ASD.proto
I used python 3, defined the absolute proto_path to the folder the .proto file is present, I also included in python_out to save the outcome of this one-line-execution in the same folder as absolute path which will be the ASD_pb2.py. In addition, the grpc_python_out wants also an absolute path on where to save the ASD_pb2_grpc.py And Finally, wrote ASD.proto which is the .proto file I want to include that can be found in the current active directory of the command line window.

Related

How do you use Python Ghostscript's high-level interface to convert a .pdf file into multiple .png files?

I am trying to convert a .pdf file into several .png files using Ghostscript in Python. The other answers on here were pretty old hence this new thread.
The following code was given as an example on pypi.org of the 'high level' interface, and I am trying to model my code after the example code below.
import sys
import locale
import ghostscript
args = [
"ps2pdf", # actual value doesn't matter
"-dNOPAUSE", "-dBATCH", "-dSAFER",
"-sDEVICE=pdfwrite",
"-sOutputFile=" + sys.argv[1],
"-c", ".setpdfwrite",
"-f", sys.argv[2]
]
# arguments have to be bytes, encode them
encoding = locale.getpreferredencoding()
args = [a.encode(encoding) for a in args]
ghostscript.Ghostscript(*args)
Can someone explain what this code is doing? And can it be used somehow to convert a .pdf into .png files?
I am new to this and am truly confused. Thanks so much!
That's calling Ghostscript, obviously. From the arguments it's not spawning a process, it's linked (either dynamically or statically) to the Ghostscript library.
The args are Ghostscript arguments. These are documented in the Ghostscript documentation, you can find it online here. Because it mimics the command line interface, where the first argument is the calling program, the first argument here is meaningless and can be anything you want (as the comment says).
The next three arguments turn on SAFER (which prevents some potentially dangerous operations and is, now, the default anyway), sets NOPAUSE so the entire input is processed without pausing between pages, and BATCH so that on completion Ghostscript exits instead of returning to the interactive prompt.
Then it selects a device. In Ghostscript (due to the PostScript language) devices are what actually output stuff. In this case the device selected is the pdfwrite device, which outputs PDF.
Then there's the OutputFile, you can probably guess that this is the name (and path) of the file where the output is to be written.
The next 3 arguments; -c .setpdfwrite -f are, frankly archaic and pointless. They were once recommended when using the pdfwrite device (and only the pdfwrite device) but they have no useful effect these days.
The very last argument is, of course, the input file.
Certainly you can use Ghostscript to render PDF files to PNG. You want to use one of the PNG devices, there are several depending on what colour depth you want to support. Unless you have some stranger requirement, just use png16m. If your input file contains more than one page you'll want to set the OutputFile to use %d so that it writes one file per page.
More details on all of this can, of course, be found in the documentation.

Unicode issues with tarfile.extractall() (Python 2.7)

I'm using python 2.7.6 on Windows and I'm using the tarfile module to extract a file a gzip file. The mode option of tarfile.open() is set to "r:gz". After the open call, if I were to print the contents of the archive via tarfile.list(), I see the following directory in the list:
./静态分析 Part 1.v1/
However, after I call tarfile.extractall(), I don't see the above directory in the extracted list of files, instead I see this:
é™æ€åˆ†æž Part 1.v1/
If I were to extract the archive via 7zip, I see a directory with the same name as the first item above. So, clearly, the extractall() method is screwing up, but I don't know how to fix this.
I learned that tar doesn't retain the encoding information as part of the archive and treats filenames as raw byte sequences. So, the output I saw from tarfile.extractall() was simply raw the character sequence that comprised the file's name prior to compression. In order to get the extractall() method to recreate the original filenames, I discovered that you have to manually convert the members of the TarFile object to the appropriate encoding before calling extractall(). In my case, the following did the trick:
modeltar = tarfile.open(zippath, mode="r:gz")
updatedMembers = []
for m in modeltar.getmembers():
m.name = unicode(m.name, 'utf-8')
updatedMembers.append(m)
modeltar.extractall(members=updatedMembers, path=dbpath)
The above code is based on this superuser answer: https://superuser.com/a/190786/354642

generate python classes from application/x-protobuf file

I have a python script that receives a application/x-protobuf file from youtube.
This part of the file
*youtubei#playerResponse
\862\94
yt_ad1\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\88\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\D6+
InnerTubeBuildLabelyoutube_20150727_RC2
InnerTubeChangelist99168778\83
e~11200142,901816,936105,9407053,9407664,9407715,9408142,9410705,9412913,9415294,9416137,9417116,9417192,9417455,9418117,94182142\88&
innertube.build.changelist99168778-
innertube.build.labelyoutube_20150727_RC2'
innertube.build.timestamp
1437996969E
!innertube.build.variants.checksum 47cbe83e1d9f5a44654ab7896473362e
innertube.client_name3!
innertube.client_version10.28z\BC\9A\EF\CA\DC\A2\80\C8 \E0](\A8F0\88\A48\98u#\B0\EAM\CD\CCL?U\CD\CCL?]\00\00#?`2h\E42p\C0>x\80\88\C4\90\98\A0\00\AD\00\00\00\00\B0\00\B8\00\C0\C8Ќ\D0\00\D8\E0\E8\C7\F0d\F8\00\80\90\00\98\00\C0\00\C8\A0\A8\00\B5\00\00\A0B\B8\00\D8\00\E0\00\E8\00\F0\F0\F8\80\00\88\00\90\00\B2ϭ\D3
\00\00 \C1\A2\DFޜ\00Z\AD
I want to generate python classes from this file.
I used protoc to decode the file
cat binay_file | protoc --decode_raw > decoded_file
Then I used this command to generate the classes
protoc -I=/root --python_out=$DST_DIR /root/decoded_file
however this command always returns "Expected top-level statement (e.g. "message")." error.
The input to protoc is a .proto source file declaring the overall structure of the protocol. It looks like you're trying to use as input an actual message. This won't work -- these aren't the same thing.
There is no automated way to reverse-engineer a .proto file from a message instance, since an encoded message does not contain things like type names or field names and contains only limited information about actual field types. You can use the output of --decode_raw to make guesses about the original .proto file, but this is a reverse-engineering task that requires human analysis, not something that can be done by a program.

Python: How to get the URL to a file when the file is received from a pipe?

I created, in Python, an executable whose input is the URL to a file and whose output is the file, e.g.,
file:///C:/example/folder/test.txt --> url2file --> the file
Actually, the URL is stored in a file (url.txt) and I run it from a DOS command line using a pipe:
type url.txt | url2file
That works great.
I want to create, in Python, an executable whose input is a file and whose output is the URL to the file, e.g.,
a file --> file2url --> URL
Again, I am using DOS and connecting executables via pipes:
type url.txt | url2file | file2url
Question: file2url is receiving a file. How do I get the file's URL (or path)?
In general, you probably can't.
If the url is not stored in the file, I seems very difficult to get the url. Imagine someone reads a text to you. Without further information you have no way to know what book it comes from.
However there are certain usecases where you can do it.
Pipe the url together with the file.
If you need the url and you can do that, try to keep the url together with the file. Make url2file pipe your url first and then the file.
Restructure your pipeline
Maybe you don't need to find the url for the file, if you restructure your pipeline.
Index your files
If only a certain files could potentially be piped into file2url, you could precalculate a hash for all files and store it in your program together with the url. In python you would do this using a dict where the key is the file (as a string) and the value is the url. You could use pickle to write the dict object to a file and load it at the start of your program.
Then you could simply lookup the url from this dict.
You might want to research how databases or search functions in explorers handle indexing or alternative solutions.
Searching for the file
You could use one significant line of the file and use something like grep or head on linux to search all files of your computer for this line. Note that grep and head are programs, not python functions. For DOS, you might need to google the equivalent programs.
FYI: grep searches for one line of text inside a file.
head puts out the first few lines of a file. I suggest comparing only the first few lines of files to avoid searching through huge file.
Searching all files on the computer might take very long.
You could only search files with the same size as your piped input.
Use url.txt
If file2url knows the location of the file url.txt, then you could look up all files in url.txt until you find a file identical to the file that was piped into your program. You could combine this with the hashing/ indexing solution.
'file2url' receives the data via standard input (like keyboard).
The data is transferred by the kernel and it doesn't necessarily have to have any file-system representation. So if there's no file there's no URL or path to that for you to get.
Let's try to do it by obvious way:
$ cat test.py | python test.py
import sys
print ''.join(sys.stdin.readlines())
print sys.stdin.name
<stdin>
So, filename is "< stdin>" because, for the python there is no filename - only input.
Another way is a system-dependent. Find a command line, which was used, for example, but no garantee that is will be works.

Linking back to a source code file in Sphinx

I am documenting a Python module in Sphinx. I have a source code file full of examples of the use of my module. I'd like to reference this file. It is too long to inline as continuous code. Is there a way to create a link to the full source file, formatted in a code-friendly way (i.e: literal or with line numbers)?
If I get the question right, you want a link from your documentation to the original source file. You can do this by adding the sphinx.ext.viewcode extension to your conf file (under extensions entry). This will create a "source" link for every header of a class, method, function, etc. Clicking the link will open the original file highlighting the clicked item. More explanation here
literalinclude
.. literalinclude:: filename
From the Sphinx (v1.5.1) documentation:
Longer displays of verbatim text may be included by storing the example text in an external file containing only plain text. The file may be included using the literalinclude directive.
For example, to include the Python source file example.py, use:
.. literalinclude:: example.py
The file name is usually relative to the current file’s path. However, if it is absolute (starting with /), it is relative to the top source directory.
Tabs in the input are expanded if you give a tab-width option with the desired tab width.
Like code-block, the directive supports the linenos flag option to switch on line numbers, the lineno-start option to select the first line number, the emphasize-lines option to emphasize particular lines, and a language option to select a language different from the current file’s standard language. Example with options:
.. literalinclude:: example.rb
:language: ruby
:emphasize-lines: 12,15-18
:linenos:
Include files are assumed to be encoded in the source_encoding. If the file has a different encoding, you can specify it with the encoding option:
.. literalinclude:: example.py
:encoding: latin-1
The directive also supports including only parts of the file. If it is a Python module, you can select a class, function or method to include using the pyobject option:
.. literalinclude:: example.py
:pyobject: Timer.start
This would only include the code lines belonging to the start() method in the Timer class within the file.
Alternately, you can specify exactly which lines to include by giving a lines option:
.. literalinclude:: example.py
:lines: 1,3,5-10,20-
This includes the lines 1, 3, 5 to 10 and lines 20 to the last line.
Another way to control which part of the file is included is to use the start-after and end-before options (or only one of them). If start-after is given as a string option, only lines that follow the first line containing that string are included. If end-before is given as a string option, only lines that precede the first lines containing that string are included.
When specifying particular parts of a file to display, it can be useful to display exactly which lines are being presented. This can be done using the lineno-match option.
You can prepend and/or append a line to the included code, using the prepend and append option, respectively. This is useful e.g. for highlighting PHP code that doesn’t include the markers.
If you want to show the diff of the code, you can specify the old file by giving a diff option:
.. literalinclude:: example.py
:diff: example.py.orig
This shows the diff between example.py and example.py.orig with unified diff format.
Python 3 does this. For example, the argparse docs link to the source code (near the top of the page, where it says "Source code"). You can see how they do it by looking at the source for the docs (linked from the first link, down at the bottom of the left had column).
I assume they're using standard Sphinx, but I am having a hard time finding :source: in their docs...
Update: the :source: role is defined here.

Categories

Resources