What does "i" represent in Python .pyi extension?

What does "i" represent in Python .pyi extension? - python

In Python, what does "i" represent in .pyi extension?
In PEP-484, it mentions .pyi is "a stub file" but no mnemonic help on the extension. So does the "i" mean "Include"? "Implementation"? "Interface"?

I think the i in .pyi stands for "Interface"
Definition for Interface in Java:
An interface in the Java programming language is an abstract type that
is used to specify a behaviour that classes must implement
From Python typeshed github repository:
Each Python module is represented by a .pyi "stub". This is a normal
Python file (i.e., it can be interpreted by Python 3), except all the
methods are empty.
In 'Mypy' repository, they explicitly mention "stub" files as public interfaces:
A stubs file only contains a description of the public interface of
the module without any implementations.
Because "Interfaces" do not exist in Python (see this SO question between Abstract class and Interface) I think the designers intended to dedicate a special extension for it.
pyi implements "stub" file (definition from Martin Fowler)
Stubs: provide canned answers to calls made during the test, usually
not responding at all to anything outside what's programmed in for the
test.
But people are more familiar with Interfaces than "stub" files, therefore it was easier to choose .pyi rather than .pys to avoid unnecessary confusion.

Apparently PyCharm creates .pyi file for its own purposes:
The *.pyi files are used by PyCharm and other development tools to provide
more information, such as PEP 484 type hints, than it is able to glean from
introspection of extension types and methods. They are not intended to be
imported, executed or used for any other purpose other than providing info
to the tools. If you don't use use a tool that makes use of .pyi files then
you can safely ignore this file.
See: https://www.python.org/dev/peps/pep-0484/
https://www.jetbrains.com/help/pycharm/2016.1/type-hinting-in-pycharm.html
This comment was found in: python27/Lib/site-packages/wx/core.pyi

The i in .pyi stands for ‘interface’.
The .pyi extension was first mentioned in this GitHub issue thread where JukkaL says:
I'd probably prefer an extension with just a single dot. It also needs to be something that is not in use (it should not be used by cython, etc.). .pys seems to be used in Windows (or was). Maybe .pyi, where i stands for an interface definition?

Another way to explain the contents of a module that Wing can't figure out is with a pyi Python Interface file. This file is merely a Python skeleton with the proper structure, call signature, and return values to correspond to the functions, attributes, classes, and methods specified in a module.

Related

Support for POSIX openat functions in python

There is a patch to add support for the POSIX openat functions (and other *at functions like fstatat) to the python standard library that is marked as closed with resolution fixed, but the os, posix and platform modules do not currently include any of these methods.
These methods are the standard way of solving problems like this in C and other languages efficiently and without race conditions.
Are these included in the standard library currently somewhere? And if not, are there plans to include this in the future.

Yes, this is supported by passing the dir_fd argument to various functions in the standard os module. See for example os.open():
Open the file path and set various flags [...]
This function can support paths relative to directory descriptors with the dir_fd parameter.
If you want to use high-level file objects such as those returned by the builtin open() function, that function's documentation provides example code showing how to do this using the opener parameter to that function. Note that open() and os.open() are entirely different functions and should not be confused. Alternatively, you could open the file with os.open() and then pass the file descriptor number to os.fdopen() or to open().
It should also be noted that this currently only works on Unix; the portable and future-proof way to check for dir_fd support is to write code such as the following:
if os.open in os.supports_dir_fd:
# Use dir_fd.
else:
# Don't.
On the other hand, I'm not entirely sure Windows even allows opening a directory in the first place. You certainly can't do it with _open()/_wopen(), which are documented to fail if "the given path is a directory." To be safe, I recommend only trying to open the directory after you check for dir_fd support.

Parsing a .proto file without creating the descriptor

I understand that the normal way of using protobuf is to create the .proto and then compile it into the relevant class - Java, Python, etc. I have a requirement which might need to parse the .proto file in Python code. Has anyone tried creating own parser for the .proto file? Will it be recommended to always compile the class instead of directly parsing the .proto?

It probably won't help you directly, but yes, I've written my own parser (live demo, parser source). This code is C# hence why it probably won't help, but it clearly is possible. I started that branch 9 days ago, and now it is basically feature-complete including parser, generator, and an interactive web-site with syntax-error highlighting - so it isn't necessarily a huge amount of work.
However! You may find it easier just to shell execute "protoc" (available on maven). If you use the -oFILE / --descriptor_set_out=FILE switch (same thing, alternative syntax), then it parses the input .proto file and writes a file that is a serialized FileDescriptorSet from descriptor.proto. This means you can use your regular tools to generate code in your chosen language for descriptor.proto, then deserialize the file as a FileDescriptorSet instance. Once you've done that: you can just walk the object model to see the files, messages, enums, fields, etc. IIRC some protobuf implementations support working entirely from a descriptor (which is what protoc emits), without the codegen step.

What is the capitalization standard for class names in the Python Standard Library?

The norm for Python standard library classes seems to be that class names are lowercase - this appears to hold true for built-ins such as str and int as well as for most classes that are part of standard library modules that must be imported such as datetime.date or datetime.datetime.
But, certain standard library classes such as enum.Enum and decimal.Decimal are capitalized. At first glance, it might seem that classes are capitalized when their name is equal to the module name, but that does not hold true in all cases (such as datetime.datetime).
What's the rationale/logic behind the capitalization conventions for class names in the Python Standard Library?

The Key Resources section of the Developers Guide lists PEP 8 as the style guide.
From PEP 8 Naming Conventions, emphasis mine.
The naming conventions of Python's library are a bit of a mess, so
we'll never get this completely consistent -- nevertheless, here are
the currently recommended naming standards. New modules and packages
(including third party frameworks) should be written to these
standards, but where an existing library has a different style,
internal consistency is preferred.
Also from PEP 8
A style guide is about consistency. Consistency with this style guide
is important. Consistency within a project is more important.
Consistency within one module or function is the most important.
...
Some other good reasons to ignore a particular guideline:
To be consistent with surrounding code that also breaks it (maybe for historic reasons) -- although this is also an opportunity to clean
up someone else's mess (in true XP style).
Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
You probably will never know why Standard Library naming conventions conflict with PEP 8 but it is probably a good idea to follow it for new stuff or even in your own projects.

Pep 8 is considered to be the standard style guide by many Python devs. This recommends to name classes using CamelCase/CapWords.
The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.
Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords convention used only for exception names and builtin constants.
Check this link for PEP8 naming conventions and standards.
datetime is a part of standard library,
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long table of contents listed below. The library contains built-in modules (written in C) that provide access to system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as well as modules written in Python that provide standardized solutions for many problems that occur in everyday programming.
In some cases, like sklearn, nltk, django, the package names are all lowercase. This link will take you there.
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket ).
I hope this covers all the questions.

Is there a python-equivalent of the unix "file" utility?

I want to have different behavior in a python script, depending on the type of file. I cannot use the filename extension as it may not be present or misleading. I could call the file utility and parse the output, but I would rather use a python builtin for portability.
So is there anything in python that uses heuristics to deduce the type of the file from its contents?

python-magic
pymagic
Probably others as well. "magic" is the magic keyword to search for. ;-)

Python naming conventions for modules

I have a module whose purpose is to define a class called "nib". (and a few related classes too.) How should I call the module itself? "nib"? "nibmodule"? Anything else?

Just nib. Name the class Nib, with a capital N. For more on naming conventions and other style advice, see PEP 8, the Python style guide.

I would call it nib.py. And I would also name the class Nib.
In a larger python project I'm working on, we have lots of modules defining basically one important class. Classes are named beginning with a capital letter. The modules are named like the class in lowercase. This leads to imports like the following:
from nib import Nib
from foo import Foo
from spam.eggs import Eggs, FriedEggs
It's a bit like emulating the Java way. One class per file. But with the added flexibility, that you can allways add another class to a single file if it makes sense.

I know my solution is not very popular from the pythonic point of view, but I prefer to use the Java approach of one module->one class, with the module named as the class.
I do understand the reason behind the python style, but I am not too fond of having a very large file containing a lot of classes. I find it difficult to browse, despite folding.
Another reason is version control: having a large file means that your commits tend to concentrate on that file. This can potentially lead to a higher quantity of conflicts to be resolved. You also loose the additional log information that your commit modifies specific files (therefore involving specific classes). Instead you see a modification to the module file, with only the commit comment to understand what modification has been done.
Summing up, if you prefer the python philosophy, go for the suggestions of the other posts. If you instead prefer the java-like philosophy, create a Nib.py containing class Nib.

nib is fine. If in doubt, refer to the Python style guide.
From PEP 8:
Package and Module Names
Modules should have short, all-lowercase names. Underscores can be used
in the module name if it improves readability. Python packages should
also have short, all-lowercase names, although the use of underscores is
discouraged.
Since module names are mapped to file names, and some file systems are
case insensitive and truncate long names, it is important that module
names be chosen to be fairly short -- this won't be a problem on Unix,
but it may be a problem when the code is transported to older Mac or
Windows versions, or DOS.
When an extension module written in C or C++ has an accompanying Python
module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket).

From PEP-8: Package and Module Names:
Modules should have short, all-lowercase names. Underscores can be
used in the module name if it improves readability.
Python packages should also have short, all-lowercase names, although the use of
underscores is discouraged.
When an extension module written in C or C++ has an accompanying
Python module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.