Python: Private Types in Type Hints? - python

I like type hints, especially for my method parameters. In my current script one function should retrieve a parameter of type argparse._SubParsersAction. As one can see from the underscore, this is a private type by convention. I guess this is why PyCharm complains with the error message Cannot find reference '_SubParsersAction' in 'argparse.pyi' when trying to import it (although it's there).
The script runs but it feels wrong. The error message seems reasonable to me as private types are meant to be... well, private. My first question is therefore why the public method ArgumentParser.add_subparsers() returns an object of a private type in the first place.
I've looked for a public super class or interface and _SubParsersAction does indeed extend from argparse.Action, but that doesn't help me as Action does not define _SubParsersAction's add_parser() method (which I need).
So my next questions are: Can I use type hints with the argparse API? Or is it only partially possible because the API was designed long before type hints were introduced? Or does my idea of typing not fit to Python's type system?
Here is my affected code snippet. It creates an argument parser with sub arguments as described in the documentation (https://docs.python.org/dev/library/argparse.html#sub-commands).
main.py
from argparse import ArgumentParser
import sub_command_foo
import sub_command_bar
import sub_command_baz
def main():
parser = ArgumentParser()
sub_parsers = parser.add_subparsers()
sub_command_foo.add_sub_parser(sub_parsers)
sub_command_bar.add_sub_parser(sub_parsers)
sub_command_baz.add_sub_parser(sub_parsers)
args = parser.parse_args()
args.func(args)
if __name__ == '__main__':
main()
sub_command_foo.py
from argparse import _SubParsersAction, Namespace
def add_sub_parser(sub_parsers: _SubParsersAction):
arg_parser = sub_parsers.add_parser('foo')
# Add arguments...
arg_parser.set_defaults(func=run)
def run(args: Namespace):
print('foo...')
The problem lies in sub_command_foo.py. PyCharm shows the error message Cannot find reference '_SubParsersAction' in 'argparse.pyi' on the first line from argparse import _SubParsersAction, Namespace.

I don't use pycharm and haven't done much with type hints so can't help your there. But I know argparse well. The bulk of this module was written in before 2010, and it's been modified since then at a snail's pace. It's well organized in the OOP sense, but the documentation is more of glorified tutorial than a formal reference. That is, it focuses on the functions and methods users will needed, and doesn't try to formally document all classes and the methods.
I do a lot of my testing in an interactive ipython session where I can look at the objects returned by commands.
In Python the distinction between public and private classes and methods is not as formal as other languages. The '_' prefix does mark 'private' things. They aren't usually documented, but they are still accessible. Sometimes a '*' import will import all object that don't start with it, but argparse has a more explicit __all__ list.
argparser.ArgumentParser does create an object instance. But that class inherits from 2 'private' classes. The add_argument method creates an Action object, and puts it on the parser._actions list. It also returns it to the user (though usually that reference is ignored). The action is actually a subclass (all of which are 'private'). add_subparsers is just a specialized version of this add_argument.
The add_parser method creates a ArgumentParser object
It feels to me that type hinting that requires a separate import of 'private' classes is counter productive. You shouldn't be explicitly referencing those classes, even if your code produces them. Some people (companies) fear they can be changed without notification and thus break their code. Knowing how slowly argparse gets changed I wouldn't too much about that.

Related

What is Python's "Namespace" object?

I know what namespaces are. But when running
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('bar')
parser.parse_args(['XXX']) # outputs: Namespace(bar='XXX')
What kind of object is Namespace(bar='XXX')? I find this totally confusing.
Reading the argparse docs, it says "Most ArgumentParser actions add some value as an attribute of the object returned by parse_args()". Shouldn't this object then appear when running globals()? Or how can I introspect it?
Samwise's answer is very good, but let me answer the other part of the question.
Or how can I introspect it?
Being able to introspect objects is a valuable skill in any language, so let's approach this as though Namespace is a completely unknown type.
>>> obj = parser.parse_args(['XXX']) # outputs: Namespace(bar='XXX')
Your first instinct is good. See if there's a Namespace in the global scope, which there isn't.
>>> Namespace
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'Namespace' is not defined
So let's see the actual type of the thing. The Namespace(bar='XXX') printer syntax is coming from a __str__ or __repr__ method somewhere, so let's see what the type actually is.
>>> type(obj)
<class 'argparse.Namespace'>
and its module
>>> type(obj).__module__
'argparse'
Now it's a pretty safe bet that we can do from argparse import Namespace and get the type. Beyond that, we can do
>>> help(argparse.Namespace)
in the interactive interpreter to get detailed documentation on the Namespace class, all with no Internet connection necessary.
It's simply a container for the data that parse_args generates.
https://docs.python.org/3/library/argparse.html#argparse.Namespace
This class is deliberately simple, just an object subclass with a readable string representation.
Just do parser.parse_args(...).bar to get the value of your bar argument. That's all there is to that object. Per the doc, you can also convert it to a dict via vars().
The symbol Namespace doesn't appear when running globals() because you didn't import it individually. (You can access it as argparse.Namespace if you want to.) It's not necessary to touch it at all, though, because you don't need to instantiate a Namespace yourself. I've used argparse many times and until seeing this question never paid attention to the name of the object type that it returns -- it's totally unimportant to the practical applications of argparse.
Namespace is basically just a bare-bones class, on whose instances you can define attributes, with a few niceties:
A nice __repr__
Only keyword arguments can be used to instantiate it, preventing "anonymous" attributes.
A convenient method to check if an attribute exists (foo in Namespace(bar=3) evaluates to False)
Equality with other Namespace instances based on having identical attributes and attribute values. (E.g. ,Namespace(foo=3, bar=5) == Namespace(bar=5, foo=3))
Instances of Namespace are returned by parse_args:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('bar')
args = parser.parse_args(['XXX'])
assert args.bar == 'XXX'

Refactoring a Python class to take a dynamic parameter without changing calling pattern dramatically

I have a Python class in a base_params.py module within an existing codebase, which looks like this:
import datetime
class BaseParams:
TIMESTAMP = datetime.datetime.now()
PATH1 = f'/foo1/bar/{TIMESTAMP}/baz'
PATH2 = f'/foo2/bar/{TIMESTAMP}/baz'
Callers utilize it this way:
from base_params import BaseParams as params
print(params.PATH1)
Now, I want to replace the TIMESTAMP value with one that is dynamically specified at runtime (through e.g. CLI arguments).
Is there a way to do this in Python without requiring my callers to refactor their code in a dramatic way? This is currently confounding me because the contents of the class BaseParams get executed at 'compile' time, so there is no opportunity there to pass in a dynamic value as it's currently structured. And in some of my existing code, this object is being treated as "fully ready" at 'compile' time, for example, its values are used as function argument defaults:
def some_function(value1, value2=params.PATH1):
...
I am wondering if there is some way to work with Python modules and/or abuse Python's __special_methods__ to get this existing code pattern working more or less as-is, without a deeper refactoring of some kind.
My current expectation is "this is not really possible" because of that last example, where the default value is being specified in the function signature. But I thought I should check with the Python wizards to see if there may be a suitably Pythonic way around this.
Yes, you just need to make sure that the command line argument is parsed before the class is defined and before any function that uses the class's attribute as a default argument is defined (but that should already be the case).
(using sys.argv for sake of simplicity. It is better to use an actual argument parser such as argparse)
import datetime
import sys
class BaseParams:
try:
TIMESTAMP = sys.argv[1]
except IndexError:
TIMESTAMP = datetime.datetime.now()
PATH1 = f'/foo1/bar/{TIMESTAMP}/baz'
PATH2 = f'/foo2/bar/{TIMESTAMP}/baz'
print(BaseParams.TIMESTAMP)
$ python main.py dummy-argument-from-cli
outputs
dummy-argument-from-cli
while
$ python main.py
outputs
2021-06-26 02:32:12.882601
You can still totally replace the value of a class attribute after the class has been defined:
BaseParams.TIMESTAMP = <whatever>
There are definitely some more "magic" things you can do though, such as a class factory of some kind. Since Python 3.7 you can also take advantage of module __getattr__ to create a kind of factory for the BaseParams class (PEP 562)
In base_params.py you might rename BaseParams to _BaseParams or BaseParamsBase or something like that :)
Then at the module level define:
def __getattr__(attr):
if attr == 'BaseParams':
params = ... # whatever code you need to determine class attributes for BaseParams
return type('BaseParams', (_BaseParams,), params)
raise AttributeError(attr)

Incompatible Type: __main__.TestClass; expected example.TestClass; Passing 'self' to function in another module

Background
I have two modules. Module 1 (example) defines TestClass. Module 2 (example2) defines functions change_var which takes an argument TestClass. example has a method change which calls change_var from example2 and passes self as argument.
example2 uses TYPE_CHECKING from typing to ensure cyclic import does not appear at run-time, but still allows MYPY to check types.
At the call to change_var from within change, MYPY gives the error Argument 1 to "change_var" has incompatible type "__main__.TestClass"; expected "example.TestClass".
Python Version: 3.7.3,
MYPY Version: 0.701
Example Code
example.py
from example2 import change_var
class TestClass:
def __init__(self) -> None:
self.test_var = 1
def change(self) -> None:
change_var(self)
example2.py
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from example import TestClass
def change_var(obj: "TestClass") -> None:
obj.test_var = 2
This code is a minimal example of the actual problem I am experiencing in a larger python project.
What I Expect this to do
These types should match as they are (as far as I can tell) the same.
My intuition as to why this doesn't work is that TestClass, at the point of calling to change_var isn't fully defined? For the same reason I can't refer to TestClass as a type within TestClass itself, I can't pass a TestClass object to a function that expects a TestClass object from withing the class itself. To MYPY, this is not a full class yet so it uses some kind of placeholder type. This is only an intuition though.
Questions
What exactly is the problem here?
What is the best work around to achieve this general code structure (two modules, class in one, function that takes class in other, method calls to function) while still making MYPY happy?
I am also open to refactoring this example entirely but I'd like to try to stick to this general structure.
This is one of many cases of breakage from treating a module as a script (whether via -m, -c (or --command for mypy), or simply python …/module.py). It works only for trivial applications that do not care about the identity of the types or functions they create. (They must also avoid side effects on import and mutable global state, but those are good ideas anyway.)
The solution, beyond “don’t do that”, is to use __main__.py in a package. Even that is not without issues, since certain naïve recursive importers will import it as if it were a true module.

How to change the string to class object in another file

I already use this function to change some string to class object.
But now I have defined a new module. How can I implement the same functionality?
def str2class(str):
return getattr(sys.modules[__name__], str)
I want to think some example, but it is hard to think. Anyway, the main problem is maybe the file path problem.
If you really need an example, the GitHub code is here.
The Chain.py file needs to perform an auto action mechanism. Now it fails.
New approach:
Now I put all files under one filefold, and it works, but if I use the modules concept, it fails. So if the problem is in a module file, how can I change the string object to relative class object?
Thanks for your help.
You can do this by accessing the namespace of the module directly:
import module
f = module.__dict__["func_name"]
# f is now a function and can be called:
f()
One of the greatest things about Python is that the internals are accessible to you, and that they fit the language paradigm. A name (of a variable, class, function, whatever) in a namespace is actually just a key in a dictionary that maps to that name's value.
If you're interested in what other language internals you can play with, try running dir() on things. You'd be surprised by the number of hidden methods available on most of the objects.
You probably should write this function like this:
def str2class(s):
return globals()[s]
It's really clearer and works even if __name__ is set to __main__.

Python namespaces: How to make unique objects accessible in other modules?

I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.
You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...
It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.
At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.

Categories

Resources