I've started using Mypy on my code, and I've got the following snippet here:
class BPDataStore():
def __init__(self, path: pathlib.Path):
self.path: pathlib.Path = path
if self.path.exists():
with self.path.open("r") as F:
self._bpdata: list[dict] = json.loads(F.read())
else:
self._bpdata: list[dict] = {}
Running mypy . from the project root gets me the following errors (there are others, but they're in other files and I want to focus on this one right now)
bp_tracker\back\data_store.py:13: error: Attribute "_bpdata" already defined on line 11
Why doesn't mypy allow "redeclaration" like this? It works in production just fine, and doesn't seem very unreasonable to me. And assuming that there's no way around this, what's the correct way to write this code so that it passes mypy's check?
For now I've just removed the annotation from the 2nd _bpdata attribute to get it to pass.
Currently using:
Python 3.10.4
Mypy 0.982
The Python interpreter does not care about annotations either way (so long as they are syntactically correct), which is why this "works" as you said.
But defining a variable more than once is generally not type safe. In this case it technically doesn't matter, because the definitions are identical, but I think mypy just simply disallows re-defining.
If you want to be explicit, you can either define it on the class in advance or inside that method beforehand. In both cases you just don't assign a value to it:
class BPDataStore:
_bpdata: list[dict] # here
...
def __init__(self, path: pathlib.Path):
self._bpdata: list[dict] # or here
...
But the problem seems to be that your type annotation doesn't match what you assign it. {} is an empty dict and not a list.
Assuming that was a typo and you are certain that you would always get a list from that json.loads (i.e. the top-level is a JSON array), you could just assign an empty list first and then potentially overwrite it with what you load from the file.
Also, I would suggest including the type arguments for generic types like dict. Here is how I would do it:
import json
from pathlib import Path
from typing import Any
class BPDataStore:
def __init__(self, path: Path) -> None:
self.path = path
self._bpdata: list[dict[str, Any]] = []
try:
with self.path.open("r") as f:
self._bpdata = json.loads(f.read())
except Exception as e:
# handle exception `e`...
Notice also that I don't explicitly annotate self.path because it would be inferred by any type checker based on that first assignment via the typed argument path. But at that point it is just a matter of preference.
EDIT: Thanks to #SUTerliakov for pointing out that you should indeed wrap your file opening in a try-block, instead of checking for existence. I edited my code example accordingly. If you are only worried the file may not exist, you should catch FileNotFoundError.
Related
Let's say I have a python module with the following function:
def is_plontaria(plon: str) -> bool:
if plon is None:
raise RuntimeError("None found")
return plon.find("plontaria") != -1
For that function, I have the unit test that follows:
def test_is_plontaria_null(self):
with self.assertRaises(RuntimeError) as cmgr:
is_plontaria(None)
self.assertEqual(str(cmgr.exception), "None found")
Given the type hints in the function, the input parameter should always be a defined string. But type hints are... hints. Nothing prevents the user from passing whatever it wants, and None in particular is a quite common option when previous operations fail to return the expected results and those results are not checked.
So I decided to test for None in the unit tests and to check the input is not None in the function.
The issue is: the type checker (pylance) warns me that I should not use None in that call:
Argument of type "None" cannot be assigned to parameter "plon" of type "str" in function "is_plontaria"
Type "None" cannot be assigned to type "str"
Well, I already know that, and that is the purpose of that test.
Which is the best way to get rid of that error? Telling pylance to ignore this kind of error in every test/file? Or assuming that the argument passed will be always of the proper type and remove that test and the None check in the function?
This is a good question. I think that silencing that type error in your test is not the right way to go.
Don't patronize the user
While I would not go so far as to say that this is universally the right way to do it, in this case I would definitely recommend getting rid of your None check from is_plontaria.
Think about what you accomplish with this check. Say a user calls is_plontaria(None) even though you annotated it with str. Without the check he causes an AttributeError: 'NoneType' object has no attribute 'find' with a traceback to the line return plon.find("plontaria") != -1. The user thinks to himself "oops, that function expects a str". With your check he causes a RuntimeError ideally telling him that plon is supposed to be a str.
What purpose did the check serve? I would argue none whatsoever. Either way, an error is raised because your function was misused.
What if the user passes a float accidentally? Or a bool? Or literally anything other than a str? Do you want to hold the user's hand for every parameter of every function you write?
And I don't buy the "None is a special case"-argument. Sure, it is a common type to be "lying around" in code, but that is still on the user, as you pointed out yourself.
If you are using properly type annotated code (as you should) and the user is too, such a situation should never happen. Say the user has another function foo that he wants to use like this:
def foo() -> str | None:
...
s = foo()
b = is_plontaria(s)
That last line should cause any static type checker worth its salt to raise an error, saying that is_plontaria only accepts str, but a union of str and None was provided. Even most IDEs mark that line as problematic.
The user should see that before he even runs his code. Then he is forced to rethink and either change foo or introduce his own type check before calling your function:
s = foo()
if isinstance(s, str):
b = is_plontaria(s)
else:
# do something else
Qualifier
To be fair, there are situations where error messages are very obscure and don't properly tell the caller what went wrong. In those cases it may be useful to introduce your own. But aside from those, I would always argue in the spirit of Python that the user should be considered mature enough to do his own homework. And if he doesn't, that is on him, not you. (So long as you did your homework.)
There may be other situations, where raising your own type-errors makes sense, but I would consider those to be the exception.
If you must, use Mock
As a little bonus, in case you absolutely do want to keep that check in place and need to cover that if-branch in your test, you can simply pass a Mock as an argument, provided your if-statement is adjusted to check for anything other than str:
from unittest import TestCase
from unittest.mock import Mock
def is_plontaria(plon: str) -> bool:
if not isinstance(plon, str):
raise RuntimeError("None found")
return plon.find("plontaria") != -1
class Test(TestCase):
def test_is_plontaria(self) -> None:
not_a_string = Mock()
with self.assertRaises(RuntimeError):
is_plontaria(not_a_string)
...
Most type checkers consider Mock to be a special case and don't complain about its type, assuming you are running tests. mypy for example is perfectly happy with such code.
This comes in handy in other situations as well. For example, when the function being tested expects an instance of some custom class of yours as its argument. You obviously want to isolate the function from that class, so you can just pass a mock to it that way. The type checker won't mind.
Hope this helps.
You can disable type checking for on a specific line with a comment.
def test_is_plontaria_null(self):
with self.assertRaises(RuntimeError) as cmgr:
is_plontaria(None) # type: ignore
self.assertEqual(str(cmgr.exception), "None found")
import os
BUCKET = os.getenv("BUCKET")
IN_CSV = os.getenv("IN_CSV")
OUT_CSV = os.getenv("OUT_CSV")
now, you see the problem right? I don't want to retype the variable name twice, is there a way to not do it? maybe some function get_and_init_env.
get_and_init_env(BUCKET) after this is executed there should be a variable of name BUCKET with value os.getenv("BUCKET") in locals()
May not be exactly what you need but to save time typing in things built on ipython, I once made a class that took in a dict of strings (such as one that can easily be made from os.environ), and in it's __init__ it called setattr to make itself have attributes that reflected the dict contents. From there I just had to .blah that instance instead of ['blah'] but more importantly in ipython could .b<tab> and bring up the items it could be. Probably went something like
...
class DotDict:
def __init__(self,dictish):
self._original = dict(dictish) #a dict has a lot of useful capabilities that can be routed to it...
for x,y in self._original.items():
setattr(self,CleanStr(x),y)
...
...
#make useful dicts part of the module
env =DotDict(os.environ)
...
from MyMod import env as env0
env0.BUCKET #just use it...
Since most environ vars should be pretty clean, you can probably just use x instead of CleanStr(x) but should really have a way to make any x object into a valid name, be it str or repr or hash related and prefixed by some favorite character sequence.
I want to cleanup each parameter before passing it to the class methods. Right now I have smth like this:
from cerberus import Validator
class MyValidator(Validator): # Validator - external lib (has it's own _validate methods)
def _validate_type_name(self, value):
# validation code here
# goal - to clean up value before passing to each methods (mine + from lib) e.g., value.strip()
schema = {"name": {"type": "name"}, # name - custom type from _validate_type_name
"planet_type": {"type": "string"}} # string - external lib type
my_dict = {"name": " Mars ",
"planet_type": " terrestrial "}
v = MyValidator(schema)
print(v.validate(my_dict)) # True/ False
# NOTE: I would like to do cleanup on method type level (not pass to schema)
I would like to clean up data before passing to the MyValidator methods (e.g., simple strip) but I don't want to make it as a separate step (just in case someone forgets to execute it before calling validation). I'd like to integrate cleanup with validation methods (external ones + mine).
I was considering either decorator on class or metaclass, but maybe there's a better approach. I don't have much experience here, asking for your advice.
If your goal is to make sure that the caller does the cleaning (i.e. you want them to "clean" their own copy of the value rather than having you return a modified version to them, which necessitates that it happen outside your function), then a decorator can't do much more than enforcement -- i.e. you can wrap all the functions such that a runtime exception is raised if an invalid value comes through.
The way that I'd tackle this instead of a decorator would be with types (which requires that you include mypy in your testing process, but you should be doing that anyway IMO). Something like:
from typing import NewType
CleanString = NewType('CleanString', str)
clean(value: str) -> CleanString:
"""Does cleanup on a raw string to make it a 'clean' string"""
value = value.strip()
# whatever else
return CleanString(value)
class MyValidator(Validator):
def validate_name(self, value: CleanString) -> bool:
# this will now flag a mypy error if someone passes a plain str to it,
# saying a 'str' was provided where a 'CleanString' was required!
Static typing has the advantage of raising an error before the code is even executed, and regardless of the actual runtime value.
I'm trying to use the ast module in Python to parse input code, but am struggling with a lot of the syntax of how to do so. For instance, I have the following code as a testing environment:
import ast
class NodeVisitor(ast.NodeVisitor):
def visit_Call(self, node):
for each in node.args:
print(ast.literal_eval(each))
self.generic_visit(node)
line = "circuit = QubitCircuit(3, True)"
tree = ast.parse(line)
print("VISITOR")
visitor = NodeVisitor()
visitor.visit(tree)
Output:
VISITOR
3
True
In this instance, and please correct me if I'm wrong, the visit_Call will be used if it's a function call? So I can get each argument, however there's no guarantee it will work like this as there are different arguments available to be provided. I understand that node.args is providing my arguments, but I'm not sure how to do things with them?
I guess what I'm asking is how do I check what the arguments are and do different things with them? I'd like to check, perhaps, that the first argument is an Int, and if so, run processInt(parameter) as an example.
The value each in your loop in the method will be assigned to the AST node for each of the arguments in each function call you visit. There are lots of different types of AST nodes, so by checking which kind you have, you may be able to learn things about the argument being passed in.
Note however that the AST is about syntax, not values. So if the function call was foo(bar), it's just going to tell you that the argument is a variable named bar, not what the value of that variable is (which it does not know). If the function call was foo(bar(baz)), it's going to show you that the argument is another function call. If you only need to handle calls with literals as their arguments, then you're probably going to be OK, you'll just look instances of AST.Num and similar.
If you want to check if the first argument is a number and process it if it is, you can do something like:
def visit_Call(self, node):
first_arg = node.args[0]
if isinstance(first_arg, ast.Num):
processInt(first_arg.n)
else:
pass # Do you want to do something on a bad argument? Raise an exception maybe?
I'm trying to understand why, how, and if to unit test methods that seem to return nothing. I've read in a couple other threads that:
The point of a unit test is to test something that the function does. If its not returning a value, then what is it actually doing?
unittest for none type in python?
In my example, I am using the XMLSigner and XMLVerifier from the sign_XML library.
def verify_xml(signed_xml: str, cert_file: str) -> None:
with open(cert_file, 'rb') as file:
cert = file.read()
with open(signed_xml, 'rb') as input_file:
input_data = input_file.read()
XMLVerifier().verify(input_data, x509_cert=cert)
I started looking up documentaion I found for SignXML. I read that verify():
class signxml.XMLVerifier Create a new XML Signature Verifier object,
which can be used to hold configuration information and verify
multiple pieces of data. verify(data, require_x509=True,
x509_cert=None, cert_subject_name=None, ca_pem_file=None,
ca_path=None, hmac_key=None, validate_schema=True, parser=None,
uri_resolver=None, id_attribute=None, expect_references=1)
Verify the
XML signature supplied in the data and return the XML node signed by
the signature, or raise an exception if the signature is not valid. By
default, this requires the signature to be generated using a valid
X.509 certificate.
This is my first time working with this and I'm confused even more now. So this apparently does return something.
What I've attempted
For another method which ends up calling verify_xml I've used #patch and just checked that the method I patched was called and with the correct arguments. This also seems like it's not the way to do it, but I didn't know how else to test it.
It feels weird doing something similar with the verify_xml method and just checking that it has been called once.
I've also tried self.assertIsNone... and that passes but that seems weird to me and not like it's a way one does this.
Could someone help me understand why, how, and if to unit test methods that seem to return nothing).
Thanks
to test verify_xml() is to test the Exception triggered by XMLVerifer().verify() if input parameters is not valid
There are a few types of exceptions you can tested.
from signxml import (XMLSigner, XMLVerifier, InvalidInput, InvalidSignature, InvalidCertificate, InvalidDigest)
class TestVerifyXML(unittest.TestCase):
def setUpCls(cls):
cls.signed_xml = from_magic()
cls.cert_file = from_magic2()
cls.ceft_file_bad = from_magic_bad()
def test_verify_xml(self):
# no Exception with correct xml
verify_xml(self.signed_xml, self.cert_file)
with self.assertRaises(InvalidSignature):
verify_xml(self.signed_xml, self.cert_file_bad)