I'm having trouble implementing an inner private enum class called: "LineCode" inside a class named Parser.
LineCode: Private Enum class that defines 6 types of general possible line of codes. I use Enum instantiation to send a Regex Pattern and compile it in the constructor, __init__, and then holds the Regex Matcher as a class variable.
Parser: Parses a programming language, irrelevant what language. Parser is using LineCode to identify the lines and proceed accordingly.
Problem: I can't access the enum members of __LineCode from a Static method.
I wish to have a Static method inside __LineCode, "matchLineCode(line)" that receives a string from the Parser, it then iterates over the Enum members in the following logic:
If Match is found: Return the enum
If no more enums left: Return None
It doesn't seem trivial, I can't access the enum members to do this.
Attempts: I tried iterating over the enums using:
__LineCode.__members__.values()
Parser.__lineCode.__members__.values()
Both failed since it can't find __lineCode.
Ideally: LineCode class must be private, and not be visible to any other class importing the Parser. Parser must use the static method that LineCode class provides to return the Enum. I am willing to accept any solution that solves this issue or one that mimics this behavior.
I omitted some of the irrelevant Parser methods to improve readability.
Code:
class Parser:
class __LineCode(Enum):
STATEMENT = ("^\s*(.*);\s*$")
CODE_BLOCK = ("^\s*(.*)\s*\{\s*$")
CODE_BLOCK_END = ("^\s*(.*)\s*\}\s*$")
COMMENT_LINE = ("^\s*//\s*(.*)$")
COMMENT_BLOCK = ("^\s*(?:/\*\*)\s*(.*)\s*$")
COMMENT_BLOCK_END = ("^\s*(.*)\s*(?:\*/)\s*$")
BLANK_LINE = ("^\s*$")
def __init__(self, pattern):
self.__matcher = re.compile(pattern)
#property
def matches(self, line):
return self.__matcher.match(line)
#property
def lastMatch(self):
try:
return self.__matcher.groups(1)
except:
return None
#staticmethod
def matchLineCode(line):
for lineType in **???**:
if lineType.matches(line):
return lineType
return None
def __init__(self, source=None):
self.__hasNext = False
self.__instream = None
if source:
self.__instream = open(source)
def advance(self):
self.__hasNext = False
while not self.__hasNext:
line = self.__instream.readline()
if line == "": # If EOF
self.__closeFile()
return
lineCode = self.__LineCode.matchLineCode(line)
if lineCode is self.__LineCode.STATEMENT:
pass
elif lineCode is self.__LineCode.CODE_BLOCK:
pass
elif lineCode is self.__LineCode.CODE_BLOCK_END:
pass
elif lineCode is self.__LineCode.COMMENT_LINE:
pass
elif lineCode is self.__LineCode.COMMENT_BLOCK:
pass
elif lineCode is self.__LineCode.COMMENT_BLOCK:
pass
elif lineCode is self.__LineCode.BLANK_LINE:
pass
else:
pass # TODO Invalid file.
I already implemented it in Java, I want to reconstruct the same thing in Python:
private enum LineCode {
STATEMENT("^(.*)" + Syntax.EOL + "\\s*$"), // statement line
CODE_BLOCK("^(.*)" + Syntax.CODE_BLOCK + "\\s*$"), // code block open line
CODE_BLOCK_END("^\\s*" + Syntax.CODE_BLOCK_END + "\\s*$"), // code block close line
COMMENT_LINE("^\\s*" + Syntax.COMMENT + "(.*+)$"), // comment line
BLANK_LINE("\\s*+$"); // blank line
private final static int CONTENT_GROUP = 1;
private Pattern pattern;
private Matcher matcher;
private LineCode(String regex) {
pattern = Pattern.compile(regex);
}
boolean matches(String line) {
matcher = pattern.matcher(line);
return matcher.matches();
}
String lastMatch() {
try {
return matcher.group(CONTENT_GROUP);
} catch (IndexOutOfBoundsException e) {
return matcher.group();
}
}
static LineCode matchLineCode(String line) throws UnparsableLineException {
for (LineCode lineType : LineCode.values())
if (lineType.matches(line)) return lineType;
throw new UnparsableLineException(line);
}
Thanks.
You could change the staticmethod to a classmethod, that way the first argument passed to matchLineCode would be the __lineCode class and you would be able to iterate over it
Edit
I've decided to add a more detailed explanation as to why the matchLineCode using the #staticmethod decorator was unable to see the __lineCode class. First I recommend you read some questions posted on SO that talk about the difference between static and class methods. The main difference is that the classmethod is aware of the Class where the method is defined, while the staticmethod is not. This does not mean that you are unable to see the __lineCode class from the staticmethod, it just means that you will have to do some more work to do so.
The way in which you organized your code, the class __lineCode is a class attribute of class Parser. In python, methods are always public, there are no private or protected class members as in Java. However, the double underscore at the beginning of a class attribute's name (or an instance's attribute's name) mean that the name will be mangled with the class name. This means that any function defined outside of the class Parser could access the __lineCode class as
Parser._Parser__lineCode
This means that using the #staticmethod decorator you could iterate over the __lineCode by doing
#staticmethod
def matchLineCode(line):
for lineType in Parser._Parser__lineCode:
if lineType.matches(line):
return lineType
return None
However, it is much more readable and, in my opinion, understandable to use the #classmethod decorator to allow the function to be aware of the __lineCode class.
Related
I am currently doing some work where I want to load a config from yaml into a class so that class attributes get specified. Depending on the work I am doing I have a different config class.
In the class I pre-specify the expected arguments as follows and then load data in from the file using BaseConfig::load_config(). This checks to see if the class has the corresponding attribute before using setattr() to assign the value.
class AutoencoderConfig(BaseConfig):
def __init__(self, config_path: Path) -> None:
super().__init__(config_path)
self.NTRAIN: Optional[int] = None
self.NVALIDATION: Optional[int] = None
# ... more below
self.load_config()
When I am referring to attributes of the AutoencoderConfig in other scripts, I might say something like: config.NTRAIN. While this works just fine, Mypy does not like this and gives the following error:
path/to/file.py:linenumber: error: Unsupported operand types for / ("None" and "int")
The error arises because at the time of the check, the values from the yaml file have not been loaded yet, and the types are specified with self.NTRAIN: Optional[int] = None.
Is there anyway I can avoid this without having to place: # type: ignore at the end of every line that I refer to the config class?
Are there any best-practices for using a config class like this?
EDIT:
Solution:
The solution provided by #SUTerliakov below works perfectly fine. In the end I decided to adopt a #dataclass approach:
from dataclasses import dataclass, field
#dataclass
class BaseConfig:
_config: Dict[str, Any] = field(default_factory=dict, repr=False)
def load_config(self, config_path: Path) -> None:
...
#dataclass
class AutoencoderConfig(BaseConfig):
NTRAIN: int = field(init=False)
NVALIDATION: int = field(init=False)
# ... more below
This allows me to specify each variable as a field(init=False) which seems to be a nice, explicit way to solve the problem.
Is it always int after load_config is called? (can it remain None?)
If it is always int, then do not mark them as Optional and don't assign None. Just declare them in class body without initialization:
class AutoencoderConfig(BaseConfig):
NTRAIN: int
NVALIDATION: int
def __init__(self, config_path: Path):
super().__init__(config_path)
# Whatever else
self.load_config()
Otherwise you have to explicitly check for None before actions. type: ignore is a bad solution for this case as this will just silence real problem: Optional is used to say that variable can be None and you cannot divide None by integer, so it can cause runtime issues. In this case proper solution will be something like this:
if self.NTRAIN is not None:
foo = self.NTRAIN / 2
else:
raise ValueError('NTRAIN required!')
# Or
# foo = some_default_foo
# If you can handle such case
If you know that after some point it can't be None, then the most clean is to assert it like this:
assert self.NTRAIN is not None
foo = self.NTRAIN / 2
Both methods will make mypy happy, because NTRAIN is really not None in conditional branch or after assert.
EDIT: sorry, I probably misread your question. Does load_config need hasattr(AutoencoderConfig, 'NTRAIN') to be True to work? If so, the easiest solution will be
class AutoencoderConfig(BaseConfig):
NTRAIN: int = None # type: ignore[assignment]
NVALIDATION: int = None # type: ignore[assignment]
def __init__(self, config_path: Path):
super().__init__(config_path)
# Whatever else
self.load_config()
You won't need any other type-ignores this way.
Using the following, I am able to successfully create a parser and add my arguments to self._parser through the __init()__ method.
class Parser:
_parser_params = {
'description': 'Generate a version number from the version configuration file.',
'allow_abbrev': False
}
_parser = argparse.ArgumentParser(**_parser_params)
Now I wish to split the arguments into groups so I have updated my module, adding some classes to represent the argument groups (in reality there are several subclasses of the ArgumentGroup class), and updating the Parser class.
class ArgumentGroup:
_title = None
_description = None
def __init__(self, parser) -> ArgumentParser:
parser.add_argument_group(*self._get_args())
def _get_args(self) -> list:
return [self._title, self._description]
class ArgumentGroup_BranchType(ArgumentGroup):
_title = 'branch type arguments'
class Parser:
_parser_params = {
'description': 'Generate a version number from the version configuration file.',
'allow_abbrev': False
}
_parser = argparse.ArgumentParser(**_parser_params)
_argument_groups = [cls(_parser) for cls in ArgumentGroup.__subclasses__()]
However, I'm now seeing an error.
Traceback (most recent call last):
...
File "version2/args.py", line 62, in <listcomp>
_argument_groups = [cls(_parser) for cls in ArgumentGroup.__subclasses__()]
NameError: name '_parser' is not defined
What I don't understand is why _parser_params do exist when they are referred by another class attribute, but _parser seemingly does not exist in the same scenario? How can I refactor my code to add the parser groups as required?
This comes from the confluence of two quirks of Python:
class statements do not create a new local scope
List comprehensions do create a new local scope.
As a result, the name _parser is in a local scope whose closest enclosing scope is the global scope, so it cannot refer to the about-to-be class attribute.
A simple workaround would be to replace the list comprehension with a regular for loop.
_argument_groups = []
for cls in ArgumentGroup.__subclasses()__:
_argument_groups.append(cls(_parser))
(A better solution would probably be to stop using class attributes where instance attributes make more sense.)
class MSG_TYPE(IntEnum):
REQUEST = 0
GRANT = 1
RELEASE = 2
FAIL = 3
INQUIRE = 4
YIELD = 5
def __json__(self):
return str(self)
class MessageEncoder(JSONEncoder):
def default(self, obj):
return obj.__json__()
class Message(object):
def __init__(self, msg_type, src, dest, data):
self.msg_type = msg_type
self.src = src
self.dest = dest
self.data = data
def __json__(self):
return dict (\
msg_type=self.msg_type, \
src=self.src, \
dest=self.dest, \
data=self.data,\
)
def ToJSON(self):
return json.dumps(self, cls=MessageEncoder)
msg = Message(msg_type=MSG_TYPE.FAIL, src=0, dest=1, data="hello world")
encoded_msg = msg.ToJSON()
decoded_msg = yaml.load(encoded_msg)
print type(decoded_msg['msg_type'])
When calling print type(decoded_msg['msg_type']), I get the result <type 'str'> instead of the original MSG_TYPTE type. I feel like I should also write a custom json decoder but kind of confused how to do that. Any ideas? Thanks.
When calling print type(decoded_msg['msg_type']), I get the result instead of the original MSG_TYPTE type.
Well, yeah, that's because you told MSG_TYPE to encode itself like this:
def __json__(self):
return str(self)
So, that's obviously going to decode back to a string. If you don't want that, come up with some unique way to encode the values, instead of just encoding their string representations.
The most common way to do this is to encode all of your custom types (including your enum types) using some specialized form of object—just like you've done for Message. For example, you might put a py-type field in the object which encodes the type of your object, and then the meanings of the other fields all depend on the type. Ideally you'll want to abstract out the commonalities instead of hardcoding the same thing 100 times, of course.
I feel like I should also write a custom json decoder but kind of confused how to do that.
Well, have you read the documentation? Where exactly are you confused? You're not going to get a complete tutorial by tacking on a followup to a StackOverflow question…
Assuming you've got a special object structure for all your types, you can use an object_hook to decode the values back to the originals. For example, as a quick hack:
class MessageEncoder(JSONEncoder):
def default(self, obj):
return {'py-type': type(obj).__name__, 'value': obj.__json__()}
class MessageDecoder(JSONDecoder):
def __init__(self, hook=None, *args, **kwargs):
if hook is None: hook = self.hook
return super().__init__(hook, *args, **kwargs)
def hook(self, obj):
if isinstance(obj, dict):
pytype = obj.get('py-type')
if pytype:
t = globals()[pytype]
return t.__unjson__(**obj['value'])
return obj
And now, in your Message class:
#classmethod
def __unjson__(cls, msg_type, src, dest, data):
return cls(msg_type, src, dest, data)
And you need a MSG_TYPE.__json__ that returns a dict, maybe just {'name': str(self)}, then an __unjson__ that does something like getattr(cls, name).
A real-life solution should probably either have the classes register themselves instead of looking them up by name, or should handle looking them up by qualified name instead of just going to globals(). And you may want to let things encode to something other than object—or, if not, to just cram py-type into the object instead of wrapping it in another one. And there may be other ways to make the JSON more compact and/or readable. And a little bit of error handling would be nice. And so on.
You may want to look at the implementation of jsonpickle—not because you want to do the exact same thing it does, but to see how it hooks up all the pieces.
Overriding the default method of the encoder won't matter in this case because your object never gets passed to the method. It's treated as an int.
If you run the encoder on its own:
msg_type = MSG_TYPE.RELEASE
MessageEncoder().encode(msg_type)
You'll get:
'MSG_TYPE.RELEASE'
If you can, use an Enum and you shouldn't have any issues. I also asked a similar question:
How do I serialize IntEnum from enum34 to json in python?
I have the base class:
class BaseGameHandler(BaseRequestHandler):
name = 'Base'
def get(self):
self.render(self.name + ".html")
Now, I need to define a few subclasses of this but the thing is, they have to have a decorator. Equivalent code would be:
#route('asteroid')
class AsteroidGameHandler(BaseGameHandler):
name = 'asteroid'
#route('blah')
class BlahGameHandler(BaseGameHandler):
name = 'blah'
and maybe a few more.
A little background here: This is a tornado web app and the #route decorator allows you to map /blah to BlahGameHandler. This code maps /blah to BlahGameHandler and /asteroid to AsteroidGameHandler.
So I thoughtI should use metaprogramming in python and define all these classes on the fly. I tried the following which doesn't work(and by doesn't work I mean the final web-app throws 404 on both /asteroid and /blah):
game_names = ['asteroid', 'blah']
games = list([game, type('%sGameHandler' % (game.title()), (BaseGameHandler,), {'name': game})] for game in game_names)
for i in xrange(len(games)):
games[i][1] = route(games[i][0])(games[i][1])
What am I missing? Aren't these two codes equivalent when run?
The library that you use only looks for global class objects in your module.
Set each class as a global; the globals() function gives you access to your module namespace as a dictionary:
for i in xrange(len(games)):
globals()[games[i][1].__name__] = route(games[i][0])(games[i][1])
The include() code does not look for your views in lists.
To be specific, include() uses the following loop to detect handlers:
for member in dir(module):
member = getattr(module, member)
if isinstance(member, type) and issubclass(member, web.RequestHandler) and hasattr(member, 'routes'):
# ...
elif isinstance(member, type) and issubclass(member, web.RequestHandler) and hasattr(member, 'route_path'):
# ...
elif isinstance(member, type) and issubclass(member, web.RequestHandler) and hasattr(member, 'rest_route_path'):
# ...
dir(module) only considers top-level objects.
i have to get static information from one 'module' to another. I'm trying to write logger with information about code place from where we're logging.
For example, in some file:
LogObject.Log('Describe error', STATIC_INFORMATION)
Static information is class name, file name and function name.
I get it from this:
__file__
self.__class__.__name__
sys._getframe().f_code.co_name
But i don't want to write this variables during logging. Can i create some function and call it. For example:
LogObject.Log('Describe error', someFunction())
How can i use it for getting static information?
I don't think "static" is the world you're looking for. If I understand you correctly, you want to write a function that will return the filename, class name and method name of the caller.
Basically, you should use sys._getframe(1) to access the previous frame, and work from there.
Example:
def codeinfo():
import sys
f = sys._getframe(1)
filename = f.f_code.co_filename
classname = ''
if 'self' in f.f_locals:
classname = f.f_locals['self'].__class__.__name__
funcname = f.f_code.co_name
return "filename: %s\nclass: %s\nfunc: %s" % (filename, classname, funcname)
Then from a method somewhere you can write
logger.info("Some message \n %s" % codeinfo())
First, please use lower-case names for objects and methods. Only use UpperCase Names for Class definitions.
More importantly, you want a clever introspective function in every class, it appears.
class Loggable( object ):
def identification( self ):
return self.__class__.__module__, self.__class__.__name__, sys._getframe().f_code.co_name
class ARealClass( Loggable ):
def someFunction( self ):
logger.info( "Some Message from %r", self. identification() )
If all of your classes are subclasses of Loggable, you'll inherit this identification function in all classes.