How do you keep code consistent with multiple developers? - python

I know that Python is a dynamically typed language, and that I am likely trying to recreate Java behavior here. However, I have a team of people working on this code base, and my goal with the code is to ensure that they are doing things in a consistent manner. Let me give an example:
class Company:
def __init__(self, j):
self.locations = []
When they instantiate a Company object, an empty list that holds locations is created. Now, with Python anything can be added to the list. However, I would like for this list to only contain Location objects:
class Location:
def __init__(self, j):
self.address = None
self.city = None
self.state = None
self.zip = None
I'm doing this with classes so that the code is self documenting. In other words, "location has only these attributes". My goal is that they do this:
c = Company()
l = Location()
l.city = "New York"
c.locations.append(l)
Unfortunately, nothing is stopping them from simply doing c.locations.append("foo"), and nothing indicates to them that c.locations should be a list of Location objects.
What is the Pythonic way to enforce consistency when working with a team of developers?

An OOP solution is to make sure the users of your class' API do not have to interact directly with your instance attributes.
Methods
One approach is to implement methods which encapsulate the logic of adding a location.
Example
class Company:
def __init__(self, j):
self.locations = []
def add_location(self, location):
if isinstance(location, Location):
self.locations.append(location)
else:
raise TypeError("argument 'location' should be a Location object")
Properties
Another OOP concept you can use is a property. Properties are a simple way to define getter and setters for your instance attributes.
Example
Suppose we want to enforce a certain format for a Location.zip attribute
class Location:
def __init__(self):
self._zip = None
#property
def zip(self):
return self._zip
#zip.setter
def zip(self, value):
if some_condition_on_value:
self._zip = value
else:
raise ValueError('Incorrect format')
#zip.deleter
def zip(self):
self._zip = None
Notice that the attribute Location()._zip is still accessible and writable. While the underscore denotes what should be a private attribute, nothing is really private in Python.
Final word
Due to Python's high introspection capabilities, nothing will ever be totally safe. You will have to sit down with your team and discuss the tools and practice you want to adopt.
Nothing is really private in python. No class or class instance can
keep you away from all what's inside (this makes introspection
possible and powerful). Python trusts you. It says "hey, if you want
to go poking around in dark places, I'm gonna trust that you've got a
good reason and you're not making trouble."
After all, we're all consenting adults here.
--- Karl Fast

You could also define a new class ListOfLocations that make the safety checks. Something like this
class ListOfLocations(list):
def append(self,l):
if not isinstance(l, Location): raise TypeError("Location required here")
else: super().append(l)

Related

Attribute name change for inherited classes. Possible/Bad practice?

Q1. If I have a very general class, with an attribute whose name could be better represented in more specific inherited classes, how can I access the same methods from the parent class if the attribute has changed its name? For example (not my real scenario, but it shows what I mean).
class Entity(object):
def __init__(self):
self.members= {}
... # Methods that use self.members
class School(Entity):
def __init__(self):
super(Entity,self).__init__(self)
class Company(Entity):
def __init__(self):
super(Entity,self).__init__(self)
for class School and for class Company, I would like to be able to use attributes that are more specific, such as self.students and self.employees, but that still work with the methods that were defined for self.members in the class Entity.
Q2. Would this be bad practice? What would be the best way to approach this? In my real case, the word I used for self.members is too general.
Renaming an attribute in a subclass is bad practice in general.
The reason is that inheritance is about substitutability. What it means for a School to be an Entity is that you can use a School in any code that was written to expect an Entity and it will work properly.
For example, typical code using an Entity might do something like this:
for member in entity.members:
If you have something that claims to be an Entity (and even passes isinstance(myschool, Entity)), but it either doesn't have members, or has an empty members, because its actual members are stored in some other attribute, then that code is broken.
More generally, if you change the interface (the set of public methods and attributes) between a base class and dericed class, the derived class isn't a subtype, which means it usually shouldn't be using inheritance in the first case.1
If you make students into an alias for members, so the same attribute can be accessed under either name, then you do have a subtype: a School has its students as members, and therefore it can be sensibly used with code that expects an Entity:
myschool.students.append(Person(cap'))
# ...
for member in myschool.members:
# now cap is going to show up here
And this works just as well with methods defined in Entity:
def slap_everyone(self):
for member in self.members:
# this will include cap
member.slap()
myschool.slap_everyone()
And you can do this by using #property.
class Student(Entity):
# ...
#property
def students(self):
return members
#students.setter
def students(self, val):
self.members = val
#students.deleter
def students(self):
del self.members
So, this isn't flat-out invalid or anything.
But it is potentially misleading.
Will it be obvious to readers of your code that adding cap to myschool.students is going to add him to myschool.members? If so, it's probably OK. If not, or if you're not sure, then you probably shouldn't do this.
Another thing to consider is that a School might have multiple kinds of members: students, teachers, administrators, dropouts who hang around their old campus because they don't know where else to find drug dealers, … If that's part of your design, then what you really want is for members to be a property, and probably a read-only property at that,2 and each subclass can define what counts as "members" in a way that makes sense for that subclass.
class Entity(object):
#property
def members(self):
return []
def rollcall(self):
return ', '.join(self.members)
class School(Entity):
def __init__(self):
super(School, self).__init__()
self.students, self.teachers = [], []
#property
def members(self):
return self.students + self.teachers
school = School()
school.teachers.append('cap')
school.students.extend(['marvel', 'america, planet'])
print(school.rollcall())
This will print out:
cap, marvel, america, planet
That school is working as a School, and as an Entity, and everything is good.
1. I say usually because (regardless of what OO dogma says) there are other reasons for subclassing besides subtyping. But it's still the main reason. And in this case, there doesn't appear to be any other reason for subclassing—you're not trying to share storage details, or provide overriding hooks, or anything like that.
2. In fact, you might even want to drag in the abc module and make it an abstract property… but I won't show that here.

Creating a python class with ONLY read-only instance attributes

When developing code for test automation, I often transform responses from the SUT from XML / JSON / whatever to a Python object model to make working with it afterwards easier.
Since the client should not alter the information stored in the object model, it would make sense to have all instance attributes read-only.
For simple cases, this can be achieved by using a namedtuple from the collections module. But in most cases, a simple namedtuple won't do.
I know that the probably most pythonic way would be to use properties:
class MyReadOnlyClass(object):
def __init__(self, a):
self.__a = a
#property
def a(self):
return self.__a
This is OK if I'm dealing only with a few attributes, but it gets lengthy pretty soon.
So I was wondering if there would be any other acceptable approach? What I came up with was this:
MODE_RO = "ro"
MODE_RW = "rw"
class ReadOnlyBaseClass(object):
__mode = MODE_RW
def __init__(self):
self.__mode = MODE_RO
def __setattr__(self, key, value):
if self.__mode != MODE_RW:
raise AttributeError("May not set attribute")
else:
self.__dict__[key] = value
I could then subclass it and use it like this:
class MyObjectModel(ReadOnlyBaseClass):
def __init__(self, a):
self.a = a
super(MyObjectModel, self).__init__()
After the super call, adding or modifying instance attributes is not possible (... that easily, at least).
A possible caveat I came to think about is that if someone was to modify the __mode attribute and set it to MODE_RO, no new instances could be created. But that seems acceptable since its clearly marked as "private" (in the Pyhon way).
I would be interested if you see any more problems with this solution, or have completely different and better approaches.
Or maybe discourage this at all (with explanation, please)?

Call Python Method on Class Attribute Change

I'm writing an API parsing Twitter bot and am very new to OOP. I have some existing Python code that relies on global variables and figured I could take this opportunity to learn.
I have the following Team class that gets updated when the API is parsed and is like to be able to call a totally unrelated (external) method when a class attribute changes.
class Team(object):
def __init__(self, team_name, tri_code, goals, shots, goalie_pulled):
self.team_name = team_name
self.tri_code = tri_code
self.goals = goals
self.shots = shots
self.goalie_pulled = goalie_pulled
When goalie_pulled is changed for an existing instance of Team I'd like the following method to be called (pseudo code):
def goalie_pulled_tweet(team):
tweet = "{} has pulled their goalie with {} remaining!".format(team.team_name, game.period_remain)
send_tweet(tweet)
Two things -
How do I call goalie_pulled_tweet from within my Team class once I detect that goalie_pulled attribute has changed?
Can I access an instance of my Game object from anywhere or does it need to be passed to that variable as well?
You should take a look at the property class. Basically, it lets you encapsulate behaviour and private members without the consumer even noticing it.
In your example, you may have a goalie_pulled property:
class Team(object):
def __init__(self, team_name, tri_code, goals, shots, goalie_pulled):
# Notice the identation here. This is very important.
self.team_name = team_name
self.tri_code = tri_code
self.goals = goals
self.shots = shots
# Prefix your field with an underscore, this is Python standard way for defining private members
self._goalie_pulled = goalie_pulled
#property
def goalie_pulled(self):
return self._goalie_pulled
#goalie_pulled.setter
def goalie_pulled(self, new_value):
self._goalie_pulled = new_value
goalie_pulled_tweet(self) #self is the current Team instance
From the consumer's point of view:
team = create_team_instance()
# goalie_pulled_tweet is called
team.goalie_pulled = 'some_value'
I'd recommend you to use properties whenever you can (and must), as they are a nice way of abstraction.
From a design standpoint, it would make more sense to have a pull_goalie method.
Classes are a tool to create more meaningful abstractions. Pulling a goalie is an action. If you think of Team as representing a real-life team, it makes more sense to say "The team pulled their goalie!" rather than "The team set their pulled-goalie attribute to X player!"
class Team(object):
...
def pull_goalie(self, player):
self.pulled_goalie = player
tweet = '<your format string>'.format(
self.pulled_goalie,
# Yes, your Team *could* store a reference to the current game.
# It's hard to tell if that makes sense in your program without more context.
self.game.period_remaining
)
I was going to recommend a property, but I think that would solve the immediate problem without considering broader design clarity.
NOTE: There is no such thing as a "private" attribute in Python. There is a convention that attributes beginning with a single underscore (self._pulled_goalie) is treated as private, but that's just so that people using your code know that they can't depend on that value always doing what they think it will. (i.e., it's not part of the public contract of your code, and you can change it without warning.)
EDIT: To create a register_team method on a Game object, you might do something like this:
class Game(object):
def __init__(<stuff>):
...
self.teams = {}
...
def register_team(self, team):
if len(self.teams) > 1:
raise ValueError(
"Too many teams! Cannot register {} for {}"
.format(team, game)
)
self.teams[team] = team
team.game = self
def unregister_team(self, team):
try:
del self.teams[team]
except KeyError:
pass
team.game = None
Note that by using a dictionary, register_team and unregiser_team can be called multiple times without ill effect.

Wrapping a python class around JSON data, which is better?

Preamble: I'm writing a python API against a service that delivers JSON.
The files are stored in JSON format on disk to cache the values.
The API should sport classful access to the JSON data, so IDEs and users can have a clue what (read-only) attributes there are in the object before runtime while also providing some convenience functions.
Question: I have two possible implementations, I'd like to know which is nicer or 'pythonic'. While I like both, I am open for suggestions, if you come up with a better solution.
First Solution: defining and inheriting JSONWrapper while nice, it is pretty verbose and repetitive.
class JsonDataWrapper:
def __init__(self, json_data):
self._data = json_data
def get(self, name):
return self._data[name]
class Course(JsonDataWrapper):
def __init__(self, data):
super().__init__(data)
self._users = {} # class omitted
self._groups = {} # class omitted
self._assignments = {}
#property
def id(self): return self.get('id')
#property
def name(self): return self.get('full_name')
#property
def short_name(self): return self.get('short_name')
#property
def users(self): return self._users
#users.setter
def users(self, data):
users = [User(u) for u in data]
for user in users:
self.users[user.id] = user
# self.groups = user # this does not make much sense without the rest of the code (It works, but that decision will be revised :D)
Second solution: using lambda for shorter syntax. While working and short, it does not quite look right (see edit1 below.)
def json(name): return property(lambda self: self.get(name))
class Group(JsonDataWrapper):
def __init__(self, data):
super().__init__(data)
self.group_members = [] # elements are of type(User). edit1, was self.members = []
id = json('id')
description = json('description')
name = json('name')
description_format = json('description_format')
(Naming this function 'json' is not a problem, since I don't import json there.)
I have a possible third solution in mind, that I cant quite wrap my head around: overriding the property builtin, so I can define a decorator that wraps the returned field name for lookup:
#json # just like a property fget
def short_name(self): return 'short_name'
That could be a little shorter, dunno if that makes code better.
Disqualified solutions (IMHO):
JSON{De,En}coder: kills all flexibility, provide no means of read-only attributes
__{get,set}attr__: makes it impossible to determine attributes before runtime. While it whould shorten self.get('id') to self['id'] it whould also further complicate matters where an attribute was not in the underlying json data.
Thank you for reading!
Edit 1: 2016-07-20T08:26Z
To further clarify (#SuperSaiyan) why I don't quite like the second solution:
I feel the lambda function is completely disconnected from the rest of classes semantics (which is also the reason why it is shorter :D). I think I can help myself liking it more by properly documenting the decision in the code. The first solution is easy to understand for everybody who understands the meaning of #property without any additional explaination.
On the second comment of #SuperSaiyan: Your question is, why I put Group.members as attribute in there? The list stores type(User) entities, might not be what you think it is, I changed the example.
#jwodder: I will use Code Review next time, did not know that was a thing.
(Also: I really think the Group.members threw some of you off, I edited the code to make it a little more obvious: Group members are Users that will be added to the list.
The complete code is on github, while undocumented it may be interesting for somebody. Keep in mind: this is all WIP :D)
(note: this got an update, I'm now using dataclasses with run-time type enforcment. see bottom :3)
So, it's been a year and I'm going to answer my own question. I don't quite like answering it myself, but: this will mark the thread as resolved which in itself might help others.
On the other hand, I want to document and give reason to why I chose my solution over proposed answers. Not, to prove me right, but to highlight the different tradeoffs.
I just realized, that this got quite long, so:
tl;dr
collections.abc contains powerful abstractions and you should use them if you have access to it (cpython >= 3.3).
#property is nice to use, enables to add documentation easily and provides read only access.
Nested classes look weird but replicate the structure of deeply nested JSON just fine.
Proposed solutions
python meta-classes
So first off: I love the concept.
I've considered many applications for where they prove useful, especially when:
writing a pluggable API where meta-classes enforce correct usage of derived classes and their implementation specifics
having a fully automated registry of classes that derive a from a meta-class.
On the other hand, python's meta-class logic felt obscure to wrap my head around (took me at least three days to figure it out). While simple in principle, the devil is in the details.
So, I decided against it, simply because I might abandon the project in the not so far future and others should be able to pick up where I left off easily.
namedtuple
collections.namedtuple is very efficient and concise enough to boil my solution down to several lines instead of the current 800+ lines. My IDE will also be able to introspect possible members of the generated class.
Cons: the breverity of namedtuple leaves much less room for the awfully necessary documentation of the APIs returned values. So with less insane APIs you will possibly get away with just that.
It also feels wierd to nest class objects into the namedtuple, but that's just personal preference.
What I went with
So in the end, I chose to stick to my first original solution with a few minor details added, if you find the details interesting, you can look at the source on github.
collections.abc
When I started the project, my python knowledge was next to none, so I went with what I knew about python ("everything is a dict") and wrote code like that. For example: classes that work like a dict, but have a file structure underneath (that was before pathlib).
While looking through python's code I noticed how they implement and enforce container "traits" through abstract base classes which sounds far more complicated than it really is in python.
the very basics
The following is indeed very basic, but we'll build up from there.
from collections import Mapping, Sequence, Sized
class JsonWrapper(Sized):
def __len__(self):
return len(self._data)
def __init__(self, json):
self._data = json
#property
def raw(self): return self._data
The most basic class I could come up with, this will just enable you to call len on the container. You also can get read-only access through raw if you really want to bother with the underlying dictionary.
So why am I inheriting from Sized instead of just starting from scratch and def __len__ just like that?
not overriding __len__ will not be accepted by the python interpreter. I forget when exactly, but AFAIR it's when you import the module that contains the class, so you're not getting screwed at runtime.
While Sized does not provide any mixin methods, the next two abstractions do provide them. I'll explain there.
With that down, we only got two more basic cases in JSON lists and dicts.
Lists
So, with the API I had to worry about, we we're not always sure what we got; so I wanted a way of checking if I got a list when we initialize the wrapper class, mostly to abort early instead of "object has no member" during more complicated processes.
Deriving from Sequence will enforce overriding __getitem__ and __len__ (which is already implemented in JsonWrapper).
class JsonListWrapper(JsonWrapper, Sequence):
def __init__(self, json_list):
if type(json_list) is not list:
raise TypeError('received type {}, expected list'.format(type(json_list)))
super().__init__(json_list)
def __getitem__(self, index):
return self._data[index]
def __iter__(self):
raise NotImplementedError('__iter__')
def get(self, index):
try:
return self._data[index]
except Exception as e:
print(index)
raise e
So you might have noted, that I chose to not implement __iter__.
I wanted an iterator that yielded typed objects, so my IDE is able to autocomplete. To illustrate:
class CourseListResponse(JsonListWrapper):
def __iter__(self):
for course in self._data:
yield self.Course(course)
class Course(JsonDictWrapper):
pass # for now
Implementing the abstract methods of Sequence, the mixin methods __contains__, __reversed__, index and count are gifted to you, so you don't have to worry about possible side-effects.
Dictionaries
To complete the basic types to wrangle JSON, here's the class derived from Mapping:
class JsonDictWrapper(JsonWrapper, Mapping):
def __init__(self, json_dict):
super().__init__(json_dict)
if type(self._data) is not dict:
raise TypeError('received type {}, expected dict'.format(type(json_dict)))
def __iter__(self):
return iter(self._data)
def __getitem__(self, key):
return self._data[key]
__marker = object()
def get(self, key, default=__marker):
try:
return self._data[key]
except KeyError:
if default is self.__marker:
raise
else:
return default
Mapping only enforces __iter__, __getitem__ and __len__.
To avoid confusion: There is also MutableMapping which will enforce the writing methods. But that's neither needed nor wanted here.
With the abstract methods out of the way, python provides the mixins __contains__, keys, items, values, get, __eq__, and __ne__ based on them.
I'm not sure why I chose to override the get mixin, I might update the post when it get's back to me.
__marker serves as a fallback to detect if the default keyword was not set. If somebody decided to call get(*args, default=None) you won't be able to detect that otherwise.
So to pick up the previous example:
class CourseListResponse(JsonListWrapper):
# [...]
class Course(JsonDictWrapper):
# Jn is just a class that contains the keys for JSON, so I only mistype once.
#property
def id(self): return self[Jn.id]
#property
def short_name(self): return self[Jn.short_name]
#property
def full_name(self): return self[Jn.full_name]
#property
def enrolled_user_count(self): return self[Jn.enrolled_user_count]
# [...] you get the idea
The properties provide read-only access to members and can be documented like a function definition.
Altough verbose, for basic accessors you can easily define a template in your editor, so it's less tedious to write.
Properties also allow to abstract from magic numbers and optional JSON return values, to provide defaults instead guarding for KeyError everywhere:
#property
def isdir(self): return 1 == self[Jn.is_dir]
#property
def time_created(self): return self.get(Jn.time_created, 0)
#property
def file_size(self): return self.get(Jn.file_size, -1)
#property
def author(self): return self.get(Jn.author, "")
#property
def license(self): return self.get(Jn.license, "")
class nesting
It seems a little weird to nest classes in others.
I chose to do that, becaue the API uses the same name for various objects with different attributes, depending on which remote function you called.
Another benefit: new people can easily understand the structure of the returned JSON.
The end of the file contains various aliases to the nested classes for easier access from outside the module.
adding logic
Now that we have encapsulated most of the returned values, I wanted to have more logic associated with the data, to add some convenience.
It also seemed necessary to merge some of the data into a more comprehensive tree that contained all of the data gathered through several API calls:
get all "assignments". each assignment contains many submissions, so:
for(assignment in assigmnents) get all "submissions"
merge submissions into respective assignment.
now get grades for the submissions, and so on...
I chose to implement them seperately, so I just inherited from the "dumb" accessors (full source):
So in this class
class Assignment(MoodleAssignment):
def __init__(self, data, course=None):
super().__init__(data)
self.course = course
self._submissions = {} # accessed via submission.id
self._grades = {} # are accessed via user_id
these properties do the merging
#property
def submissions(self): return self._submissions
#submissions.setter
def submissions(self, data):
if data is None:
self.submissions = {}
return
for submission in data:
sub = Submission(submission, assignment=self)
if sub.has_content:
self.submissions[sub.id] = sub
#property
def grades(self):
return self._grades
#grades.setter
def grades(self, data):
if data is None:
self.grades = {}
return
grades = [Grade(g) for g in data]
for g in grades:
self.grades[g.user_id] = g
and these implement some logic that can be abstracted from the data.
#property
def is_due(self):
now = datetime.now()
return now > self.due_date
#property
def due_date(self): return datetime.fromtimestamp(super().due_date)
While the setters obscure the wrangling, they are nice to write and use: so it's just a trade-off.
Caveat: The logic implementation is not quite what I want it to be, there's much interdependance where it should not be. It's grown from me not knowing enough of python to get the abstractions right and getting things done, so I can do the actual work with the tedium out of my way.
Now that I know, what could have been done: I look at some of that spaghetti, and well … you know the feeling.
Conclusion
Encapsulating the JSON into classes proved quite useful to me and the project's structue and I'm quite happy with it.
The rest of the project is fine and works, although some parts are just awful :D
Thank you all for the feedback, I'll be around for questions and remarks.
update: 2019-05-02
As #RickTeachey points out in the comments, pythons dataclasses (DCs) can be used here, as well.
And I forgot to put an update here, since I already did that some time ago and extended it with pythons typing functionality :D
Reason for that: I was growing tired to manually check if the documentation of the API I was abstracting from was correct or if I got my implementation wrong.
With dataclasses.fields I'm able to check if the response does conform to my schema; and now I'm able to find changes in the external API much faster, since the assumptions are checked during run-time on instantiation.
DCs provide a __post_init__(self) hook to do some post-processing once the __init__ completed successfully. Pythons' type hints are only in place to provide hints for static checkers, I built a little system that does enforce the types on dataclasses in the post init phase.
Here is the BaseDC, from which all other DCs inherit (abbreviated)
import dataclasses as dc
#dataclass
class BaseDC:
def _typecheck(self):
for field in dc.fields(self):
expected = field.type
f = getattr(self, field.name)
actual = type(f)
if expected is list or expected is dict:
log.warning(f'untyped list or dict in {self.__class__.__qualname__}: {field.name}')
if expected is actual:
continue
if is_generic(expected):
return self._typecheck_generic(expected, actual)
# Subscripted generics cannot be used with class and instance checks
if issubclass(actual, expected):
continue
print(f'mismatch {field.name}: should be: {expected}, but is {actual}')
print(f'offending value: {f}')
def __post_init__(self):
for field in dc.fields(self):
castfunc = field.metadata.get('castfunc', False)
if castfunc:
attr = getattr(self, field.name)
new = castfunc(attr)
setattr(self, field.name, new)
if DEBUG:
self._typecheck()
Fields have an additional attribute that is allowed to store arbitary information, I'm using it to store functions that convert the response value; but more on that later.
A basic response wrapper looks like this:
#dataclass
class DCcore_enrol_get_users_courses(BaseDC):
id: int # id of course
shortname: str # short name of course
fullname: str # long name of course
enrolledusercount: int # Number of enrolled users in this course
idnumber: str # id number of course
visible: int # 1 means visible, 0 means hidden course
summary: Optional[str] = None # summary
summaryformat: Optional[int] = None # summary format (1 = HTML, 0 = MOODLE, 2 = PLAIN or 4 = MARKDOWN)
format: Optional[str] = None # course format: weeks, topics, social, site
showgrades: Optional[int] = None # true if grades are shown, otherwise false
lang: Optional[str] = None # forced course language
enablecompletion: Optional[int] = None # true if completion is enabled, otherwise false
category: Optional[int] = None # course category id
progress: Optional[float] = None # Progress percentage
startdate: Optional[int] = None # Timestamp when the course start
enddate: Optional[int] = None # Timestamp when the course end
def __str__(self): return f'{self.fullname[0:39]:40} id:{self.id:5d} short: {self.shortname}'
core_enrol_get_users_courses = destructuring_list_cast(DCcore_enrol_get_users_courses)
Responses that are just lists were giving me trouble in the beginning, since I could not enforce type checking on them with a plain List[DCcore_enrol_get_users_courses].
This is where the destructuring_list_cast solves that problem for me, which is a little more involved. We're entering higher order function territory:
T = typing.TypeVar('T')
def destructuring_list_cast(cls: typing.Callable[[dict], T]) -> typing.Callable[[list], T]:
def cast(data: list) -> List[T]:
if data is None:
return []
if not isinstance(data, list):
raise SystemExit(f'listcast expects a list, you sent: {type(data)}')
try:
return [cls(**entry) for entry in data]
except TypeError as err:
# here is more code that explains errors
raise SystemExit(f'listcast for class {cls} failed:\n{err}')
return cast
This expects a Callable that accepts a dict and returns a class instance of type T, which is something what you'd expect from a constructor or a factory.
It returns a Callable that will accept a list, here it's cast.
return [cls(**entry) for entry in data] does all the work here, by constructing a list of dataclasses, when you call core_enrol_get_users_courses(response.json()).
(Throwing SystemExit is not nice, but that's handled in the upper layers, so it works for me; I want that to fail hard and fast.)
It's other use case is to define nested fields, then the responses are deeply nested: remember the field.metadata.get('castfunc', False) in the BaseDC? That's where these two shortcuts come in:
# destructured_cast_field
def dcf(cls):
return dc.field(metadata={'castfunc': destructuring_list_cast(cls)})
def optional_dcf(cls):
return dc.field(metadata={'castfunc': destructuring_list_cast(cls)}, default_factory=list)
These are used in nested cases like this (see bottom):
#dataclass
class core_files_get_files(BaseDC):
#dataclass
class parent(BaseDC):
contextid: int
# abbrev ...
#dataclass
class file(BaseDC):
contextid: int
component: str
timecreated: Optional[int] = None # Time created
# abbrev ...
parents: List[parent] = dcf(parent)
files: Optional[List[file]] = optional_dcf(file)
Have you considered using a meta-class?
class JsonDataWrapper(object):
def __init__(self, json_data):
self._data = json_data
def get(self, name):
return self._data[name]
class JsonDataWrapperMeta(type):
def __init__(self, name, base, dict):
for mbr in self.members:
prop = property(lambda self: self.get(mbr))
setattr(self, mbr, prop)
# You can use the metaclass inside a class block
class Group(JsonDataWrapper):
__metaclass__ = JsonDataWrapperMeta
members = ['id', 'description', 'name', 'description_format']
# Or more programmatically
def jsonDataFactory(name, members):
d = {"members":members}
return JsonDataWrapperMeta(name, (JsonDataWrapper,), d)
Course = jsonDataFactory("Course", ["id", "name", "short_name"])
When developing an API like this- in which all the members are read-only (meaning you do not want them overwritten, but may still have mutable data structures as members), I have often considered using collections.namedtuple a hard-to-beat approach unless I have a very good reason to do otherwise. It is fast, and needs a bare minimum of code.
from collections import namedtuple as nt
Group = nt('Group', 'id name shortname users')
g = Group(**json)
Simple.
If there is more data in your json than will be used in the object, just filter it out:
g = Group(**{k:v for k,v in json.items() if k in Group._fields})
If you want defaults for missing data, you can do that, too:
Group.__new__.__defaults__ = (0, 'DefaultName', 'DefN', None)
# now this works:
g = Group()
# and now this will still work even if some keys are missing;
g = Group(**{k:v for k,v in json.items() if k in Group._fields})
One gotcha using the above technique of setting defaults: don't set the default value for one of the members to any mutable object, such as a list, because it will be the same mutable shared object across all instances:
# don't do this:
Group.__new__.__defaults__(0, 'DefaultName', 'DefN', [])
g1 = Group()
g2 = Group()
g1.users.append(user1)
g2.users # output: [user1] <-- whoops!
Instead, wrap it all up in a nice factory that instantiates a new list (or dict or whatever user-defined data structure you need) for the members that need them:
# jsonfactory.py
new_list = Object()
def JsonClassFactory(name, *args, defaults=None):
'''Produces a new namedtuple class. Any members
intended to default to a blank list should be set to
the new_list object.
'''
cls = nt(name, *args)
if defaults is not None:
cls.__new__.__defaults__ = tuple(([] if d is new_list else d) for d in defaults)
Now given some json object that defines the fields you want present:
from jsonfactory import JsonClassFactory, new_list
MyJsonClass = JsonClassFactory(MyJsonClass, *json_definition,
defaults=(0, 'DefaultName', 'DefN', new_list))
And then as before:
obj = MyJsonClass(**json)
OR, if there is extra data:
obj = MyJsonClass(**{k:v for k,v in json.items() if k in MyJsonClass._fields})
If you want the default container to be something other than a list, this is simple enough- just replace the new_list sentinel with whatever sentinel you wish. If needed you could have multiple sentinels at the same time.
And if you still need extra functionality, you can always extend your MyJsonClass:
class ExtJsonClass(MyJsonClass):
__slots__ = () # optional- needed if you want the low memory benefits of namedtuple
def __new__(cls, *args, **kwargs):
self = super().__new__(cls, *args, **{k:v for k,v in kwargs.items()
if k in cls._fields})
return self
def add_user(self, user):
self.users.append(user)
The __new__ method above takes care of the missing data problem for good. So now you can always just do this:
obj = ExtJsonClass(**json)
Simple.
I myself am a newbie in python and so excuse me if I sound naive. One of the solution could be using __dict__ as discussed in the article below:
https://www.safaribooksonline.com/library/view/python-cookbook-3rd/9781449357337/ch06s02.html
Of course this solution will create issues if there are objects inside a class which below to other class and need to be serialized or de-serialized. I would love to hear the opinion of the experts here on this solution and different limitations.
Any feedback on jsonpickle.
Update:
I just saw your objection about the serialization and how you don't like it as everything is runtime. Understood. Thanks a lot.
Below is the code I wrote to get around that. A bit of a stretch but works well and I do not have to add get/set everytime !!!
import json
class JSONObject:
exp_props = {"id": "", "title": "Default"}
def __init__(self, d):
self.__dict__ = d
for key in [x for x in JSONObject.exp_props if x not in self.__dict__]:
setattr(self, key, JSONObject.exp_props[key])
#staticmethod
def fromJSON(s):
return json.loads(s, object_hook=JSONObject)
def toJSON(self):
return json.dumps(self.__dict__, indent=4)
s = '{"name": "ACME", "shares": 50, "price": 490.1}'
anObj = JSONObject.fromJSON(s)
print("Name - {}".format(anObj.name))
print("Shares - {}".format(anObj.shares))
print("Price - {}".format(anObj.price))
print("Title - {}".format(anObj.title))
sAfter = anObj.toJSON()
print("Type of dumps is {}".format(type(sAfter)))
print(sAfter)
Results below
Name - ACME
Shares - 50
Price - 490.1
Title - Default
Type of dumps is <type 'str'>
{
"price": 490.1,
"title": "Default",
"name": "ACME",
"shares": 50,
"id": ""
}

How can I copy a python class property?

In an attempt to discover the boundaries of Python as a language I'm exploring whether it is possible to go further with information hiding than the convention of using a leading underscore to denote 'private' implementation details.
I have managed to achieve some additional level of privacy of fields and methods using code such as this to copy 'public' stuff from a locally defined class:
from __future__ import print_function
class Dog(object):
def __init__(self):
class Guts(object):
def __init__(self):
self._dog_sound = "woof"
self._repeat = 1
def _make_sound(self):
for _ in range(self._repeat):
print(self._dog_sound)
def get_repeat(self):
return self._repeat
def set_repeat(self, value):
self._repeat = value
#property
def repeat(self):
return self._repeat
#repeat.setter
def repeat(self, value):
self._repeat = value
def speak(self):
self._make_sound()
guts = Guts()
# Make public methods
self.speak = guts.speak
self.set_repeat = guts.set_repeat
self.get_repeat = guts.get_repeat
dog = Dog()
print("Speak once:")
dog.speak()
print("Speak twice:")
dog.set_repeat(2)
dog.speak()
However, I'm struggling to find a way to do the same for the property setter and getter.
I want to be able to write code like this:
print("Speak thrice:")
dog.repeat = 3
dog.speak()
and for it to actually print 'woof' three times.
I've tried all of the following in Dog.__init__, none of which blow up, but neither do they seem to have any effect:
Dog.repeat = guts.repeat
self.repeat = guts.repeat
Dog.repeat = Guts.repeat
self.repeat = Guts.repeat
self.repeat = property(Guts.repeat.getter, Guts.repeat.setter)
self.repeat = property(Guts.repeat.fget, Guts.repeat.fset)
Descriptors only work when defined on the class, not the instance. See this previous question and this one and the documentation that abarnert already pointed you to. The key statement is:
For objects, the machinery is in object.__getattribute__() which transforms b.x into type(b).__dict__['x'].__get__(b, type(b)).
Note the reference to type(b). There is only one property object for the whole class, and the instance information is passed in at access time.
That means you can't have a property on an individual Dog instance that deals only with that particular dog's guts. You have to define a property on the Dog class, and have it access the guts of the individual dog via self. Except you can't do that with your setup, because you're not storing a reference to the dog's guts on self, because you're trying to hide the guts.
The bottom line is that you can't effectively proxy attribute access to an underlying "guts" object without storing a reference to that object on the "outward-facing" object. And if you store such a reference, people can use it to modify the guts in an "unauthorized" way.
Also, sad to say, even your existing example doesn't really protect the guts. I can do this:
>>> d = Dog()
>>> d.speak.__self__._repeat = 3
>>> d.speak()
woof
woof
woof
Even though you try to hide the guts by exposing only the public methods of the guts, those public methods themselves contain a reference to the actual Guts object, allowing anyone to sneak in and modify the guts directly, bypassing your information-hiding scheme. This is why it's futile to try to enforce information-hiding in Python: core parts of the language like methods already expose a lot of stuff, and you just can't plug all those leaks.
You could set up a system with descriptors and a metaclass where the Dog metaclass creates descriptors for all the public attributes of the class and constructs a SecretDog class containing all the private methods, then have each Dog instance has a shadow SecretDog instance tracked by the descriptors that houses your 'private' implementation. However, this would be going an awfully long way to "secure" the private implementation in a language that by it's nature can't really have anything private. You'll also have a hell of a time getting inheritance to work reliably.
Ultimately, if you want to hide a private implementation in Python, you should probably be writing it as a C extension (or not trying to in the first place). If your goal is a deeper understanding of the language, looking at writing a C extension isn't a bad place to start.
You will need something like this
>>> class hi:
... def __setattr__(self, attr, value):
... getattr(self, attr)(value)
... def haha(self, val):
... print val
...
>>> a = hi()
>>> a.haha = 10
10
just be careful with this

Categories

Resources