Related
After much struggle with awful defaults in Sphinx, I finally found a way to display inherited methods in subclass documentation. Unfortunately, this option is global...
autodoc_default_options = {
...
'inherited-members': True,
}
Is there any way to annotate any given class to prevent inherited methods and fields from showing in that documentation?
If there is no way to base it on inheritance, is there any way to simply list all the methods I don't want to be documented for a given class in its docstring?
I'm OK... well, I'd cry a little, but I'd live if I had to list the methods that need to be documented rather than blacklising the ones I don't want.
I know I can put :meta private: on a method definition to circumvent its inclusion in documentation (sort of, not really, but let's pretend it works), but in the case of inherited methods there's nowhere I can attach the docstring to.
Note that any "solution" that involves writing .. automodule:: section by hand is not a solution -- those must be generated.
Well, I figured how to solve half of the problem. Brace yourself for a lot of duct tape...
So, here's an example of how you can disable inherited fields in everything that extends directly or indirectly Exception class:
def is_error(obj):
return issubclass(obj, Exception)
conditionally_ignored = {
'__reduce__': is_error,
'__init__': is_error,
'with_traceback': is_error,
'args': is_error,
}
def skip_member_handler(app, objtype, membername, member, skip, options):
ignore_checker = conditionally_ignored.get(membername)
if ignore_checker:
frame = sys._getframe()
while frame.f_code.co_name != 'filter_members':
frame = frame.f_back
suspect = frame.f_locals['self'].object
return not ignore_checker(suspect)
return skip
def setup(app):
app.connect('autodoc-skip-member', skip_member_handler)
Put the above in your config.py. This implies that you are using autodoc to generate documentation.
In the end, I didn't go for controlling this behavior from the docstring of the class being documented. This is just too much work, and there are too many bugs in autodoc, Sphinx builders, the generated output and so on. Just not worth it.
But, the general idea would be to similarly add an event handler for when the source is read, extract information from docstrings, replace docstrings by patched docstrings w/o the necessary information, keep the extracted information somewhere until the skip handler is called, and then implement skipping. But, like I said, there are simply too many things broken along this line.
Another approach would be to emit XML, patch it with XSLT and feed it back to Docutils, but at the time of writing XML generator is broken (it adds namespaced attributes to XML, while it doesn't declare namespaces...) Old classic.
Yet another approach would've been handling an event, when, say the document is generated and references are resolved. Unfortunately, at that point, you will be missing information such as types of things you are working with, the original docstrings etc. You'll be essentially patching HTML, but with very convoluted structure.
In Python,
I am using a dataclass named "MyDataClass" to store data returned by a http response. let's say the response content is a json like this and I need only the first two fields:
{
"name": "Test1",
"duration": 4321,
"dont_care": "some_data",
"dont_need": "some_more_data"
}
and now I have two options:
Option 1
resp: dict = The response's content as json
my_data_class: MyDataClass(name=resp['name'], duration=resp['duration'])
where I take advantage of the dataclass' automatically defined init method
or
Option 2
resp: dict = The response's content as json
my_data_class: MyDataClass(resp)
and leave the processing to the dataclass init method, like this:
def _ _ init _ _(self, resp: Response) -> None:
self.name: str = resp['name']
self.duration: int = resp['duration']
I prefer the 2nd option, but I would like to know if there is a right way to this.
Thanks.
You only need the 1st 2 fields for now. Until you actually end up needing more. IMO it'll be way easier to go to the Dataclass's _ _init _ _() method to take care of that. Otherwise you would have to change BOTH the function call (MyDataClass(name=.....)) AND the dataclass init. With the 2nd option you have only one place where you need to intervene.
Unless don't care/don't need is huge and you're taking performance hit because of that... premature optimization is the root of all evils. So keep it simple & flexible as long as you can!
Let's say in future, you want to extract more data from response and store it in Dataclass, in OPTION 1: you would need to increase the arguments for __init__ method as well as all place where you initialized Dataclass. Therefore, OPTION 2 is preferable since it reduces code redundancy and keeps data extraction logic in one place.
You should absolutely try to avoid overwriting a dataclass' __init__ function. There is quite a bit of magic that you'll just lose by overwriting it. Among other things, you won't be able to have a proper __post_init__ function call, unless you rewrite it yourself. Which is not trivial.
The reason why dataclass works this way is because it is supposed to be a very simple one-to-one mapping of your business data into a programmatic structure. As a consequence, every kind of additional logic that you add which has nothing to do with that core idea takes away from the usefulness of dataclass.
So I'd suggest to stick to option 1.
If writing out the wanted attributes by hand becomes too much of a nuisance, you can consider writing a classmethod that filters unwanted attributes for you, and allows you to just splat the dictionary like this:
dataclass_instance = MyDataClass.from_request(**resp)
Here is a post that explains how to do just that, where the accompanying question also touches on some of your issues.
So I know this could be considered quite a broad quesiton, for which I am sorry, but I'm having problems understanding the whole importing and __init__ and self. things and all that... I've tried reading through the Python documentation and a few other tutorials, but this is my first language, and I'm a little (a lot) confused.
So far through my first semester at university I have learnt the very basics of Python, functions, numeric types, sequence types, basic logic stuff. But it's moving slower than I would like, so I took it upon myself to try learn a bit more and create a basic text based, strategy, resource management sorta game inspired by Ogame.
First problem I ran into was how to define each building, for example each mine, which produces resources. I did some research and found classes were useful, so I have something like this for each building:
class metal_mine:
level = 1
base_production = 15
cost_metal = 40
cost_crystal = 10
power_use = 10
def calc_production():
metal_mine.production = A formula goes here
def calc_cost_metal():
etc, same for crystal
def calc_power_use():
metal_mine.power_use = blah blah
def upgrade():
various things
solar_plant.calc_available_power()
It's kinda long, I left a lot out. Anyway, so the kinda important bit is that last bit, you see when I upgrade the mine, to determine if it has enough power to run, I calculate the power output of the solar plant which is in its own class (solar_plant.calc_output()), which contains many similar things to the metal mine class. If I throw everything in the same module, this all works fantastically, however with many buildings and research levels and the likes, it gets very long and I get lost in it.
So I tried to split it into different modules, so one for mines, one for storage buildings, one for research levels, etc. This makes everything very tidy, however I still need a way to call the functions in classes which are now part of a different module. My initial solution was to put, for example, from power import *, which for the most part, made the solar_plant class available in the metal_mine class. I say for the most part, because depending on the order in which I try to do things, sometimes it seems this doesn't work. The solar_plant class itself calls on variables from the metal_mine class, now I know this is getting very spagetti-ish..but I don't know of any better conventions to follow yet.
Anyway, sometimes when I call the solar_plant class, and it in turn tries to call the metal_mine class, it says that metal_mine is not defined, which leads me to think somehow the modules or classes need to be initialized? There seems to be a bit of looping between things in the code. And depending on the order in which I try and 'play the game', sometimes I am unintentionally doing this, sometimes I'm not. I haven't yet been taught the conventions and details of importing and reloading and all that..so I have no idea if I am taking the right approach or anything.
Provided everything I just said made sense, could I get some input on how I would properly go about making the contents of these various modules freely available and modifiable to others? Am I perhaps trying to split things into different modules which you wouldn't normally do, and I should just deal with the large module? Or am I importing things wrong? Or...?
And on a side note, in most tutorials and places I look for help on this, I see classes or functions full of self.something and the init function..can I get a explanation of this? Or a link to a simple first-time-programmer's tutorial?
==================UPDATE=================
Ok so too broad, like I thought it might be. Based on the help I got, I think I can narrow it down.
I sorted out what I think need to be the class variables, those which don't change - name, base_cost_metal, and base_cost_crystal, all the rest would depend on the players currently selected building of that type (supposing they could have multiple settlements).
To take a snippet of what I have now:
class metal_mine:
name = 'Metal Mine'
base_cost_metal = 60
base_cost_crystal = 15
def __init__(self):
self.level = 0
self.production = 30
self.cost_metal = 60
self.cost_crystal = 15
self.power_use = 0
self.efficiency = 1
def calc_production(self):
self.production = int(30 + (self.efficiency * int(30 * self.level * 1.1 * self.level)))
def calc_cost_metal(self):
self.cost_metal = int(metal_mine.base_cost_metal * 1.5 ** self.level)
So to my understanding, this is now a more correctly defined class? I define the instance variables with their starting values, which are then changed as the user plays.
In my main function where I begin the game, I would create an instance of each mine, say, player_metal_mine = metal_mine(), and then I call all the functions and variables with the likes of
>>> player_metal_mine.level
0
>>> player_metal_mine.upgrade()
>>> player_metal_mine.level
1
So if this is correctly defined, do I now just import each of my modules with these new templates for each building? and once they are imported, and an instance created, are all the new instances and their variables contained within the scope(right terminology?) of the main module, meaning no need for new importing or reloading?
Provided the answer to that is yes, I do just need to import, what method should I use? I understand there is just import mines for example, but that means I would have to use mines.player_metal_mine.upgrade() to use it, which is a tiny bit more typing thanusing the likes of from mines import *, or more particularly, from mines import metal_mine, though that last options means I need to individually import every building from every module. So like I said, provided, yes, I am just importing it, what method is best?
==================UPDATE 2================= (You can probably skip the wall of text and just read this)
So I went through everything, corrected all my classes, everything seems to be importing correctly using from module import *, but I am having issues with the scope of my variables representing the resource levels.
If everything was in 1 module, right at the top I would declare each variable and give it the beginning value, e.g. metal = 1000. Then, in any method of my classes which alters this, such as upgrading a building, costing resources, or in any function which alters this, like the one which periodically adds all the production to the current resource levels, I put global metal, for example, at the top. Then, within the function, I can call and alter the value of metal no problem.
However now that I am importing these classes and functions from various modules all into 1 module, functions cant find these variables. What I thought would happen was that in the process of importing I would basically be saying, take everything in this module, and pretend its now in this one, and work with it. But apparently that's not what is happening.
In my main module, I import all my modules using from mines import * for example and define the value of say, metal, to be 1000. Now I create an instance of a metal mine, `metal_mine_one = metal_mine(), and I can call its methods and variables, e.g.
>>> metal_mine_one.production
30
But when I try call a method like metal_mine_one.upgrade(), which contains global metal, and then metal -= self.cost_metal, it give me an error saying metal is not defined. Like I said, if this is all in 1 module, this problem doesn't happen, but if I try to import things, it does.
So how can I import these modules in a way which doesn't cause this problem, and makes variables in the global scope of my main module available to all functions and methods within all imported modules?
First a little background on object oriented programming. i.e. classes. You should think of a class like a blueprint, it shows how to make something. When you make a class it describes how to make an object to the program. a simple class in python might look like this.
class foo:
def __init__(self, bars_starting_value):
self.bar = bars_starting_value
def print_bar(self):
print(self.bar)
This tells python how to make a foo object. The init function is called a constructor. It is called when you make a new foo. The self is a way of referencing the foo that is running the function. In this case every foo has its own bar which can be accessed from within a foo by using self.bar. Note that you have to put a self as the first argument of the function definition this makes it so those functions belong to a single foo and not all of them.
One might use this class like this:
my_foo = foo(15)
my_other_foo = foo(100)
foo.print_bar()
foo.bar = 20
print(foo.bar)
my_other_foo.print_bar()
This would output
15
20
100
As far as imports go. They take all things that are defined in one file and move them to be defined in another. This is useful if you put the a class definition in a file you can import it into your main program file and make objects from there.
As far as making variables available to others, you could pass the power that has been generated from all the generators to the mine's function to determine if it has enough power.
Hope this helps.
A lot of things to cover here.. init is a builtin method that is automatically called when an instance of a class is created. In the code you provided you've created a class, now you need to create an instance of that class. A simpler example:
class Test:
def __init__(self):
print "this is called when you create an instance of this class"
def a_method(self):
return True
class_instance = Test()
>>> "this is called when you create an instance of this class"
class_instance.a_method()
>>> True
The first argument in a class method is *always itself. By convention we just call that argument 'self'. Your methods did not accept any arguments, make sure they accept self (or have the decorator #staticmethod above them). Also, make sure you refer to attributes (in you case methods) by self.a_method or class_instance.a_method
I have an application which relies heavily on a Context instance that serves as the access point to the context in which a given calculation is performed.
If I want to provide access to the Context instance, I can:
rely on global
pass the Context as a parameter to all the functions that require it
I would rather not use global variables, and passing the Context instance to all the functions is cumbersome and verbose.
How would you "hide, but make accessible" the calculation Context?
For example, imagine that Context simply computes the state (position and velocity) of planets according to different data.
class Context(object):
def state(self, planet, epoch):
"""base class --- suppose `state` is meant
to return a tuple of vectors."""
raise NotImplementedError("provide an implementation!")
class DE405Context(Context):
"""Concrete context using DE405 planetary ephemeris"""
def state(self, planet, epoch):
"""suppose that de405 reader exists and can provide
the required (position, velocity) tuple."""
return de405reader(planet, epoch)
def angular_momentum(planet, epoch, context):
"""suppose we care about the angular momentum of the planet,
and that `cross` exists"""
r, v = context.state(planet, epoch)
return cross(r, v)
# a second alternative, a "Calculator" class that contains the context
class Calculator(object):
def __init__(self, context):
self._ctx = context
def angular_momentum(self, planet, epoch):
r, v = self._ctx.state(planet, epoch)
return cross(r, v)
# use as follows:
my_context = DE405Context()
now = now() # assume this function returns an epoch
# first case:
print angular_momentum("Saturn", now, my_context)
# second case:
calculator = Calculator(my_context)
print calculator.angular_momentum("Saturn", now)
Of course, I could add all the operations directly into "Context", but it does not feel right.
In real life, the Context not only computes positions of planets! It computes many more things, and it serves as the access point to a lot of data.
So, to make my question more succinct: how do you deal with objects which need to be accessed by many classes?
I am currently exploring: python's context manager, but without much luck. I also thought about dynamically adding a property "context" to all functions directly (functions are objects, so they can have an access point to arbitrary objects), i.e.:
def angular_momentum(self, planet, epoch):
r, v = angular_momentum.ctx.state(planet, epoch)
return cross(r, v)
# somewhere before calling anything...
import angular_momentum
angular_momentum.ctx = my_context
edit
Something that would be great, is to create a "calculation context" with a with statement, for example:
with my_context:
h = angular_momentum("Earth", now)
Of course, I can already do that if I simply write:
with my_context as ctx:
h = angular_momentum("Earth", now, ctx) # first implementation above
Maybe a variation of this with the Strategy pattern?
You generally don't want to "hide" anything in Python. You may want to signal human readers that they should treat it as "private", but this really just means "you should be able to understand my API even if you ignore this object", not "you can't access this".
The idiomatic way to do that in Python is to prefix it with an underscore—and, if your module might ever be used with from foo import *, add an explicit __all__ global that lists all the public exports. Again, neither of these will actually prevent anyone from seeing your variable, or even accessing it from outside after import foo.
See PEP 8 on Global Variable Names for more details.
Some style guides suggest special prefixes, all-caps-names, or other special distinguishing marks for globals, but PEP 8 specifically says that the conventions are the same, except for the __all__ and/or leading underscore.
Meanwhile, the behavior you want is clearly that of a global variable—a single object that everyone implicitly shares and references. Trying to disguise it as anything other than what it is will do you no good, except possibly for passing a lint check or a code review that you shouldn't have passed. All of the problems with global variables come from being a single object that everyone implicitly shares and references, not from being directly in the globals() dictionary or anything like that, so any decent fake global is just as bad as a real global. If that truly is the behavior you want, make it a global variable.
Putting it together:
# do not include _context here
__all__ = ['Context', 'DE405Context', 'Calculator', …
_context = Context()
Also, of course, you may want to call it something like _global_context or even _private_global_context, instead of just _context.
But keep in mind that globals are still members of a module, not of the entire universe, so even a public context will still be scoped as foo.context when client code does an import foo. And this may be exactly what you want. If you want a way for client scripts to import your module and then control its behavior, maybe foo.context = foo.Context(…) is exactly the right way. Of course this won't work in multithreaded (or gevent/coroutine/etc.) code, and it's inappropriate in various other cases, but if that's not an issue, in some cases, this is fine.
Since you brought up multithreading in your comments: In the simple style of multithreading where you have long-running jobs, the global style actually works perfectly fine, with a trivial change—replace the global Context with a global threading.local instance that contains a Context. Even in the style where you have small jobs handled by a thread pool, it's not much more complicated. You attach a context to each job, and then when a worker pulls a job off the queue, it sets the thread-local context to that job's context.
However, I'm not sure multithreading is going to be a good fit for your app anyway. Multithreading is great in Python when your tasks occasionally have to block for IO and you want to be able to do that without stopping other tasks—but, thanks to the GIL, it's nearly useless for parallelizing CPU work, and it sounds like that's what you're looking for. Multiprocessing (whether via the multiprocessing module or otherwise) may be more of what you're after. And with separate processes, keeping separate contexts is even simpler. (Or, you can write thread-based code and switch it to multiprocessing, leaving the threading.local variables as-is and only changing the way you spawn new tasks, and everything still works just fine.)
It may make sense to provide a "context" in the context manager sense, as an external version of the standard library's decimal module did, so someone can write:
with foo.Context(…):
# do stuff under custom context
# back to default context
However, nobody could really think of a good use case for that (especially since, at least in the naive implementation, it doesn't actually solve the threading/etc. problem), so it wasn't added to the standard library, and you may not need it either.
If you want to do this, it's pretty trivial. If you're using a private global, just add this to your Context class:
def __enter__(self):
global _context
self._stashedcontext = _context
_context = self
def __exit__(self, *args):
global context
_context = self._stashedcontext
And it should be obvious how to adjust this to public, thread-local, etc. alternatives.
Another alternative is to make everything a member of the Context object. The top-level module functions then just delegate to the global context, which has a reasonable default value. This is exactly how the standard library random module works—you can create a random.Random() and call randrange on it, or you can just call random.randrange(), which calls the same thing on a global default random.Random() object.
If creating a Context is too heavy to do at import time, especially if it might not get used (because nobody might ever call the global functions), you can use the singleton pattern to create it on first access. But that's rarely necessary. And when it's not, the code is trivial. For example, the source to random, starting at line 881, does this:
_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
…
And that's all there is to it.
And finally, as you suggested, you could make everything a member of a different Calculator object which owns a Context object. This is the traditional OOP solution; overusing it tends to make Python feel like Java, but using it when it's appropriate is not a bad thing.
You might consider using a proxy object, here's a library that helps in creating object proxies:
http://pypi.python.org/pypi/ProxyTypes
Flask uses object proxies for it's "current_app", "request" and other variables, all it takes to reference them is:
from flask import request
You could create a proxy object that is a reference to your real context, and use thread locals to manage the instances (if that would work for you).
Let's say I have the following simple class:
import cherrypy
import os
class test:
test_member = 0;
def __init__(self):
return
def index(self):
self.test_member = self.test_member + 1
return str(self.test_member)
index.exposed = True
conf = os.path.join(os.path.dirname(__file__), 'config.ini')
if __name__ == '__main__':
# CherryPy always starts with app.root when trying to map request URIs
# to objects, so we need to mount a request handler root. A request
# to '/' will be mapped to HelloWorld().index().
cherrypy.config.update({'server.socket_host': '0.0.0.0'})
cherrypy.quickstart(test(), config=conf)
else:
# This branch is for the test suite; you can ignore it.
cherrypy.config.update({'server.socket_host': '0.0.0.0'})
cherrypy.tree.mount(test(), config=conf)
So when I open my index page the first time I get back 1, the next time 2, then 3, 4, and so on. My questions are:
Are there any big dangers with this, particularly with threads and multiple people accessing the page at the same time?
Do I have to lock the member variable in some way each time it's written to in order to prevent issues?
Does anything change if I'm using a none basic data type as a member (such as my own, complicated class) rather than something as simple as an integer?
I don't totally understand how threading with CherryPy works, I suppose my concern in this simple example would be that on one thread the test_member could be equal to one thing, and when accessed from another thread it'd be something totally different. I apologize in advance if I'm missing something that's well documented, but some googling didn't really turn up what I was looking for. I understand for such a simple example there are a number of relatively easy paths that could solve potential problems here (keep the state of the variable in a database, or something along those lines), but that won't work in my actual use case.
There's a danger there of lost updates. Just setting the value shouldn't need to lock, since replacing an instance variable is atomic with respect to the GIL (assuming it doesn't call any special methods, etc). But incrementing or using more complex variables will need different schemes to make them threadsafe.
Shared access in CherryPy is generally no different than any other Python program. Rather than a long rehash of all those options here, it's best to direct you to http://effbot.org/zone/thread-synchronization.htm As it mentions, replacing an instance variable is probably atomic with respect to the GIL and thereby thread-safe, but incrementing is not.
CherryPy only adds some helpers in the opposite direction: when you don't want to share: the cherrypy.request and cherrypy.response objects are newly created (and properly destroyed) for each request/response--feel free to stick data in cherrypy.request.foo if you want to keep it around for the duration of the request only.