Is docstring being declared every time the function is called? - python

The Python official document specifies that a docstring is a string literal that occurs at the beginning of a function. And it can be accessed using the __doc__ attribute.
If I have a function that will be called many many times, does that mean the docstring will be declared every time the function is called?
If this is the case, would it be more efficient to design docstring in such a way that it is stored in __doc__ but not being declared every time the function is called?

every time you start a python program, they are "important" into memory only "once", parsed so that every "object" properties are determined and all objects are put into their separate memory locations and then linked together in the memory to make it a whole running system (remember the object nature of python).
second behavior is when you don't restrict the python interpreter. If you import your files, then, in addition to the above steps, it writes these objects into more durable .pyc files under __pycache__ folder at the same level of the file.
In this process, new objects are created to have a __doc__ property object when certain keywords are parsed, mainly class and def. These __doc__ of each are then either kept empty, filled some default by inheritance, or if it has a docstring then it is written inside.
You can see this behavior on different objects, created with/out supplying a docstring, simply by using dir(objectname). To answer your question, you can use this command throughout your program.
However, this is true only for static written code. If you are trying to make objects on the fly, especially within loops, then your objects will be actively created and destroyed, thus there will be almost no optimization against them and docstrings will be created again and again.
consider these two:
def staticMethod():
pass
for i in range(5):
def activeMethod():
pass
print(staticMethod,"s")
print(activeMethod,"a")
while staticMethod is served from the same memory location, the memory address for activeMethod changes. you will see an altering between few values because python can still optimize since this one is a simple example.
So keep yourself aware of this distinct behavior, especially of loops.

Related

How can I save a dynamically generated module and reimport them from file?

I have an application that dynamically generates a lot of Python modules with class factories to eliminate a lot of redundant boilerplate that makes the code hard to debug across similar implementations and it works well except that the dynamic generation of the classes across the modules (hundreds of them) takes more time to load than simply importing from a file. So I would like to find a way to save the modules to a file after generation (unless reset) then load from those files to cut down on bootstrap time for the platform.
Does anyone know how I can save/export auto-generated Python modules to a file for re-import later. I already know that pickling and exporting as a JSON object won't work because they make use of thread locks and other dynamic state variables and the classes must be defined before they can be pickled. I need to save the actual class definitions, not instances. The classes are defined with the type() function.
If you have ideas of knowledge on how to do this I would really appreciate your input.
You’re basically asking how to write a compiler whose input is a module object and whose output is a .pyc file. (One plausible strategy is of course to generate a .py and then byte-compile that in the usual fashion; the following could even be adapted to do so.) It’s fairly easy to do this for simple cases: the .pyc format is very simple (but note the comments there), and the marshal module does all of the heavy lifting for it. One point of warning that might be obvious: if you’ve already evaluated, say, os.getcwd() when you generate the code, that’s not at all the same as evaluating it when loading it in a new process.
The “only” other task is constructing the code objects for the module and each class: this requires concatenating a large number of boring values from the dis module, and will fail if any object encountered is non-trivial. These might be global/static variables/constants or default argument values: if you can alter your generator to produce modules directly, you can probably wrap all of these (along with anything else you want to defer) in function calls by compiling something like
my_global=(lambda: open(os.devnull,'w'))()
so that you actually emit the function and then a call to it. If you can’t so alter it, you’ll have to have rules to recognize values that need to be constructed in this fashion so that you can replace them with such calls.
Another detail that may be important is closures: if your generator uses local functions/classes, you’ll need to create the cell objects, perhaps via “fake” closures of your own:
def cell(x): return (lambda: x).__closure__[0]

(Python) Monkeypatch __new__ for objects of type int, float, str, list, dict, set, and module in python

I want to implicitly extend the int, float, str, list, dict, set, and module classes with custom built substitutions (extensions).
When I say 'implicitly', what I mean is that when I declare 'a = 1', and object of the type Custom_Int (as an example) is produced, as opposed to a normal integer object.
Now, I understand and respect the reasons not to do this. Firstly- messing with built-ins is like messing with the laws of physics. No good can come from it. That said- I do understand the gravity of what I'm trying to do and what can happen if I do it wrong.
Second- I understand that modifying a base case will effect not just the current run-time but all running python processes. I feel that by overriding the __new__ method of these base classes, such that it returns Custom_Object_Whatever if and ONLY IF certain environmental factors are true, other run times will remain largely unaffected.
So, getting back to the issue at hand- how can I override the __new__ method of these various types?
Pythons forbiddenfruit package seems to be promising. I havn't had a chance to reeeeeeally investigate it though, and if someone who understands it could summarize what it does, that would save me a lot of time.
Beyond that, I've observed something strange.
Every answer to monkeypatching that doesn't eventually circle back to forbiddenfruit or how forbiddenfruit works has to do with modifying what I will refer to as the 'absolute_dictionary' of the class. Because everything in Python is essentially a mapping (or dictionary) of functions/values to names, if you change the name __new__ within the right mapping, you change the nature of the object.
Problem is- every near-success I've had has it that if I call 'str( 'a' ).__new__( *args )' it works fine {in some cases}, but the calling of varOne = 'a' does not seem to actually call str.__new__().
My guess- this has something to do with either python's parsing of a program prior to launch, or else the caching of the various classes during/post launch. Or maybe I'm totally off the mark. Either python pre-reads and applies some regex to it's modules prior to launch or else the machine code, when it attempts to implicitly create an object, it reaches for something other than the class located in moduleObject.builtins[ __classname__ ]
Any ideas?
If you want to do this, your best option is probably to modify the CPython source code and build your own custom Python build with your extensions baked into the actual built-in types. The result will integrate a lot better with all the low-level mechanisms you don't yet understand, and you'll learn a lot in the process.
Right now, you're getting blocked by a lot of factors. Here are the ones that have come to my mind.
The first is that most ways of creating built-in objects don't go through a __new__ method at all. They go through C-level calls like PyLong_FromLong or PyList_New. These calls are hardwired to use the actual built-in types, allocating memory sized for the real built-in types, fetching the type object by the address of its statically-allocated C struct, and stuff like that. It's basically impossible to change any of this without building your own custom Python.
The second factor is that messing with __new__ isn't even enough to correctly affect things that theoretically should go through __new__, like int("5"). Python has reasons for stopping you from setting attributes on built-in classes, and two of those reasons are slots and the type attribute cache.
Slots are a public part of the C API that you'll probably learn about if you try to modify the CPython source code. They're function pointers in the C structs that make up type objects at C level, and most of them correspond to Python-level magic methods. For example, the __new__ method has a corresponding tp_new slot. Most C code accesses slots instead of methods, and there's code to ensure the slots and methods are in sync, but if you bypass Python's protections, that breaks and everything goes to heck.
The type attribute cache isn't a public part of anything even at C level. It's a cache that saves the results of type object attribute lookups, to make Python go faster. Its memory safety relies on all type object attribute modification going through type.__setattr__ (and all built-in type object attribute modification getting rejected by type.__setattr__), but if you bypass the protection, memory safety goes out the window and arbitrarily weird results can occur.
The third factor is that there's a bunch of caching for immutable objects. The small int cache, the interned string dict, constants getting saved in bytecode objects, compile-time constant folding... there's a lot. Objects aren't going to be created when you expect. (There's also stuff like, say, zip saving the last output tuple and reusing it if it sees you didn't keep a reference, for even more ways object creation will mess with your assumptions.)
There's more. Stuff like, what argument would int.__new__ even take if you tried to use int.__new__ to evaluate the expression 5? Stuff like all the low-level code that knows exactly how to work with the types it expects and will get very confused if it gets a MyCustomTuple with a completely different memory layout from a real tuple. Screwing with built-ins has a lot of issues.
Incidentally, one of the things you expected to be a problem is mostly not a problem. Screwing with one Python process's built-ins won't affect other Python processes' built-ins... unless those other processes are created by forking the first process, such as with multiprocessing in fork mode.

How can I call a function defined in a module with the same name as an attribute in the same module?

If an attribute of a module is binded to a function which is defined in the same module with the same name as the attribute, how can I invoke the function directly from outside the module?
For example, builtins.__import__ attribute is binded to builtins.__import__ function by default.
If I rebind builtins.__import__ attribute to a function different from builtins.__import__ function,
how shall I invoke the new function by the builtins.__import__ attribute? Treat the builtins.__import__ attribute as if it were a function, and make a call to it like builtins.__import__(argument)?
how can I invoke the builtins.__import__ function, if it is hidden by the builtins.__import__ attribute?
Thanks.
First of all, technically speaking, functions don't have names - there is no such a thing as 'function a', there is only a property/attribute/argument a pointing to a function (in memory). Thus, unless you know the original function's location in memory - and that's (greatly simplified) what properties/variables represent, (wrapped) pointers - it will be lost for you. How can you find something if you don't know where to look (except manually by going bit by bit through the memory and trying to identify the structure you're looking for) and when you don't know the structure of it (because if you knew you could just recreate the function instead of searching for it)?
In fact, Python is usually smart enough to release the memory of any object that loses its reference count, otherwise a lot of Python would cease to work (or would create memory sinkholes within first seconds of operation) so if the builtin.__import__ function has no other references than that (and to my knowledge it doesn't in standard CPython), it will be gone forever (for the duration of the current process) once you overwrite it to point to some other function.
Granted, given the importance of the function in question, I'd bet there are still references to it a level or two deeper so it probably doesn't get garbage collected, but finding a different 'route' to the function would be a time consuming task way out of SO answer scope.
If you were hoping to use this fact to build a 'sandboxed Python' - don't. Many have tried before and failed because you should never underestimate the time some people are willing to devote to find a workaround. Just run your Python instances in disposable VMs and live a carefree life (until somebody really, really dedicated finds a way to break out of that as well).

Why use python classes over modules with functions?

Im teaching myself python (3.x) and I'm trying to understand the use case for classes. I'm starting to understand what they actually do, but I'm struggling to understand why you would use a class as opposed to creating a module with functions.
For example, how does:
class cls1:
def func1(arguments...):
#do some stuff
obj1 = cls1()
obj2 = cls1()
obj1.func1(arg1,arg2...)
obj2.func1(arg1,arg2...)
Differ from:
#module.py contents
def func1(arguments...):
#do some stuff
import module
x = module.func1(arg1,arg2...)
y = module.func1(arg1,arg2...)
This is probably very simple but I just can't get my head around it.
So far, I've had quite a bit of success writing python programs, but they have all been pretty procedural, and only importing basic module functions. Classes are my next biggest hurdle.
You use class if you need multiple instance of it, and you want that instances don't interfere each other.
Module behaves like a singleton class, so you can have only one instance of them.
EDIT: for example if you have a module called example.py:
x = 0
def incr():
global x
x = x + 1
def getX():
return x
if you try to import these module twice:
import example as ex1
import example as ex2
ex1.incr()
ex1.getX()
1
ex2.getX()
1
This is why the module is imported only one time, so ex1 and ex2 points to the same object.
As long as you're only using pure functions (functions that only works on their arguments, always return the same result for the same arguments set, don't depend on any global/shared state and don't change anything - neither their arguments nor any global/shared state - IOW functions that don't have any side effects), then classes are indeed of a rather limited use. But that's functional programming, and while Python can technically be used in a functional style, it's possibly not the best choice here.
As soon has you have to share state between functions, and specially if some of these functions are supposed to change this shared state, you do have a use for OO concepts. There are mainly two ways to share state between functions: passing the state from function to function, or using globals.
The second solution - global state - is known to be troublesome, first because it makes understanding of the program flow (hence debugging) harder, but also because it prevents your code from being reentrant, which is a definitive no-no for quite a lot of now common use cases (multithreaded execution, most server-side web application code etc). Actually it makes your code practically unusable or near-unusable for anything except short simple one-shot scripts...
The second solution most often implies using half-informal complex datastructures (dicts with a given set of keys, often holding other dicts, lists, lists of dicts, sets etc), correctly initialising them and passing them from function to function - and of course have a set of functions that works on a given datastructure. IOW you are actually defining new complex datatypes (a data structure and a set of operations on that data structure), only using the lowest level tools the language provide.
Classes are actually a way to define such a data type at a higher level, grouping together the data and operations. They also offer a lot more, specially polymorphism, which makes for more generic, extensible code, and also easier unit testing.
Consider you have a file or a database with products, and each product has product id, price, availability, discount, published at web status, and more values. And you have second file with thousands of products that contain new prices and availability and discount. You want to update the values and keep control on how many products will be change and other stats. You can do it with Procedural programming and Functional programming but you will find yourself trying to discover tricks to make it work and most likely you will be lost in many different lists and sets.
On the other hand with Object-oriented programming you can create a class Product with instance variables the product-id, the old price, the old availability, the old discount, the old published status and some instance variables for the new values (new price, new availability, new discount, new published status). Than all you have to do is to read the first file/database and for every product to create a new instance of the class Product. Than you can read the second file and find the new values for your product objects. In the end every product of the first file/database will be an object and will be labeled and carry the old values and the new values. It is easier this way to track the changes, make statistics and update your database.
One more example. If you use tkinter, you can create a class for a top level window and every time you want to appear an information window or an about window (with custom color background and dimensions) you can simply create a new instance of this class.
For simple things classes are not needed. But for more complex things classes sometimes can make the solution easier.
I think the best answer is that it depends on what your indented object is supposed to be/do. But in general, there are some differences between a class and an imported module which will give each of them different features in the current module. Which the most important thing is that class has been defined to be objects, this means that they have a lot of options to act like an object which modules don't have. For example some special attributes like __getattr__, __setattr__, __iter__, etc. And the ability to create a lot of instances and even controlling the way that they are created. But for modules, the documentation describes their use-case perfectly:
If you quit from the Python interpreter and enter it again, the
definitions you have made (functions and variables) are lost.
Therefore, if you want to write a somewhat longer program, you are
better off using a text editor to prepare the input for the
interpreter and running it with that file as input instead. This is
known as creating a script. As your program gets longer, you may want
to split it into several files for easier maintenance. You may also
want to use a handy function that you’ve written in several programs
without copying its definition into each program.
To support this, Python has a way to put definitions in a file and use
them in a script or in an interactive instance of the interpreter.
Such a file is called a module; definitions from a module can be
imported into other modules or into the main module (the collection of
variables that you have access to in a script executed at the top
level and in calculator mode).

Python game programming: is my IO object a legitimate candidate for being a global variable?

I'm programming a game in Python, where all IO activities are done by an IO object (in the hope that it will be easy to swap that object out for another which implements a different user interface). Nearly all the other objects in the game need to access the IO system at some point (e.g. printing a message, updating the position of the player, showing a special effect caused by an in-game action), so my question is this:
Does it make sense for a reference to the IO object to be available globally?
The alternative is passing a reference to the IO object into the __init__() of every object that needs to use it. I understand that this is good from a testing point of view, but is this worth the resulting "function signature pollution"?
Thanks.
Yes, this is a legitimate use of a global variable. If you'd rather not, passing around a context object that is equivalent to this global is another option, as you mentioned.
Since I assume you're using multiple files (modules), why not do something like:
import io
io.print('hello, world')
io.clear()
This is a common way programs that have more complex I/O needs than simple printing do things like logging.
Yes, I think so.
Another possibility would be to create a module loggerModule that has functions like print() and write(), but this would only marginally be better.
Nope.
Variables are too specific to be passed around in the global namespace. Hide them inside static functions/classes instead that can do magic things to them at run time (or call other ones entirely).
Consider what happens if the IO can periodically change state or if it needs to block for a while (like many sockets do).
Consider what happens if the same block of code is included multiple times. Does the variable instance get duplicated as well?
Consider what happens if you want to have a version 2 of the same variable. What if you want to change its interface? Do you have to modify all the code that references it?
Does it really make sense to infect all the code that uses the variable with knowledge of all the ways it can go bad?

Categories

Resources