Python: where is the code for os.mkdir? - python

I've been looking through the code of the os module (just to be clear, I'm looking at the file /usr/lib/python2.7/os.py), and I've been trying to find the code for the mkdir function. From what I could tell, it comes from the 'posix' module, and its a built-in function, same as range or max:
>>> import posix
>>> posix.mkdir
<built-in function mkdir>
>>> max
<built-in function max>
I'm guessing the code for these is written in C somewhere, and the python interpreter knows where to find them. Could someone explain, or point me to some resources that do, how and where these built-in function are written and how they are integrated with the interpreter?
Thanks!

On POSIX platforms (and on Windows and OS/2) the os module imports from a C module, defined in posixmodule.c.
This module defines a posix_mkdir() function that wraps the mkdir() C call on POSIX platforms, CreateDirectoryW on Windows.
The module registers this function, together with others, in the module PyMethodDef posix_methods structure. When the module is imported, Python calls the PyMODINIT_FUNC() function, which uses that structure to create an approriate module object with the posix_methods structure and adds a series of constants (such as the open() flag constants) to the module.
See the Extending Python with C or C++ tutorial on how C extensions work.

Related

Where is binascii really stored?

whilst digging into encoding/decoding, I found the following part in binascii.py:
def a2b_base64(*args, **kwargs): # real signature unknown
""" Decode a line of base64 data. """
pass
From my naive understanding, this is implemented as C somewhere else. Is this in the python.exe itself or am I missing something?
There is no binascii.py file in the Python standard library. The binascii module in Python is entirely written in C; it's implemented in the Modules/binascii.c source file.
When Python is installed on a system, available as a shared library object, so as a .so or .dll file in a lib/pythonx.x/lib-dynload directory somewhere.
What you found instead is a stub file, to aid an IDE in introspection and autocompletion tasks. Such a file is needed because extension modules written in C usually are not introspectable, you can't always use the normal introspection techniques to figure out what arguments a function written in a compiled language will accept.
Note that such files are slowly becoming obsolete as more and more code in the standard library is being converted to using the new argument clinic system, which enables introspection support. The binascii module has been updated to use the AC syntax starting in Python 3.4, so you could ask the module directly:
>>> import inspect, binascii
>>> inspect.signature(binascii.a2b_base64)
<Signature (data, /)>
The function accepts a single positional-only argument named data (see Python: What does the slash mean in the output of help(range)? for an explanation for what the / in the signature means or what positional only means).

Where/how does the name `posix` get resolved by an import statement?

What happens behind the scenes (in CPython 3.6.0) when code uses import posix? This module doesn't have a __file__ attribute. When starting the interpreter in verbose mode, I see this line:
import 'posix' # <class '_frozen_importlib.BuiltinImporter'>
It's already present in sys.modules in a newly openened interpreter, and importing it just binds a name to the existing module.
I'm trying to look at implementation detail of os.lstat on my platform to determine if and when it uses os.stat.
Here, have more detail than you're likely to need.
posix is a built-in module. When you hear "built-in module", you might think of ordinary standard library modules, or you might think of modules written in C, but posix is more built-in than most.
The posix module is written in C, in Modules/posixmodule.c. However, while most C modules, even standard library C modules, are compiled to .so or .pyd files and placed on the import path like regular Python modules, posix actually gets compiled right into the Python executable itself.
One of the internal details of CPython's import system is the PyImport_Inittab array:
extern struct _inittab _PyImport_Inittab[];
struct _inittab *PyImport_Inittab = _PyImport_Inittab;
This is an array of struct _inittabs, which consist of a name and a C module initialization function for the module with that name. Modules listed here are built-in.
This array is initially set to _PyImport_Inittab, which comes from Modules/config.c (or PC/config.c depending on your OS, but that's not the case here). Unfortunately, Modules/config.c is generated from Modules/config.c.in during the Python build process, so I can't show you a source code link, but here's part of what it looks like when I generate the file:
struct _inittab _PyImport_Inittab[] = {
{"_thread", PyInit__thread},
{"posix", PyInit_posix},
// ...
As you can see, there's an entry for the posix module, along with the module initialization function, PyInit_posix.
As part of the import system, when trying to load a module, Python goes through sys.meta_path, a list of module finders. One of these finders is responsible for performing the sys.path search you're likely more familiar with, but one of the others is _frozen_importlib.BuiltinImporter, responsible for finding built-in modules like posix. When Python tries that finder, it runs the finder's find_spec method:
#classmethod
def find_spec(cls, fullname, path=None, target=None):
if path is not None:
return None
if _imp.is_builtin(fullname):
return spec_from_loader(fullname, cls, origin='built-in')
else:
return None
which uses _imp.is_builtin to search PyImport_Inittab for the "posix" name. The search finds the name, so find_spec returns a module spec representing the fact that the loader for built-in modules should handle creating this module. (The loader is the second argument to spec_from_loader. It's cls here, because BuiltinImporter is both the finder and loader.)
Python then runs the loader's create_module method to generate the module object:
#classmethod
def create_module(self, spec):
"""Create a built-in module"""
if spec.name not in sys.builtin_module_names:
raise ImportError('{!r} is not a built-in module'.format(spec.name),
name=spec.name)
return _call_with_frames_removed(_imp.create_builtin, spec)
which delegates to _imp.create_builtin, which searches PyImport_Inittab for the module name and runs the corresponding initialization function.
(_call_with_frames_removed(x, y) just calls x(y), but part of the import system treats it as a magic indicator to strip importlib frames from stack traces, which is why you never see those frames in the stack trace when your imports go wrong.)
If you want to see more of the code path involved, you can look through Lib/importlib/_bootstrap.py, where most of the import implementation lives, Python/import.c, where most of the C part of the implementation lives, and Python/ceval.c, which is where the bytecode interpreter loop lives, and thus is where execution of an import statement starts, before it reaches the more core parts of the import machinery.
Relevant documentation includes the section of the language reference on the import system, as well as PEPs 451 and 302. There isn't much documentation on built-in modules, although I did find a bit of documentation targeted toward people embedding Python in other programs, since they might want to modify PyImport_Inittab, and there is the sys.builtin_module_names list.

Constant integer Attributes in Python Module written in C

I implemented a python extension module in C according to https://docs.python.org/3.3/extending/extending.html
Now I want to have integer constants in that module, so I did:
module= PyModule_Create(&myModuleDef);
...
PyModule_AddIntConstant(module, "VAR1",1);
PyModule_AddIntConstant(module, "VAR2",2);
...
return module;
This works. But I can modify the "constants" from python, like
import myModule
myModule.VAR1 = 10
I tried to overload __setattr__, but this function is not called upon assignment.
Is there a solution?
You can't define module level "constants" in Python as you would in C(++). The Python way is to expect everyone to behave like responsible adults. If something is in all caps with underscores (like PEP 8 dictates), you shouldn't change it.

Importing #defines, constants and typedefs from a DLL using ctypes

I have a DLL from a board I bought to do some stuff, and it defines some functions, constants and types. I have successfully imported it to Python using ctypes. However, from this import I do not have access to the defined constants. For instance, if I need to call a function:
myDLL = ctypes.cdll.LoadLibrary("path/to/dll/parrot.dll")
spam = myDll.eggs(THIS_CONSTANT) #THIS_CONSTANT is defined in the DLL
then I cannot do it.
Is there a way to have access to these constants?
#define are certainly not accessible from the DLL. Indeed, their definition is expanded by the preprocessor even before the compiler starts to work. So there is no way the DLL remember the name under which it was defined.
You need to translate the header file into equivalent Python ctypes code. That can be done manually, or perhaps using a tool to automate some or all of the conversion.

How to re import an updated package while in Python Interpreter? [duplicate]

This question already has answers here:
How do I unload (reload) a Python module?
(22 answers)
Closed 5 years ago.
I often test my module in the Python Interpreter, and when I see an error, I quickly update the .py file. But how do I make it reflect on the Interpreter ? So, far I have been exiting and reentering the Interpreter because re importing the file again is not working for me.
Update for Python3: (quoted from the already-answered answer, since the last edit/comment here suggested a deprecated method)
In Python 3, reload was moved to the imp module. In 3.4, imp was deprecated in favor of importlib, and reload was added to the latter. When targeting 3 or later, either reference the appropriate module when calling reload or import it.
Takeaway:
Python3 >= 3.4: importlib.reload(packagename)
Python3 < 3.4: imp.reload(packagename)
Python2: continue below
Use the reload builtin function:
https://docs.python.org/2/library/functions.html#reload
When reload(module) is executed:
Python modules’ code is recompiled and the module-level code reexecuted, defining a new set of objects which are bound to names in the module’s dictionary. The init function of extension modules is not called a second time.
As with all other objects in Python the old objects are only reclaimed after their reference counts drop to zero.
The names in the module namespace are updated to point to any new or changed objects.
Other references to the old objects (such as names external to the module) are not rebound to refer to the new objects and must be updated in each namespace where they occur if that is desired.
Example:
# Make a simple function that prints "version 1"
shell1$ echo 'def x(): print "version 1"' > mymodule.py
# Run the module
shell2$ python
>>> import mymodule
>>> mymodule.x()
version 1
# Change mymodule to print "version 2" (without exiting the python REPL)
shell2$ echo 'def x(): print "version 2"' > mymodule.py
# Back in that same python session
>>> reload(mymodule)
<module 'mymodule' from 'mymodule.pyc'>
>>> mymodule.x()
version 2
All the answers above about reload() or imp.reload() are deprecated.
reload() is no longer a builtin function in python 3 and imp.reload() is marked deprecated (see help(imp)).
It's better to use importlib.reload() instead.
So, far I have been exiting and reentering the Interpreter because re importing the file again is not working for me.
Yes, just saying import again gives you the existing copy of the module from sys.modules.
You can say reload(module) to update sys.modules and get a new copy of that single module, but if any other modules have a reference to the original module or any object from the original module, they will keep their old references and Very Confusing Things will happen.
So if you've got a module a, which depends on module b, and b changes, you have to ‘reload b’ followed by ‘reload a’. If you've got two modules which depend on each other, which is extremely common when those modules are part of the same package, you can't reload them both: if you reload p.a it'll get a reference to the old p.b, and vice versa. The only way to do it is to unload them both at once by deleting their items from sys.modules, before importing them again. This is icky and has some practical pitfalls to do with modules entries being None as a failed-relative-import marker.
And if you've got a module which passes references to its objects to system modules — for example it registers a codec, or adds a warnings handler — you're stuck; you can't reload the system module without confusing the rest of the Python environment.
In summary: for all but the simplest case of one self-contained module being loaded by one standalone script, reload() is very tricky to get right; if, as you imply, you are using a ‘package’, you will probably be better off continuing to cycle the interpreter.
In Python 3, the behaviour changes.
>>> import my_stuff
... do something with my_stuff, then later:
>>>> import imp
>>>> imp.reload(my_stuff)
and you get a brand new, reloaded my_stuff.
No matter how many times you import a module, you'll get the same copy of the module from sys.modules - which was loaded at first import mymodule
I am answering this late, as each of the above/previous answer has a bit of the answer, so I am attempting to sum it all up in a single answer.
Using built-in function:
For Python 2.x - Use the built-in reload(mymodule) function.
For Python 3.x - Use the imp.reload(mymodule).
For Python 3.4 - In Python 3.4 imp has been deprecated in favor of importlib i.e. importlib.reload(mymodule)
Few caveats:
It is generally not very useful to reload built-in or dynamically
loaded modules. Reloading sys, __main__, builtins and other key
modules is not recommended.
In many cases extension modules are not
designed to be initialized more than once, and may fail in arbitrary
ways when reloaded. If a module imports objects from another module
using from ... import ..., calling reload() for the other module does
not redefine the objects imported from it — one way around this is to
re-execute the from statement, another is to use import and qualified
names (module.name) instead.
If a module instantiates instances of a
class, reloading the module that defines the class does not affect
the method definitions of the instances — they continue to use the
old class definition. The same is true for derived classes.
External packages:
reimport - Reimport currently supports Python 2.4 through 2.7.
xreload- This works by executing the module in a scratch namespace, and then
patching classes, methods and functions in place. This avoids the
need to patch instances. New objects are copied into the target
namespace.
livecoding - Code reloading allows a running application to change its behaviour in response to changes in the Python scripts it uses. When the library detects a Python script has been modified, it reloads that script and replaces the objects it had previously made available for use with newly reloaded versions. As a tool, it allows a programmer to avoid interruption to their workflow and a corresponding loss of focus. It enables them to remain in a state of flow. Where previously they might have needed to restart the application in order to put changed code into effect, those changes can be applied immediately.
Short answer:
try using reimport: a full featured reload for Python.
Longer answer:
It looks like this question was asked/answered prior to the release of reimport, which bills itself as a "full featured reload for Python":
This module intends to be a full featured replacement for Python's reload function. It is targeted towards making a reload that works for Python plugins and extensions used by longer running applications.
Reimport currently supports Python 2.4 through 2.6.
By its very nature, this is not a completely solvable problem. The goal of this module is to make the most common sorts of updates work well. It also allows individual modules and package to assist in the process. A more detailed description of what happens is on the overview page.
Note: Although the reimport explicitly supports Python 2.4 through 2.6, I've been trying it on 2.7 and it seems to work just fine.
Basically reload as in allyourcode's asnwer. But it won't change underlying the code of already instantiated object or referenced functions. Extending from his answer:
#Make a simple function that prints "version 1"
shell1$ echo 'def x(): print "version 1"' > mymodule.py
# Run the module
shell2$ python
>>> import mymodule
>>> mymodule.x()
version 1
>>> x = mymodule.x
>>> x()
version 1
>>> x is mymodule.x
True
# Change mymodule to print "version 2" (without exiting the python REPL)
shell2$ echo 'def x(): print "version 2"' > mymodule.py
# Back in that same python session
>>> reload(mymodule)
<module 'mymodule' from 'mymodule.pyc'>
>>> mymodule.x()
version 2
>>> x()
version 1
>>> x is mymodule.x
False
Not sure if this does all expected things, but you can do just like that:
>>> del mymodule
>>> import mymodule
import sys
del sys.modules['module_name']
See here for a good explanation of how your dependent modules won't be reloaded and the effects that can have:
http://pyunit.sourceforge.net/notes/reloading.html
The way pyunit solved it was to track dependent modules by overriding __import__ then to delete each of them from sys.modules and re-import. They probably could've just reload'ed them, though.
dragonfly's answer worked for me (python 3.4.3).
import sys
del sys.modules['module_name']
Here is a lower level solution :
exec(open("MyClass.py").read(), globals())

Categories

Resources