How can I overload a built-in module in python? - python

I am trying to bind hosts to specified ips in my python program. Just make it affect in the python program, so I am not going to modify the /etc/hosts file.
I tried to add a bit code to the create_connection function in socket.py for host-ip translation, like this:
host, port = address # the original code in socket.py
# My change here:
if host == "www.google.com":
host = target_ip
for res in getaddrinfo(host, port, 0, SOCK_STREAM): # the original code in socket.py
I found it works fine.
And now I want the host-ip translation only works in this python program.
So my question is: how can I make my python program import this socket.py not the build-in one when using import socket?
To make it clear, here is an example. Suppose 'test' is my work directory:
test
|--- main.py
|--- socket.py
In this case:
How can I make main.py use test/socket.py by import socket?
How can I make another modules use test/socket.py when they are
using import socket?
I think changing the module find path order may help. But I found that even if the current path('') is in the first place of sys.path already and import socket still imports the built-in scoket module.

You can monkey-patch sys.modules, placing your own module instead of the standard socket, before importing any other module which might be using it.
# myscript.py
from myproject import mysocket
import sys
sys.modules['socket'] = mysocket
# ... the rest of your code
import requests
...
For that, mysocket should expose everything which the standard socket does.
# mysocket.py
import socket as _std_socket
from socket import * # expose everything
def create_connection(address, *args, **kwargs):
if address == ...:
address = ...
return _std_socket.create_connection(address, *args, **kwargs)
This might be an over-simplification of what mysocket.py should look like. Youd' likely need to add some definitions before this can be used in production, but you get the idea.
Another approach would be to monkey-patch the socket module itself, i.e. overwrite names inside the original module.
# myscript.py
import socket
def create_connection2(...):
...
socket.create_connection = create_connection2
# ... the rest of your code
import requests
...
I prefer the former approach, becuase it is cleaner in the sense you don't need to go inside the module, only to hide it and override some things in it from the outside.

You can use relative imports to locally use a socket.py module. However, to do this your project must be structured as a package.
from . import socket

Related

splitting python code into separate files

I am trying to split common python code into separate files.
for example I have svr.py with the following code.
import socket
PORT = 6060
SERVER = socket.gethostbyname(socket.gethostname())
ADDRESS = (SERVER, PORT)
__server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
__server.bind(ADDRESS)
def startServer():
pass
startServer()
so I am thinking of splitting into 2 python files, since the common section (bcode.py) will be use in svr.py and client.py
file: bcode.py has the following code
import socket
PORT = 6060
SERVER = socket.gethostbyname(socket.gethostname())
ADDRESS = (SERVER, PORT)
file: svr.py has the following code
import socket
import bcode
__server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
__server.bind(ADDRESS)
def startServer():
pass
startServer()
from my understanding when I do import bcode, python interpreter execude bcode.py so it should has constant PORT, SERVER and ADDRESS in memory, but when I run svr.py, I get the following error message:
Traceback (most recent call last):
File "C:\temp\PythonProject\svr.py", line 9, in <module>
__server.bind(ADDRESS)
NameError: name 'ADDRESS' is not defined
Process finished with exit code 1
it seems the ADDRESS constant in not available on svr.py even after bcode is imported, and also I need to add import socket on svr.py, I thought since I already import socket in bcode.py and when svr.py import bcode, the import socket is carried into svr.py as well.
I appreciated if you could help me on what is the best way to split common code in Python.
First off, I think it is a very good idea to split your code into modules. It will help you keeping your code clean and tidy!
Then, when you import a module in Python, you decide under what namespace its content is included under. In your example, you've used the standard way of importing, namely import bcode. By this approach, all content of bcode is subject to the bcode namespace and must be referenced as such:
import bcode
print(bcode.ADDRESS)
This is also the approach that I recommend, as it keeps your namespaces clean and tidy when your files grow in number and in terms of code lines. This way, there is never any doubt of which ADDRESS is being used.
However, there are other ways to import modules, e.g. by explicitly importing the variable of choice by from bcode import ADDRESS. But then, imagine doing this,
ADDRESS = "127.0.0.1"
from bcode import ADDRESS
print(ADDRESS) # whatever was in bcode ..
This may be fine for now, but someone else that reads your code may overlook the fact that you rewrote the variable or lose track of which is what and where whatever originally came from.
Yet another approach lets you import all content of a module in under the local namespace by using *. This solution may be acceptable for small scripts, however, you'll probably make it really cumbersome for your future-self (or colleges) as you'll definitely lose control over your names (that is, variables, functions, classes, and so on) in the long run,
ADDRESS = "127.0.0.1"
from bcode import *
print(ADDRESS) # whatever was in bcode ..
print(PORT) # whatever was in bcode ..
I strongly recommend that you stick to the first approach (as you already have), and remember to reference the variables appropriately.
As a final note, you should also be aware of the possibility to rename namespaces/modules. I don't really recommend this either, but it may come in handy for e.g. shortening long modules names. Some heavily used modules from the standard lib have some commonly used abbreviations, e.g. the numpy module, often referenced as just np
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,5,100)
y = x**2
plt.plot(x,y)

Object not working properly when called from child module

Hello generous SO'ers,
This is a somewhat complicated question, but hopefully relevant to the more general use of global objects from a child-module.
I am using some commercial software that provides a python library for interfacing with their application through TCP. (I can't post the code for their library I don't think.)
I am having an issue with calling an object from a child module, that I think is more generally related to global variables or some such. Basically, the object's state is as expected when the child-module is in the same directory as all the other modules (including the module that creates the object).
But when I move the offending child module into a subfolder, it can still access the object but the state appears to have been altered, and the object's connection to the commercial app doesn't work anymore.
Following some advice from this question on global vars, I have organized my module's files as so:
scriptfile.py
pyFIMM/
__init__.py # imports all the other files
__globals.py # creates the connection object used in most other modules
__pyfimm.py # main module functions, such as pyFIMM.connect()
__Waveguide.py # there are many of these files with various classes and functions
(...other files...)
PhotonDesignLib/
__init__.py # imports all files in this folder
pdPythonLib.py # the commercial library
proprietary/
__init__.py # imports all files in this folder
University.py # <-- The offending child-module with issues
pyFIMM/__init__.py imports the sub-files like so:
from __globals import * # import global vars & create FimmWave connection object `fimm`
from __pyfimm import * # import the main module
from __Waveguide import *.
(...import the other files...)
from proprietary import * # imports the subfolder containing `University.py`
The __init__.py's in the subfolders "PhotonDesignLib" & "proprietary" both cause all files in the subfolders to imported, so, for example, scriptfile.py would access my proprietary files as so: import pyFIMM.proprietary.University. This is accomplished via this hint, coded as follows in proprietary/__init__.py:
import os, glob
__all__ = [ os.path.basename(f)[:-3] for f in glob.glob(os.path.dirname(__file__)+"/*.py")]
(Numerous coders from a few different institutions will have their own proprietary code, so we can share the base code but keep our proprietary files/functions to ourselves this way, without having to change any base code/import statements. I now realize that, for the more static PhotonDesignLib folder, this is overkill.)
The file __globals.py creates the object I need to use to communicate with their commercial app, with this code (this is all the code in this file):
import PhotonDesignLib.pdPythonLib as pd # the commercial lib/object
global fimm
fimm = pd.pdApp() # <- - this is the offending global object
All of my sub-modules contain a from __global import * statement, and are able to access the object fimm without specifically declaring it as a global var, without any issue.
So I run scriptfile.py, which has an import statement like from pyFIMM import *.
Most importantly, scriptfile.py initiates the TCP connection made to the application via fimm.connect() right at the top, before issuing any commands that require the communication, and all the other modules call fimm.Exec(<commands for app>) in various routines, which has been working swimmingly well - the fimm object has so-far been accessible to all modules, and keeps it's connection state without issue.
The issue I am running into is that the file proprietary/University.py can only successfully use the fimm object when it's placed in the pyFIMM root-level directory (ie. the same folder as __globals.py etc.). But when University.py is imported from within the proprietary sub-folder, it gives me an "application not initialized" error when I use the fimm object, as if the object had been overwritten or re-initialized or something. The object still exists, it just isn't maintaining it's connection state when called by this sub-module. (I've checked that it's not reinitialized in another module.)
If, after the script fails in proprietary/University.py, I use the console to send a command eg. pyFimm.fimm.Exec(<command to app>), it communicates just fine!
I set proprietary/University.py to print a dir(fimm) as a test right at the beginning, which works fine and looks like the fimm object exists as expected, but a subsequent call in the same file to fimm.Exec() indicates that the object's state is not correct, returning the "application not initialized" error.
This almost looks like there are two fimm objects - one that the main python console (and pyFIMM modules) see, which works great, and another that proprietary/University.py sees which doesn't know that we called fimm.connect() already. Again, if I put University.py in the main module folder "pyFIMM" it works fine - the fimm.Exec() calls operate as expected!
FYI proprietary/University.py imports the __globals.py file as so:
import sys, os, inspect
ScriptDir = inspect.currentframe().f_code.co_filename # get path to this module file
(ParentDir , tail) = os.path.split(ScriptDir) # split off top-level directory from path
(ParentDir , tail) = os.path.split(ParentDir) # split off top-level directory from path
sys.path.append(ParentDir) # add ParentDir to the python search path
from __globals import * # import global vars & FimmWave connection object
global fimm # This line makes no difference, was just trying it.
(FYI, Somewhere on SO it was stated that inspect was better than __file__, hence the code above.)
Why do you think having the sub-module in a sub-folder causes the object to lose it's state?
I suspect the issue is either the way I instruct University.py to import __globals.py or the "import all files in this folder" method I used in proprietary/__init__.py. But I have little idea how to fix it!
Thank you for looking at this question, and thanks in advance for your insightful comments.

Make an object behave like a module - Python 3

I am wondering if there is a way to have a python variable to behave like a python module.
Problem I currently have is that we have python bindings for our API. The bindings are automatically generated through swig and to use them someone would only needs to:
import module_name as short_name
short_name.functions()
Right now we are studying having the API to use Apache Thrift. To use it someone needs to:
client, transport = thrift_connect()
client.functions()
...
transport.close()
Problem is that we have loads of scripts and we were wondering if there is a way to have the thrift client object to behave like a module so that we don't need to modify all scripts. One idea we had was to do something like this:
client, transport = thrift_connect()
global short_name
short_name = client
__builtins__.short_name = client
This 'sort of' works. It creates a global variable 'short_name' that acts like a module, but it also generates other problems. If other files import the same module it is needed to comment those imports. Also, having a global variable is not a bright idea for maintenance purposes.
So, would there be a way to make the thrift client to behave like a module? So that people could continue to use the 'old' syntax, but under the hood the module import would trigger a connection ans return the object as the module?
EDIT 1:
It is fine for every import to open a connection. Maybe we could use some kind of singleton so that a specific interpreter can only open one connection even if it calls multiple imports on different files.
I thought about binding the transport.close() to a object termination. Could be the module itself, if that is possible.
EDIT 2:
This seems to do what I want:
client, transport = thrift_connect()
attributes = dict((name, getattr(client, name)) for name in dir(client) if not (name.startswith('__') or name.startswith('_')))
globals().update(attributes)
Importing a module shouldn't cause a network connection.
If you have mandatory setup/teardown steps then you could define a context manager:
from contextlib import contextmanager
#contextmanager
def thrift_client():
client, transport = thrift_connect()
client.functions()
try:
yield client
finally:
transport.close()
Usage:
with thrift_client() as client:
# use client here
In general, the auto-generated module with C-like API should be private e.g., name it _thrift_client and the proper pythonic API that is used outside should be written on top of it by hand in another module.
To answer the question from the title: you can make an object to behave like a module e.g., see sh.SelfWrapper and quickdraw.Module.

Relative import creates different class object?

In a Django project, I have a directory structure that looks something like this:
project/
├╴module_name/
│ ├╴dbrouters.py
│ ...
...
In dbrouters.py, I define a class that starts out like this:
class CustomDBRouter(object):
current_connection = 'default'
...
The idea is to have a database router that sets the connection to use at the start of each request and then uses that connection for all subsequent database queries similarly to what is described in the Django docs for automatic database routing.
Everything works great, except that when I want to import CustomDBRouter in a script, I have to use the absolute path, or else something weird happens.
Let's say in one part of the application, CustomDBRouter.current_connection is changed:
import project.module_name.dbrouters.CustomDBRouter
...
CustomDBRouter.current_connection = 'alternate'
In another part of the application (assume that it is executed after the above code), I use a relative import instead:
import .dbrouters.CustomDBRouter
...
print CustomDBRouter.current_connection # Outputs 'default', not 'alternate'!
I'm confused as to why this is happening. Is Python creating a new class object for CustomDBRouter because I'm using a different import path?
Bonus points: Is there a better way to implement a 'global' class property?
It depends on how this script is being executed. When you're using relative imports, you have to make sure the name of the script the import is in has a __name__ attribute other than __main__. If it does, import .dbrouters.CustomDBRouter becomes import __main__.dbrouters.CustomDBRouter.
I found this here.
It turns out, the problem was being caused by a few lines in another file:
PROJECT_ROOT = '/path/to/project'
sys.path.insert(0, '%s' % PROJECT_ROOT)
sys.path.insert(1, '%s/module_name' % PROJECT_ROOT)
The files that were referencing .dbrouters were imported using the "shortcut" path (e.g., import views instead of import module_name.views).

Python: intercept a class loading action

Summary: when a certain python module is imported, I want to be able to intercept this action, and instead of loading the required class, I want to load another class of my choice.
Reason: I am working on some legacy code. I need to write some unit test code before I start some enhancement/refactoring. The code imports a certain module which will fail in a unit test setting, however. (Because of database server dependency)
Pseduo Code:
from LegacyDataLoader import load_me_data
...
def do_something():
data = load_me_data()
So, ideally, when python excutes the import line above in a unit test, an alternative class, says MockDataLoader, is loaded instead.
I am still using 2.4.3. I suppose there is an import hook I can manipulate
Edit
Thanks a lot for the answers so far. They are all very helpful.
One particular type of suggestion is about manipulation of PYTHONPATH. It does not work in my case. So I will elaborate my particular situation here.
The original codebase is organised in this way
./dir1/myapp/database/LegacyDataLoader.py
./dir1/myapp/database/Other.py
./dir1/myapp/database/__init__.py
./dir1/myapp/__init__.py
My goal is to enhance the Other class in the Other module. But since it is legacy code, I do not feel comfortable working on it without strapping a test suite around it first.
Now I introduce this unit test code
./unit_test/test.py
The content is simply:
from myapp.database.Other import Other
def test1():
o = Other()
o.do_something()
if __name__ == "__main__":
test1()
When the CI server runs the above test, the test fails. It is because class Other uses LegacyDataLoader, and LegacydataLoader cannot establish database connection to the db server from the CI box.
Now let's add a fake class as suggested:
./unit_test_fake/myapp/database/LegacyDataLoader.py
./unit_test_fake/myapp/database/__init__.py
./unit_test_fake/myapp/__init__.py
Modify the PYTHONPATH to
export PYTHONPATH=unit_test_fake:dir1:unit_test
Now the test fails for another reason
File "unit_test/test.py", line 1, in <module>
from myapp.database.Other import Other
ImportError: No module named Other
It has something to do with the way python resolves classes/attributes in a module
You can intercept import and from ... import statements by defining your own __import__ function and assigning it to __builtin__.__import__ (make sure to save the previous value, since your override will no doubt want to delegate to it; and you'll need to import __builtin__ to get the builtin-objects module).
For example (Py2.4 specific, since that's what you're asking about), save in aim.py the following:
import __builtin__
realimp = __builtin__.__import__
def my_import(name, globals={}, locals={}, fromlist=[]):
print 'importing', name, fromlist
return realimp(name, globals, locals, fromlist)
__builtin__.__import__ = my_import
from os import path
and now:
$ python2.4 aim.py
importing os ('path',)
So this lets you intercept any specific import request you want, and alter the imported module[s] as you wish before you return them -- see the specs here. This is the kind of "hook" you're looking for, right?
There are cleaner ways to do this, but I'll assume that you can't modify the file containing from LegacyDataLoader import load_me_data.
The simplest thing to do is probably to create a new directory called testing_shims, and create LegacyDataLoader.py file in it. In that file, define whatever fake load_me_data you like. When running the unit tests, put testing_shims into your PYTHONPATH environment variable as the first directory. Alternately, you can modify your test runner to insert testing_shims as the first value in sys.path.
This way, your file will be found when importing LegacyDataLoader, and your code will be loaded instead of the real code.
The import statement just grabs stuff from sys.modules if a matching name is found there, so the simplest thing is to make sure you insert your own module into sys.modules under the target name before anything else tries to import the real thing.
# in test code
import sys
import MockDataLoader
sys.modules['LegacyDataLoader'] = MockDataLoader
import module_under_test
There are a handful of variations on the theme, but that basic approach should work fine to do what you describe in the question. A slightly simpler approach would be this, using just a mock function to replace the one in question:
# in test code
import module_under_test
def mock_load_me_data():
# do mock stuff here
module_under_test.load_me_data = mock_load_me_data
That simply replaces the appropriate name right in the module itself, so when you invoke the code under test, presumably do_something() in your question, it calls your mock routine.
Well, if the import fails by raising an exception, you could put it in a try...except loop:
try:
from LegacyDataLoader import load_me_data
except: # put error that occurs here, so as not to mask actual problems
from MockDataLoader import load_me_data
Is that what you're looking for? If it fails, but doesn't raise an exception, you could have it run the unit test with a special command line tag, like --unittest, like this:
import sys
if "--unittest" in sys.argv:
from MockDataLoader import load_me_data
else:
from LegacyDataLoader import load_me_data

Categories

Resources