I am instantiating this object below every time I call csv in my function. Was just wondering if there's anyway I could just instantiate the object just once?
I tried to split the return csv from def csv() to another function but failed.
Code instantiating the object
def csv():
proj = Project.Project(db_name='test', json_file="/home/qingyong/workspace/Project/src/json_files/sys_setup.json")#, _id='poc_1'
csv = CSVDatasource(proj, "/home/qingyong/workspace/Project/src/json_files/data_setup.json")
return csv
Test function
def test_df(csv,df)
..............
Is your csv function actually a pytest.fixture? If so, you can change its scope to session so it will only be called once per py.test session.
#pytest.fixture(scope="session")
def csv():
# rest of code
Of course, the returned data should be immutable so tests can't affect each other.
You can use a global variable to cache the object:
_csv = None
def csv():
global _csv
if _csv is None:
proj = Project.Project(db_name='test', json_file="/home/qingyong/workspace/Project/src/json_files/sys_setup.json")#, _id='poc_1'
_csv = CSVDatasource(proj, "/home/qingyong/workspace/Project/src/json_files/data_setup.json")
return _csv
Another option is to change the caller to cache the result of csv() in a manner similar to the snippet above.
Note that your "code to call the function" doesn't call the function, it only declares another function that apparently receives the csv function's return value. You didn't show the call that actually calls the function.
You can use a decorator for this if CSVDatasource doesn't have side effects like reading the input line by line.
See Efficient way of having a function only execute once in a loop
You can store the object in the function's local dictionary. And return that object if it exists, create a new one if it doesn't.
def csv():
if not hasattr(csv, 'obj'):
proj = Project.Project(db_name='test', json_file="/home/qingyong/workspace/Project/src/json_files/sys_setup.json")#, _id='poc_1'
csv.obj = CSVDatasource(proj, "/home/qingyong/workspace/Project/src/json_files/data_setup.json")
return csv.obj
Related
I am trying to create a dynamic method executor, where I have a list that will always contain two elements. The first element is the name of the file, the second element is the name of the method to execute.
How can I achieve this?
My below code unfortunately doesn't work, but it will give you an good indication of what I am trying to achieve.
from logic.intents import CenterCapacity
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
# process method call
return call_reference[0].call_reference[1]
Thanks!
You can use __import__ to look up the module by name and then then use getattr to find the method. For example if the following code is in a file called exec.py then
def dummy(): print("dummy")
def lookup(mod, func):
module = __import__(mod)
return getattr(module, func)
if __name__ == "__main__":
lookup("exec","dummy")()
will output
dummy
Addendum
Alternatively importlib.import_module can be used, which although a bit more verbose, may be easier to use.
The most important difference between these two functions is that import_module() returns the specified package or module (e.g. pkg.mod), while __import__() returns the top-level package or module (e.g. pkg).
def lookup(mod, func):
import importlib
module = importlib.import_module(mod)
return getattr(module, func)
starting from:
from logic.intents import CenterCapacity
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
# process method call
return call_reference[0].call_reference[1]
Option 1
We have several options, the first one is using a class reference and the getattr. For this we have to remove the ' around the class and instantiate the class before calling a reference (you do not have to instantiate the class when the method is a staticmethod.)
def method_executor(event):
call_reference = [CenterCapacity, 'get_capacity'] # We now store a class reference
# process method call
return getattr(call_reference[0](), call_reference[1])
option 2
A second option is based on this answer. It revolves around using the getattr method twice. We firstly get module using sys.modules[__name__] and then get the class from there using getattr.
import sys
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
class_ref = getattr(sys.modules[__name__], call_reference[0])
return getattr(class_ref, call_reference[1])
Option 3
A third option could be based on a full import path and use __import__('module.class'), take a look at this SO post.
(Note: This answer assumes that the necessary imports have already happened, and you just need a mechanism to invoke the functions of the imported modules. If you also want the import do be done by some program code, I will have to add that part, using importlib library)
You can do this:
globals()[call_reference[0]].__dict__[call_reference[1]]()
Explanation:
globals() returns a mapping between global variable names and their referenced objects. The imported module's name counts as one of these global variables of the current module.
Indexing this mapping object with call_reference[0] returns the module object containing the function to be called.
The module object's __dict__ maps each attribute-name of the module to the object referenced by that attribute. Functions defined in the module also count as attributes of the module.
Thus, indexing __dict__ with the function name call_reference[1] returns the function object.
I have a function foo that calls another function get_info_from_tags.
Here's the get_info_from_tags implementation:
def get_info_from_tags(*args):
instance_id = get_instance_id()
proc = subprocess.Popen(["aws", "ec2", "describe-tags", "--filters", f"Name=resource-id,Values={instance_id}"],
stdout=subprocess.PIPE, shell=True)
(out, err) = proc.communicate()
em_json = json.loads(out.decode('utf8'))
tags = em_json['Tags'] # tags list
results = []
for arg in args:
for tag in tags:
if tag['Key'] == arg:
results.append(tag['Value'])
return results
There is a set of 10 possible args that can be passed to get_info_from_tags, and I need to return the correct array (I don't want to make a call to aws services, that's the point of my mock, I will manually set the values in a dictionary).
How can I mock get_info_from_tags so that when I call
get_info_from_tags('key1', 'key2' ...)
inside the foo function, I get the results I want?
I've already tried some functions of pytest but, as it seems, I didn't quite understand.
A possible solution would be to create another function:
def mocked_get_info_from_tags(*args):
values = []
for arg in args:
values.append(my_dictionary[arg])
return values
But I don't know how to implement this override within a test environment.
Thank you.
unittest.mock.patch is your friend.
You didn't specify the module names, so I put some <placeholders> there.
from unittest.mock import patch
from <module_with_foo> import foo
from <module_with_mock> import mocked_get_info_from_tags
with patch('<module_with_foo>.get_info_from_tags', mocked_get_info_from_tags):
foo()
This will replace get_info_from_tags with your mocked version of this function. The replacement is done on a module level, so everything in module <module_with_foo> that calls get_info_from_tags will now call your mock.
Note about the path given to patch:
patch replaces values of module attributes. So, if you have a module moo with a function foo, which calls bar from module moo2:
# moo module
from moo2 import bar
def foo():
bar()
...from the point of view of patch, moo.foo calls moo.bar, not moo2.bar. That's why you have to patch the module where the function is used, not where it is defined.
You could call one function from the other, and then replace when you have the db setup. That way you can still call foo.get_info_from_db('key1', 'key2' ...) everywhere in your code and when you add the proper database connection, all you have to change is the one main get_info_from_db function implementation and remove the mock
import db_connector
def mocked_get_info_from_db(*args):
values = []
for arg in args:
values.append(my_dictionary[arg])
return values
def get_info_from_db(*args):
# remove this once your database is setup
return mocked_get_info_from_db(*args)
# values = []
# for arg in args:
# values.append(db_connector.get(arg))
# return values
I'm writing a wrapper or pipeline to create a tfrecords dataset to which I would like to supply a function to apply to the dataset.
I would like to make it possible for the user to inject a function defined in another python file which is called in my script to transform the data.
Why? The only thing the user has to do is write the function which brings his data into the right format, then the existing code does the rest.
I'm aware of the fact that I could have the user write the function in the same file and call it, or to have an import statement etc.
So as a minimal example, I would like to have file y.py
def main(argv):
# Parse args etc, let's assume it is there.
dataset = tf.data.TFRecordDataset(args.filename)
dataset = dataset.map(args.function)
# Continue with doing stuff that is independent from actual content
So what I'd like to be able to do is something like this
python y.py --func x.py my_func
And use the function defined in x.py my_func in dataset.map(...)
Is there a way to do this in python and if yes, which is the best way to do it?
Pass the name of the file as an argument to your script (and function name)
Read the file into a string, possibly extracting the given function
use Python exec() to execute the code
An example:
file = "def fun(*args): \n return args"
func = "fun(1,2,3)"
def execute(func, file):
program = file + "\nresult = " + func
local = {}
exec(program, local)
return local['result']
r = execute(func, file)
print(r)
Similar to here however we must use locals() as we are not calling exec in global scope.
Note: the use of exec is somewhat dangerous, you should be sure that the function is safe - if you are using it then its fine!
Hope this helps.
Ok so I have composed the answer myself now using the information from comments and this answer.
import importlib, inspect, sys, os
# path is given path to file, funcion_name is name of function and args are the function arguments
# Create package and module name from path
package = os.path.dirname(path).replace(os.path.sep,'.')
module_name = os.path.basename(path).split('.')[0]
# Import module and get members
module = importlib.import_module(module_name, package)
members = inspect.getmembers(module)
# Find matching function
function = [t[1] for t in members if t[0] == function_name][0]
function(args)
This exactly solves the question, since I get a callable function object which I can call, pass around, use it as a normal function.
I want to mock the read and write functions of the Serial object from pyserial, which is used inside my class, to check for call arguments and edit the return values, but it doesn't seem to work.
Currently I have a file 'serialdevice.py', like this:
import serial
class SerialDevice:
def __init__(self):
self._serial = serial.Serial(port='someport')
def readwrite(self, msg):
self._serial.write(msg)
return self._serial.read(1024)
Then I have a file 'test_serialdevice.py', like this:
import mock
from serialdevice import SerialDevice
#mock.patch('serialdevice.serial.Serial.read')
#mock.patch('serialdevice.serial.Serial.write')
#mock.patch('serialdevice.serial.Serial')
def test_write(mock_serial, mock_write, mock_read):
mock_read.return_value='hello'
sd = SerialDevice()
resp = sd.readwrite('test')
mock_write.assert_called_once_with('test')
assert resp == 'hello'
But both asserts fail. Somehow the mock_write is not called with the argument 'test' and the write method returns a mock object instead of the 'hello' string. I've also tried using #patch('serial.Serial.write) etc. with the same results. Also using the return objects of mock_serial, so e.g. mock_read = mock_serial.read() does not seem to work.
The constructor, i.e. mock_serial, does seem to be called with the expected arguments however.
How can I achieve what I want in this case?
The python version is 2.7.9
Apparently you have to use the return value of the mock_serial object as the instance returned by the constructor, so something like
mock_serial.return_value.read.return_value = 'hello'
This works, but I still wonder if there is a better way to do this
How can I get a variable that contains the currently executing function in Python? I don't want the function's name. I know I can use inspect.stack to get the current function name. I want the actual callable object. Can this be done without using inspect.stack to retrieve the function's name and then evaling the name to get the callable object?
Edit: I have a reason to do this, but it's not even a remotely good one. I'm using plac to parse command-line arguments. You use it by doing plac.call(main), which generates an ArgumentParser object from the function signature of "main". Inside "main", if there is a problem with the arguments, I want to exit with an error message that includes the help text from the ArgumentParser object, which means that I need to directly access this object by calling plac.parser_from(main).print_help(). It would be nice to be able to say instead: plac.parser_from(get_current_function()).print_help(), so that I am not relying on the function being named "main". Right now, my implementation of "get_current_function" would be:
import inspect
def get_current_function():
return eval(inspect.stack()[1][3])
But this implementation relies on the function having a name, which I suppose is not too onerous. I'm never going to do plac.call(lambda ...).
In the long run, it might be more useful to ask the author of plac to implement a print_help method to print the help text of the function that was most-recently called using plac, or something similar.
The stack frame tells us what code object we're in. If we can find a function object that refers to that code object in its __code__ attribute, we have found the function.
Fortunately, we can ask the garbage collector which objects hold a reference to our code object, and sift through those, rather than having to traverse every active object in the Python world. There are typically only a handful of references to a code object.
Now, functions can share code objects, and do in the case where you return a function from a function, i.e. a closure. When there's more than one function using a given code object, we can't tell which function it is, so we return None.
import inspect, gc
def giveupthefunc():
frame = inspect.currentframe(1)
code = frame.f_code
globs = frame.f_globals
functype = type(lambda: 0)
funcs = []
for func in gc.get_referrers(code):
if type(func) is functype:
if getattr(func, "__code__", None) is code:
if getattr(func, "__globals__", None) is globs:
funcs.append(func)
if len(funcs) > 1:
return None
return funcs[0] if funcs else None
Some test cases:
def foo():
return giveupthefunc()
zed = lambda: giveupthefunc()
bar, foo = foo, None
print bar()
print zed()
I'm not sure about the performance characteristics of this, but i think it should be fine for your use case.
I recently spent a lot of time trying to do something like this and ended up walking away from it. There's a lot of corner cases.
If you just want the lowest level of the call stack, you can just reference the name that is used in the def statement. This will be bound to the function that you want through lexical closure.
For example:
def recursive(*args, **kwargs):
me = recursive
me will now refer to the function in question regardless of the scope that the function is called from so long as it is not redefined in the scope where the definition occurs. Is there some reason why this won't work?
To get a function that is executing higher up the call stack, I couldn't think of anything that can be reliably done.
This is what you asked for, as close as I can come. Tested in python versions 2.4, 2.6, 3.0.
#!/usr/bin/python
def getfunc():
from inspect import currentframe, getframeinfo
caller = currentframe().f_back
func_name = getframeinfo(caller)[2]
caller = caller.f_back
from pprint import pprint
func = caller.f_locals.get(
func_name, caller.f_globals.get(
func_name
)
)
return func
def main():
def inner1():
def inner2():
print("Current function is %s" % getfunc())
print("Current function is %s" % getfunc())
inner2()
print("Current function is %s" % getfunc())
inner1()
#entry point: parse arguments and call main()
if __name__ == "__main__":
main()
Output:
Current function is <function main at 0x2aec09fe2ed8>
Current function is <function inner1 at 0x2aec09fe2f50>
Current function is <function inner2 at 0x2aec0a0635f0>
Here's another possibility: a decorator that implicitly passes a reference to the called function as the first argument (similar to self in bound instance methods). You have to decorate each function that you want to receive such a reference, but "explicit is better than implicit" as they say.
Of course, it has all the disadvantage of decorators: another function call slightly degrades performance, and the signature of the wrapped function is no longer visible.
import functools
def gottahavethatfunc(func):
#functools.wraps(func)
def wrapper(*args, **kwargs):
return func(func, *args, **kwargs)
return wrapper
The test case illustrates that the decorated function still gets the reference to itself even if you change the name to which the function is bound. This is because you're only changing the binding of the wrapper function. It also illustrates its use with a lambda.
#gottahavethatfunc
def quux(me):
return me
zoom = gottahavethatfunc(lambda me: me)
baz, quux = quux, None
print baz()
print zoom()
When using this decorator with an instance or class method, the method should accept the function reference as the first argument and the traditional self as the second.
class Demo(object):
#gottahavethatfunc
def method(me, self):
return me
print Demo().method()
The decorator relies on a closure to hold the reference to the wrapped function in the wrapper. Creating the closure directly might actually be cleaner, and won't have the overhead of the extra function call:
def my_func():
def my_func():
return my_func
return my_func
my_func = my_func()
Within the inner function, the name my_func always refers to that function; its value does not rely on a global name that may be changed. Then we just "lift" that function to the global namespace, replacing the reference to the outer function. Works in a class too:
class K(object):
def my_method():
def my_method(self):
return my_method
return my_method
my_method = my_method()
I just define in the beginning of each function a "keyword" which is just a reference to the actual name of the function. I just do this for any function, if it needs it or not:
def test():
this=test
if not hasattr(this,'cnt'):
this.cnt=0
else:
this.cnt+=1
print this.cnt
The call stack does not keep a reference to the function itself -
although the running frame as a reference to the code object that is the code associated to a given function.
(Functions are objects with code, and some information about their environment, such as closures, name, globals dictionary, doc string, default parameters and so on).
Therefore if you are running a regular function, you are better of using its own name on the globals dictionary to call itself, as has been pointed out.
If you are running some dynamic, or lambda code, in which you can't use the function name, the only solution is to rebuild another function object which re-uses thre currently running code object and call that new function instead.
You will loose a couple of things, like default arguments, and it may be hard to get it working with closures (although it can be done).
I have written a blog post on doing exactly that - calling anonymous functions from within themselves - I hope the code in there can help you:
http://metapython.blogspot.com/2010/11/recursive-lambda-functions.html
On a side note: avoid the use o inspect.stack -- it is too slow, as it rebuilds a lot of information each time it is called. prefer to use inspect.currentframe to deal with code frames instead.
This may sounds complicated, but the code itself is very short - I am pasting it bellow. The post above contains more information on how this works.
from inspect import currentframe
from types import FunctionType
lambda_cache = {}
def myself (*args, **kw):
caller_frame = currentframe(1)
code = caller_frame.f_code
if not code in lambda_cache:
lambda_cache[code] = FunctionType(code, caller_frame.f_globals)
return lambda_cache[code](*args, **kw)
if __name__ == "__main__":
print "Factorial of 5", (lambda n: n * myself(n - 1) if n > 1 else 1)(5)
If you really need the original function itself, the "myself" function above could be made to search on some scopes (like the calling function global dictionary) for a function object which code object would match with the one retrieved from the frame, instead of creating a new function.
sys._getframe(0).f_code returns exactly what you need: the codeobject being executed. Having a code object, you can retrieve a name with codeobject.co_name
OK after reading the question and comments again, I think this is a decent test case:
def foo(n):
""" print numbers from 0 to n """
if n: foo(n-1)
print n
g = foo # assign name 'g' to function object
foo = None # clobber name 'foo' which refers to function object
g(10) # dies with TypeError because function object tries to call NoneType
I tried solving it by using a decorator to temporarily clobber the global namespace and reassigning the function object to the original name of the function:
def selfbind(f):
""" Ensures that f's original function name is always defined as f when f is executed """
oname = f.__name__
def g(*args, **kwargs):
# Clobber global namespace
had_key = None
if globals().has_key(oname):
had_key = True
key = globals()[oname]
globals()[oname] = g
# Run function in modified environment
result = f(*args, **kwargs)
# Restore global namespace
if had_key:
globals()[oname] = key
else:
del globals()[oname]
return result
return g
#selfbind
def foo(n):
if n: foo(n-1)
print n
g = foo # assign name 'g' to function object
foo = 2 # calling 'foo' now fails since foo is an int
g(10) # print from 0..10, even though foo is now an int
print foo # prints 2 (the new value of Foo)
I'm sure I haven't thought through all the use cases. The biggest problem I see is the function object intentionally changing what its own name points to (an operation which would be overwritten by the decorator), but that should be ok as long as the recursive function doesn't redefine its own name in the middle of recursing.
Still not sure I'd ever need to do this, but thinking about was interesting.
Here a variation (Python 3.5.1) of the get_referrers() answer, which tries to distinguish between closures that are using the same code object:
import functools
import gc
import inspect
def get_func():
frame = inspect.currentframe().f_back
code = frame.f_code
return [
referer
for referer in gc.get_referrers(code)
if getattr(referer, "__code__", None) is code and
set(inspect.getclosurevars(referer).nonlocals.items()) <=
set(frame.f_locals.items())][0]
def f1(x):
def f2(y):
print(get_func())
return x + y
return f2
f_var1 = f1(1)
f_var1(3)
# <function f1.<locals>.f2 at 0x0000017235CB2C80>
# 4
f_var2 = f1(2)
f_var2(3)
# <function f1.<locals>.f2 at 0x0000017235CB2BF8>
# 5
def f3():
print(get_func())
f3()
# <function f3 at 0x0000017235CB2B70>
def wrapper(func):
functools.wraps(func)
def wrapped(*args, **kwargs):
return func(*args, **kwargs)
return wrapped
#wrapper
def f4():
print(get_func())
f4()
# <function f4 at 0x0000017235CB2A60>
f5 = lambda: get_func()
print(f5())
# <function <lambda> at 0x0000017235CB2950>
Correction of my previous answer, because the subdict check already works with "<=" called on dict_items and the additional set() calls result in problems, if there are dict-values which are dicts themself:
import gc
import inspect
def get_func():
frame = inspect.currentframe().f_back
code = frame.f_code
return [
referer
for referer in gc.get_referrers(code)
if getattr(referer, "__code__", None) is code and
inspect.getclosurevars(referer).nonlocals.items() <=
frame.f_locals.items()][0]