I have a function to preprocess images in batches to forward to caffe as input, it is something like below and returns two variables.
def processImageCrop(im_info, transformer, flowtransformer):
.....
return processed_image, processed_flowimage
class ImageProcessorCrop(object):
def __init__(self, transformer, flowtransformer):
self.transformer = transformer
self.flowtransformer = flowtransformer
#self.flow = flow
def __call__(self, im_info):
return processImageCrop(im_info, self.transformer, self.flowtransformer) #, self.flow)
I call this function with pool.map sending im_info parameters, and want to assign the two variables returned as below, but I get the exception Too many values to unpack. Both variables should have length 192. How can I assign the returned values? Thx. I don't want to iterate over each element, but return the two values and assign them to two variables.
result['data'] , result['flowdata'] = pool.map(image_processor, im_info)
Your pool.map call is going to return a list with the results of calling your callable class once per value in im_info. If im_info has more than two value, your assignment that unpacks the list into two variables will not work.
If you actually want to be unpacking the two-tuples within the list, you probably want to use zip to transpose the data:
result['data'], result['flowdata'] = zip(*pool.map(image_processor, im_info))
Related
I'm trying to dynamically add function calls to fill in array columns. I will be accessing the array millions of times so it needs to be quick.
I'm thinking to add the call of a function into a dictionary by using a string variable
numpy_array[row,column] = dict[key[index containing function call]]
The full scope of the code I'm working with is too large to post here is an equivalent simplistic example I've tried.
def hello(input):
return input
dict1 = {}
#another function returns the name and ID values
name = 'hello'
ID = 0
dict1["hi"] = globals()[name](ID)
print (dict1)
but it literally activates the function when using
globals()[name](ID)
instead of copy pasting hello(0) as a variable into the dictionary.
I'm a bit out of my depth here.
What is the proper way to implement this?
Is there a more efficient way to do this than reading into a dictionary on every call of
numpy_array[row,column] = dict[key[index containing function call]]
as I will be accessing and updating it millions of times.
I don't know if the dictionary is called every time the array is written to or if the location of the column is already saved into cache.
Would appreciate the help.
edit
Ultimately what I'm trying to do is initialize some arrays, dictionaries, and values with a function
def initialize(*args):
create arrays and dictionaries
assign values to global and local variables, arrays, dictionaries
Each time the initialize() function is used it creates a new set of variables (names, values, ect) that direct to a different function with a different set of variables.
I have an numpy array which I want to store information from the function and associated values created from the initialize() function.
So in other words, in the above example hello(0), the name of the function, it's value, and some other things as set up within initialize()
What I'm trying to do is add the function with these settings to the numpy array as a new column before I run the main program.
So as another example. If I was setting up hello() (and hello() was a complex function) and when I used initialize() it might give me a value of 1 for hello(1).
Then if I use initialize again it might give me a value of 2 for hello(2).
If I used it one more time it might give the value 0 for the function goodbye(0).
So in this scenaro let's say I have an array
array[row,0] = stuff()
array[row,1] = things()
array[row,2] = more_stuff()
array[row,3] = more_things()
Now I want it to look like
array[row,0] = stuff()
array[row,1] = things()
array[row,2] = more_stuff()
array[row,3] = more_things()
array[row,4] = hello(1)
array[row,5] = hello(2)
array[row,6] = goodbye(0)
As a third, example.
def function1():
do something
def function2():
do something
def function3():
do something
numpy_array(size)
initialize():
do some stuff
then add function1(23) to the next column in numpy_array
initialize():
do some stuff
then add function2(5) to the next column in numpy_array
initialize():
do some stuff
then add function3(50) to the next column in numpy_array
So as you can see. I need to permanently append new columns to the array and feed the new columns with the function/value as directed by the initialize() function without manual intervention.
So fundamentally I need to figure out how to assign syntax to an array column based upon a string value without activating the syntax on assignment.
edit #2
I guess my explanations weren't clear enough.
Here is another way to look at it.
I'm trying to dynamically assign functions to an additional column in a numpy array based upon the output of a function.
The functions added to the array column will be used to fill the array millions of times with data.
The functions added to the array can be various different function with various different input values and the amount of functions added can vary.
I've tried assigning the functions to a dictionary using exec(), eval(), and globals() but when using these during assignment it just instantly activates the functions instead of assigning them.
numpy_array = np.array((1,5))
def some_function():
do some stuff
return ('other_function(15)')
#somehow add 'other_function(15)' to the array column.
numpy_array([1,6] = other_function(15)
The functions returned by some_function() may or may not exist each time the program is run so the functions added to the array are also dynamic.
I'm not sure this is what the OP is after, but here is a way to make an indirection of functions by name:
def make_fun_dict():
magic = 17
def foo(x):
return x + magic
def bar(x):
return 2 * x + 1
def hello(x):
return x**2
return {k: f for k, f in locals().items() if hasattr(f, '__name__')}
mydict = make_fun_dict()
>>> mydict
{'foo': <function __main__.make_fun_dict.<locals>.foo(x)>,
'bar': <function __main__.make_fun_dict.<locals>.bar(x)>,
'hello': <function __main__.make_fun_dict.<locals>.hello(x)>}
>>> mydict['foo'](0)
17
Example usage:
x = np.arange(5, dtype=int)
names = ['foo', 'bar', 'hello', 'foo', 'hello']
>>> np.array([mydict[name](v) for name, v in zip(names, x)])
array([17, 3, 4, 20, 16])
I have a list of patterns:
patterns_trees = [response.css("#Header").xpath("//a/img/#src"),
response.css("#HEADER").xpath("//a/img/#src"),
response.xpath("//header//a/img/#src"),
response.xpath("//a[#href='"+response.url+'/'+"']/img/#src"),
response.xpath("//a[#href='/']/img/#src")
]
After I traverse it and find the right pattern I have to send the pattern as an argument to a callback function
for pattern_tree in patterns_trees:
...
pattern_response = scrapy.Request(...,..., meta={"pattern_tree": pattern_tree.extract_first()})
By doing this I get the value of the regex not the pattern
THINGS I TRIED:
I tried isolating the patterns in a separate class but still I have the problem that I can not store them as pattern but as values.
I tried to save them as strings and maybe I can make it work but
What is the most efficient way of storing list of functions
UPDATE: Possible solution but too hardcoded and it's too problematic when I want to add more patterns:
def patter_0(response):
response.css("#Header").xpath("//a/img/#src")
def patter_1(response):
response.css("#HEADER").xpath("//a/img/#src")
.....
class patternTrees:
patterns = [patter_0,...,patter_n]
def length_patterns(self):
return len(patterns)
If you're willing to consider reformatting your list of operations, then this is a somewhat neat solution. I've changed the list of operations to a list of tuples. Each tuple contains (a ref to) the appropriate function, and another tuple consisting of arguments.
It's fairly easy to add new operations to the list: just specify what function to use, and the appropriate arguments.
If you want to use the result from one operation as an argument in the next: You will have to return the value from execute() and process it in the for loop.
I've replaced the calls to response with prints() so that you can test it easily.
def response_css_ARG_xpath_ARG(args):
return "response.css(\"%s\").xpath(\"%s\")" % (args[0],args[1])
#return response.css(args[0]).xpath(args[1])
def response_xpath_ARG(arg):
return "return respons.xpath(\"%s\")" % (arg)
#return response.xpath(arg)
def execute(function, args):
response = function(args)
# do whatever with response
return response
response_url = "https://whatever.com"
patterns_trees = [(response_css_ARG_xpath_ARG, ("#Header", "//a/img/#src")),
(response_css_ARG_xpath_ARG, ("#HEADER", "//a/img/#src")),
(response_xpath_ARG, ("//header//a/img/#src")),
(response_xpath_ARG, ("//a[#href='"+response_url+"/"+"']/img/#src")),
(response_xpath_ARG, ("//a[#href='/']/img/#src"))]
for pattern_tree in patterns_trees:
print(execute(pattern_tree[0], pattern_tree[1]))
Note that execute() can be omitted! Depending on if you need to process the result or not. Without the executioner, you may just call the function directly from the loop:
for pattern_tree in patterns_trees:
print(pattern_tree[0](pattern_tree[1]))
Not sure I understand what you're trying to do, but could you make your list a list of lambda functions like so:
patterns_trees = [
lambda response : response.css("#Header").xpath("//a/img/#src"),
...
]
And then, in your loop:
for pattern_tree in patterns_trees:
intermediate_response = scrapy.Request(...) # without meta kwarg
pattern_response = pattern_tree(intermediate_response)
Or does leaving the meta away have an impact on the response object?
I'm new to Python and have a problem-specific question about how to access non-returned variables defined within a function.
I am using an analysis package in Python that requires a user-defined function as input. It expects a function defined in a particular way that returns a single output. However, for diagnostic purposes, I would like to obtain multiple outputs from this function after it is called (e.g. to produce diagnostic plots at a particular stage of the analysis). However, I can't modify the function to return multiple outputs (unless I modify every instance of it being called by the analysis code--not practical); this would result in an error.
For instance, in the following example,the user defined function returns f1+f2, but for diagnostic purposes, say I would like to know what f1 and f2 are individually:
def my2dfunction(x,y,theta):
'''x and y are 1-d arrays of len(x)==len(y)
theta is an array of 5 model parameters '''
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
return f1+f2
Researching on this site I've come up with 2 possible solutions:
Create a global variable containing the values for f1 and f2 from the latest function call, which I can access at any time:
live_diagnostic_info={'f1':0, 'f2':0}
def my2dfunction(x,y,theta):
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
global live_diagnostic_info={'f1':f1, 'f2':f2}
return f1+f2
Define a second function identical to the first with multiple return values to call in only the instances where I need diagnostic information.
def my2dfunction_extra(x,y,theta):
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
return f1+f2,f1,f2
I think both would work for my purposes, but I'm wondering if there is another way to pass non-returned variables from a function in Python. (e.g. I do a lot of coding in IDL, where extra information can be passed through keywords, without modifying the return statement, and wonder what the Python equivalent would be if it exists).
You can write a decorator, so you don't need to copy-paste anything for your second solution:
def return_first(f):
return lambda *args, **kwargs: f(*args, **kwargs)[0]
return_first takes a function, and returns the same function, but with only first returned value.
def my2dfunction_extra(x,y,theta):
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
return f1+f2,f1,f2
my2dfunction = return_first(my2dfunction_extra)
Add an optional argument to the function
def my2dfunction(x,y,theta, local_debug=None):
'''x and y are 1-d arrays of len(x)==len(y)
theta is an array of 5 model parameters '''
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
if local_debug is not None:
local_debug["f1"] = f1
local_debug["f2"] = f2
return f1+f2
then call it with a dictionary
local_data = {}
my2dfunction(x,y,theta, local_data)
on return you have the information in the dict. Old client code will not be affected either in the return or in the supplied input values.
If instead you want to have them saved no matter who calls the routine, you can do the following. Create the routine as follows
def my2dfunction(x,y,theta, local_debug={}):
'''x and y are 1-d arrays of len(x)==len(y)
theta is an array of 5 model parameters '''
f1=theta[0]+theta[1]*x+theta[2]*x**2
f2=theta[3]*y+theta[4]*y**2
local_debug["f1"] = f1
local_debug["f2"] = f2
return f1+f2
then you can access the local_debug dictionary from outside
python3: my2dfunction.__defaults__[0]
python2: my2dfunction.func_defaults[0]
Keep in mind that having mutables (e.g. lists, dictionaries etc.) as defaults is a python faux pas, but in this case has a motivation.
I have created a class MyClassthat contains a lot of simulation data. The class groups simulation results for different simulations that have a similar structure. The results can be retreived with a MyClass.get(foo) method. It returns a dictionary with simulationID/array pairs, array being the value of foo for each simulation.
Now I want to implement a method in my class to apply any function to all the arrays for foo. It should return a dictionary with simulationID/function(foo) pairs.
For a function that does not need additional arguments, I found the following solution very satisfying (comments always welcome :-) ):
def apply(self, function, variable):
result={}
for k,v in self.get(variable).items():
result[k] = function(v)
return result
However, for a function requiring additional arguments I don't see how to do it in an elegant way. A typical operation would be the integration of foo with bar as x-values like np.trapz(foo, x=bar), where both foo and bar can be retreived with MyClass.get(...)
I was thinking in this direction:
def apply(self, function_call):
"""
function_call should be a string with the complete expression to evaluate
eg: MyClass.apply('np.trapz(QHeat, time)')
"""
result={}
for SID in self.simulations:
result[SID] = eval(function_call, locals=...)
return result
The problem is that I don't know how to pass the locals mapping object. Or maybe I'm looking in a wrong direction. Thanks on beforehand for your help.
Roel
You have two ways. The first is to use functools.partial:
foo = self.get('foo')
bar = self.get('bar')
callable = functools.partial(func, foo, x=bar)
self.apply(callable, variable)
while the second approach is to use the same technique used by partial, you can define a function that accept arbitrary argument list:
def apply(self, function, variable, *args, **kwds):
result={}
for k,v in self.get(variable).items():
result[k] = function(v, *args, **kwds)
return result
Note that in both case the function signature remains unchanged. I don't know which one I'll choose, maybe the first case but I don't know the context on you are working on.
I tried to recreate (the relevant part of) the class structure the way I am guessing it is set up on your side (it's always handy if you can provide a simplified code example for people to play/test).
What I think you are trying to do is translate variable names to variables that are obtained from within the class and then use those variables in a function that was passed in as well. In addition to that since each variable is actually a dictionary of values with a key (SID), you want the result to be a dictionary of results with the function applied to each of the arguments.
class test:
def get(self, name):
if name == "valA":
return {"1":"valA1", "2":"valA2", "3":"valA3"}
elif name == "valB":
return {"1":"valB1", "2":"valB2", "3":"valB3"}
def apply(self, function, **kwargs):
arg_dict = {fun_arg: self.get(sim_args) for fun_arg, sim_args in kwargs.items()}
result = {}
for SID in arg_dict[kwargs.keys()[0]]:
fun_kwargs = {fun_arg: sim_dict[SID] for fun_arg, sim_dict in arg_dict.items()}
result[SID] = function(**fun_kwargs)
return result
def joinstrings(string_a, string_b):
return string_a+string_b
my_test = test()
result = my_test.apply(joinstrings, string_a="valA", string_b="valB")
print result
So the apply method gets an argument dictionary, gets the class specific data for each of the arguments and creates a new argument dictionary with those (arg_dict).
The SID keys are obtained from this arg_dict and for each of those, a function result is calculated and added to the result dictionary.
The result is:
{'1': 'valA1valB1', '3': 'valA3valB3', '2': 'valA2valB2'}
The code can be altered in many ways, but I thought this would be the most readable. It is of course possible to join the dictionaries instead of using the SID's from the first element etc.
I have a function that has several outputs, all of which "native", i.e. integers and strings. For example, let's say I have a function that analyzes a string, and finds both the number of words and the average length of a word.
In C/C++ I would use # to pass 2 parameters to the function. In Python I'm not sure what's the right solution, because integers and strings are not passed by reference but by value (at least this is what I understand from trial-and-error), so the following code won't work:
def analyze(string, number_of_words, average_length):
... do some analysis ...
number_of_words = ...
average_length = ...
If i do the above, the values outside the scope of the function don't change. What I currently do is use a dictionary like so:
def analyze(string, result):
... do some analysis ...
result['number_of_words'] = ...
result['average_length'] = ...
And I use the function like this:
s = "hello goodbye"
result = {}
analyze(s, result)
However, that does not feel right. What's the correct Pythonian way to achieve this? Please note I'm referring only to cases where the function returns 2-3 results, not tens of results. Also, I'm a complete newbie to Python, so I know I may be missing something trivial here...
Thanks
python has a return statement, which allows you to do the follwing:
def func(input):
# do calculation on input
return result
s = "hello goodbye"
res = func(s) # res now a result dictionary
but you don't need to have result at all, you can return a few values like so:
def func(input):
# do work
return length, something_else # one might be an integer another string, etc.
s = "hello goodbye"
length, something = func(s)
If you return the variables in your function like this:
def analyze(s, num_words, avg_length):
# do something
return s, num_words, avg_length
Then you can call it like this to update the parameters that were passed:
s, num_words, avg_length = analyze(s, num_words, avg_length)
But, for your example function, this would be better:
def analyze(s):
# do something
return num_words, avg_length
In python you don't modify parameters in the C/C++ way (passing them by reference or through a pointer and doing modifications in situ).There are some reasons such as that the string objects are inmutable in python. The right thing to do is to return the modified parameters in a tuple (as SilentGhost suggested) and rebind the variables to the new values.
If you need to use method arguments in both directions, you can encapsulate the arguments to the class and pass object to the method and let the method use its properties.