python how to memoize a method - python

Say I a method to create a dictionary from the given parameters:
def newDict(a,b,c,d): # in reality this method is a bit more complex, I've just shortened for the sake of simplicity
return { "x": a,
"y": b,
"z": c,
"t": d }
And I have another method that calls newDict method each time it is executed. Therefore, at the end, when I look at my cProfiler I see something like this:
17874 calls (17868 primitive) 0.076 CPU seconds
and of course, my newDict method is called 1785 times. Now, my question is whether I can memorize the newDict method so that I reduce the call times? (Just to make sure, the variables change almost in every call, though I'm not sure if it has an effect on memorizing the function)
Sub Question: I believe that 17k calls are too much, and the code is not efficient. But by looking at the stats can you also please state whether this is a normal result or I have too many calls and the code is slow?

You mean memoize not memorize.
If the values are almost always different, memoizing won't help, it will slow things down.
Without seeing your full code, and knowing what it's supposed to do, how can we know if 17k calls is a lot or the little?

If by memorizing you mean memoizing, use functools.lru_cache.
It's a function decorator

The purpose of memoizing is to save a result of an operation that was expensive to perform so that it can be provided a second, third, etc., time without having to repeat the operation and repeatedly incur the expense.
Memoizing is normally applied to a function that (a) performs an expensive operation, (b) always produces the same result given the same arguments, and (c) has no side effects on the program state.
Memoizing is typically implemented within such a function by 'saving' the result along with the values of the arguments that produced that result. This is a special form of the general concept of a cache. Each time the function is called, the function checks its memo cache to see if it has already determined the result that is appropriate for the current values of the arguments. If the cache contains the result, it can be returned without the need to recompute it.
Your function appears to be intended to create a new dict each time it is called. There does not appear to be a sensible way to memoize this function: you always want a new dict returned to the caller so that its use of the dict it receives does not interfere with some other call to the function.
The only way I can visualize using memoizing would be if (1) the computation of one or more of the values placed into the result are expensive (in which case I would probably define a function that computes the value and memoize that function) or (2) the newDict function is intended to return the same collection of values given a particular set of argument values. In the latter case I would not use a dict but would instead use a non-modifiable object (e.g., a class like a dict but with protections against modifying its contents).
Regarding your subquestion, the questions you need to ask are (1) is the number of times newDict is being called appropriate and (2) can the execution time of each execution of newDict be reduced. These are two separate and independent questions that need to be individually addressed as appropriate.
BTW your function definition has a typo in it -- the return should not have a 'd' between the return keyword and the open brace.

Related

detect if a python function was activated with an assignment operator (and should return a value)

I have a function that can return values, but in order to do so it collects outputs from AWS (and it can be very large jobs, so it can take a long time and not always necessary).
Is there a way I can know, at run time, if the user has called my function with an assignment operator (i.e: x = foo()) or not (i.e foo()).
I know I can just add a flag, but the idea of checking the call during runtime seems elegant.
Thanks!

Why isn't there any special method for __max__ in python?

As the title asks. Python has a lot of special methods, __add__, __len__, __contains__ et c. Why is there no __max__ method that is called when doing max? Example code:
class A:
def __max__():
return 5
a = A()
max(a)
It seems like range() and other constructs could benefit from this. Am I missing some other effective way to do max?ยจ
Addendum 1:
As a trivial example, max(range(1000000000)) takes a long time to run.
I have no authoritative answer but I can offer my thoughts on the subject.
There are several built-in functions that have no corresponding special method. For example:
max
min
sum
all
any
One thing they have in common is that they are reduce-like: They iterate over an iterable and "reduce" it to one value. The point here is that these are more of a building block.
For example you often wrap the iterable in a generator (or another comprehension, or transformation like map or filter) before applying them:
sum(abs(val) for val in iterable) # sum of absolutes
any(val > 10 for val in iterable) # is one value over 10
max(person.age for person in iterable) # the oldest person
That means most of the time it wouldn't even call the __max__ of the iterable but try to access it on the generator (which isn't implemented and cannot be implemented).
So there is simply not much of a benefit if these were implemented. And in the few cases when it makes sense to implement them it would be more obvious if you create a custom method (or property) because it highlights that it's a "shortcut" or that it's different from the "normal result".
For example these functions (min, etc.) have O(n) run-time, so if you can do better (for example if you have a sorted list you could access the max in O(1)) it might make sense to document that explicitly.
Some operations are not basic operations. Take max as an example, it is actually an operation based on comparison. In other words, when you get a max value, you are actually getting a biggest value.
So in this case, why should we implement a specified max function but not override the behave of comparison?
Think in another direction, what does max really mean? For example, when we execute max(list), what are we doing?
I think we are actually checking list's elements, and the max operation is not related to list itself at all.
list is just a container which is unnecessary in max operation. It is list or set or something else, it doesn't matter. What really useful is the elements inside this container.
So if we define a __max__ action for list, we are actually doing another totally different operation. We are asking a container to give us advice about max value.
I think in this case, as it is a totally different operation, it should be a method of container instead of overriding built-in function's behave.

difference between printing and returning recursive function in python

When writing a recursive function in Python, what is the difference between using "print" and "return"? I understand the difference between the two when using them for iterative functions, but don't see any rhyme or reason to why it may be more important to use one over the other in a recursive function.
What a strange question.
The two are completely different, and their correct use in a recursive function is just as important as in an iterative one. You might even say more important: after all, in an iterative function, you return the result once only; but in a recursive function, you must return something at every step, otherwise the calling step has nothing to work on.
To illustrate: if you are doing mergesort, for example, the recursive function at each stage must return the sorted sublist. If it simply prints it, without returning it, then the caller will not get the sublist to sort, so cannot then merge the two sorted sublists into a single sorted list for passing further up the stack.
I might add that from a Functional Programming perspective print is a side affect as it pertains to return.
Consider programming as an extent of mathematics. Your function takes a set of inputs, performs an action on them and returns the computation. Print in this case is not a computation. It causes an interaction with the system's IO to provide output to the user.
As for return and print in a recursive function, return is the only required operation. Recursion requires inputs, an optional computation and a test. The test defines if the function will be called again with the computation modified inputs or if the modified inputs are the final solution to the overall equation. No where in this process is print required, and per Functional purists, it really has no place in a recursive function ( unless its computation IS to print).
The difference between print and return in a recursive function is similar to the difference in an iterative function. Print is direct output to the user and return is the result of the function. You have to return at every step or the function will never end and you will get an error.
For example-
def factorial(n):
if n == 1:
return 1
else:
return n * factorial(n-1)
If you used print instead the function would never end.

explicitly passing functions in python

Out of curiosity is more desirable to explicitly pass functions to other functions, or let the function call functions from within. is this a case of Explicit is better than implicit?
for example (the following is only to illustrate what i mean)
def foo(x,y):
return 1 if x > y else 0
partialfun = functools.partial(foo, 1)
def bar(xs,ys):
return partialfun(sum(map(operator.mul,xs,ys)))
>>> bar([1,2,3], [4,5,6])
--or--
def foo(x,y):
return 1 if x > y else 0
partialfun = functools.partial(foo, 1)
def bar(fn,xs,ys):
return fn(sum(map(operator.mul,xs,ys)))
>>> bar(partialfun, [1,2,3], [4,5,6])
There's not really any difference between functions and anything else in this situation. You pass something as an argument if it's a parameter that might vary over different invocations of the function. If the function you are calling (bar in your example) is always calling the same other function, there's no reason to pass that as an argument. If you need to parameterize it so that you can use many different functions (i.e., bar might need to call many functions besides partialfun, and needs to know which one to call), then you need to pass it as an argument.
Generally, yes, but as always, it depends. What you are illustrating here is known as dependency injection. Generally, it is a good idea, as it allows separation of variability from the logic of a given function. This means, for example, that it will be extremely easy for you to test such code.
# To test the process performed in bar(), we can "inject" a function
# which simply returns its argument
def dummy(x):
return x
def bar(fn,xs,ys):
return fn(sum(map(operator.mul,xs,ys)))
>>> assert bar(dummy, [1,2,3], [4,5,6]) == 32
It depends very much on the context.
Basically, if the function is an argument to bar, then it's the responsibility of the caller to know how to implement that function. bar doesn't have to care. But consequently, bar's documentation has to describe what kind of function it needs.
Often this is very appropriate. The obvious example is the map builtin function. map implements the logic of applying a function to each item in a list, and giving back a list of results. map itself neither knows nor cares about what the items are, or what the function is doing to them. map's documentation has to describe that it needs a function of one argument, and each caller of map has to know how to implement or find a suitable function. But this arrangement is great; it allows you to pass a list of your custom objects, and a function which operates specifically on those objects, and map can go away and do its generic thing.
But often this arrangement is inappropriate. A function gives a name to a high level operation and hides the internal implementation details, so you can think of the operation as a unit. Allowing part of its operation to be passed in from outside as a function parameter exposes that it works in a way that uses that function's interface.
A more concrete (though somewhat contrived) example may help. Lets say I've implemented data types representing Person and Job, and I'm writing a function name_and_title for formatting someone's full name and job title into a string, for client code to insert into email signatures or on letterhead or whatever. It's obviously going to take a Person and Job. It could potentially take a function parameter to let the caller decide how to format the person's name: something like lambda firstname, lastname: lastname + ', ' + firstname. But to do this is to expose that I'm representing people's names with a separate first name and last name. If I want to change to supporting a middle name, then either name_and_title won't be able to include the middle name, or I have to change the type of the function it accepts. When I realise that some people have 4 or more names and decide to change to storing a list of names, then I definitely have to change the type of function name_and_title accepts.
So for your bar example, we can't say which is better, because it's an abstract example with no meaning. It depends on whether the call to partialfun is an implementation detail of whatever bar is supposed to be doing, or whether the call to partialfun is something that the caller knows about (and might want to do something else). If it's "part of" bar, then it shouldn't be a parameter. If it's "part of" the caller, then it should be a parameter.
It's worth noting that bar could have a huge number of function parameters. You call sum, map, and operator.mul, which could all be parameterised to make bar more flexible:
def bar(fn, xs,ys, g, h, i):
return fn(g(h(i,xs,ys))
And the way in which g is called on the output of h could be abstracted too:
def bar(fn, xs, ys, g, h, i, j):
return fn(j(g, h(i, xs, ys)))
And we can keep going on and on, until bar doesn't do anything at all, and everything is controlled by the functions passed in, and the caller might as well have just directly done what they want done rather than writing 100 functions to do it and passing those to bar to execute the functions.
So there really isn't a definite answer one way or the other that applies all the time. It depends on the particular code you're writing.

Python function long parameter list

I'm looking for the best way to give a list of arguments to my function :
def myFunc(*args):
retVal=[]
for arg in args:
retVal.append(arg+1)
return "test",retVal
The problem is that it becomes very annoying when you have a long list of parameters to pass to your function because you have to write two times your whole list of parameters and When you have 10 parameters or more with complete names, it becomes really (really) heavy.
test,alpha,beta,gamma,delta,epsilon,zeta,eta,theta,iota=myFunc(alpha,beta,gamma,delta,epsilon,zeta,eta,theta,iota)
So I thought about something like this :
w=alpha,beta,gamma,delta,epsilon,zeta,eta,theta,iota
test,w=myFunc(w)
But then I sill have to do :
alpha,beta,gamma,delta,epsilon,zeta,eta,theta,iota=w
Is there any shorter way to give and get back a list of parameter from a function.
Or give a pointer to the function for it to modify directly the parameters ?
This is what I'm looking for :
w=alpha,beta,gamma,delta,epsilon,zeta,eta,theta,iota
test,w=myFunc(w)
# And directly get my parameters modified to be able to print them :
print alpha,[...],iota
Two options:
Try reducing the number of arguments by splitting the logic into multiple functions.
If 1.) is not possible, you can use a dictionary a single argument - encapsulating all your arguments. This would be a flexible (signature of function stays the same, even if you take away or add parameters) and mostly readable solution (meaningful keys in the dictionary).
Simply make the function return a dict. Then you can call it using myFunc(**yourdict) to use the dict items as arguments and if you return yourdict you get back the same dict (with probably modified values) - or you just modify the original dict and don't return one at all.

Categories

Resources