I need to sort a dictionary file before I format it. I used list.sort and it put it
in ascii order (capital letters before lowercase). So I found this code online to do the sort. It works, but I don't fully understand how the lambda works with 2 variables and with cmp(). I am confused as to what cmp is comparing and which two variables lambda is using. Please explain how lambda works with cmp inside of the sort function.
f = open("swedish.txt", 'r')
f2 = open("swed.txt", 'w')
doc = f.read().split('\n')
doc.sort(lambda x, y: cmp(x.lower(), y.lower()))
for line in doc:
f2.write(line + '\n')
f.close()
f2.close()
To sort a list [b, a, c] one have to compare a with b and a with c and so on. Nothing else doest this lambda, it compares two components of the list with each other.
For your case, the key-argument is more suitable:
doc.sort(key=str.lower)
#Danial has the better answer for lower case compares. But since you asked about the lambda, in python 2.x, you can pass in a comparison function that takes 2 variables and returns -1, 0, 1 depending on whether var 1 is less than, equal to or greater than var 2. This feature has been removed in python 3, so consider it deprecated.
A lambda is just a function that hasn't been assigned to a variable and it is commonly used in cases like this where a call requires a function but you don't want to litter your code with small function definitions. lambda x,y: says you are defining a function with variables x and y, and cmp(x.lower(), y.lower()) is the implementation. In this case, you lower case the strings and cmp returns -1, 0 or 1 like sort requires.
Since you just want to lower case things, the cmp solution is much slower than the key solution. With key=, you only lower case the keys once, With the cmp function, you lower case two keys for every single compare.
Related
I want to sort this list a = [31415926535897932384626433832795, 1, 3, 10, 3, 5]. To reduce time complexity I want to first check whether the 2 elements are of same length or not. If both are of not same length I will swap them according to their length otherwise I will check which number is bigger and swap them. I want to implement this using .sort() function which have a parameter called key. I can use a.sort(key=len) but it will only work for the test cases with different length inputs. Please help me regarding this.
When you are using sort() in Python, you can provide a function or anonymous (lambda) function to it as a basis for sorting.
In this case, you can use lambda x where x are the elements in a.
Subsequently, providing a tuple as a return result in the function allows the sort to prioritize in sorting, thus what you need is this:
a.sort(key=lambda x: (len(str(x)), x))
In the above code, a sorts first by len(str(x)) then by the value of x
Edit: Added explanation
This question already has answers here:
What do lambda function closures capture?
(7 answers)
Creating functions (or lambdas) in a loop (or comprehension)
(6 answers)
Closed 3 years ago.
I have a vector with some parameters and I would like to create a dictionary of anonymous (lambda) functions in Python3.
The goal of this is to create a callable function that gives values equal to the sum of these functions with the same argument.
I am struggling even to create a dictionary with the original lambda function objects and get them to behave consistently. I use the following code:
import numpy as np
a = np.linspace(0,2.0,10)
func = {}
for i,val in enumerate(a):
print(i)
func['f{}'.format(i)] = lambda x: x**val
print(func['f0'](0.5))
print(func['f1'](0.5))
print(func['f2'](0.5))
print(func['f3'](0.5))
The output of the final print statements gives the same value, whereas I would like it to give the values corresponding to x**val with the value of val coming from the originally constructed array a.
I guess what's happening is that the lambda functions always reference the "current" value of val, which, after the loop is executed is always the last value in the array? This makes sense because the output is:
0
1
2
3
4
5
6
7
8
9
0.25
0.25
0.25
0.25
The output makes sense because it is the result of 0.5**2.0 and the exponent is the last value that val takes on in the loop.
I don't really understand this because I would have thought val would go out of scope after the loop is run, but I'm assuming this is part of the "magic" of lambda functions in that they will keep variables that they need to compute the function in scope for longer.
I guess what I need to do is to insert the "literal" value of val at that point into the lambda function, but I've never done that and don't know how.
I would like to know how to properly insert the literal value of val into the lambda functions constructed at each iteration of the loop. I would also like to know if there is a better way to accomplish what I need to.
EDIT: it has been suggested that this question is a duplicate. I think it is a duplicate of the list comprehension post because the best answer is virtually identical and lambda functions are used.
I think it is not a duplicate of the lexical closures post, although I think it is important that this post was mentioned. That post gives a deeper understanding of the underlying causes for this behavior but the original question specifically states "mindful avoidance of lambda functions," which makes it a bit different. I'm not sure what the purpose of that mindful avoidance is, but the post did teach related lessons on scoping.
The problem with this approach is that val used inside your lambda function is the live variable, outside. When each lambda is called, the value used for val in the formula is the current value of val, therefore all your results are the same.
The solution is to "freeze" the value for val when creating each lambda function - the way that is easier to understand what is going on is to have an outer lambda function, that will take val as an input, and return your desided (inner) lambda - but with val frozen in a different scope. Note that the outer function is called and imedially discarded - its return value is the original function you had:
for i,val in enumerate(a):
print(i)
func[f'f{i}'] = (lambda val: (lambda x: x**val))(val)
shorter version
Now, due to the way Python stores default arguments to functions, it is possible to store the "current val value" as a default argument in the lambda, and avoid the need for an outer function. But that spoils the lambda signature, and the "why" that value is there is harder to understand -
for i,val in enumerate(a):
print(i)
func[f'f{i}'] = lambda x, val=val: x**val
I have a tuple of functions that I want to pre-load with some data. Currently the way I am doing this is below. Essentially, I make a list of the new functions, and add the lambda functions to it one at a time, then reconvert to a tuple. However, when I use these functions in a different part of the code, every one of them acts as if it were the last one in the list.
def newfuncs(data, funcs):
newfuncs = []
for f in funcs:
newf = lambda x: f(x, data)
newfuncs.append(newf)
return tuple(newfuncs)
Here is a simple example of the problem
funcs = (lambda x, y: x + y, lambda a, b: a - b)
funcs = newfuncs(10, funcs)
print(funcs[0](5))
print(funcs[1](5))
I would expect the number 15 to be printed, then -5. However, this code prints the number -5 twice. If anyone can help my understand why this is happening, it would be greatly appreciated. Thanks!
As mentioned, the issue is with the variable f, which is the same variable assigned to all lambda functions, so at the end of the loop, every lambda sees the same f.
The solution here is to either use functools.partial, or create a scoped default argument for the lambda:
def newfuncs(data, funcs):
newfuncs = []
for f in funcs:
newf = lambda x, f=f: f(x, data) # note the f=f bit here
newfuncs.append(newf)
return tuple(newfuncs)
Calling these lambdas as before now gives:
15
-5
If you're using python3.x, make sure to take a look at this comment by ShadowRanger as a possible safety feature to the scoped default arg approach.
This is a well-known Python "issue," or should I say "this is just the way Python is."
You created the tuple:
( x => f(x, data), x => f(x, data) )
But what is f? f is not evaluated until you finally call the functions!
First, f was (x, y)=>x+y. Then in your for-loop, f was reassigned to (x, y)=>x-y.
When you finally get around to calling your functions, then, and only then, will the value of f be looked up. What is the value of f at this point? The value is (x, y)=>x-y for all of your functions. All of your functions do subtraction. This is because f is reassigned. There is ONLY ONE f. And the value of that one and only one f is set to the subtraction function, before any of your lambdas ever get called.
ADDENDUM
In case anyone is interested, different languages approach this problem in different ways. JavaScript is rather interesting (some would say confusing) here because it does things the Python way, which the OP found unexpected, as well as a different way, which the OP would have expected. In JavaScript:
> let funcs = []
> for (let f of [(x,y)=>x+y, (x,y)=>x-y]) {
funcs.push(x=>f(x,10));
}
> funcs[0](5)
15
funcs[1](5)
-5
However, if you change let to var above, it behaves like Python and you get -5 for both! This is because with let, you get a different f for each iteration of the for-loop; with var, the whole function shares the same f, which keeps getting reassigned. This is how Python works.
cᴏʟᴅsᴘᴇᴇᴅ has shown you that the way to do what you expect is to make sure you get that different f for each iteration, which Python allows you do in a pretty neat way, defaulting the second argument of the lambda to a f local to that iteration. It's pretty cool, so their answer should be accepted if it helps you.
This question already has answers here:
Syntax behind sorted(key=lambda: ...) [duplicate]
(10 answers)
Closed 6 years ago.
I have just finished my code for a school project, but have used one line of code from stack overflow that is slightly more advanced than the knowledge that's in my "skill comfort zone".
I do understand what it does, why it exists, etc.. But still cannot convert it into "human syntax" for my individual report. Could anyone give me a bit of their time and explain, as precisely as possible, the underlying mechanism in this line of code? Thanks in advance!
sites.sort(key = lambda x: x[0])
The role it has within my program is sorting a dictionary by the integers in its first column, from smallest to biggest.
Need to demonstrate that I fully understand this line of code, which I frankly do not.
Thanks!
A lambda function is basically like any other function, with some restrictions. It can only be one line, and it must be an expression. That expression is evaluated when the lambda function is called and the result of that expression is returned.
sites.sort(key = lambda x: x[0])
is the same as
def f(x):
return x[0]
sites.sort(key = f)
In this case, the lambda function is passed to the sort method as a key function. These are used in sorting to sort things based on something other than their value. Each element e of sites is passed to the key function, and then they are sorted based on f(e) or e[0] rather than the value of e.
When calling sort on an object, Python passes each of the elements contained inside that object to the function you specify as the key parameter.
A lambda function can be dissected in the following parts:
lambda <args>: <what to do with args>
In your lambda function you have a single arg x as the <args>, and your <what to do with args> is to get the zero element of it.
That element x[0] is going to be used in the comparisons sort performs.
In your example, sites must contain elements that themselves contain elements, for example, nested lists: l = [[1, 2, 3], [0, 1, 2, 3]].
Using this example, Python is first going to pass [1, 2, 3] to the lambda as the value of x and the lambda function is going to return x[0] == 1, then it will pass [0, 1, 2, 3] to it and get back x[0] == 0. These values are used during sorting to get the ordering as you require.
i try to create a function, which generates random int-values and after a value appeared twice the function should return the number of all generated int-values.
I have to use a dictionary.
This is my code so far:
def repeat(a,b):
dict={}
d=b-a+2
for c in range(1,d):
dict['c']=random.randint(a,b)
for f in dict:
if dict['f']==dict['c']:
return c
First problem: It doesn't work.
>>> repeat(1,5)
Traceback (most recent call last):
File "<pyshell#144>", line 1, in <module>
repeat(1,5)
File "<pyshell#143>", line 7, in repeat
if dict['f']==dict['c']:
KeyError: 'f'
Second problem: if dict['f']==dict['c']:
Should be true in first step because both values are the same.
I can't find a smart way to compare all values without comparing a key with itself.
Sorry for my bad english, it's kinda rusty and thank you for your time.
Enclosing your variable names in quotes makes them strings - Python is looking for the key of the letter f, not the key with the integer in the f variable.
Simply use the variable normally and it should work as you expected:
def repeat(a, b):
stored = {}
d = b - a + 2
for c in range(1, d):
stored[c] = random.randint(a, b)
for f in stored:
if stored[f] == stored[c]:
return c
Note also that you are shadowing the built-in function dict() by naming your variable dict - it is preferable to use another name because of this.
This isn't really an answer to your question. #Lattyware told you the problem. But I can't put code in a comment so I'm posting this as an answer.
Your code is using weird variable names, which makes the code harder to understand. I suggest you use variable names that help the reader to understand the program.
I've changed your variable names and added comments. I also put in a "doc string" but I don't really understand this function so I didn't actually write a documentation message.
def repeat(a,b): # short names are okay for a,b as they are just two numbers
"""
This is the "doc string". You should put in here a short summary of what the function
does. I won't write one because I don't understand what you are trying to do.
"""
# do not use built-in names from Python as variable names! So don't use "dict"
# I often use "d" as a short name for a dictionary if there is only one dictionary.
# However, I like #Lattyware's name "stored" so I will use it as well.
stored={}
# You only used "d" once, and it's the upper bound of a range; might as well just
# put the upper bound inside the call to range(). If the calculation was really long
# and difficult I might still use the variable, but this calculation is simple.
# I guess you can use "c" in the loop, but usually I use "n" for number if the loop
# is making a series of numbers to use. If it is making a series of indexes I use "i".
for n in range(1,b-a+2):
stored[n]=random.randint(a,b)
# Any for loop that loops over a dictionary is looping over the keys.
for key in stored:
# I don't understand what you are trying to do. This loop will always terminate
# the first time through (when n == 1). The line above this for loop assigns
# a value to stored[n], and n will be equal to 1 on the first loop; then this
# test will trivially succeed.
if stored[key] == stored[n]:
return n