Sort by length of lists [duplicate] - python

I want to sort a list of strings based on the string length. I tried to use sort as follows, but it doesn't seem to give me correct result.
xs = ['dddd','a','bb','ccc']
print xs
xs.sort(lambda x,y: len(x) < len(y))
print xs
['dddd', 'a', 'bb', 'ccc']
['dddd', 'a', 'bb', 'ccc']
What might be wrong?

When you pass a lambda to sort, you need to return an integer, not a boolean. So your code should instead read as follows:
xs.sort(lambda x,y: cmp(len(x), len(y)))
Note that cmp is a builtin function such that cmp(x, y) returns -1 if x is less than y, 0 if x is equal to y, and 1 if x is greater than y.
Of course, you can instead use the key parameter:
xs.sort(key=lambda s: len(s))
This tells the sort method to order based on whatever the key function returns.
EDIT: Thanks to balpha and Ruslan below for pointing out that you can just pass len directly as the key parameter to the function, thus eliminating the need for a lambda:
xs.sort(key=len)
And as Ruslan points out below, you can also use the built-in sorted function rather than the list.sort method, which creates a new list rather than sorting the existing one in-place:
print(sorted(xs, key=len))

The same as in Eli's answer - just using a shorter form, because you can skip a lambda part here.
Creating new list:
>>> xs = ['dddd','a','bb','ccc']
>>> sorted(xs, key=len)
['a', 'bb', 'ccc', 'dddd']
In-place sorting:
>>> xs.sort(key=len)
>>> xs
['a', 'bb', 'ccc', 'dddd']

The easiest way to do this is:
list.sort(key = lambda x:len(x))

I Would like to add how the pythonic key function works while sorting :
Decorate-Sort-Undecorate Design Pattern :
Python’s support for a key function when sorting is implemented using what is known as the
decorate-sort-undecorate design pattern.
It proceeds in 3 steps:
Each element of the list is temporarily replaced with a “decorated” version that includes the result of the key function applied to the element.
The list is sorted based upon the natural order of the keys.
The decorated elements are replaced by the original elements.
Key parameter to specify a function to be called on each list element prior to making comparisons. docs

Write a function lensort to sort a list of strings based on length.
def lensort(a):
n = len(a)
for i in range(n):
for j in range(i+1,n):
if len(a[i]) > len(a[j]):
temp = a[i]
a[i] = a[j]
a[j] = temp
return a
print lensort(["hello","bye","good"])

I can do it using below two methods, using function
def lensort(x):
list1 = []
for i in x:
list1.append([len(i),i])
return sorted(list1)
lista = ['a', 'bb', 'ccc', 'dddd']
a=lensort(lista)
print([l[1] for l in a])
In one Liner using Lambda, as below, a already answered above.
lista = ['a', 'bb', 'ccc', 'dddd']
lista.sort(key = lambda x:len(x))
print(lista)

def lensort(list_1):
list_2=[];list_3=[]
for i in list_1:
list_2.append([i,len(i)])
list_2.sort(key = lambda x : x[1])
for i in list_2:
list_3.append(i[0])
return list_3
This works for me!

Related

Alternatives to using in-place list methods within a list comprehension?

I understand that in-place list methods return None instead of the mutated list. As far as I can see, this makes it impossible to use these methods as part of the internal logic of a list comprehension.
What is the most pythonic way to create a list comprehension whose members result from mutating other lists? In other words: what is the best alternative to this (non-functioning) line:
new_list = [old_list.insert(0, "X") for old_list in list_of_old_lists]
Which results in a list of Nones because list.insert() returns None.
Is it simply not possible to do this in an elegant single-line of code without a lot of slicing and concatenating?
The example above is trivial for the sake of illustrating my question but in reality I'd like to do this in more complex situations in lieu of multiple nested 'for' loops.
Here's a simplified sample of what I'm trying to do:
word = 'abcdefg'
variations_list = []
characters_to_insert = ['X', 'Y', 'Z']
for character in characters_to_insert:
for position in range(len(word) + 1):
w = list(word)
w.insert(position, character)
this_variation = ''.join(w)
variations_list.append(this_variation)
for v in variations_list:
print(v)
This works fine using nested 'for' loops, like this (my real application is much more complex/verbose than this sample).
But I cannot do the same thing using list comprehension because the 'insert' method returns None:
variations_list_comprehension = [list(word).insert(position, character) for position in range(len(word) +1) for character in ['X', 'Y', 'Z']]
for v in variations_list_comprehension:
print(v)
Results in a list of None values because the in-place mutations return "None".
If you don't care about the results of mutating the other list, then you don't need to use an interim list:
variations_list = [word[:i] + char + word[i:]
for char in characters_to_insert
for i in range(len(word) + 1)]
['Xabcdefg', 'aXbcdefg', 'abXcdefg', 'abcXdefg', 'abcdXefg', 'abcdeXfg', 'abcdefXg', 'abcdefgX',
'Yabcdefg', 'aYbcdefg', 'abYcdefg', 'abcYdefg', 'abcdYefg', 'abcdeYfg', 'abcdefYg', 'abcdefgY',
'Zabcdefg', 'aZbcdefg', 'abZcdefg', 'abcZdefg', 'abcdZefg', 'abcdeZfg', 'abcdefZg', 'abcdefgZ']
I would still say this is at best a borderline comprehension: it's much easier to follow as a for loop.
Not everything should or needs to be solved by a comprehension. for-loops aren't bad - sometimes they are better than comprehensions, especially because one tends to avoid doing too much in one line automatically.
But if you really want a list-comprehension solution I would use a helper function that wraps the in-place operation:
def insert(it, index, value):
lst = list(it)
lst.insert(index, value)
return lst
[''.join(insert(word, position, character)) for position in range(len(word) +1) for character in ['X', 'Y', 'Z']]
However to be really equivalent to your loopy solution you need to swap the loops in the comprehension:
[''.join(insert(word, position, character)) for character in ['X', 'Y', 'Z'] for position in range(len(word) +1)]
The advantage here is that wrapping the in-place method can be applied in a lot of cases, it doesn't just work in this case (you can wrap any in-place function that way). It's verbose, but it's very readable and re-useable!
Personally I would use a generator function with loops, because you can use it to create a list but you could also create the items on demand (without needing the list):
def insert_characters_everywhere(word, chars):
for character in characters_to_insert:
for position in range(len(word) + 1):
lst = list(word)
lst.insert(position, character)
yield ''.join(lst)
list(insert_characters_everywhere('abcdefg', ['X', 'Y', 'Z']))
I think it is important to understand what are you actually trying to achieve here.
new_list = [old_list.insert(0, "X") for old_list in list_of_old_lists]
Case 1: In your code above, are you trying to create a new new_list containing old(!!!) lists updated to include 'X' character as their first element? If that is the case, then list_of_old_lists will be a list of "new lists".
For example,
list_of_old_lists = [['A'], ['B']]
new_list = []
for old_list in list_of_old_lists:
old_list.insert(0, 'X')
new_list.append(old_list)
print(list_of_old_lists)
print(new_list)
print(list_of_old_lists == new_list)
will print:
[['X', 'A'], ['X', 'B']]
[['X', 'A'], ['X', 'B']]
True
That is, new_list is a shallow copy of list_of_old_lists containing "updated" lists. If this is what you want, then you can do something like this using list comprehension:
[old_list.insert(0, "X") for old_list in list_of_old_lists]
new_list = list_of_old_lists[:]
instead of the for-loop in my example above.
Case 2: Or, are you trying to create a new list containing updated lists while having list_of_old_lists hold the original lists? In this case, you can use list comprehension in the following way:
new_list = [['X'] + old_list for old_list in list_of_old_lists]
Then:
In [14]: list_of_old_lists = [['A'], ['B']]
...: new_list = [['X'] + old_list for old_list in list_of_old_lists]
...: print(list_of_old_lists)
...: print(new_list)
...: print(new_list == list_of_old_lists)
...:
[['A'], ['B']]
[['X', 'A'], ['X', 'B']]
False

Returning semi-unique values from a list

Not sure how else to word this, but say I have a list containing the following sequence:
[a,a,a,b,b,b,a,a,a]
and I would like to return:
[a,b,a]
How would one do this in principle?
You can use itertools.groupby, this groups consecutive same elements in the same group and return an iterator of key value pairs where the key is the unique element you are looking for:
from itertools import groupby
[k for k, _ in groupby(lst)]
# ['a', 'b', 'a']
lst = ['a','a','a','b','b','b','a','a','a']
Psidoms way is a lot better, but I may as well write this so you can see how it'd be possible just using basic loops and statements. It's always good to figure out what steps you'd need to take for any problem, as it usually makes coding the simple things a bit easier :)
original = ['a','a','a','b','b','b','a','a','a']
new = [original[0]]
for letter in original[1:]:
if letter != new[-1]:
new.append(letter)
Basically it will append a letter if the previous letter is something different.
Using list comprehension:
original = ['a','a','a','b','b','b','a','a','a']
packed = [original[i] for i in range(len(original)) if i == 0 or original[i] != original[i-1]]
print(packed) # > ['a', 'b', 'a']
Similarly (thanks to pylang) you can use enumerate instead of range:
[ x for i,x in enumerate(original) if i == 0 or x != original[i-1] ]
more_itertools has an implementation of the unique_justseen recipe from itertools:
import more_itertools as mit
list(mit.unique_justseen(["a","a","a","b","b","b","a","a","a"]))
# ['a', 'b', 'a']

How to call a function for every for iteration of a zip of two lists?

I have two lists:
a=[1,2,3], b=[a,b,c]
I want for each zip of those two to call a function, but not to do it in a trivial way inside a for loop. Is there a pythonic way? I tried with a map:
map(func(i,v) for i,v in zip(a,b))
but it does not work
The pythonic way is the for loop:
for i, v in zip(a, b):
func(i, v)
Clear, concise, readable. What's not to like?
A list comprehension is almost always faster or equivalent to map. If you append the results of the comprehension to a list (as in the example), then a comprehension is also faster than a for loop:
a = [1, 2, 3]
b = ['a', 'b', 'c']
c = []
def foo(x, y):
global c
result = x * y
c.append(result)
return result
>>> c
[]
>>> [foo(x, y) for x, y in zip(a, b)]
['a', 'bb', 'ccc']
>>> c
['a', 'bb', 'ccc']
If the function func doesn't return anything, you could use:
any(func(i, v) for i,v in zip(a, b))
Which will return False but not accumulate the results.
This would not be considered "Pythonic" by many since any() is being used for its side-effects, and therefore isn't very explicit.

Sorting Python list based on the length of the string

I want to sort a list of strings based on the string length. I tried to use sort as follows, but it doesn't seem to give me correct result.
xs = ['dddd','a','bb','ccc']
print xs
xs.sort(lambda x,y: len(x) < len(y))
print xs
['dddd', 'a', 'bb', 'ccc']
['dddd', 'a', 'bb', 'ccc']
What might be wrong?
When you pass a lambda to sort, you need to return an integer, not a boolean. So your code should instead read as follows:
xs.sort(lambda x,y: cmp(len(x), len(y)))
Note that cmp is a builtin function such that cmp(x, y) returns -1 if x is less than y, 0 if x is equal to y, and 1 if x is greater than y.
Of course, you can instead use the key parameter:
xs.sort(key=lambda s: len(s))
This tells the sort method to order based on whatever the key function returns.
EDIT: Thanks to balpha and Ruslan below for pointing out that you can just pass len directly as the key parameter to the function, thus eliminating the need for a lambda:
xs.sort(key=len)
And as Ruslan points out below, you can also use the built-in sorted function rather than the list.sort method, which creates a new list rather than sorting the existing one in-place:
print(sorted(xs, key=len))
The same as in Eli's answer - just using a shorter form, because you can skip a lambda part here.
Creating new list:
>>> xs = ['dddd','a','bb','ccc']
>>> sorted(xs, key=len)
['a', 'bb', 'ccc', 'dddd']
In-place sorting:
>>> xs.sort(key=len)
>>> xs
['a', 'bb', 'ccc', 'dddd']
The easiest way to do this is:
list.sort(key = lambda x:len(x))
I Would like to add how the pythonic key function works while sorting :
Decorate-Sort-Undecorate Design Pattern :
Python’s support for a key function when sorting is implemented using what is known as the
decorate-sort-undecorate design pattern.
It proceeds in 3 steps:
Each element of the list is temporarily replaced with a “decorated” version that includes the result of the key function applied to the element.
The list is sorted based upon the natural order of the keys.
The decorated elements are replaced by the original elements.
Key parameter to specify a function to be called on each list element prior to making comparisons. docs
Write a function lensort to sort a list of strings based on length.
def lensort(a):
n = len(a)
for i in range(n):
for j in range(i+1,n):
if len(a[i]) > len(a[j]):
temp = a[i]
a[i] = a[j]
a[j] = temp
return a
print lensort(["hello","bye","good"])
I can do it using below two methods, using function
def lensort(x):
list1 = []
for i in x:
list1.append([len(i),i])
return sorted(list1)
lista = ['a', 'bb', 'ccc', 'dddd']
a=lensort(lista)
print([l[1] for l in a])
In one Liner using Lambda, as below, a already answered above.
lista = ['a', 'bb', 'ccc', 'dddd']
lista.sort(key = lambda x:len(x))
print(lista)
def lensort(list_1):
list_2=[];list_3=[]
for i in list_1:
list_2.append([i,len(i)])
list_2.sort(key = lambda x : x[1])
for i in list_2:
list_3.append(i[0])
return list_3
This works for me!

Python - compare nested lists and append matches to new list?

I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subsequent transformation into a tab delimited file. Here is an example of what I am working with:
x = [['1', 'a', 'b'], ['2', 'c', 'd']]
y = [['1', 'z', 'x'], ['4', 'z', 'x']]
match = []
def find_match():
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match
This returns:
[['1', 'x'], ['1', 'y'], ['1', 'x'], ['1', 'y'], ['1', 'z', 'x']]
Would it be good practise to reprocess the list to remove duplicates or can this be done in a simpler fashion?
Also, is it better to use tuples and/or tuples of tuples for the purposes of comparison?
Any help is greatly appreciated.
Regards,
Seafoid.
Use sets to obtain collections with no duplicates.
You'll have to use tuples instead of lists as the items because set items must be hashable.
The code you posted doesn't seem to generate the output you posted. I do not have any idea how you are supposed to generate that output from that input. For example, the output has 'y' and the input does not.
I think the design of your function could be much improved. Currently you define x, y, and match as the module level and read and mutate them explicitly. This is not how you want to design functions—as a general rule, a function shouldn't mutate something at the global level. It should be explicitly passed everything it needs and return a result, not implicitly receive information and change something outside itself.
I would change
x = some list
y = some list
match = []
def find_match():
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match # This is the only line I changed. I think you meant
# your return to be over here?
find_match()
to
x = some list
y = some list
def find_match(x, y):
match = []
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match
match = find_match(x, y)
To take that last change to the next level, I usually replace the pattern
def f(...):
return_value = []
for...
return_value.append(foo)
return return_value
with the similar generator
def f(...):
for...
yield foo
which would make the above function
def find_match(x, y):
for i in x:
for j in y:
if i[0] == j[0]:
yield j
another way to express this generator's effect is with the generator expression (j for i in x for j in y if i[0] == j[0]).
I don't know if I interpret your question correctly, but given your example it seems that you might be using a wrong index:
change
if i[1] == j[1]:
into
if i[0] == j[0]:
You can do this a lot more simply by using sets.
set_x = set([i[0] for i in x])
set_y = set([i[0] for i in y])
matches = list(set_x & set_y)
if i[1] == j[1]
checks whether the second elements of the arrays are identical. You want if i[0] == j[0].
Otherwise, I find your code quite readable and wouldn't necessarily change it.
A simplier expression should work here too:
list_of_lists = filter(lambda l: l[0][0] == l[1][0], zip(x, y))
map(lambda l: l[1], list_of_lists)

Categories

Resources