I'm not sure what the correct term is for the multiplication here but I need to multiply an element from List A for example by every element in List B and create a new list for the new elements, so that the total length of the new list is len(A)*len(B).
As an example
A = [1,3,5], B=[4,6,8]
I need to multiply the two together to get
C = [4,6,8,12,18,24,20,30,40]
I have researched this and I have found that itertools(product) have exactly what I needed, however it is for a specific number of lists and I need to generalise to any number of lists as requested by the user.
I don't have access to the full code right now but the code asks the user for some lists (can be any number of lists) and the lists can have any number of elements in the lists (but all lists contain the same number of elements). These lists are then stored in one big list.
For example (user input)
A = [2,5,8], B= [4,7,3]
The big list will be
C = [[2,5,8],[4,7,3]]
In this case there are two lists in the big list but in general it can be any number of lists.
Once the code has this I have
print([a*b for a,b in itertools.product(C[0],C[1])])
>> [8,14,6,20,35,15,32,56,24]
The output of this is exactly what I want, however in this case the code is written for exactly two lists and I need it generalised to n lists.
I've been thinking about creating a loop to somehow loop over it n times but so far I have not been successful in this. Since C could any of any length then the loop needs a way to know when it's reached the end of the list. I don't need it to compute the product with n lists at the same time
print([a0*a1*...*a(n-1) for a0,a1,...,a(n-1) in itertools.product(C[0],C[1],C[2],...C[n-1])])
The loop could multiply two lists at a time then use the result from that multiplication against the next list in C and so on until C[n-1].
I would appreciate any advice to see if I'm at least heading in the right direction.
p.s. I am using numpy and the lists are arrays.
You can pass variable number of arguments to itertools.product with *. * is the unpacking operator that unpacks the list and passes its values the values of list to the function as if they are separately passed.
import itertools
import math
A = [[1, 2], [3, 4], [5, 6]]
result = list(map(math.prod, itertools.product(*A)))
print(result)
Result:
[15, 18, 20, 24, 30, 36, 40, 48]
You can find many explanations on the internet about * operator. In short, if you call a function like f(*lst), it will be roughly equivalent to f(lst[0], lst[1], ..., lst[len(lst) - 1]). So, it will save you from the need to know the length of the list.
Edit: I just realized that math.prod is a 3.8+ feature. If you're running an older version of Python, you can replace it with its numpy equivalent, np.prod.
You could use a reduce function that is intended exactly for these types of operations, which is based on recursion and accumulation. I am providing you an example with a primitive function so you can better understand its functionality:
lists = [
[4, 6, 8],
[1, 3, 5]
]
def reduce(function, iterable, initializer=None):
it = iter(iterable)
if initializer is None:
value = next(it)
else:
value = initializer
for element in it:
value = function(value, element)
return value
def cmp(a, b):
for x in a:
for y in b:
yield x*y
summed = list(reduce(cmp, lists))
# OUTPUT
[4, 12, 20, 6, 18, 30, 8, 24, 40]
In case you need it sorted just make use of the sort() function.
Related
for practicing purposes, I tried to implement a function that receives two lists as parameters and returns the difference of them. So basically the elements which are the lists have not in common.
I coded the following functions:
list1 = [4,2,5,3,9,11]
list2 = [7,9,2,3,5,1]
def difference(list1,list2):
return (list(set(list1) - set(list2)))
difference(list1,list2)
AND
def difference_extra_credit(list1,list2):
return [value for value in list1 if value not in list2]
difference(list1,list2)
--> Basically both codes seem to work but I'm currently facing the problem that the lists need to have the same length in order for the functions to work. If the length is not the same, adding for instance an integer of 100 to list 1, it would not be shown as a difference between the lists if you print the functions.
I didn't manage to find a way to modify the code so that the length of the lists doesn't matter.. Does someone has an idea?
Thanks!
If you want symmetric difference, use the ^ operator instead of -
def difference(list1, list2):
return list(set(list1) ^ set(list2))
Here are the four set operators that combine two sets into one set.
| union : elements in one or both of the sets
& intersection : only elements common to both sets
- difference : elements in the left hand set that are not in the right hand set
^ symmetric difference : elements in either set but not in both.
I think this is a more readable way of writing the function
def symmetric_difference(a, b):
return {*a} ^ {*b}
(* unpacking in set literals requires python 3.5 or later)
Returning a set instead of a list makes it a bit more clear what the function does. The input arguments can be any iterable types, and since set is an unordered data type, returning a set makes it obvious that any ordering in the input data was not preserved.
>>> symmetric_difference(range(3, 8), [1,2,3,4])
{1, 2, 5, 6, 7}
>>> symmetric_difference('hello', 'world')
{'d', 'e', 'h', 'r', 'w'}
your both versions aren't symmetrical: if you exchange list1 and list2, the result won't be the same.
If you add a number in list2 (not in list1 as your question states), it's not seen as a difference, whereas it is one.
You want to perform a symmetric difference, so no matter the data in both lists (swapped or not) the result remains the same
def difference(list1,list2):
return list(set(list1).symmetric_difference(list2))
with your data:
[1, 4, 7, 11]
Trying out your code, it seemed to work fine with me regardless of the length of the lists - when I added 100 to list1, it showed up for both difference functions.
However, there appear to be a few issues with your code that could be causing the problems. Firstly, you accept arguments list1 and list2 for both functions, but these variables are the same name as your list variables. This seems not to cause an issue, but it means that the global variables are no longer accessible, and it is generally a better practice to avoid confusion by using different names for global variables and variables within functions.
Additionally, your function does not take the symmetric difference - it only loops over the variables in the first list, so unique variables in the second list will not be counted. To fix this easily, you could add a line combining your lists into a sum list, then looping over that entire list and checking if each value is in only one of the lists - this would use ^ to do an xor comparison of whether or not the variable is in the two lists, so that it returns true if it is in only one of the lists. This could be done like so:
def difference_extra_credit(l1,l2):
list = l1 + l2
return [value for value in list if (value in l1) ^ (value in l2)]
Testing this function on my own has resulted in the list [4, 11, 7, 1], and [4, 11, 100, 7, 1] if 100 is added to list1 or list2.
I've had a look through the forums and can't find anything to do with multiplying all elements in an array recursively.
I've created the following code that almost does what I want. The goal is to use no loops and only recursion.
Here's the code:
def multAll(k,A):
multAllAux(k,A)
return A[:]
def multAllAux(k,A):
B = [0]
if A == []:
return 0
else:
B[0] = (A[0] * k)
B.append(multAllAux(k,A[1:]))
return B
print(multAllAux(10, [5,12,31,7,25] ))
The current output is:
[50, [120, [310, [70, [250, 0]]]]]
However, it should be:
[50,120,310,70,250]
I know I am close, but I am at a complete loss at this point. The restrictions of no loops and solely recursion has left me boggled!
Your multAllAux function returns a list. If you append a list to another list, you get this nested list kind of structure that you are getting right now.
If you instead use the "extend" function; it will work as expected.
>>> a = [1, 2, 3]
>>> a.extend([4, 5])
>>> a
[1, 2, 3, 4, 5]
extend takes the elements from a second list and adds them to the first list, instead of adding the second list itself which is what append does!
Your function also returns a zero at the end of the list, which you don't need. You can try this:
def mult(k, A: list) -> list:
return [k * A[0]] + mult(k, A[1:]) if A else []
The problem is here:
B.append(multAllAux(k,A[1:])))
What .append(..) does is it takes the argument, considers it as a single element and adds that element to the end of the list. What you want is to concatenate to the list (ie the item being added should be seen as a list of elements rather than one single element).
You can say something like: B = B + multAllAux(..) or just use +=
B += multAllAux(...)
BTW, if you wanted to multiply a single number to a list, there is a very similar construct: map(..). Note that this behaves slightly differently depending on whether you're using Py2 or Py3.
print(map(lambda x: x * 10, [5,12,31,7,25]))
I have the following which is a list of a list in Python, and is a partial list of values that I have:
[1,33]
[2,10,42]
[5,1,33,44]
[10,42,98]
[44,12,100,124]
Is there a way of grouping them so they collect the values that are common in each list?
For example, if I look at the first list [1,33], I can see that the value exists in the third list: [5,1,33,44]
So, those are grouped together as
[5,1,33,44]
If I carry on looking, I can see that 44 is in the final list, and so that will be grouped along with this list.
[44,12,100,124] is added onto [5,1,33,44]
to give:
[1,5,12,33,44,100,124]
The second list [2,10,42] has common values with [10,42,98] and are therefore joined together to give:
[2,10,42,98]
So the final lists are:
[1,5,12,33,44,100,124]
[2,10,42,98]
I am guessing there is a specific name for this type of grouping. Is there a library available that can deal with it automatically? Or would I have to write a manual way of searching?
I hope the edit makes it clearer as to what I am trying to achieve.
Thanks.
Here's a solution that does not require anything from the standard library or 3rd party packages. Note that this will modify a. To avoid that, just make a copy of a and work with that. The result is a list of lists containing your resulting sorted lists.
a = [
[1,33],
[2,10,42],
[5,1,33,44],
[10,42,98],
[44,12,100,124]
]
res = []
while a:
el = a.pop(0)
res.append(el)
for sublist in a:
if set(el).intersection(set(sublist)):
res[-1].extend(sublist)
a.remove(sublist)
res = [sorted(set(i)) for i in res]
print(res)
# [[1, 5, 12, 33, 44, 100, 124], [2, 10, 42, 98]]
How this works:
Form an empty result list res. Groupings from a will be "transferred" here.
.pop() off the first element of a. This modifies a in place and defines el as that element.
Then loop through each sublist in a, comparing your popped el to those sublists and "building up" common sets. This is where your problem is a tiny bit tricky in that you need to gradually increment your intersected set rather than finding the intersection of multiple sublists all at once.
Repeat this process until a is empty.
Alternatively, if you just want to group together the even- and odd-numbered sublists (still a bit unclear from your question), you can use itertools:
from itertools import chain
grp1 = sorted(set(chain.from_iterable(a[::2])))
grp2 = sorted(set(chain.from_iterable(a[1::2])))
print(grp1)
print(grp2)
# [1, 5, 12, 33, 44, 100, 124]
# [2, 10, 42, 98]
list.sort() sorts the list and replaces the original list, whereas sorted(list) returns a sorted copy of the list, without changing the original list.
When is one preferred over the other?
Which is more efficient? By how much?
Can a list be reverted to the unsorted state after list.sort() has been performed?
Please use Why do these list operations (methods) return None, rather than the resulting list? to close questions where OP has inadvertently assigned the result of .sort(), rather than using sorted or a separate statement. Proper debugging would reveal that .sort() had returned None, at which point "why?" is the remaining question.
sorted() returns a new sorted list, leaving the original list unaffected. list.sort() sorts the list in-place, mutating the list indices, and returns None (like all in-place operations).
sorted() works on any iterable, not just lists. Strings, tuples, dictionaries (you'll get the keys), generators, etc., returning a list containing all elements, sorted.
Use list.sort() when you want to mutate the list, sorted() when you want a new sorted object back. Use sorted() when you want to sort something that is an iterable, not a list yet.
For lists, list.sort() is faster than sorted() because it doesn't have to create a copy. For any other iterable, you have no choice.
No, you cannot retrieve the original positions. Once you called list.sort() the original order is gone.
What is the difference between sorted(list) vs list.sort()?
list.sort mutates the list in-place & returns None
sorted takes any iterable & returns a new list, sorted.
sorted is equivalent to this Python implementation, but the CPython builtin function should run measurably faster as it is written in C:
def sorted(iterable, key=None):
new_list = list(iterable) # make a new list
new_list.sort(key=key) # sort it
return new_list # return it
when to use which?
Use list.sort when you do not wish to retain the original sort order
(Thus you will be able to reuse the list in-place in memory.) and when
you are the sole owner of the list (if the list is shared by other code
and you mutate it, you could introduce bugs where that list is used.)
Use sorted when you want to retain the original sort order or when you
wish to create a new list that only your local code owns.
Can a list's original positions be retrieved after list.sort()?
No - unless you made a copy yourself, that information is lost because the sort is done in-place.
"And which is faster? And how much faster?"
To illustrate the penalty of creating a new list, use the timeit module, here's our setup:
import timeit
setup = """
import random
lists = [list(range(10000)) for _ in range(1000)] # list of lists
for l in lists:
random.shuffle(l) # shuffle each list
shuffled_iter = iter(lists) # wrap as iterator so next() yields one at a time
"""
And here's our results for a list of randomly arranged 10000 integers, as we can see here, we've disproven an older list creation expense myth:
Python 2.7
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000)
[3.75168503401801, 3.7473005310166627, 3.753129180986434]
>>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000)
[3.702025591977872, 3.709248117986135, 3.71071034099441]
Python 3
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000)
[2.797430992126465, 2.796825885772705, 2.7744789123535156]
>>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000)
[2.675589084625244, 2.8019039630889893, 2.849375009536743]
After some feedback, I decided another test would be desirable with different characteristics. Here I provide the same randomly ordered list of 100,000 in length for each iteration 1,000 times.
import timeit
setup = """
import random
random.seed(0)
lst = list(range(100000))
random.shuffle(lst)
"""
I interpret this larger sort's difference coming from the copying mentioned by Martijn, but it does not dominate to the point stated in the older more popular answer here, here the increase in time is only about 10%
>>> timeit.repeat("lst[:].sort()", setup=setup, number = 10000)
[572.919036605, 573.1384446719999, 568.5923951]
>>> timeit.repeat("sorted(lst[:])", setup=setup, number = 10000)
[647.0584738299999, 653.4040515829997, 657.9457361929999]
I also ran the above on a much smaller sort, and saw that the new sorted copy version still takes about 2% longer running time on a sort of 1000 length.
Poke ran his own code as well, here's the code:
setup = '''
import random
random.seed(12122353453462456)
lst = list(range({length}))
random.shuffle(lst)
lists = [lst[:] for _ in range({repeats})]
it = iter(lists)
'''
t1 = 'l = next(it); l.sort()'
t2 = 'l = next(it); sorted(l)'
length = 10 ** 7
repeats = 10 ** 2
print(length, repeats)
for t in t1, t2:
print(t)
print(timeit(t, setup=setup.format(length=length, repeats=repeats), number=repeats))
He found for 1000000 length sort, (ran 100 times) a similar result, but only about a 5% increase in time, here's the output:
10000000 100
l = next(it); l.sort()
610.5015971539542
l = next(it); sorted(l)
646.7786222379655
Conclusion:
A large sized list being sorted with sorted making a copy will likely dominate differences, but the sorting itself dominates the operation, and organizing your code around these differences would be premature optimization. I would use sorted when I need a new sorted list of the data, and I would use list.sort when I need to sort a list in-place, and let that determine my usage.
The main difference is that sorted(some_list) returns a new list:
a = [3, 2, 1]
print sorted(a) # new list
print a # is not modified
and some_list.sort(), sorts the list in place:
a = [3, 2, 1]
print a.sort() # in place
print a # it's modified
Note that since a.sort() doesn't return anything, print a.sort() will print None.
Can a list original positions be retrieved after list.sort()?
No, because it modifies the original list.
Here are a few simple examples to see the difference in action:
See the list of numbers here:
nums = [1, 9, -3, 4, 8, 5, 7, 14]
When calling sorted on this list, sorted will make a copy of the list. (Meaning your original list will remain unchanged.)
Let's see.
sorted(nums)
returns
[-3, 1, 4, 5, 7, 8, 9, 14]
Looking at the nums again
nums
We see the original list (unaltered and NOT sorted.). sorted did not change the original list
[1, 2, -3, 4, 8, 5, 7, 14]
Taking the same nums list and applying the sort function on it, will change the actual list.
Let's see.
Starting with our nums list to make sure, the content is still the same.
nums
[-3, 1, 4, 5, 7, 8, 9, 14]
nums.sort()
Now the original nums list is changed and looking at nums we see our original list has changed and is now sorted.
nums
[-3, 1, 2, 4, 5, 7, 8, 14]
Note: Simplest difference between sort() and sorted() is: sort()
doesn't return any value while, sorted() returns an iterable list.
sort() doesn't return any value.
The sort() method just sorts the elements of a given list in a specific order - Ascending or Descending without returning any value.
The syntax of sort() method is:
list.sort(key=..., reverse=...)
Alternatively, you can also use Python's in-built function sorted()
for the same purpose. sorted function return sorted list
list=sorted(list, key=..., reverse=...)
The .sort() function stores the value of new list directly in the list variable; so answer for your third question would be NO.
Also if you do this using sorted(list), then you can get it use because it is not stored in the list variable. Also sometimes .sort() method acts as function, or say that it takes arguments in it.
You have to store the value of sorted(list) in a variable explicitly.
Also for short data processing the speed will have no difference; but for long lists; you should directly use .sort() method for fast work; but again you will face irreversible actions.
With list.sort() you are altering the list variable but with sorted(list) you are not altering the variable.
Using sort:
list = [4, 5, 20, 1, 3, 2]
list.sort()
print(list)
print(type(list))
print(type(list.sort())
Should return this:
[1, 2, 3, 4, 5, 20]
<class 'NoneType'>
But using sorted():
list = [4, 5, 20, 1, 3, 2]
print(sorted(list))
print(list)
print(type(sorted(list)))
Should return this:
[1, 2, 3, 4, 5, 20]
[4, 5, 20, 1, 3, 2]
<class 'list'>
I have a list as [[4,5,6],[2,3,1]]. Now I want to sort the list based on list[1] i.e. output should be [[6,4,5],[1,2,3]]. So basically I am sorting 2,3,1 and maintaining the order of list[0].
While searching I got a function which sorts based on first element of every list but not for this. Also I do not want to recreate list as [[4,2],[5,3],[6,1]] and then use the function.
Since [4, 5, 6] and [2, 3, 1] serves two different purposes I will make a function taking two arguments: the list to be reordered, and the list whose sorting will decide the order. I'll only return the reordered list.
This answer has timings of three different solutions for creating a permutation list for a sort. Using the fastest option gives this solution:
def pyargsort(seq):
return sorted(range(len(seq)), key=seq.__getitem__)
def using_pyargsort(a, b):
"Reorder the list a the same way as list b would be reordered by a normal sort"
return [a[i] for i in pyargsort(b)]
print using_pyargsort([4, 5, 6], [2, 3, 1]) # [6, 4, 5]
The pyargsort method is inspired by the numpy argsort method, which does the same thing much faster. Numpy also has advanced indexing operations whereby an array can be used as an index, making possible very quick reordering of an array.
So if your need for speed is great, one would assume that this numpy solution would be faster:
import numpy as np
def using_numpy(a, b):
"Reorder the list a the same way as list b would be reordered by a normal sort"
return np.array(a)[np.argsort(b)].tolist()
print using_numpy([4, 5, 6], [2, 3, 1]) # [6, 4, 5]
However, for short lists (length < 1000), this solution is in fact slower than the first. This is because we're first converting the a and b lists to array and then converting the result back to list before returning. If we instead assume you're using numpy arrays throughout your application so that we do not need to convert back and forth, we get this solution:
def all_numpy(a, b):
"Reorder array a the same way as array b would be reordered by a normal sort"
return a[np.argsort(b)]
print all_numpy(np.array([4, 5, 6]), np.array([2, 3, 1])) # array([6, 4, 5])
The all_numpy function executes up to 10 times faster than the using_pyargsort function.
The following logaritmic graph compares these three solutions with the two alternative solutions from the other answers. The arguments are two randomly shuffled ranges of equal length, and the functions all receive identically ordered lists. I'm timing only the time the function takes to execute. For illustrative purposes I've added in an extra graph line for each numpy solution where the 60 ms overhead for loading numpy is added to the time.
As we can see, the all-numpy solution beats the others by an order of magnitude. Converting from python list and back slows the using_numpy solution down considerably in comparison, but it still beats pure python for large lists.
For a list length of about 1'000'000, using_pyargsort takes 2.0 seconds, using_nympy + overhead is only 1.3 seconds, while all_numpy + overhead is 0.3 seconds.
The sorting you describe is not very easy to accomplish. The only way that I can think of to do it is to use zip to create the list you say you don't want to create:
lst = [[4,5,6],[2,3,1]]
# key = operator.itemgetter(1) works too, and may be slightly faster ...
transpose_sort = sorted(zip(*lst),key = lambda x: x[1])
lst = zip(*transpose_sort)
Is there a reason for this constraint?
(Also note that you could do this all in one line if you really want to:
lst = zip(*sorted(zip(*lst),key = lambda x: x[1]))
This also results in a list of tuples. If you really want a list of lists, you can map the result:
lst = map(list, lst)
Or a list comprehension would work as well:
lst = [ list(x) for x in lst ]
If the second list doesn't contain duplicates, you could just do this:
l = [[4,5,6],[2,3,1]] #the list
l1 = l[1][:] #a copy of the to-be-sorted sublist
l[1].sort() #sort the sublist
l[0] = [l[0][l1.index(x)] for x in l[1]] #order the first sublist accordingly
(As this saves the sublist l[1] it might be a bad idea if your input list is huge)
How about this one:
a = [[4,5,6],[2,3,1]]
[a[0][i] for i in sorted(range(len(a[1])), key=lambda x: a[1][x])]
It uses the principal way numpy does it without having to use numpy and without the zip stuff.
Neither using numpy nor the zipping around seems to be the cheapest way for giant structures. Unfortunately the .sort() method is built into the list type and uses hard-wired access to the elements in the list (overriding __getitem__() or similar does not have any effect here).
So you can implement your own sort() which sorts two or more lists according to the values in one; this is basically what numpy does.
Or you can create a list of values to sort, sort that, and recreate the sorted original list out of it.