I'm newbie on Python. I have this list:
a = [[0,1,2,3],[4,5,6,7,8,9], ...]
b = [[0,6,9],[1,5], ...]
a & b can have more components, depends on data. I want to know is there any intersection on these lists? If there's any intersection, I wanna have a result like this:
c = [[6,9], ...]
The set type, built into Python, supports intersection natively. However, note that set can only hold one of each element (like a mathematical set). If you want to hold more than one of each element, try collections.Counter.
You can make sets using {} notation (like dictionaries, but without values):
>>> a = {1, 2, 3, 4, 5}
>>> b = {2, 4, 6, 8, 10}
and you can intersect them using the & operator:
>>> print a & b
set([2, 4])
Given that intersection is an operation between two sets, and you have given two lists of lists, it's very unclear what you're looking for. Do you want the intersection of a[1] and b[0]? Do you want the intersection of every possible combination?
I'm guessing you want the intersection of every combination of two sets between your two lists, which would be:
from itertools import product
[set(x).intersection(set(y)) for x, y in product(a, b)]
First of all, in your example code this is not a tuple, it's a list (the original question asked about lists, but references tuples in the example code).
To get an intersection of two tuples or lists, use a code like this:
set((1,2,3,4,5)).intersection(set((1,2,3,7,8)))
In one line:
common_set = set([e for r in a for e in r])&set([e for r in b for e in r])
Or easier:
common_set = set(sum(a,[])) & set(sum(b,[]))
Common will be a set. You can easily convert set to the list is you need it:
common_list = list(common_set)
Another way to do it... assuming you want the intersection of the flatten list.
>>> from itertools import chain
>>> a = [[0,1,2,3],[4,5,6,7,8,9]]
>>> b = [[0,6,9],[1,5]]
>>> list(set(chain(*a)).intersection(set(chain(*b))))
[0, 9, 5, 6, 1]
Related
I have a list of dictionaries like the following:
a = [{1000976: 975},
{1000977: 976},
{1000978: 977},
{1000979: 978},
{1000980: 979},
{1000981: 980},
{1000982: 981},
{1000983: 982},
{1000984: 983},
{1000985: 984}]
I could be thinking about this wrong, but I'm comparing this list of dicts to another list of dicts and am attempting to remove elements (dictionaries) in one list that are in the other. In order to list operations, I want to transform both into sets and perform set subtraction. However I'm getting the following error when attempting to do the conversion.
set_a = set(a)
TypeError: unhashable type: 'dict'
Am I thinking about this incorrectly?
Try this:
>>> a = [{1000976: 975},
... {1000977: 976},
... {1000978: 977},
... {1000979: 978},
... {1000980: 979},
... {1000981: 980},
... {1000982: 981},
... {1000983: 982},
... {1000984: 983},
... {1000985: 984}]
>>> a.extend(a) # just to add some duplicates
>>> len(a)
20
>>> dict_set = set(frozenset(d.items()) for d in a)
>>> b = [dict(s) for s in dict_set]
>>> b
[{1000982: 981}, {1000983: 982}, {1000981: 980}, {1000985: 984}, {1000978: 977}, {1000980: 979}, {1000977: 976}, {1000976: 975}, {1000984: 983}, {1000979: 978}]
>>> len(b)
10
If you want do set subtraction between two lists of dicts then just use the same conversion to sets as above on both dicts, do the subtraction, then convert back.
Note: At the very least all values in your dict should also be hashable (as well as keys but that goes without saying). If not, you need a similar transformation on the values into a hashable, immutable type of some kind.
Note: This is also does not preserve the original order; if that's important to you need to adapt this to an algorithm like this one. The key though is converting dicts to some immutable type.
You could turn the dictionaries into tuples, as there are only two values like so:
a_set = set(t for d in a for t in d.items())
And then use set operations to compare two sets from that point. To convert back into a list of dictionaries, you can use:
a_list = [{key: value} for key, value in a_set]
For filtering there's a one-liner. (b is the filter list of dicts). This is by far and away the fastest approach, unless you are using the same filter against multiple sets.
c = [a[i] for i,j in enumerate(a) if j not in b]
Or using the built in filter: another one-liner (slower):
c = list(filter(lambda i: i not in b, a))
If you are really asking how to convert a list of dicts into a set-operable variable, then you can do this with yet another one-liner:
a_set = set(map(lambda i: frozenset(i.items()), a))
again, if we have 'b' as a list of dicts as our filter
b_set = set(map(lambda i: frozenset(i.items()), b))
... and we can now use set operations on them:
c_set = a_set - b_set
The 'frozenset' method of converting a dict to a set is about 25% faster than using a list comprehension; but it's much slower to convert everything to sets and then perform the set operations than it is just to use a simple list comprehension filter such as the one at the top of my answer. Obviously, if one is going to do many filters, it may be cost effective to convert the objects to immutables; but in that case, it may be better to change the underlying data structure of the objects, and convert the entire structure to a class.
If you don't want to use frozen set and your dicts are arbitrary, rather than single entry dicts, you can tupelise the dicts:
a_set = set(map(lambda j: tuple(map(lambda i: tuple((i, j[i])), j)), a))
You suggest in the question that you don't want ANY nested loop, and so far all the answers (including mine) have a 'for' (or a lambda).
When we want to use a set method for filtering two dictionaries, it's not too shabby to do exactly that as follows:
c = a.items() - b.items()
of course if we want c to be a dict, we need to wrap it again:
c = dict(a.items() - b.items()
Likewise, for lists of immutable types, we can do the same (by coercing our lists into sets:
x = [3, 4, 5, 6, 7]
y = [3, 2, 1, 7]
z = set(x) - set(y)
or (tuples are immutable)
x = [(3, 1), (4, 1), (5, 1), (6, 2), (7, 5)]
y = [(4, 1), (4, 2), (5, 1)]
z = set(x) - set(y)
but (mutable) lists fail (as do your dicts):
x = [[3, 1], [4, 1], [5, 1], [6, 2], [7, 5]]
y = [[4, 1], [4, 2], [5, 1]]
z = set(x) - set(y)
>>>> TypeError: unhashable type: 'list'
This is because they are being stored by reference, not by value - so the uniqueness of them is unknowable at that point. One can handle it by creating a class - but then that is not using a list of dicts anymore, and your 'for' is just being buried into a class method.
So - you will need a nested loop somewhere, even if it is hidden by a lambda or a function..
Is there a more pythonic way to tell the list which parts of it has to stay in it an which parts has to be removed?
li = [1,2,3,4,5,6,7]
Wanted list:
[1,2,3,6,7]
I can do that this way:
wl = li[:-4]+li[-2:]
I'm looking for something like li[:-4,-2:] (in one statement/command)
Of course I can do remove but it can be used in many situations like:
Wanted list:
[3,4,5,6,7]
I can do del li[0:2]
But it's more common to do:
li[2:]
Other than regular python lists, NumPy arrays can be indexed by other sequence-like objects (other than tuples) e.g. by regular python lists or by another NumPy array.
import numpy as np
li = np.arange(1, 8)
# array([1, 2, 3, 4, 5, 6, 7])
wl = li[[0,1,2,5,6]]
# array([1, 2, 3, 6, 7])
Of course, this leaves you now with the problem of creating the index sequence (the regular python list [0,1,2,5,6] in this example), which puts you back on square one. (Unless you need to access several NumPy arrays at the same indices, so you create this index list once and then re-use it.)
You should probably only consider this if you have additional reasons to use NumPy in general or specifically NumPy arrays.
If you want the output list to follow a certain logic, you can use the filter function.
filter(lambda x: x > 2, li)
or maybe
filter(lambda x: x < 4 or x > 5, li)
I try to find common list of values for three different lists:
a = [1,2,3,4]
b = [2,3,4,5]
c = [3,4,5,6]
of course naturally I try to use the and operator however that way I just get the value of last list in expression:
>> a and b and c
out: [3,4,5,6]
Is any short way to find the common values list:
[3,4]
Br
Use sets:
>>> a = [1, 2, 3, 4]
>>> b = [2, 3, 4, 5]
>>> c = [3, 4, 5, 6]
>>> set(a) & set(b) & set(c)
{3, 4}
Or as Jon suggested:
>>> set(a).intersection(b, c)
{3, 4}
Using sets has the benefit that you don’t need to repeatedly iterate the original lists. Each list is iterated once to create the sets, and then the sets are intersected.
The naive way to solve this using a filtered list comprehension as Geotob did will iterate lists b and c for each element of a, so for longer list, this will be a lot less efficient.
out = [x for x in a if x in b and x in c]
is a quick and simple solution. This constructs a list out with entries from a, if those entries are in b and c.
For larger lists, you want to look at the answer provided by #poke
For those still stumbling uppon this question, with numpy one can use:
np.intersect1d(array1, array2)
This works with lists as well as numpy arrays.
It could be extended to more arrays with the help of functools.reduce, or it can simply be repeated for several arrays.
from functools import reduce
reduce(np.intersect1d, (array1, array2, array3))
or
new_array = np.intersect1d(array1, array2)
np.intersect1d(new_array, array3)
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Python, compute list difference
I have two lists
For example:
A = [1,3,5,7]
B = [1,2,3,4,5,6,7,8]
Now, A is always a subset of B
I want to generate a third list C:
which has elements which are present in B but absent in A
like
C = [2,4..]
Thanks
List comprehensions are one way to do this:
[x for x in B if x not in A]
If you use Python, I recommend gaining familiarity with list comprehensions. They're a very powerful tool.
(Several people have suggested using set. While this is a very good idea if you only care about whether or not an element is in the set, note that it will not preserve the order of the elements; a list comprehension will.)
>>> set(B) - set(A)
set([8, 2, 4, 6])
or
>>> sorted(set(B) - set(A))
[2, 4, 6, 8]
An easy way to do this is
C = [x for x in B if x not in A]
This will become slow for big lists, so it would be better to use a set for A:
A = set(A)
C = [x for x in B if x not in A]
If you have multiple operations like this, using sets all the time might be the best option. If A and B are sets, you can simply do
C = B - A
C = sorted(list(set(B) - set(A)))
That should do it.
I need to plot a lot of data samples, each stored in a list of integers. I want to create a list from a lot of concatenated lists, in order to plot it with enumerate(big_list) in order to get a fixed-offset x coordinate.
My current code is:
biglist = []
for n in xrange(number_of_lists):
biglist.extend(recordings[n][chosen_channel])
for x,y in enumerate(biglist):
print x,y
Notes: number_of_lists and chosen_channel are integer parameters defined elsewhere, and print x,y is for example (actually there are other statements to plot the points.
My question is:
is there a better way, for example, list comprehensions or other operation, to achieve the same result (merged list) without the loop and the pre-declared empty list?
Thanks
import itertools
for x,y in enumerate(itertools.chain(*(recordings[n][chosen_channel] for n in xrange(number_of_lists))):
print x,y
You can think of itertools.chain() as managing an iterator over the individual lists. It remembers which list and where in the list you are. This saves you all memory you would need to create the big list.
>>> import itertools
>>> l1 = [2,3,4,5]
>>> l2=[9,8,7]
>>> itertools.chain(l1,l2)
<itertools.chain object at 0x100429f90>
>>> list(itertools.chain(l1,l2))
[2, 3, 4, 5, 9, 8, 7]