Turn List of Dictionaries and Into a Set of Dictionaries

Turn List of Dictionaries and Into a Set of Dictionaries - python

I have a list of dictionaries like the following:
a = [{1000976: 975},
{1000977: 976},
{1000978: 977},
{1000979: 978},
{1000980: 979},
{1000981: 980},
{1000982: 981},
{1000983: 982},
{1000984: 983},
{1000985: 984}]
I could be thinking about this wrong, but I'm comparing this list of dicts to another list of dicts and am attempting to remove elements (dictionaries) in one list that are in the other. In order to list operations, I want to transform both into sets and perform set subtraction. However I'm getting the following error when attempting to do the conversion.
set_a = set(a)
TypeError: unhashable type: 'dict'
Am I thinking about this incorrectly?

Try this:
>>> a = [{1000976: 975},
... {1000977: 976},
... {1000978: 977},
... {1000979: 978},
... {1000980: 979},
... {1000981: 980},
... {1000982: 981},
... {1000983: 982},
... {1000984: 983},
... {1000985: 984}]
>>> a.extend(a) # just to add some duplicates
>>> len(a)
20
>>> dict_set = set(frozenset(d.items()) for d in a)
>>> b = [dict(s) for s in dict_set]
>>> b
[{1000982: 981}, {1000983: 982}, {1000981: 980}, {1000985: 984}, {1000978: 977}, {1000980: 979}, {1000977: 976}, {1000976: 975}, {1000984: 983}, {1000979: 978}]
>>> len(b)
10
If you want do set subtraction between two lists of dicts then just use the same conversion to sets as above on both dicts, do the subtraction, then convert back.
Note: At the very least all values in your dict should also be hashable (as well as keys but that goes without saying). If not, you need a similar transformation on the values into a hashable, immutable type of some kind.
Note: This is also does not preserve the original order; if that's important to you need to adapt this to an algorithm like this one. The key though is converting dicts to some immutable type.

You could turn the dictionaries into tuples, as there are only two values like so:
a_set = set(t for d in a for t in d.items())
And then use set operations to compare two sets from that point. To convert back into a list of dictionaries, you can use:
a_list = [{key: value} for key, value in a_set]

For filtering there's a one-liner. (b is the filter list of dicts). This is by far and away the fastest approach, unless you are using the same filter against multiple sets.
c = [a[i] for i,j in enumerate(a) if j not in b]
Or using the built in filter: another one-liner (slower):
c = list(filter(lambda i: i not in b, a))
If you are really asking how to convert a list of dicts into a set-operable variable, then you can do this with yet another one-liner:
a_set = set(map(lambda i: frozenset(i.items()), a))
again, if we have 'b' as a list of dicts as our filter
b_set = set(map(lambda i: frozenset(i.items()), b))
... and we can now use set operations on them:
c_set = a_set - b_set
The 'frozenset' method of converting a dict to a set is about 25% faster than using a list comprehension; but it's much slower to convert everything to sets and then perform the set operations than it is just to use a simple list comprehension filter such as the one at the top of my answer. Obviously, if one is going to do many filters, it may be cost effective to convert the objects to immutables; but in that case, it may be better to change the underlying data structure of the objects, and convert the entire structure to a class.
If you don't want to use frozen set and your dicts are arbitrary, rather than single entry dicts, you can tupelise the dicts:
a_set = set(map(lambda j: tuple(map(lambda i: tuple((i, j[i])), j)), a))
You suggest in the question that you don't want ANY nested loop, and so far all the answers (including mine) have a 'for' (or a lambda).
When we want to use a set method for filtering two dictionaries, it's not too shabby to do exactly that as follows:
c = a.items() - b.items()
of course if we want c to be a dict, we need to wrap it again:
c = dict(a.items() - b.items()
Likewise, for lists of immutable types, we can do the same (by coercing our lists into sets:
x = [3, 4, 5, 6, 7]
y = [3, 2, 1, 7]
z = set(x) - set(y)
or (tuples are immutable)
x = [(3, 1), (4, 1), (5, 1), (6, 2), (7, 5)]
y = [(4, 1), (4, 2), (5, 1)]
z = set(x) - set(y)
but (mutable) lists fail (as do your dicts):
x = [[3, 1], [4, 1], [5, 1], [6, 2], [7, 5]]
y = [[4, 1], [4, 2], [5, 1]]
z = set(x) - set(y)
>>>> TypeError: unhashable type: 'list'
This is because they are being stored by reference, not by value - so the uniqueness of them is unknowable at that point. One can handle it by creating a class - but then that is not using a list of dicts anymore, and your 'for' is just being buried into a class method.
So - you will need a nested loop somewhere, even if it is hidden by a lambda or a function..

Related

How do you convert a Dictionary to a List?

For example, if the Dictionary is {0:0, 1:0, 2:0} making a list: [0, 0, 0].
If this isn't possible, how do you take the minimum of a dictionary, meaning the dictionary: {0:3, 1:2, 2:1} returning 1?

convert a dictionary to a list is pretty simple, you have 3 flavors for that .keys(), .values() and .items()
>>> test = {1:30,2:20,3:10}
>>> test.keys() # you get the same result with list(test)
[1, 2, 3]
>>> test.values()
[30, 20, 10]
>>> test.items()
[(1, 30), (2, 20), (3, 10)]
>>>
(in python 3 you would need to call list on those)
finding the maximum or minimum is also easy with the min or max function
>>> min(test.keys()) # is the same as min(test)
1
>>> min(test.values())
10
>>> min(test.items())
(1, 30)
>>> max(test.keys()) # is the same as max(test)
3
>>> max(test.values())
30
>>> max(test.items())
(3, 10)
>>>
(in python 2, to be efficient, use the .iter* versions of those instead )
the most interesting one is finding the key of min/max value, and min/max got that cover too
>>> max(test.items(),key=lambda x: x[-1])
(1, 30)
>>> min(test.items(),key=lambda x: x[-1])
(3, 10)
>>>
here you need a key function, which is a function that take one of whatever you give to the main function and return the element(s) (you can also transform it to something else too) for which you wish to compare them.
lambda is a way to define anonymous functions, which save you the need of doing this
>>> def last(x):
return x[-1]
>>> min(test.items(),key=last)
(3, 10)
>>>

You can simply take the minimum with:
min(dic.values())
And convert it to a list with:
list(dic.values())
but since a dictionary is unordered, the order of elements of the resulting list is undefined.
In python-2.7 you do not need to call list(..) simply dic.values() will be sufficient:
dic.values()

>>> a = {0:0, 1:2, 2:4}
>>> a.keys()
[0, 1, 2]
>>> a.values()
[0, 2, 4]

Here is my one-liner solution for a flattened list of keys and values:
d = {'foo': 'bar', 'zoo': 'bee'}
list(sum(d.items(), tuple()))
And the result:
['foo', 'bar', 'zoo', 'bee']

A dictionary is defined as the following:
dict{[Any]:[Any]} = {[Key]:[Value]}
The problem with your question is that you haven't clarified what the keys are.
1: Assuming the keys are just numbers and in ascending order without gaps, dict.values() will suffice, as other authors have already pointed out.
2: Assuming the keys are just numbers in strictly ascending order but not in the right order:
i = 0
list = []
while i < max(mydict.keys()):
list.append(mydict[i])
i += 1
3: Assuming the keys are just numbers but not in strictly ascending order:
There still is a way, but you have to get the keys first and do it via the maximum of the keys and an try-except block
4: If none of these is the case, maybe dict is not what you are looking for and a 2d or 3d array would suffice? This also counts if one of the solutions do work. Dict seems to be a bad choice for what you are doing.

Python: how to find common values in three lists

I try to find common list of values for three different lists:
a = [1,2,3,4]
b = [2,3,4,5]
c = [3,4,5,6]
of course naturally I try to use the and operator however that way I just get the value of last list in expression:
>> a and b and c
out: [3,4,5,6]
Is any short way to find the common values list:
[3,4]
Br

Use sets:
>>> a = [1, 2, 3, 4]
>>> b = [2, 3, 4, 5]
>>> c = [3, 4, 5, 6]
>>> set(a) & set(b) & set(c)
{3, 4}
Or as Jon suggested:
>>> set(a).intersection(b, c)
{3, 4}
Using sets has the benefit that you don’t need to repeatedly iterate the original lists. Each list is iterated once to create the sets, and then the sets are intersected.
The naive way to solve this using a filtered list comprehension as Geotob did will iterate lists b and c for each element of a, so for longer list, this will be a lot less efficient.

out = [x for x in a if x in b and x in c]
is a quick and simple solution. This constructs a list out with entries from a, if those entries are in b and c.
For larger lists, you want to look at the answer provided by #poke

For those still stumbling uppon this question, with numpy one can use:
np.intersect1d(array1, array2)
This works with lists as well as numpy arrays.
It could be extended to more arrays with the help of functools.reduce, or it can simply be repeated for several arrays.
from functools import reduce
reduce(np.intersect1d, (array1, array2, array3))
or
new_array = np.intersect1d(array1, array2)
np.intersect1d(new_array, array3)

Sort a list then give the indexes of the elements in their original order

I have an array of n numbers, say [1,4,6,2,3]. The sorted array is [1,2,3,4,6], and the indexes of these numbers in the old array are 0, 3, 4, 1, and 2. What is the best way, given an array of n numbers, to find this array of indexes?
My idea is to run order statistics for each element. However, since I have to rewrite this function many times (in contest), I'm wondering if there's a short way to do this.

>>> a = [1,4,6,2,3]
>>> [b[0] for b in sorted(enumerate(a),key=lambda i:i[1])]
[0, 3, 4, 1, 2]
Explanation:
enumerate(a) returns an enumeration over tuples consisting of the indexes and values in the original list: [(0, 1), (1, 4), (2, 6), (3, 2), (4, 3)]
Then sorted with a key of lambda i:i[1] sorts based on the original values (item 1 of each tuple).
Finally, the list comprehension [b[0] for b in ...] returns the original indexes (item 0 of each tuple).

Using numpy arrays instead of lists may be beneficial if you are doing a lot of statistics on the data. If you choose to do so, this would work:
import numpy as np
a = np.array( [1,4,6,2,3] )
b = np.argsort( a )
argsort() can operate on lists as well, but I believe that in this case it simply copies the data into an array first.

Here is another way:
>>> sorted(xrange(len(a)), key=lambda ix: a[ix])
[0, 3, 4, 1, 2]
This approach sorts not the original list, but its indices (created with xrange), using the original list as the sort keys.

This should do the trick:
from operator import itemgetter
indices = zip(*sorted(enumerate(my_list), key=itemgetter(1)))[0]

The long way instead of using list comprehension for beginner like me
a = [1,4,6,2,3]
b = enumerate(a)
c = sorted(b, key = lambda i:i[1])
d = []
for e in c:
d.append(e[0])
print(d)

Intersection in Python List

I'm newbie on Python. I have this list:
a = [[0,1,2,3],[4,5,6,7,8,9], ...]
b = [[0,6,9],[1,5], ...]
a & b can have more components, depends on data. I want to know is there any intersection on these lists? If there's any intersection, I wanna have a result like this:
c = [[6,9], ...]

The set type, built into Python, supports intersection natively. However, note that set can only hold one of each element (like a mathematical set). If you want to hold more than one of each element, try collections.Counter.
You can make sets using {} notation (like dictionaries, but without values):
>>> a = {1, 2, 3, 4, 5}
>>> b = {2, 4, 6, 8, 10}
and you can intersect them using the & operator:
>>> print a & b
set([2, 4])

Given that intersection is an operation between two sets, and you have given two lists of lists, it's very unclear what you're looking for. Do you want the intersection of a[1] and b[0]? Do you want the intersection of every possible combination?
I'm guessing you want the intersection of every combination of two sets between your two lists, which would be:
from itertools import product
[set(x).intersection(set(y)) for x, y in product(a, b)]

First of all, in your example code this is not a tuple, it's a list (the original question asked about lists, but references tuples in the example code).
To get an intersection of two tuples or lists, use a code like this:
set((1,2,3,4,5)).intersection(set((1,2,3,7,8)))

In one line:
common_set = set([e for r in a for e in r])&set([e for r in b for e in r])
Or easier:
common_set = set(sum(a,[])) & set(sum(b,[]))
Common will be a set. You can easily convert set to the list is you need it:
common_list = list(common_set)

Another way to do it... assuming you want the intersection of the flatten list.
>>> from itertools import chain
>>> a = [[0,1,2,3],[4,5,6,7,8,9]]
>>> b = [[0,6,9],[1,5]]
>>> list(set(chain(*a)).intersection(set(chain(*b))))
[0, 9, 5, 6, 1]

Accessing grouped items in arrays

I'm new to Python and have a list of numbers. e.g.
5,10,32,35,64,76,23,53....
and I've grouped them into fours (5,10,32,35, 64,76,23,53 etc..) using the code from this post.
def group_iter(iterator, n=2, strict=False):
""" Transforms a sequence of values into a sequence of n-tuples.
e.g. [1, 2, 3, 4, ...] => [(1, 2), (3, 4), ...] (when n == 2)
If strict, then it will raise ValueError if there is a group of fewer
than n items at the end of the sequence. """
accumulator = []
for item in iterator:
accumulator.append(item)
if len(accumulator) == n: # tested as fast as separate counter
yield tuple(accumulator)
accumulator = [] # tested faster than accumulator[:] = []
# and tested as fast as re-using one list object
if strict and len(accumulator) != 0:
raise ValueError("Leftover values")
How can I access the individual arrays so that I can perform functions on them. For example, I'd like to get the average of the first values of every group (e.g. 5 and 64 in my example numbers).

Let's say you have the following tuple of tuples:
a=((5,10,32,35), (64,76,23,53))
To access the first element of each tuple, use a for-loop:
for i in a:
print i[0]
To calculate average for the first values:
elements=[i[0] for i in a]
avg=sum(elements)/float(len(elements))

Ok, this is yielding a tuple of four numbers each time it's iterated. So, convert the whole thing to a list:
L = list(group_iter(your_list, n=4))
Then you'll have a list of tuples:
>>> L
[(5, 10, 32, 35), (64, 76, 23, 53), ...]
You can get the first item in each tuple this way:
firsts = [tup[0] for tup in L]
(There are other ways, of course.)

You've created a tuple of tuples, or a list of tuples, or a list of lists, or a tuple of lists, or whatever...
You can access any element of any nested list directly:
toplist[x][y] # yields the yth element of the xth nested list
You can also access the nested structures by iterating over the top structure:
for list in lists:
print list[y]

Might be overkill for your application but you should check out my library, pandas. Stuff like this is pretty simple with the GroupBy functionality:
http://pandas.sourceforge.net/groupby.html
To do the 4-at-a-time thing you would need to compute a bucketing array:
import numpy as np
bucket_size = 4
n = len(your_list)
buckets = np.arange(n) // bucket_size
Then it's as simple as:
data.groupby(buckets).mean()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Turn List of Dictionaries and Into a Set of Dictionaries - python

You could turn the dictionaries into tuples, as there are only two values like so: a_set = set(t for d in a for t in d.items()) And then use set operations to compare two sets from that point. To convert back into a list of dictionaries, you can use: a_list = [{key: value} for key, value in a_set]

Related

How do you convert a Dictionary to a List?

Python: how to find common values in three lists

Sort a list then give the indexes of the elements in their original order

Intersection in Python List

Accessing grouped items in arrays

Categories

Resources