List comprehension with early conditional check - python

For the given list l
l = [{'k': [1, 2]}, {'k': [2, 8]}, {'k': [6, 32]}, {}, {'s': 0}]
where I would like to have a single list of all values
r = [1, 2, 2, 8, 6, 32]
and the code
r = []
for item in l:
if 'k' in item:
for i in item['k']:
r += [i]
is there an elegant list comprehension solution for this kind of list?
Obviously,
[i for i in item['k'] if 'k' in item for item in l]
fails, because item['k'] is accessed before the condition is checked. Any ideas?

Use get to provide an empty list to iterate over if k doesn't exist.
r = [i for d in l for i in d.get('k', [])]
Or, check for k before you try to access its value.
r = [i for d in l if 'k' in d for i in d['k']]

You almost have the right solution with your list comprehension. It is just that the order of statements inside list comprehension is wrong. Please try the following.
l = [{'k': [1, 2]}, {'k': [2, 8]}, {'k': [6, 32]}, {}, {'s': 0}]
answer = [i for item in l if 'k' in item for i in item['k'] ]
print(answer)
Is this what you wanted?

Related

How to extract only certain values from a dictionary (python)

Let's say that I have a list l=[1, 2, 3, 4] and a dictionary d={2:a, 4:b}.
I'd like to extract the values of d only in the key is also in my list and put the result in a new list.
This is what I've tried so far:
new_l=[]
for i in l:
for key in d.keys():
if key in l:
new_l.append(d[key])
print (new_l)
Thank you in advance for your help.
This will compare each value in the dictionary and if it's match in the list.
Simplistic answer..
>>> l
[1, 2, 3, 4]
>>> d
{2: 'a', 4: 'b'}
>>> [value for (key,value) in d.items() if key in l]
['a', 'b']
You don't need to cycle through each key in that second for loop. With Python, you can just use a list comprehension:
L = [1, 2, 3, 4]
d = {2: 'a', 4: 'b'}
res = [d[i] for i in L if i in d] # ['a', 'b']
An alternative functional solution is possible if you know your dictionary values are non-Falsy (e.g. not 0, None). filter is a lazy iterator, so you'll need to exhaust via list in a subsequent step:
res = filter(None, map(d.get, L))
print(list(res)) # ['a', 'b']
You can skip iterating l
Ex:
l=[1, 2, 3, 4]
d={2:"a", 4:"b"}
new_l=[]
for key in d.keys():
if key in l:
new_l.append(d[key])
print (new_l)
Iterate the dictionary with key and match the key present in list.
L=[1, 2, 3, 4]
d={2:"a", 4:"b"}
new_l=[]
for k in d.keys():
if k in L:
new_l.append(d[k])
print (new_l)

Python remove duplicates in dictionary of lists

My dictionary looks something like this:
dictionary= {apple:[3,5], banana:[3,3,6], strawberry:[1,2,4,5,5]}
How am I able to remove all duplicates (so create a set) for each value/list?
I would like the new dictionary to be look this:
{apple:[3,5], banana:[3,6], strawberry:[1,2,4,5]}
using dict comprehension and sets to remove duplicates
d= {'apple':[3,5], 'banana':[3,3,6], 'strawberry':[1,2,4,5,5]}
print {k:list(set(j)) for k,j in d.items()}
results in
{'strawberry': [1, 2, 4, 5], 'apple': [3, 5], 'banana': [3, 6]}
If you want to preserve the list order
d= {'apple':[3,5,5,8,4,5], 'banana':[3,3,6,1,1,3], 'strawberry':[5,1,1,2,4,5,5]}
print {k:sorted(set(j),key=j.index) for k,j in d.items()}
results in:
{'strawberry': [5, 1, 2, 4], 'apple': [3, 5, 8, 4], 'banana': [3, 6, 1]}
for lst in dictionary.values():
lst[:] = list(set(lst))
Going through set might change the order, though. If that must not happen, OrderedDict is an option:
for lst in dictionary.values():
lst[:] = list(collections.OrderedDict.fromkeys(lst))
Or if the lists shall be sorted, you can do that instead:
for lst in dictionary.values():
lst[:] = sorted(set(lst))
Or if the lists already are sorted, you could keep the first element and every element that's not a duplicate of the element before it.
for lst in dictionary.values():
lst[:] = lst[:1] + [b for a, b in zip(lst, lst[1:]) if a != b]
dictionary= {"apple":[3,5], "banana":[3,3,6], "strawberry":[1,2,4,5,5]}
for key,item in dictionary.items():
dictionary[key]=set(item)
print(dictionary)
output:
{'apple': {3, 5}, 'banana': {3, 6}, 'strawberry': {1, 2, 4, 5}}

iterate over only two keys of python dictionary

What is the pythonic way to iterate over a dictionary with a setup like this:
dict = {'a': [1, 2, 3], 'b': [3, 4, 5], 'c': 6}
if I only wanted to iterate a for loop over all the values in a and b and skip c. There's obviously a million ways to solve this but I'd prefer to avoid something like:
for each in dict['a']:
# do something
pass
for each in dict['b']:
# do something
pass
of something destructive like:
del dict['c']
for k,v in dict.iteritems():
pass
The more generic way is using filter-like approaches by putting an if in the end of a generator expression.
If you want to iterate over every iterable value, filter with hasattr:
for key in (k for k in dict if hasattr(dict[k], '__iter__')):
for item in dict[key]:
print(item)
If you want to exclude some keys, use a "not in" filter:
invalid = set(['c', 'd'])
for key in (k for k in dict if key not in invalid):
....
If you want to select only specific keys, use a "in" filter:
valid = set(['a', 'b'])
for key in (k for k in dict if key in valid):
....
Similar to SSDMS's solution you can also just do:
mydict = {'a': [1, 2, 3], 'b': [3, 4, 5], 'c': 6}
for each in mydict['a']+mydict['b']:
....
You can use chain from the itertools module to do this:
In [29]: from itertools import chain
In [30]: mydict = {'a': [1, 2, 3], 'b': [3, 4, 5], 'c': 6}
In [31]: for item in chain(mydict['a'], mydict['b']):
...: print(item)
...:
1
2
3
3
4
5
To iterate over only the values the keys' value in the dictionary that are instance of list simply use chain.from_iterable.
wanted_key = ['a', 'b']
for item in chain.from_iterable(mydict[key] for key in wanted_key if isinstance(mydict[key], list)):
# do something with the item

How to make dictionary from list of lists

This is what I am doing:
Where data is a list of lists in the form of [[int1, int2, int3], [int1, int2,int3]].
I want a dictionary that looks like: {int1: [int2, int3], in2:[int2, int3]}.
I am checking what the size of data is before the dictionary comprehension and it is 1417.
I then check what the length of the dictionary is and it is something like 11, I don't know what is happening to the data list since all of the elements are not being copied into containsBacon.
def makeComprehension(data):
containsBacon = dict([(movies[2], movies[0:2]) for movies in data])
Here's a way to do it:
>>> l = [[1,2,3], [10,20,30]]
>>> d = {m[0]:m[1:] for m in l}
>>> d
{1: [2, 3], 10: [20, 30]}
Note that not all the elements will be in the resulting dictionary, because if two lists start with the same elements, this will create the same key, and will therefore not appear.
If you want to have all your original elements in your resulting dictionary, you can do the following:
>>> l = [[1,2,3], [10,20,30], [1,5,6]
>>> {m[0]:[x for n in l if n[0]==m[0] for x in n[1:]] for m in l}
{1: [2, 3, 5, 6], 10: [20, 30]}
Similar to #DevShark answer, but with destructuring assignment:
>>> L = [[1,2,3], [10,20,30], [1,5,6]]
>>> {k:v for k,*v in L}
{1: [5, 6], 10: [20, 30]}
If you want to concatenate the values for a given key, do not use a dict comprehension:
>>> d = {}
>>> for k,*v in L: d.setdefault(k, []).extend(v)
...
>>> d
{1: [2, 3, 5, 6], 10: [20, 30]}
The setdefault method creates the d[k] entry and set it to an empty list if it does not exist.
This solution is O(n) vs O(n^2) in #DevShark answer.
Here's another O(n) version:
>>> import functools
>>> functools.reduce(lambda d,m:{**d, m[0]:d.get(m[0], []) + m[1:]}, L, {})
{1: [2, 3, 5, 6], 10: [20, 30]}
d[m[0]] is updated to its previous value + m[1:]
If you want a dict comprehension, you may use itertools.groupby for a O(n lg n) solution:
>>> import itertools
>>> L.sort() # O(n lg n) part
>>> L
[[1, 2, 3], [1, 5, 6], [10, 20, 30]]
{k:[v for m in ms for v in m[1:]] for k, ms in itertools.groupby(L, lambda m:m[0])}
{1: [2, 3, 5, 6], 10: [20, 30]}

List of dicts to/from dict of lists

I want to change back and forth between a dictionary of (equal-length) lists:
DL = {'a': [0, 1], 'b': [2, 3]}
and a list of dictionaries:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
For those of you that enjoy clever/hacky one-liners.
Here is DL to LD:
v = [dict(zip(DL,t)) for t in zip(*DL.values())]
print(v)
and LD to DL:
v = {k: [dic[k] for dic in LD] for k in LD[0]}
print(v)
LD to DL is a little hackier since you are assuming that the keys are the same in each dict. Also, please note that I do not condone the use of such code in any kind of real system.
If you're allowed to use outside packages, Pandas works great for this:
import pandas as pd
pd.DataFrame(DL).to_dict(orient="records")
Which outputs:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
You can also use orient="list" to get back the original structure
{'a': [0, 1], 'b': [2, 3]}
Perhaps consider using numpy:
import numpy as np
arr = np.array([(0, 2), (1, 3)], dtype=[('a', int), ('b', int)])
print(arr)
# [(0, 2) (1, 3)]
Here we access columns indexed by names, e.g. 'a', or 'b' (sort of like DL):
print(arr['a'])
# [0 1]
Here we access rows by integer index (sort of like LD):
print(arr[0])
# (0, 2)
Each value in the row can be accessed by column name (sort of like LD):
print(arr[0]['b'])
# 2
To go from the list of dictionaries, it is straightforward:
You can use this form:
DL={'a':[0,1],'b':[2,3], 'c':[4,5]}
LD=[{'a':0,'b':2, 'c':4},{'a':1,'b':3, 'c':5}]
nd={}
for d in LD:
for k,v in d.items():
try:
nd[k].append(v)
except KeyError:
nd[k]=[v]
print nd
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Or use defaultdict:
nd=cl.defaultdict(list)
for d in LD:
for key,val in d.items():
nd[key].append(val)
print dict(nd.items())
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Going the other way is problematic. You need to have some information of the insertion order into the list from keys from the dictionary. Recall that the order of keys in a dict is not necessarily the same as the original insertion order.
For giggles, assume the insertion order is based on sorted keys. You can then do it this way:
nl=[]
nl_index=[]
for k in sorted(DL.keys()):
nl.append({k:[]})
nl_index.append(k)
for key,l in DL.items():
for item in l:
nl[nl_index.index(key)][key].append(item)
print nl
#[{'a': [0, 1]}, {'b': [2, 3]}, {'c': [4, 5]}]
If your question was based on curiosity, there is your answer. If you have a real-world problem, let me suggest you rethink your data structures. Neither of these seems to be a very scalable solution.
Here are the one-line solutions (spread out over multiple lines for readability) that I came up with:
if dl is your original dict of lists:
dl = {"a":[0, 1],"b":[2, 3]}
Then here's how to convert it to a list of dicts:
ld = [{key:value[index] for key,value in dl.items()}
for index in range(max(map(len,dl.values())))]
Which, if you assume that all your lists are the same length, you can simplify and gain a performance increase by going to:
ld = [{key:value[index] for key, value in dl.items()}
for index in range(len(dl.values()[0]))]
Here's how to convert that back into a dict of lists:
dl2 = {key:[item[key] for item in ld]
for key in list(functools.reduce(
lambda x, y: x.union(y),
(set(dicts.keys()) for dicts in ld)
))
}
If you're using Python 2 instead of Python 3, you can just use reduce instead of functools.reduce there.
You can simplify this if you assume that all the dicts in your list will have the same keys:
dl2 = {key:[item[key] for item in ld] for key in ld[0].keys() }
cytoolz.dicttoolz.merge_with
Docs
from cytoolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}
Non-cython version
Docs
from toolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}
The python module of pandas can give you an easy-understanding solution. As a complement to #chiang's answer, the solutions of both D-to-L and L-to-D are as follows:
import pandas as pd
DL = {'a': [0, 1], 'b': [2, 3]}
out1 = pd.DataFrame(DL).to_dict('records')
Output:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
In the other direction:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
out2 = pd.DataFrame(LD).to_dict('list')
Output:
{'a': [0, 1], 'b': [2, 3]}
Cleanest way I can think of a summer friday. As a bonus, it supports lists of different lengths (but in this case, DLtoLD(LDtoDL(l)) is no more identity).
From list to dict
Actually less clean than #dwerk's defaultdict version.
def LDtoDL (l) :
result = {}
for d in l :
for k, v in d.items() :
result[k] = result.get(k,[]) + [v] #inefficient
return result
From dict to list
def DLtoLD (d) :
if not d :
return []
#reserve as much *distinct* dicts as the longest sequence
result = [{} for i in range(max (map (len, d.values())))]
#fill each dict, one key at a time
for k, seq in d.items() :
for oneDict, oneValue in zip(result, seq) :
oneDict[k] = oneValue
return result
I needed such a method which works for lists of different lengths (so this is a generalization of the original question). Since I did not find any code here that the way that I expected, here's my code which works for me:
def dict_of_lists_to_list_of_dicts(dict_of_lists: Dict[S, List[T]]) -> List[Dict[S, T]]:
keys = list(dict_of_lists.keys())
list_of_values = [dict_of_lists[key] for key in keys]
product = list(itertools.product(*list_of_values))
return [dict(zip(keys, product_elem)) for product_elem in product]
Examples:
>>> dict_of_lists_to_list_of_dicts({1: [3], 2: [4, 5]})
[{1: 3, 2: 4}, {1: 3, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5]})
[{1: 3, 2: 5}, {1: 4, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6]})
[{1: 3, 2: 5}, {1: 3, 2: 6}, {1: 4, 2: 5}, {1: 4, 2: 6}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6], 7: [8, 9, 10]})
[{1: 3, 2: 5, 7: 8},
{1: 3, 2: 5, 7: 9},
{1: 3, 2: 5, 7: 10},
{1: 3, 2: 6, 7: 8},
{1: 3, 2: 6, 7: 9},
{1: 3, 2: 6, 7: 10},
{1: 4, 2: 5, 7: 8},
{1: 4, 2: 5, 7: 9},
{1: 4, 2: 5, 7: 10},
{1: 4, 2: 6, 7: 8},
{1: 4, 2: 6, 7: 9},
{1: 4, 2: 6, 7: 10}]
Here my small script :
a = {'a': [0, 1], 'b': [2, 3]}
elem = {}
result = []
for i in a['a']: # (1)
for key, value in a.items():
elem[key] = value[i]
result.append(elem)
elem = {}
print result
I'm not sure that is the beautiful way.
(1) You suppose that you have the same length for the lists
Here is a solution without any libraries used:
def dl_to_ld(initial):
finalList = []
neededLen = 0
for key in initial:
if(len(initial[key]) > neededLen):
neededLen = len(initial[key])
for i in range(neededLen):
finalList.append({})
for i in range(len(finalList)):
for key in initial:
try:
finalList[i][key] = initial[key][i]
except:
pass
return finalList
You can call it as follows:
dl = {'a':[0,1],'b':[2,3]}
print(dl_to_ld(dl))
#[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
If you don't mind a generator, you can use something like
def f(dl):
l = list((k,v.__iter__()) for k,v in dl.items())
while True:
d = dict((k,i.next()) for k,i in l)
if not d:
break
yield d
It's not as "clean" as it could be for Technical Reasons: My original implementation did yield dict(...), but this ends up being the empty dictionary because (in Python 2.5) a for b in c does not distinguish between a StopIteration exception when iterating over c and a StopIteration exception when evaluating a.
On the other hand, I can't work out what you're actually trying to do; it might be more sensible to design a data structure that meets your requirements instead of trying to shoehorn it in to the existing data structures. (For example, a list of dicts is a poor way to represent the result of a database query.)
List of dicts ⟶ dict of lists
from collections import defaultdict
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def ld_to_dl(ld: list[dict[K, V]]) -> dict[K, list[V]]:
dl = defaultdict(list)
for d in ld:
for k, v in d.items():
dl[k].append(v)
return dl
defaultdict creates an empty list if one does not exist upon key access.
Dict of lists ⟶ list of dicts
Collecting into "jagged" dictionaries
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = []
for k, vs in dl.items():
ld += [{} for _ in range(len(vs) - len(ld))]
for i, v in enumerate(vs):
ld[i][k] = v
return ld
This generates a list of dictionaries ld that may be missing items if the lengths of the lists in dl are unequal. It loops over all key-values in dl, and creates empty dictionaries if ld does not have enough.
Collecting into "complete" dictionaries only
(Usually intended only for equal-length lists.)
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = [dict(zip(dl.keys(), v)) for v in zip(*dl.values())]
return ld
This generates a list of dictionaries ld that have the length of the smallest list in dl.
DL={'a':[0,1,2,3],'b':[2,3,4,5]}
LD=[{'a':0,'b':2},{'a':1,'b':3}]
Empty_list = []
Empty_dict = {}
# to find length of list in values of dictionry
len_list = 0
for i in DL.values():
if len_list < len(i):
len_list = len(i)
for k in range(len_list):
for i,j in DL.items():
Empty_dict[i] = j[k]
Empty_list.append(Empty_dict)
Empty_dict = {}
LD = Empty_list

Categories

Resources