I have this nested python dictionary
dictionary = {'a':'1', 'b':{'c':'2', 'd':{'z':'5', 'e':{'f':'13', 'g':'14'}}}}
So the recommended output will be:
output = ['a:1', 'b:c:2', 'b:d:z:5', 'b:d:e:f:13', 'b:d:e:g:13']
using recursive function and without using recursive function
In cases like this, I always like to try and solve the easy part first.
def flatten_dict(dictionary):
output = []
for key, item in dictionary.items():
if isinstance(item, dict):
output.append(f'{key}:???') # Hm, here is the difficult part
else:
output.append(f'{key}:{item}')
return output
Trying flatten_dict(dictionary) now prints ['a:1', 'b:???'] which is obviously not good enough. For one thing, the list has three items too few.
First, I'd like to switch to using generator functions. This is more complicated for now, but will pay off later.
def flatten_dict(dictionary):
return list(flatten_dict_impl(dictionary))
def flatten_dict_impl(dictionary):
for key, item in dictionary.items():
if isinstance(item, dict):
yield f'{key}:???'
else:
yield f'{key}:{item}'
No change in the output yet. Time to go recusrive.
You want the output to be a flat list, so that means we have to yield multiple things in the case item is a dictionary. Only, what things? Let's try plugging in a recursive call to flatten_dict_impl on this subdictionary, that seems the most straightforward way to go.
# flatten_dict is unchanged
def flatten_dict_impl(dictionary):
for key, item in dictionary.items():
if isinstance(item, dict):
for value in flatten_dict_impl(item):
yield f'{key}:{value}'
else:
yield f'{key}:{item}'
The output is now ['a:1', 'b:c:2', 'b:d:z:5', 'b:d:e:f:13', 'b:d:e:g:14'], which is the output you wanted, except the final 14, but I think that's a typo on your part.
Now the non-recursive route. For that we need to manage some state ourselves, because we need to know how deep we are.
def flatten_dict_nonrecursive(dictionary):
return list(flatten_dict_nonrecursive_impl(dictionary))
def flatten_dict_nonrecursive_impl(dictionary):
dictionaries = [iter(dictionary.items())]
keys = []
while dictionaries:
try:
key, value = next(dictionaries[-1])
except StopIteration:
dictionaries.pop()
if keys:
keys.pop()
else:
if isinstance(value, dict):
keys.append(key)
dictionaries.append(iter(value.items()))
else:
yield ':'.join(keys + [key, value])
Now this gives the right output but is a lot less easy to understand, and a lot longer. It took a lot longer for me to get right too. There may be shorter and more obvious ways to do it that I missed, but in general recursive problems are easier to solve with recursive functions.
Such an approach can still be useful: if your dictionaries are nested hundreds or thousands of levels deep, then trying to do it recursively will likely overflow the stack.
I hope this helps. Let me know if I need to go into more detail or something.
You can use a NestedDict. First install ndicts
pip install ndicts
Then:
from ndicts.ndicts import NestedDictionary
dictionary = {'a': '1', 'b': {'c' :'2', 'd': {'z': '5', 'e': {'f': '13', 'g': '14'}}}}
nd = NestedDict(dictionary)
output = [":".join((*key, value)) for key, value in nd.items()]
Related
I am dealing with a dictionary that is formatted as such:
dic = {'Start': [['Story' , '.']],
'Wonderful': [('thing1',), ["thing1", "and", "thing2"]],
'Amazing': [["The", "thing", "action", "the", "thing"]],
'Fantastic': [['loved'], ['ate'], ['messaged']],
'Example': [['bus'], ['car'], ['truck'], ['pickup']]}
if you notice, in the story key, there is a tuple within a list. I am looking for a way to convert all tuples within the inner lists of each key into lists.
I have tried the following:
for value in dic.values():
for inner in value:
inner = list(inner)
but that does not work and I don't see why. I also tried an if type(inner) = tuple statement to try and convert it only if its a tuple but that is not working either... Any help would be very greatly appreciated.
edit: I am not allowed to import, and only have really learned a basic level of python. A solution that I could understand with that in mind is preferred.
You need to invest some time learning how assignment in Python works.
inner = list(inner) constructs a list (right hand side), then binds the name inner to that new list and then... you do nothing with it.
Fixing your code:
for k, vs in dic.items():
dic[k] = [list(x) if isinstance(x, tuple) else x for x in vs]
You need to update the element by its index
for curr in dic.values():
for i, v in enumerate(curr):
if isinstance(v, tuple):
curr[i] = list(v)
print(dic)
Your title, data and code suggest that you only have tuples and lists there and are willing to run list() on all of them, so here's a short way to convert them all to lists and assign them back into the outer lists (which is what you were missing) (Try it online!):
for value in dic.values():
value[:] = map(list, value)
And a fun way (Try it online!):
for value in dic.values():
for i, [*value[i]] in enumerate(value):
pass
I want to delete an element of a list which contents are dictionaries
I've created a function called "del_product(product):" and tried many ways to delete the element, but it ends up in KeyError
def del_product(product):
found=False
for lis in productos:
for dic in lis.values():
if product in lis.values():
flag=True
del(lis[product])
return lis
print(f"{product} deleted successfully") if found==True else print("{product} not found")
I expect the output of the function works properly about deleting the element, I'd really appreciate if someone could help me
Right format when call a dictionary value is: dict[key], you're calling dict[value] instead. That is the reason why you got KeyError.
You should provide data samples and expected output for getting a faster response.
This is my example based on your code:
productos = [{"key1":0, "key2":1}, {"key1":0, "key2":"pro"}]
def del_pro(val):
for dic in productos:
for k in dic.keys():
if dic[k] == val:
print(dic)
del(dic[k])
return dic
>>>del_pro("pro")
{'key1': 0, 'key2': 'pro'}
{'key1': 0}
>>>productos
[{'key1': 0, 'key2': 1}, {'key1': 0}]
It looks like you want to make found = True before you delete your item, but instead you're referencing a different variable name called flag. There may be other issues, but the biggest one I see here is setting flag to True instead of found.
Additionally, you are wanting to print something when done, but you're print statements come after the return operator. Calling return will exit your function. This means that the print statements can never be reached.
So, in conclusion, change flag to found, and then move your if/else print statement to before you return lis or else those prints will not be able to execute.
I'm a C coder developing something in python. I know how to do the following in C (and hence in C-like logic applied to python), but I'm wondering what the 'Python' way of doing it is.
I have a dictionary d, and I'd like to operate on a subset of the items, only those whose key (string) contains a specific substring.
i.e. the C logic would be:
for key in d:
if filter_string in key:
# do something
else
# do nothing, continue
I'm imagining the python version would be something like
filtered_dict = crazy_python_syntax(d, substring)
for key,value in filtered_dict.iteritems():
# do something
I've found a lot of posts on here regarding filtering dictionaries, but couldn't find one which involved exactly this.
My dictionary is not nested and i'm using python 2.7
How about a dict comprehension:
filtered_dict = {k:v for k,v in d.iteritems() if filter_string in k}
One you see it, it should be self-explanatory, as it reads like English pretty well.
This syntax requires Python 2.7 or greater.
In Python 3, there is only dict.items(), not iteritems() so you would use:
filtered_dict = {k:v for (k,v) in d.items() if filter_string in k}
Go for whatever is most readable and easily maintainable. Just because you can write it out in a single line doesn't mean that you should. Your existing solution is close to what I would use other than I would user iteritems to skip the value lookup, and I hate nested ifs if I can avoid them:
for key, val in d.iteritems():
if filter_string not in key:
continue
# do something
However if you realllly want something to let you iterate through a filtered dict then I would not do the two step process of building the filtered dict and then iterating through it, but instead use a generator, because what is more pythonic (and awesome) than a generator?
First we create our generator, and good design dictates that we make it abstract enough to be reusable:
# The implementation of my generator may look vaguely familiar, no?
def filter_dict(d, filter_string):
for key, val in d.iteritems():
if filter_string not in key:
continue
yield key, val
And then we can use the generator to solve your problem nice and cleanly with simple, understandable code:
for key, val in filter_dict(d, some_string):
# do something
In short: generators are awesome.
You can use the built-in filter function to filter dictionaries, lists, etc. based on specific conditions.
filtered_dict = dict(filter(lambda item: filter_str in item[0], d.items()))
The advantage is that you can use it for different data structures.
input = {"A":"a", "B":"b", "C":"c"}
output = {k:v for (k,v) in input.items() if key_satifies_condition(k)}
Jonathon gave you an approach using dict comprehensions in his answer. Here is an approach that deals with your do something part.
If you want to do something with the values of the dictionary, you don't need a dictionary comprehension at all:
I'm using iteritems() since you tagged your question with python-2.7
results = map(some_function, [(k,v) for k,v in a_dict.iteritems() if 'foo' in k])
Now the result will be in a list with some_function applied to each key/value pair of the dictionary, that has foo in its key.
If you just want to deal with the values and ignore the keys, just change the list comprehension:
results = map(some_function, [v for k,v in a_dict.iteritems() if 'foo' in k])
some_function can be any callable, so a lambda would work as well:
results = map(lambda x: x*2, [v for k,v in a_dict.iteritems() if 'foo' in k])
The inner list is actually not required, as you can pass a generator expression to map as well:
>>> map(lambda a: a[0]*a[1], ((k,v) for k,v in {2:2, 3:2}.iteritems() if k == 2))
[4]
You can use the built-in function 'filter()':
data = {'aaa':12, 'bbb':23, 'ccc':8, 'ddd':34}
# filter by key
print(dict(filter(lambda e:e[0]=='bbb', data.items() ) ) )
# filter by value
print(dict(filter(lambda e:e[1]>18, data.items() ) ) )
OUTPUT:
{'bbb':23}
{'bbb':23, 'ddd':34}
I'm using Python 2.7 with plistlib to import a .plist in a nested dict/array form, then look for a particular key and delete it wherever I see it.
When it comes to the actual files we're working with in the office, I already know where to find the values -- but I wrote my script with the idea that I didn't, in the hopes that I wouldn't have to make changes in the future if the file structure changes or we need to do likewise to other similar files.
Unfortunately I seem to be trying to modify a dict while iterating over it, but I'm not certain how that's actually happening, since I'm using iteritems() and enumerate() to get generators and work with those instead of the object I'm actually working with.
def scrub(someobject, badvalue='_default'): ##_default isn't the real variable
"""Walks the structure of a plistlib-created dict and finds all the badvalues and viciously eliminates them.
Can optionally be passed a different key to search for."""
count = 0
try:
iterator = someobject.iteritems()
except AttributeError:
iterator = enumerate(someobject)
for key, value in iterator:
try:
scrub(value)
except:
pass
if key == badvalue:
del someobject[key]
count += 1
return "Removed {count} instances of {badvalue} from {file}.".format(count=count, badvalue=badvalue, file=file)
Unfortunately, when I run this on my test .plist file, I get the following error:
Traceback (most recent call last):
File "formscrub.py", line 45, in <module>
scrub(loadedplist)
File "formscrub.py", line 19, in scrub
for key, value in iterator:
RuntimeError: dictionary changed size during iteration
So the problem might be the recursive call to itself, but even then shouldn't it just be removing from the original object? I'm not sure how to avoid recursion (or if that's the right strategy) but since it's a .plist, I do need to be able to identify when things are dicts or lists and iterate over them in search of either (a) more dicts to search, or (b) the actual key-value pair in the imported .plist that I need to delete.
Ultimately, this is a partial non-issue, in that the files I'll be working with on a regular basis have a known structure. However, I was really hoping to create something that doesn't care about the nesting or order of the object it's working with, as long as it's a Python dict with arrays in it.
Adding or removing items to/from a sequence while iterating over this sequence is tricky at best, and just illegal (as you just discovered) with dicts. The right way to remove entries from a dict while iterating over it is to iterate on a snapshot of the keys. In Python 2.x, dict.keys() provides such a snapshot. So for dicts the solution is:
for key in mydict.keys():
if key == bad_value:
del mydict[key]
As mentionned by cpizza in a comment, for python3, you'll need to explicitely create the snapshot using list():
for key in list(mydict.keys()):
if key == bad_value:
del mydict[key]
For lists, trying to iterate on a snapshot of the indexes (ie for i in len(thelist):) would result in an IndexError as soon as anything is removed (obviously since at least the last index will no more exist), and even if not you might skip one or more items (since the removal of an item makes the sequence of indexes out of sync with the list itself). enumerate is safe against IndexError (since the iteration will stop by itself when there's no more 'next' item in the list, but you'll still skip items:
>>> mylist = list("aabbccddeeffgghhii")
>>> for x, v in enumerate(mylist):
... if v in "bdfh":
... del mylist[x]
>>> print mylist
['a', 'a', 'b', 'c', 'c', 'd', 'e', 'e', 'f', 'g', 'g', 'h', 'i', 'i']
Not a quite a success, as you can see.
The known solution here is to iterate on reversed indexes, ie:
>>> mylist = list("aabbccddeeffgghhii")
>>> for x in reversed(range(len(mylist))):
... if mylist[x] in "bdfh":
... del mylist[x]
>>> print mylist
['a', 'a', 'c', 'c', 'e', 'e', 'g', 'g', 'i', 'i']
This works with reversed enumeration too, but we dont really care.
So to summarize: you need two different code path for dicts and lists - and you also need to take care of "not container" values (values which are neither lists nor dicts), something you do not take care of in your current code.
def scrub(obj, bad_key="_this_is_bad"):
if isinstance(obj, dict):
# the call to `list` is useless for py2 but makes
# the code py2/py3 compatible
for key in list(obj.keys()):
if key == bad_key:
del obj[key]
else:
scrub(obj[key], bad_key)
elif isinstance(obj, list):
for i in reversed(range(len(obj))):
if obj[i] == bad_key:
del obj[i]
else:
scrub(obj[i], bad_key)
else:
# neither a dict nor a list, do nothing
pass
As a side note: never write a bare except clause. Never ever. This should be illegal syntax, really.
Here a generalized version of the one of #bruno desthuilliers, with a callable to test against the keys.
def clean_dict(obj, func):
"""
This method scrolls the entire 'obj' to delete every key for which the 'callable' returns
True
:param obj: a dictionary or a list of dictionaries to clean
:param func: a callable that takes a key in argument and return True for each key to delete
"""
if isinstance(obj, dict):
# the call to `list` is useless for py2 but makes
# the code py2/py3 compatible
for key in list(obj.keys()):
if func(key):
del obj[key]
else:
clean_dict(obj[key], func)
elif isinstance(obj, list):
for i in reversed(range(len(obj))):
if func(obj[i]):
del obj[i]
else:
clean_dict(obj[i], func)
else:
# neither a dict nor a list, do nothing
pass
And an example with a regex callable :
func = lambda key: re.match(r"^<div>", key)
clean_dict(obj, func)
def walk(d, badvalue, answer=None, sofar=None):
if sofar is None:
sofar = []
if answer is None:
answer = []
for k,v in d.iteritems():
if k == badvalue:
answer.append(sofar + [k])
if isinstance(v, dict):
walk(v, badvalue, answer, sofar+[k])
return answer
def delKeys(d, badvalue):
for path in walk(d, badvalue):
dd = d
while len(path) > 1:
dd = dd[path[0]]
path.pop(0)
dd.pop(path[0])
Output
In [30]: d = {1:{2:3}, 2:{3:4}, 5:{6:{2:3}, 7:{1:2, 2:3}}, 3:4}
In [31]: delKeys(d, 2)
In [32]: d
Out[32]: {1: {}, 3: 4, 5: {6: {}, 7: {1: 2}}}
I have a giant dict with a lot of nested dicts -- like a giant tree, and depth in unknown.
I need a function, something like find_value(), that takes dict, value (as string), and returns list of lists, each one of them is "path" (sequential chain of keys from first key to key (or key value) with found value). If nothing found, returns empty list.
I wrote this code:
def find_value(dict, sought_value, current_path, result):
for key,value in dict.items():
current_path.pop()
current_path.append(key)
if sought_value in key:
result.append(current_path)
if type(value) == type(''):
if sought_value in value:
result.append(current_path+[value])
else:
current_path.append(key)
result = find_value(value, sought_value, current_path, result)
current_path.pop()
return result
I call this function to test:
result = find_value(self.dump, sought_value, ['START_KEY_FOR_DELETE'], [])
if not len(result):
print "forgive me, mylord, i'm afraid we didn't find him.."
elif len(result) == 1:
print "bless gods, for all that we have one match, mylord!"
For some inexplicable reasons, my implementation of this function fails some of my tests. I started to debug and find out, that even if current_path prints correct things (it always does, I checked!), the result is inexplicably corrupted. Maybe it is because of recursion magic?
Can anyone help me with this problem? Maybe there is a simple solution for my tasks?
When you write result.append(current_path), you're not copying current_path, which continues to mutate. Change it to result.append(current_path[:]).
I doubt you can do much to optimize a recursive search like that. Assuming there are many lookups on the same dictionary, and the dictionary doesn't change once loaded, then you can index it to get O(1) lookups...
def build_index(src, dest, path=[]):
for k, v in src.iteritems():
fk = path+[k]
if isinstance(v, dict):
build_index(v, dest, fk)
else:
try:
dest[v].append(fk)
except KeyError:
dest[v] = [fk]
>>> data = {'foo': {'sub1': 'blah'}, 'bar': {'sub2': 'whatever'}, 'baz': 'blah'}
>>> index = {}
>>> build_index(data, index)
>>> index
{'blah': [['baz'], ['foo', 'sub1']], 'whatever': [['bar', 'sub2']]}
>>> index['blah']
[['baz'], ['foo', 'sub1']]