I've two defaultdicts I eventually want to merge, but first I need to make their keys match. According to some threads I've seen here, I can use pop() to replace keys in a dictionary. But that only updates the existing dictionary, whereas I want to create a new dictionary with the new keys. So something like:
existing_dict_one -> new_dict_one
This is what I've so far:
def split_tabs(x):
"""
Function to split tab-separated strings, used to break up the keys that are separated by tabs.
"""
return x.split('\t')
def create_dict(old_dict):
"""
Function to create a new defaultdict from an existing defaultdict, just with
different keys.
"""
new_dict = old_dict.copy() # Create a copy of old_dict to house the new keys, but with the same values.
for key, value in new_dict.iteritems():
umi = split_tabs(key)[0] # Change key to be UMI, which is the 0th index of the tab-delimited key.
# new_key = key.replace(key, umi)
new_dict[umi] = new_dict.pop(key)
return new_dict
However, I'm getting the following error
RuntimeError: dictionary changed size during iteration
and I don't know how to fix it. Does anyone know how to correct it? I'd like to use the variable "umi" as the new key.
I'd like to post the variable "key" and dictionary "old_dict" I'm using for testing this code, but it's messy and takes up a lot of space. So here's a pastebin link that contains them instead.
Note that "umi" comes from variable "key" which is separated by tabs. So I split "key" and get the first object as "umi".
Just use a dict comprehension for this:
new_dict = {split_tabs(key)[0]: value for key, value in old_dict.iteritems()}
Trying to modify a dictionary while iterating over it is not a good idea in general.
If you use .items() instead of .iteritems(), you won't have that problem, because that will just return a list that is disconnected from the dictionary. In python 3 it would be 'list(new_dict.items())`.
Also if there's any possibility that the dictionary values are mutable, you'll have to use copy.deepcopy(old_dict) instead of just old_dict.copy().
Related
I want to access to an element of a dictionary with a string.
For example, I have a dictionary like this:
data = {"masks": {"id": "valore"}}
I have one string campo="masks,id" I want to split this string with this campo.split(','). I obtain ['masks', 'id'] and with this I want to access to the element data["masks"]["id"].
This dictionary is an example, my dictionaries have more complexity. The point is that I want to access to the element data["masks"]["id"] with an input string "masks,id", or to the element data["masks"] with the string "masks" and to the element data["masks"]["id"]["X"] with the input string "masks,id,X" and so on.
How can I do this?
However, I won't recommend you to use the following method, as python dict is not meant to be accessed the way you want it to be, but since in Python you can change the object type at your own risk, I would like to attach the snippet which would get the work done for you.
So what I do is iterate over the keys and at each iteration fetch the child dictionary is present else put empty dictionary, the .get() method used, returns empty dict if the key was not found.
data = {"masks": {"id": "valore"}}
text = "masks, id"
nested_keys = text.split(", ")
nested_dict = data
for key in nested_keys:
nested_dict = nested_dict.get(key, {})
if (isinstance(nested_dict, str)):
print nested_dict
The point is that you are coming up with requirements that do not match the capability of the python-built-in dictionaries.
If you want to have nested maps that do this kind of automated "splitting" of a single key string like "masks, id, X" then ... you will have to implement that yourself.
In other words: the answer is - the built-in dictionary can't do that for you.
So, the "real" thing to do here: step back and carefully look into your requirements to understand exactly what you want to do; and why you want to do that. And going from there look for the best design to support that.
From an implementation side, I think what you "need" would roughly look like:
check if the provided "key" matches "key1,key2,key3"
if so, split that key into its sub-keys
then check if the "out dictionary" has a value for key1
then check, if the value for key1 is a dictionary
then check if that "inner" dictionary has a value for key2
...
and so on.
So here's the problem, I'm importing a dictionary with anywhere from 6000 to 12000 keys. Then using a nested for algorithm to group them into a list inside of another dictionary. I'm using the following code to check if the key is in the dictionary:
for key in range(sizeOfOriginalKeys):
if key in key_data:
As you might imagine, this is taking forever since the sorting algorithm is fairly complex. I would like to only iterate through the keys in 'key_data', without doing 1000 to 11999 checks if there is that key in the dictionary. Is there a way to make a list of current keys? Then iterate through them? Or at least something more efficient than what I'm currently doing?
Current Code after Kevin's suggestion:
for key in key_data:
currentKey = key_data[key].name
if key_data[currentKey].prefList[currentPref] == currentGroup
key_data[currentKey].currentScore = getDotProduct()
group_data[currentGroup].keyList.append(key_data[currentKey])
group_data[currentGroup].sortKeys()
del key_data[currentKey]
The key names are integers.
At the end of the sorting algorithm I delete the key, if its been sorted into a group.
Now I get an error: dictionary changed size during iteration.
Thoughts?
You're trying too hard:
for key in key_data:
You can try
for key,value in key_data.items() :
print key
print value
you can access to value without calling key_data[key]
So I have some dicts which looks something like this
clue={"number":set([i,j]), "var2":x[i]<0.03 or x[j]<0.03, "var3":x[i]>0.97 or x[j]>0.97, "var4":v4}
etc.
clue={"number":set([k,l]), "var2":x[i]<0.03 or x[j]<0.03, "var3":x[i]>0.97 or x[j]>0.97, "var4":v4}
etc.
I created a list of these dictionaries list(clue) because I needed to sort them and join some of the values together, so for example:
clue={"number":set([i,j,k,l]), "var2":True, "var3":False, "var4":v4}
etc.
clue={"number":{}, "var2":True, "var3":False, "var4":v4}
etc.
Now because I have a list(clue) the following operation becomes difficult: I want to delete all dict entries from my list(clue) which have the set {} for number. i.e. The dicts with "number:{}" are worthless (regardless of the other key, values) and are just cluttering up everything else and I want rid of them.
I would like to make it clear that I want to get rid of the whole dict entry with number:{} rather than just the key from that specific dict.
Using Python 2.7
Thanks
I want to delete all dict entries from my list(clue) which have the
set {} for number. i.e. The dicts with "number:{}" are worthless
(regardless of the other key, values) and are just cluttering up
everything else and I want rid of them.
filtered_clue = filter(lambda x: len(x['number']), clue)
I know it is easy to implement.
I want a dictionary like class, which takes a list of dictionaries in the constructor.
If you read from this dict by key, the dict-class should check the list of dictionaries and return the first value. If none contains this key KeyError should be thrown like a normal dict.
This dictionary container should be read only for my usage.
You seem to be describing collections.ChainMap, which will be in the next version of Python (3.3, expected to go final later this year). For current/earlier versions of Python, you can copy the implementation from the collections source code.
Not really answer to the question: what if you just define method that merge all dictionaries into one? Why make new class for it?
How to merge: How to merge two Python dictionaries in a single expression?
Varargs: Can a variable number of arguments be passed to a function?
You can easily implement this with this logic.
Iterate over all the dictionaries in the list.
For each dictionary, see if it has the required key by using key in value statement.
If value is found, return the value from the function.
If you have iterated over all dictionaries, and value is not found, Raise KeyError exception.
I'm trying to construct a dictionary that contains a series of sets:
{Field1:{Value1, Value2, Value3}, Field2{Value4}}
The trouble is, I then wish to delete any fields from the dictionary that only have one value in the set. I have been writing code like this:
for field in FieldSet:
if len(FieldSet[field]) == 1:
del(FieldSet[field])
But receive the error "RuntimeError: dictionary changed size during execution". (Not surprising, since that's what I'm doing.) It's not the be-all and end-all if I have to knock together some sort of workaround, but is it possible to do this?
Iterate over the return value from .keys() instead. Since you get a list of keys back, it won't be affected by changing the dictionary after you've called it.
A sometimes-preferable alternative to changing FieldSet in place is sometimes (depending on the amount of alterations performed) to build a new one and bind it to the existing name:
FieldSet = dict((k, v) for k, v in FieldSet.iteritems()
if len(v) != 1)
There is the pop method. It removes the element that a key calls. With respect to your example this looks like:
for field in FieldSet.keys():
if len(FieldSet[field]) == 1:
FieldSet.pop(field)
This is in python 3.2, but I'm not sure if it's a new feature:
http://docs.python.org/dev/library/stdtypes.html#dict.pop
Just tried it and it works as advertised.