Python: Dynamically update a dictionary with varying variable "depth" - python

I have a dictionary with various variable types - from simple strings to other nested dictionaries several levels deep. I need to create a pointer to a specific key:value pair so it can be used in a function that would update the dictionary and could be called like so:
dict_update(my_dictionary, value, level1key, *level2key....)
Data coming from a web request like this:
data {
'edited-fields': ['level1key-level2key-level3key', 'level1key-level2key-listindex', 'level1key'],
'level1key-level2key-level3key': 'value1',
'level1key-level2key-listindex': 'value2',
'level1key': 'value3'
}
I can get to the original value to read it like this:
for field in data["edited-fields"]:
args = field.split("-")
value = my_dictionary
for arg in args:
if arg.isdigit():
arg = int(arg)
value = value[arg]
print(value)
But have no idea how to edit it using the same logic. I can't search and replace by the value itself as there can be duplicates and having several if statements for each possible arg count doesn't feel very pythonic.
EXAMPLE:
data {
'edited-fields': ['mail-signatures-work', 'mail-signatures-personal', 'mail-outofoffice', 'todo-pending-0'],
'mail-signatures-work': 'I'm Batman',
'mail-signatures-personal': 'Bruce, Wayne corp.',
'mail-outofoffice': 'false',
'todo-pending-0': 'Call Martha'
}
I'd like to process that request like this:
for field in data['edited-fields']:
update_batman_db(field, data[field])
def update_batman_db(key-to-parse, value):
# how-to?
# key-to-parse ('mail-signatures-work') -> batman_db pointer ["mail"]["signatures"]["work"]
# etc:
batman_db["mail"]["signatures"]["work"] = value
batman_db["mail"]["outofoffice"] = value # one less level
batman_db["todo"]["pending"][0] = value # list index

The hard part here is to know whether an index must be used as a string form a mapping of as an integer for a list.
I will first try to process it as an integer index on a list, and revert to a string index of a mapping in case of any exception:
def update_batman_db(key, value):
keys = key.split('-') # parse the received key
ix = batman_db # initialize a "pointer" to the top most item
for key in keys[:-1]: # process up to the last key item
try: # descending in the structure
i = int(key)
ix = ix[i]
except:
ix = ix[key]
try: # assign the value with last key item
i = int(keys[-1])
ix[i] = value
except:
ix[keys[-1]] = value

Related

Python - process only new element of dictionnary

I have a unique (unique keys) dictionnary that I update adding some new keys depending data on a webpage.
and I want to process only the new keys that may appear after a long time. Here is a piece of code to understand :
a = UniqueDict()
while 1:
webpage = update() # return a list
for i in webpage:
title = getTitle(i)
a[title] = new_value # populate only new title obtained because it's a unique dictionnary
if len(a) > 50:
a.clear() # just to clear dictionnary if too big
# Condition before entering this loop to process only new title entered
for element in a.keys():
process(element)
Is there a way to know only new keys added in the dictionnary (because most of the time, it will be the same keys and values so I don't want them to be processed) ?
Thank you.
What you might also do, is keep the processed keys in a set.
Then you can check for new keys by using set(d.keys()) - set_already_processed.
And add processed keys using set_already_processed.add(key)
You may want to use a OrderedDict:
Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.
Make your own dict that tracks additions:
class NewKeysDict(dict):
"""A dict, but tracks keys that are added through __setitem__
only. reset() resets tracking to begin tracking anew. self.new_keys
is a set holding your keys.
"""
def __init__(self, *args, **kw):
super(NewKeysDict, self).__init__(*args, **kw)
self.new_keys = set()
def reset(self):
self.new_keys = set()
def __setitem__(self, key, value):
super(NewKeysDict, self).__setitem__(key, value)
self.new_keys.add(key)
d = NewKeysDict((i,str(i)) for i in range(10))
d.reset()
print(d.new_keys)
for i in range(5, 10):
d[i] = '{} new'.format(i)
for k in d.new_keys:
print(d[k])
(because most of the time, it will be the same keys and values so I don't want them to be processed)
You get complicate !
The keys are immutable and unique.
Each key is followed by a value separated, by a colon.
dict = {"title",title}
text = "textdude"
dict["keytext"]=text
This is add a value textdude, with the new key called "keytext".
For check, we use "in".
"textdude" in dict
He return true

Can anyone explain to me what a two-level dictionary is in Python

I am struggling to find any documentation anywhere on what this actually is. I understand just an ordinary dictionary. This consists of key and value pairs so if you search for a key its corresponding value is returned, For example:
myDict = {‘dog’ : ’fido’, ‘cat’ : ’tiddles’, ‘fish’ : ’bubbles’, ’rabbit’ : ’thumper’}
And then you can invoke certain methods on this like:
myDict[‘fish’]
returns
'bubbles'
or
myDict.has_key(‘tiddles’)
returns
True
How would a two-level dictionary compare to this?
It appears nested dictionaries was what I was looking for.
One more question, say I have a nested dictionary which links words to text files where the first integer is the number of the text file and the second is the number of occurrences:
myDict = {'through':{1:18,2:27,3:2,4:15,5:63}, 'one':{1:27,2:15,3:24,4:9,5:32}, 'clock':{1:2,2:5,3:9,4:6,5:15}
How would I use the file numbers to work out the total number of text files that were present? i.e is there a way of extracting the number of key / value pairs in the inner dictionary?
I guess a two level dictionary could be a dictionary of dictionaries i.e
dict = {'a':{"cool":1,"dict":2}}
you could use it like
dict['a']['cool']
>> 1
so you can do
dict['a'].has_key('cool')
>> True
It is just a nested dictionary, meaning it contains other dictionaries, for example
d = {'mike':{'age':10, 'gender':'male'}, 'jen':{'age':12, 'gender':'female'}}
I can access inner values such as
>>> d['mike']['age']
10
Common examples of deeply nested dictionaries in Python are reading and writing of JSON.
I believe you mean a two-way dictionary, here's a recipe (from Two way/reverse map):
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
usage:
myDict = {‘dog’ : ’fido’, ‘cat’ : ’tiddles’, ‘fish’ : ’bubbles’, ’rabbit’ : ’thumper’}
twowaydict = TwoWayDict() # can't instantiate with old dict, need to setitem
for key in myDict:
twowaydict[key] = myDict[key]
twowaydict.has_key('tiddles')
returns True
If you mean a dict of dicts, that's fairly common construct, where the values of the containing dict are also dicts.
dofd = {'key1': {'subkey1': 'value1,1', 'subkey2': 'value1,2'}
'key2': {'subkey1': 'value2,1', 'subkey2': 'value2,2'}
}
and you'd access the internal values like this:
dofd['key1']['subkey2']
should return value1,2

python: badly behaving dict inside a function- erroneous TypeError

I have dicts that I need to clean, e.g.
dict = {
'sensor1': [list of numbers from sensor 1 pertaining to measurements on different days],
'sensor2': [list of numbers from from sensor 2 pertaining to measurements from different days],
etc. }
Some days have bad values, and I would like to generate a new dict with the all the sensor values from that bad day to be erased by using an upper limit on the values of one of the keys:
def clean_high(dict_name,key_string,limit):
'''clean all the keys to eliminate the bad values from the arrays'''
new_dict = dict_name
for key in new_dict: new_dict[key] = new_dict[key][new_dict[key_string]<limit]
return new_dict
If I run all the lines separately in IPython, it works. The bad days are eliminated, and the good ones are kept. These are both type numpy.ndarray: new_dict[key] and new_dict[key][new_dict[key_string]<limit]
But, when I run clean_high(), I get the error:
TypeError: only integer arrays with one element can be converted to an index
What?
Inside of clean_high(), the type for new_dict[key] is a string, not an array.
Why would the type change? Is there a better way to modify my dictionary?
Do not modify a dictionary while iterating over it. According to the python documentation: "Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries". Instead, create a new dictionary and modify it while iterating over the old one.
def clean_high(dict_name,key_string,limit):
'''clean all the keys to eliminate the bad values from the arrays'''
new_dict = {}
for key in dict_name:
new_dict[key] = dict_name[key][dict_name[key_string]<limit]
return new_dict

Why isn't all the data being stored?

I have a dictionary carrying key:value however it only saves the last iteration and discards the previous entries where is it being reset ?? This is the output from the ctr of iterations and the length of the dictionary
Return the complete Term and DocID Ref.
LENGTH:6960
CTR:88699
My code:
class IndexData:
def getTermDocIDCollection(self):
...............
for term in terms:
#TermDocIDCollection[term] = sourceFile['newid']
TermDocIDCollection[term] = []
TermDocIDCollection[term].append(sourceFile['newid'])
return TermDocIDCollection
The piece of code you've commented out does the following:
Sets a value to the key (removing whatever was there before, if it existed)
Sets a new value to the key (an empty list)
Appends the value set in step 1 to the new empty list
Sadly, it would do the same each iteration, so you'd end up with [last value] assigned to the key. The new code (with update) does something similar. In the old days you'd do this:
if term in TermDocIDCollection:
TermDocIDCollection[term].append(sourceFile['newid'])
else:
TermDocIDCollection[term] = [sourceFile['newid']]
or a variation of the theme using try-except. After collections was added you can do this instead:
from collections import defaultdict
# ... code...
TermDocIDCollection = defaultdict(list)
and you'd update it like this:
TermDocIDCollection[term].append(sourceFile['newid'])
no need to check if term exists in the dictionary. If it doesn't, the defaultdict type will first call to the constructor you passed (list) to create the initial value for the key

converting variable like string content to real variable in python

I have string variable which contains "variable" like content as shown below.
str1="type=gene; loc=scaffold_12875; ID=FBgn0207418; name=Dvir\GJ20278;MD5=4c62b751ec045ac93306ce7c08d254f9; length=2088; release=r1.2; species=Dvir;"
I need to make variables out of the string such that the variables name and values goes like this
type="gene"
loc="scaffold_12875"
ID="FBgn0207418"
name="Dvir\GJ20278"
MD5="4c62b751ec045ac93306ce7c08d254f9"
length=2088
release="r1.2"
species="Dvir"
Thanks for the help in advance.
Don't do this. You could, but don't.
Instead make a dictionary whose keys are the names:
result_dict = {}
items = str1.split(';')
for item in items:
key, value = item.strip().split('=')
result_dict[key] = value
Or you could do this
class Namespace(object):
pass
for item in str1.split(';'):
key, value = item.strip().split('=', 1)
setattr(Namespace, key, value)
You can then access your variables like so
Namespace.length

Categories

Resources