Index value confusion in python - python

Hey all am new to python programming and i have noticed some code which is really confusing me.
import collectors
s = 'mississippi'
d = collectors.defaultdict(int)
for k in s:
d[k] += 1
d.items()
The thing i need to know is the use of d[k] here ..I know k is the value in the string s.But i didnt understood what d[k] returns.In defaultdict(int) new value is created if dictonary has no values..
Please help me any help would be appreciated ..Thanks ..

Dictionaries in Python are "mapping" types. (This applies to both regular dict dictionaries and the more specialized variations like defaultdict.) A mapping takes a key and "maps" it to a value. The syntax d[k] is used to look up the key k in the dictionary d. Depending on where it appears in your code, it can have slightly different semantics (either returning the existing value for the key or setting a new one).
In your example, you're using d[k] += 1, which increments the value under key k in the dictionary. Since integers are immutable, it actually breaks out into d[k] = d[k] + 1. The right side d[k] does a look up of the value in the dictionary. Then it adds one and, using the d[k] on the left side, assigns the result into the dictionary as a new value.
defaultdict changes things a bit in that keys that don't yet exist in the dictionary are treated as if they did exist. The argument to its constructor is a "factory" object which will be called to create the new values when an unknown key is requested.

Here you go
d[key]
Return the item of d with key key. Raises a KeyError if key is not in the map.
Straigth from the python docs, under mapping types.
Go to https://docs.python.org/ and bookmark it. It will become your best friend.

Related

can this code be written in two lines?

I have a feeling that using the setdefault method or lamda that this code can be written in two lines:
variables = ['a','b','c','d']
for value in indefinite_dict.values():
str1 = immediate_instantiation.get(value)
if str1 == None:
immediate_instantiation.update({value:variables[0]})
del variables[0]
It loops through the values of the indefinite_dict and puts them in and if that value is not already a key in the immediate instantiation dict then it adds that as an entry into that dict with a value of the first member of the variables list and deletes the first member of the variables list.
If you’re okay with values in variables being deleted even if a corresponding key already exists in immediate_instantiation when that key has the same value, you’re right that you can do it with only setdefault:
for value in indefinite_dict.values():
if immediate_instantiation.setdefault(value, variables[0]) is variables[0]:
del variables[0]
To get it down to two lines without any other context takes a different (and kind of unpleasant) approach, though:
updates = (v for v in indefinite_dict.values() if v not in immediate_instantiation)
immediate_instantiation.update({v: variables.pop(0) for v in updates})
And indefinite_dict had better be an OrderedDict – otherwise you’re removing variables in a potentially random order! (Don’t rely on Python 3.6’s dict representation for this.)
If you don’t need variables after this and variables is guaranteed to be at least as long as updates, a non-mutating solution is much cleaner, note:
updates = (v for v in indefinite_dict.values() if v not in immediate_instantiation)
immediate_instantiation.update(zip(updates, variables))
One line solution:
immediate_instantiation.update({value: variables.pop(0) for value in indefinite_dict.values() if value not in immediate_instantiation})

What's the fastest way to identify the 'name' of a dictionary that contains a specific key-value pair?

I'd like to identify the dictionary within the following list that contains the key-value pair 'Keya':'123a', which is ID1 in this case.
lst = {ID1:{'Keya':'123a','Keyb':456,'Keyc':789},ID2:{'Keya':'132a','Keyb':654,'Keyc':987},ID3:{'Keya':'5433a','Keyb':222,'Keyc':333},ID4:{'Keya':'444a','Keyb':777,'Keyc':666}}
It's safe to assume all dictionaries have the same key's, but have different values.
I currently have the following to identify which dictionary has the value '123a' for the key 'Keya', but is there a shorter and faster way?
DictionaryNames = map(lambda Dict: str(Dict),lst)
Dictionaries = [i[1] for i in lst.items()]
Dictionaries = map(lambda Dict: str(Dict),Dictionaries)
Dict = filter(lambda item:'123a' in item,Dictionaries)
val = DictionaryNames[Dictionaries.index(Dict[0])]
return val
If you actually had a list of dictionaries, this would be:
next(d for d in list_o_dicts if d[key]==value)
Since you actually have a dictionary of dictionaries, and you want the key associated with the dictionary, it's:
next(k for k, d in dict_o_dicts.items() if d[key]==value)
This returns the first matching value. If you're absolutely sure there is exactly one, or if you don't care which you get if there are more than one, and if you're happy with a StopIteration exception if you were wrong and there isn't one, that's exactly what you want.
If you need all matching values, just do the same with a list comprehension:
[k for k, d in dict_o_dicts.items() if d[key]==value]
That list can of course have 0, 1, or 17 values.
You can just do [name for name, d in lst.iteritems() if d['Keya']=='123a'] to get a list of all the dictionaries in lst that have that value for that key. If you know there is only one, you can get it with [name for name, d in lst.iteritems() if d['Keya']=='123a'][0]. (As Andy mentions in a comment, your name lst is misleading, since lst is actually a dictionary of dictionaries, not a list.)
Since you want the fastest, you should short-cut your search as soon as you find the data you are after. Iterating through the whole list is not necessary, nor is producing any temporary dictionary:
for key,data in lst.iteritems():
if data['Keya']=='132a':
return key #or break is not in a function
Å different way to do this is to use the appropriate data structure: Keep a "reverse map" of key-value pairs to names. If your dictionary of dictionaries is static after being built, you can build the reverse dictionary like this:
revdict = {(key, value): name
for name, subdict in dictodicts.items()
for key, value in subdict.items()}
If not, you just need to add revdict[key, value] = name for each d[name][key] = value statement and build them up in parallel.
Either way, to find the name of the dict that maps key to value, it's just:
revdict[key, value]
For (a whole lot) more information (than you actually want), and some sample code for wrapping things up in different ways… I dug up an unfinished blog post, considered editing it, and decided to not bother and just clicked Publish instead, so: Reverse dictionary lookup and more, on beyond z.

Python: compare keys of 2 dictionaries and update the value of one

I have 2 dictionaries with the same keys:
d={123:'bla', 456: blabla}
e={123:'bla', 456:''}
Dictionary e has some empty values and if this is the case and the IDs are the same I'd like to replace the empty value with the value from d.
I am looking for something like this:
if d.keys()==e.keys():
e.values==d.values()
print e
However, I couldn't find anything in the Python documentation that compares single keys and updates the values.
Can someone help or point me at something?
Thanks :)
You can do a straight update if you don't mind overwriting differing values with
e.update(d)
or if you want to make sure you only overwrite the ones containing empty values then you will need to iterate over your dictionary to find them, and update selectively
# Python 2.x
for key, value in e.iteritems():
if value == '':
e[key] = d.get(key)
# Python 3.x
for key, value in e.items():
if value == '':
e[key] = d.get(key)
You can also use a dict comprehension:
f = {k:(e[k] or d[k]) for k in e.keys()}
or evaluates to the second item if the first is empty. Of course you have to make sure both use the same keys.

How do I trust the order of a Python dictionary?

I'm trying to make a dictionary in Python that I can sort through but it seems to change order when I add new things. Is there a way around this?
A standard Dictionary does not impose an ordering, it's simply a lookup.
You want an Ordered Dictionary or Ordered Dictionary.
Python dicts are built as hash tables -- great performance, but ordering is essentially arbitrary and unpredictable. If your need for predictably-ordered walks are occasional, and based on keys or values, the sorted built-in is very handy:
# print all entries in sorted key order
for k in sorted(d): print k, d[k]
# print all entries in reverse-sorted value order
for k in sorted(d, key=d.get, reverse=True): print k, d[k]
# given all keys are strings, print in case-insensitive sorted order
for k in sorted(d, key=str.lower): print k, d[k]
and so forth. If you needs are different (e.g., keep track of the respective times at which keys are inserted, or their values altered, and so forth), the "ordered dictionaries" suggested in other answers will serve you better (but never with the awesome raw performance of a true dict!-).

Reversible dictionary for python

I'd like to store some data in Python in a similar form to a dictionary: {1:'a', 2:'b'}. Every value will be unique, not just among other values, but among keys too.
Is there a simple data structure that I can use to get the corresponding object no matter if I ask using the 'key' or the 'value'? For example:
>>> a = {1:'a', 2:'b'}
>>> a[1]
'a'
>>> a['b']
2
>>> a[3]
KeyError
The 'keys' are standard python ints, an the values are short (<256char) strings.
My current solution is creating a reversed dictionary and searching it if I can't find a result in the original dictionary:
pointsreversed = dict((v, k) for k, v in points.iteritems())
def lookup(key):
return points.get(key) or pointsreversed.key()
This uses twice as much space, which isn't great (my dictionaries can be up to a few hundred megs) and is 50% slower on average.
EDIT: as mentioned in a few answers, two dicts doesn't double memory usage, as it's only the dictionary, not the items within, that is duplication.
Is there a solution that improves on this?
If your keys and values are non-overlapping, one obvious approach is to simply store them in the same dict. ie:
class BidirectionalDict(dict):
def __setitem__(self, key, val):
dict.__setitem__(self, key, val)
dict.__setitem__(self, val, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
d = BidirectionalDict()
d['foo'] = 4
print d[4] # Prints 'foo'
(You'll also probably want to implement things like the __init__, update and iter* methods to act like a real dict, depending on how much functionality you need).
This should only involve one lookup, though may not save you much in memory (you still have twice the number of dict entries after all). Note however that neither this nor your original will use up twice as much space: the dict only takes up space for the references (effectively pointers), plus an overallocation overhead. The space taken up by your data itself will not be repeated twice since the same objects are pointed to.
Related posts:
Python mapping inverse
Python 1:1 mappings
Of course, if all values and keys are unique, couldn't you just use a single dictionary, and insert both key:value and value:key initially?
In The Art of Computer Programming, Vokume 3 Knuth has a section on lookups of secondary keys. For purposes of your question, the value could be considered the secondary key.
The first suggestion is to do what you have done: make an efficient index of the keys by value.
The second suggestion is to setup a large btree that is a composite index of the clustered data, where the branch nodes contain values and the leaves contain the key data and pointers to the larger record (if there is one.)
If the data is geometric (as yours appears to be) there are things called post-office trees. It can answer questions like, what is the nearest object to point x. A few examples are here: http://simsearch.yury.name/russir/01nncourse-hand.pdf Another simple option for this kind of query is the quadtree and the k-d tree. http://en.wikipedia.org/wiki/Quadtree
Another final option is combinatorial hashing, where you combine the key and value into a special kind of hash that lets you do efficient lookups on the hash, even when you don't have both values. I couldn't find a good combinatorial hash explanation online, but it is in TAoCP, Volume 3 Second Edition on page 573.
Granted, for some of these you may have to write your own code. But if memory or performance is really key, you might want to take the time.
It shouldn't use "twice the space". Dictionaries just store references to data, not the data itself. So, if you have a million strings taking up a billion bytes, then each dictionary takes maybe an extra 10-20 million bytes--a tiny fraction of the overall storage. Using two dictionaries is the right thing to do.
Insert reversed pair of (key, value) into same dict:
a = {1:'a', 2:'b'}
a.update(dict((v, k) for k, v in a.iteritems()))
Then you will be able to do both, as you required:
print a[1]
print a['a']
Here's another solution using a user defined class.
And the code...
# search a dictionary for key or value
# using named functions or a class
# tested with Python25 by Ene Uran 01/19/2008
def find_key(dic, val):
"""return the key of dictionary dic given the value"""
return [k for k, v in symbol_dic.iteritems() if v == val][0]
def find_value(dic, key):
"""return the value of dictionary dic given the key"""
return dic[key]
class Lookup(dict):
"""
a dictionary which can lookup value by key, or keys by value
"""
def __init__(self, items=[]):
"""items can be a list of pair_lists or a dictionary"""
dict.__init__(self, items)
def get_key(self, value):
"""find the key(s) as a list given a value"""
return [item[0] for item in self.items() if item[1] == value]
def get_value(self, key):
"""find the value given a key"""
return self[key]
I've been doing it this way for many years now. I personally like the simplicity of it more than the other solutions out there.
d = {1: 'a', 2: 'b'}
dict(zip(d.values(), d.keys()))

Categories

Resources