Python Dictionary or Alternative - python

For instance,
if dict['sample']:
//append values to dict['sample']
else:
// assign new key to the python dictionary
If dict['sample'] is empty, Python will throw errors. Does anyone know a better way to check on this?
All I want is something like that, I will have list of data, let's say a,a,b,c,g,g,g,g,g.
So, I want python dictionary to append the values of two a,a to dict['a'], and g,g,g,g,g to dict['g'] and so the rest as dict['b'] etc. A for loop will be executed to loop through the data of a,a,b,c,g,g,g,g,g.
I hope I've made my question clear. Any idea? Preferably, if Python's dictionary has a way to check existing key.
EDIT
Credit goes to #Paul McGuire. I've figured out the exact solution I wanted based on #Paul McGuire's answer. As shown below:
from collections import defaultdict
class Test:
def __init__(self, a,b):
self.a=a
self.b=b
data = []
data.append(Test(a=4,b=6))
data.append(Test(a=1,b=2))
data.append(Test(a=1,b=3))
data.append(Test(a=2,b=2))
data.append(Test(a=3,b=2))
data.append(Test(a=4,b=5))
data.append(Test(a=4,b=2))
data.append(Test(a=1,b=2))
data.append(Test(a=5,b=9))
data.append(Test(a=4,b=7))
dd = defaultdict(list)
for c in data:
dd[c.a].append(c.b)
print dd

The old approaches of "if key in dict" or "dict.get" or "dict.setdefault" should all be set aside in favor of the now standard defaultdict:
data = "aabcggggg"
from collections import defaultdict
dd = defaultdict(list)
for c in data:
dd[c].append(c)
print dd
defaultdict takes care of the checking for key existence for you, all your code has to do is 1) define a factory function or class for the defaultdict to use to initialize a new key entry (in this case, defaultdict(list)), and 2) define what to do with each key (dd[c].append(c)). Clean, simple, no clutter.
In your particular example, you could actually use itertools.groupby, since the groups of letters are all contiguous for each grouping value. groupby returns an iterator of tuples, each tuple containing the current key value and an iterator of the matching values. The following code works by converting each tuple's list-of-values-iterator to an actual list (in list(v)), and passing the sequence of key-valuelist tuples to the dict constructor.
from itertools import groupby
print dict((k,list(v)) for k,v in groupby(data))
prints:
{'a': ['a', 'a'], 'c': ['c'], 'b': ['b'], 'g': ['g', 'g', 'g', 'g', 'g']}

my_dict = {}
my_dict.setdefault('sample', []).append(value)
second parameter of 'setdefault' method says what should be initial value if given key does not exist

If I understand your values are lists.
if 'sample' in mydict:
pass #whatever
else:
mydict['sample'] = []
What you want to do is the following:
A = ['a','a','b','b','b','c','c','c']
myDict = {}
for i in A:
if i not in myDict:
myDict[i] = []
myDict[i].append(i)
print myDict

Every dict key should contain a list. Am I right?
d = dict()
try:
d['sample'].append(new_data)
except KeyError:
d['sample'] = [new_data]
I believe this would work. By the way, you shouldn't use the name dict for a dictonary. dict is already used as a function.
Edit1:
I'm not really sure I understand what you trying to do. Nor do I know if my solution is the best of those proposed. But is it this you are trying to do? It seems a little bit odd? Or do you want to count how many times every letter occur?
# Create a list named l.
l = ['a', 'a', 'b', 'c', 'g', 'g', 'g', 'g','g']
# Create dictionary named d.
d = dict()
for i in l:
try:
d[i].append(i)
except KeyError:
d[i] = [i]

If you are using some very old python:
if not myDict.has_key(key):
myDict[key]=[val]
else:
myDict[key].append(val)
Recently the has_key has been deprecated in favor of key in dict
So nowadays, it would be:
if not key in myDict:
myDict[key]=[val]
else:
myDict[key].append(val)

Credit should goes to #Niclas Nilsson even though his posted solution didn't really worked for what I wanted, however, it did help me figure out the solution that I wanted in the simplest form.
Still, I appreciate everyone's help here for extra knowledge and alternative ways of solving it. Thanks a lot.
Following achieved what I wanted in the simplest way without additional library imported etc:
r = {}
try:
if r['new_data']:
r['new_data'] = 'appending'
except KeyError:
r['new_data'] = 'new value'
print r['new_data']

The exact solution I wanted the most as shown below. Credit goes to #Paul McGuire
from collections import defaultdict
class Test:
def __init__(self, a,b):
self.a=a
self.b=b
data = []
data.append(Test(a=4,b=6))
data.append(Test(a=1,b=2))
data.append(Test(a=1,b=3))
data.append(Test(a=2,b=2))
data.append(Test(a=3,b=2))
data.append(Test(a=4,b=5))
data.append(Test(a=4,b=2))
data.append(Test(a=1,b=2))
data.append(Test(a=5,b=9))
data.append(Test(a=4,b=7))
dd = defaultdict(list)
for c in data:
dd[c.a].append(c.b)
print dd

Related

Elegant way to set values in a nested json in python

I am setting up some values in a nested JSON. In the JSON, it is not necessary that the keys would always be present.
My sample code looks like below.
if 'key' not in data:
data['key'] = {}
if 'nested_key' not in data['key']:
data['key']['nested_key'] = some_value
Is there any other elegant way to achieve this? Simply assigning the value without if's like - data['key']['nested_key'] = some_value can sometimes throw KeyError.
I referred multiple similar questions about "getting nested JSON" on StackOverflow but none fulfilled my requirement. So I have added a new question. In case, this is a duplicate question then I'll remove this one once guided towards the right question.
Thanks
Please note that, for the insertion you need not check for the key and you can directly add it. But, defaultdict can be used. It is particularly helpful incase of values like lists.
from collections import defaultdict
data = defaultdict(dict)
data['key']['nested_key'] = some_value
defaultdict will ensure that you will never get a key error. If the key doesn't exist, it returns an empty object of the type with which you have initialized it.
List based example:
from collections import defaultdict
data = defaultdict(list)
data['key'].append(1)
which otherwise will have to be done like below:
data = {}
if 'key' not in data:
data['key'] = ['1']
else:
data['key'].append('2')
Example based on existing dict:
from collections import defaultdict
data = {'key1': 'sample'}
data_new = defaultdict(dict,data)
data_new['key']['something'] = 'nothing'
print data_new
Output:
defaultdict(<type 'dict'>, {'key1': 'sample', 'key': {'something': 'nothing'}})
You can write in one statement:
data.setdefault('key', {})['nested_value'] = some_value
but I am not sure it looks more elegant.
PS: if you prefer to use defaultdict as proposed by Jay, you can initialize the new dict with the original one returned by json.loads(), then passes it to json.dumps():
data2 = defaultdict(dict, data)
data2['key'] = value
json.dumps(data2) # print the expected dict

Python use %s in dict to call value

How can we use dict similar to the following?
dict[%s] % variable
For those who are interested in what I am trying to do exactly, I have three dicts:
dict_1 = {'a':'123', 'b':'234', 'c':'345'}
dict_2 = {'d':'456', 'e':'567', 'f':'678'}
dict_3 = {'a':'e', 'b':'d', 'c':'f'}
And I have a function where I need to input something like:
function(dict_1['a'], dict_2['e']) #according to dict_3 that 'a' is paired with 'e'.
Edited:
I was trying to write a for loop to loop over all the dicts and have them pair dict_1 and dict_2 pairs into the function according to dict_3. I actually don't need the %s thing after looking at you guys answer. That's what happens what you try to code without coffee in the morning lol.
And in the end, this following did what I wanted, thanks all!:
for i in dict_1:
results = function(dict_1[i],dict_2[dict_3[i]]
If what I gather is correct, you were almost already on the money. You can just write it like this:
dict["%s" % variable]
One issue with this, however, is that if you had something like:
d = {3 : 'hello'}
my_key = 3
d['%s' % my_key]
That would fail with a KeyError
Regardless though, it is kind of a roundabout way to use it. You can just write:
dict[otherdict['a']]
function(dict_1['a'], dict_2[dict_3['a']])
should work.
So i'd do something like
k = 'a'
function(dict_1[k],dict_2[dict_3[k]])
Using the % operator here doesn't make much sense to me because you're not trying to produce a string with some static data and a variably inserted value.
Because % operator is evaluated before than [] operator. You should do dict["%s" % variable] or more easier dict[variable] if variable is already a string.
If your variables are strings, you then use them as keys directly
dict[variable]
For the function you want to do, you can call
function(dict1[variable], dict2[dict3[variable]])
If you are trying to call this function for every key and value in dict3, you may want to just iterate over dict3.
for key, val in dict3.items():
function(dict1[key], dict2[values])
For a more pythonic way of collecting these results into a list
func_results = [function(dict1[key], dict2[val]) for key, val in dict3.items()]
One final thing. If you just have dict3 there for the purpose of this function call, a list of tuples is all you need:
arg_list = [('a', 'e'), ('b', 'd'), ('c', 'f')]
func_results = [function(dict1[arg1], dict2[arg2]) for arg1, arg2 in arg_list]
Note that if the keys were integers instead of strings, you will have to do:
dict[int("%d" % some_integer)]

A "pythonic" strategy to check whether a key already exists in a dictionary

I often deal with heterogeneous datasets and I acquire them as dictionaries in my python routines. I usually face the problem that the key of the next entry I am going to add to the dictionary already exists.
I was wondering if there exists a more "pythonic" way to do the following task: check whether the key exists and create/update the corresponding pair key-item of my dictionary
myDict = dict()
for line in myDatasetFile:
if int(line[-1]) in myDict.keys():
myDict[int(line[-1])].append([line[2],float(line[3])])
else:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Use a defaultdict.
from collections import defaultdict
d = defaultdict(list)
# Every time you try to access the value of a key that isn't in the dict yet,
# d will call list with no arguments (producing an empty list),
# store the result as the new value, and give you that.
for line in myDatasetFile:
d[int(line[-1])].append([line[2],float(line[3])])
Also, never use thing in d.keys(). In Python 2, that will create a list of keys and iterate through it one item at a time to find the key instead of using a hash-based lookup. In Python 3, it's not quite as horrible, but it's still redundant and still slower than the right way, which is thing in d.
Its what that dict.setdefault is for.
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.
example :
>>> d={}
>>> d.setdefault('a',[]).append([1,2])
>>> d
{'a': [[1, 2]]}
Python follows the idea that it's easier to ask for forgiveness than permission.
so the true Pythonic way would be:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
for reference:
https://docs.python.org/2/glossary.html#term-eafp
https://stackoverflow.com/questions/6092992/why-is-it-easier-to-ask-forgiveness-than-permission-in-python-but-not-in-java
Try to catch the Exception when you get a KeyError
myDict = dict()
for line in myDatasetFile:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Or use:
myDict = dict()
for line in myDatasetFile:
myDict.setdefault(int(line[-1]),[]).append([line[2],float(line[3])])

python: getting sub-dicts in dicts dynamically?

Say I want to write a function which will return an arbitrary value from a dict, like: mydict['foo']['bar']['baz'], or return an empty string if it doesn't. However, I don't know if mydict['foo'] will necessarily exist, let alone mydict['foo']['bar']['baz'].
I'd like to do something like:
safe_nested(dict, element):
try:
return dict[element]
except KeyError:
return ''
But I don't know how to approach writing code that will accept the lookup path in the function. I started going down the route of accepting a period-separated string (like foo.bar.baz) so this function could recursively try to get the next sub-dict, but this didn't feel very Pythonic. I'm wondering if there's a way to pass in both the dict (mydict) and the sub-structure I'm interested in (['foo']['bar']['baz']), and have the function try to access this or return an empty string if it encounters a KeyError.
Am I going about this in the right way?
You should use the standard defaultdict: https://docs.python.org/2/library/collections.html#collections.defaultdict
For how to nest them, see: defaultdict of defaultdict, nested or Multiple levels of 'collection.defaultdict' in Python
I think this does what you want:
from collections import defaultdict
mydict = defaultdict(lambda: defaultdict(lambda: defaultdict(str)))
You might also want to check out addict.
>>> from addict import Dict
>>> addicted = Dict()
>>> addicted.a = 2
>>> addicted.b.c.d.e
{}
>>> addicted
{'a': 2, 'b': {'c': {'d': {'e': {}}}}}
It returns an empty Dict, not an empty string, but apart from that it looks like it does what you ask for in the question.

Most efficient way to add new keys or append to old keys in a dictionary during iteration in Python?

Here's a common situation when compiling data in dictionaries from different sources:
Say you have a dictionary that stores lists of things, such as things I like:
likes = {
'colors': ['blue','red','purple'],
'foods': ['apples', 'oranges']
}
and a second dictionary with some related values in it:
favorites = {
'colors':'yellow',
'desserts':'ice cream'
}
You then want to iterate over the "favorites" object and either append the items in that object to the list with the appropriate key in the "likes" dictionary or add a new key to it with the value being a list containing the value in "favorites".
There are several ways to do this:
for key in favorites:
if key in likes:
likes[key].append(favorites[key])
else:
likes[key] = list(favorites[key])
or
for key in favorites:
try:
likes[key].append(favorites[key])
except KeyError:
likes[key] = list(favorites[key])
And many more as well...
I generally use the first syntax because it feels more pythonic, but if there are other, better ways, I'd love to know what they are. Thanks!
Use collections.defaultdict, where the default value is a new list instance.
>>> import collections
>>> mydict = collections.defaultdict(list)
In this way calling .append(...) will always succeed, because in case of a non-existing key append will be called on a fresh empty list.
You can instantiate the defaultdict with a previously generated list, in case you get the dict likes from another source, like so:
>>> mydict = collections.defaultdict(list, likes)
Note that using list as the default_factory attribute of a defaultdict is also discussed as an example in the documentation.
Use collections.defaultdict:
import collections
likes = collections.defaultdict(list)
for key, value in favorites.items():
likes[key].append(value)
defaultdict takes a single argument, a factory for creating values for unknown keys on demand. list is a such a function, it creates empty lists.
And iterating over .items() will save you from using the key to get the value.
Except defaultdict, the regular dict offers one possibility (that might look a bit strange): dict.setdefault(k[, d]):
for key, val in favorites.iteritems():
likes.setdefault(key, []).append(val)
Thank you for the +20 in rep -- I went from 1989 to 2009 in 30 seconds. Let's remember it is 20 years since the Wall fell in Europe..
>>> from collections import defaultdict
>>> d = defaultdict(list, likes)
>>> d
defaultdict(<class 'list'>, {'colors': ['blue', 'red', 'purple'], 'foods': ['apples', 'oranges']})
>>> for i, j in favorites.items():
d[i].append(j)
>>> d
defaultdict(<class 'list'>, {'desserts': ['ice cream'], 'colors': ['blue', 'red', 'purple', 'yellow'], 'foods': ['apples', 'oranges']})
All of the answers are defaultdict, but I'm not sure that's the best way to go about it. Giving out defaultdict to code that expects a dict can be bad. (See: How do I make a defaultdict safe for unexpecting clients? ) I'm personally torn on the matter. (I actually found this question looking for an answer to "which is better, dict.get() or defaultdict") Someone in the other thread said that you don't want a defaultdict if you don't want this behavior all the time, and that might be true. Maybe using defaultdict for the convenience is the wrong way to go about it. I think there are two needs being conflated here:
"I want a dict whose default values are empty lists." to which defaultdict(list) is the correct solution.
and
"I want to append to the list at this key if it exists and create a list if it does not exist." to which my_dict.get('foo', []) with append() is the answer.
What do you guys think?

Categories

Resources