After having created a dictionary from one dataframe column as keys, I want to set all values to an instance of an object (the class serves as container for storing key statistics for each row of the original pandas dataframe).
Hence, I tried this:
class Bond:
def __init__(self):
self.totalsize = 0
self.count = 0
if __name__ == '__main__':
isin_dict = list_of_isins.set_index('isin').T.to_dict()
isin_dict = dict.fromkeys(isin_dict, Bond())
The problem is that all values in isin_dict point to the same address, ie all rows share the same Bond class object.
How could I create a dictionary with each key holding a separate class instance as value?
The reason for this is already explained here
dict.fromKeys() uses the same value for every key.
The solution is to use dictionary comprehensions or to use defaultdict from collections module.
Sample Code to use defaultdict
from collections import defaultdict
class Bond:
def __init__(self):
pass
# I have just used your variable and stored in a list
d = defaultdict(lambda : list(list_of_isins.set_index('isin').T)
for keys in d:
d[keys] = Bond()
print (d)
The reason we are passing the type dict to defaultdict is the first argument should be callable for defaultdict. Else you may get a TypeError
Alternately you may also pass a lambda expression which will make it callable
Related
I would like to get the names of __init__ parameters and modify them when the code runs. My class looks like this:
class Sample:
def __init__ (self,indicators:dict):
self.names = []
self.returns = 0.0
for k,v in indicators.items():
setattr(self, k, v)
self.names.append(k)
The input of this class is a random choice of items from a lis; then I assign those random items to a dictionary with integer values.
indicatorsList =["SMA", "WMA", "EMA", "STOCHASTIC", "MACD", "HIGHEST_HIGH",
"HIGHEST_LOW", "HIGHEST_CLOSE", "LOWEST_HIGH", "LOWEST_LOW",
"LOWEST_CLOSE", "ATR", "LINGRES", "RSI", "WRSI", "ROC",
"DAY", "MONTH"]
# initializing the value of n
n = random.randint(2,int(math.ceil(len(indicatorsList)/2)))
randomIndList = n * [None]
for i in range(n):
choice = random.choice(indicatorsList)
randomIndList[i] = choice
...
...
sample = Sample(randDict)
Problem is, I don't know the names of these parameters in __init__, and I need to modify them later, for example like this:
sample.sma = random.randint(0, maxVal)
But I don't know if the object will have sma, or ema, or any other attribute, because of the way they're assigned randomly.
First of all, this code:
sample.sma = random.randint(0, maxVal)
will work, even if sample doesn't have an sma attribute. It will create one. Try it yourself and see.
But as you specified in your comment that you only want to modify attributes that already exist, that won't help in this case.
What you could do, with your existing class definition, is to loop over the names attribute you've already defined.
for name in sample.names:
setattr(sample, name, random.randint(0, maxVal))
However, you've basically reinvented a dictionary here, so why not redefine your class to directly use a dictionary?
class Sample:
def __init__(self, indicators:dict):
self.indicators = indicators
Now you no longer need dynamic setattr or getattr lookups. They're simply keys and values:
for key in sample.indicators:
sample.indicators[key] = random.randint(0, maxVal)
(This also means you don't need the separate names attribute.)
for variables 'a' and 'b' in dictionary 'dict1' is it possible to later call variable 'a' using its key given in 'dict1' to assign a value to it??
a=""
b=""
dict1= {0:a,1:b}
dict1[0] = "Hai" #assign a value to the variable using the key
print(a) #later call the variable```
No, when you do the assignment {key:value}, the value doesn't refer to the original variable, so mutating either one will not affect the other.
The variable is not set automatically, what you could do is:
def update_dic(a,b):
dict1={0:a, 1:b}
return dict1
def update_vars(dict1):
return dict1[0],dict1[1]
Every time you call the first function your dictionary is getting updated, and for the second time you always get a and b back.
You could do something similar using a class to store your variables and the indexing dictionary:
class Variables():
def __init__(self):
self.varIndex = dict()
def __getitem__(self,index):
return self.__dict__[self.varIndex[index]]
def __setitem__(self,index,value):
self.__dict__[self.varIndex[index]] = value
variables = Variables()
variables.a = 3
variables.b = 4
variables.varIndex = {0:"a",1:"b"}
variables[0] = 8
print(variables.a) # 8
We can do this by using two dictionaries with the same amount of variables:
This allows to access the variable 'a' using the key '0' and then altering its value using 'dict2' and later get the value by calling 'a'.
however remember that the variables need to written as string i.e. in quotes, it doesn't work when used as a regular variable.
dict1= {0:'a',1:'b'}
dict2={'a':'x','b':'y'}
dict2[dict1[0]]="Hai" #assign a value to the variable using the key
print(dict2['a']) #later call the variable ````
Is there a way to add duplicate keys to json with python?
From my understanding, you can't have duplicate keys in python dictionaries. Usually, how I go about creating json is to create the dictionary and then json.dumps. However, I need duplicated keys within the JSON for testing purposes. But I can't do so because I can't add duplicate keys in a python dictionary. I am trying to doing this in python 3
You could always construct such a string value by hand.
On the other hand, one can make the CPython json module to encode duplicate keys. This is very tricky in Python 2 because json module does not respect duck-typing at all.
The straightforward solution would be to inherit from collections.Mapping - well you can't, since "MyMapping is not a JSON serializable."
Next one tries to subclass a dict - well, but if json.dumps notices that the type is dict, it skips from calling __len__, and sees the underlying dict directly - if it is empty, {} is output directly, so clearly if we fake the methods, the underlying dictionary must not be empty.
The next source of joy is that actually __iter__ is called, which iterates keys; and for each key, the __getitem__ is called, so we need to remember what is the corresponding value to return for the given key... thus we arrive to a very ugly solution for Python 2:
class FakeDict(dict):
def __init__(self, items):
# need to have something in the dictionary
self['something'] = 'something'
self._items = items
def __getitem__(self, key):
return self.last_val
def __iter__(self):
def generator():
for key, value in self._items:
self.last_val = value
yield key
return generator()
In CPython 3.3+ it is slightly easier... no, collections.abc.Mapping does not work, yes, you need to subclass a dict, yes, you need to fake that your dictionary has content... but the internal JSON encoder calls items instead of __iter__ and __getitem__!
Thus on Python 3:
import json
class FakeDict(dict):
def __init__(self, items):
self['something'] = 'something'
self._items = items
def items(self):
return self._items
print(json.dumps(FakeDict([('a', 1), ('a', 2)])))
prints out
{"a": 1, "a": 2}
Thanks a lot Antti Haapala, I figured out you can even use this to convert an array of tuples into a FakeDict:
def function():
array_of_tuples = []
array_of_tuples.append(("key","value1"))
array_of_tuples.append(("key","value2"))
return FakeDict(array_of_tuples)
print(json.dumps(function()))
Output:
{"key": "value1", "key": "value2"}
And if you change the FakeDict class to this Empty dictionaries will be correctly parsed:
class FakeDict(dict):
def __init__(self, items):
if items != []:
self['something'] = 'something'
self._items = items
def items(self):
return self._items
def test():
array_of_tuples = []
return FakeDict(array_of_tuples)
print(json.dumps(test()))
Output:
"{}"
Actually, it's very easy:
$> python -c "import json; print json.dumps({1: 'a', '1': 'b'})"
{"1": "b", "1": "a"}
Given n-number of models which contain m-number of key:value pairs, can namedtuples be used to consolidate that information inside of one object? By consolidate, I mean refactor, so that I can pass around this object and access particular bits of information from it.
How it is organized currently:
Model_1_Dict = {'key1':('value1','value2','value3'),'key2':('value1','value2','value3')}
Model_2_Dict = {'key1':('value1','value2','value3'),'key2':('value1','value2','value3')}
Each model dictionary will have 3 value pairs per key. The key represents an independent variable name (from a regression model), the values represent the beta coefficient, calculated value(x), and an associated function... semantically like this:
>>> Model_1_Dict["Variable Name"]
("Beta Coefficient", "Calculated Value", "myClass.myFunction")
Model_1_Dict["Variable Name"][1] gets updated later on in the code. I could either pass = None at initialization, and then update the value on calculation. Or append the value to the values list object sometime later on (I think this is a non-issue).
I want to know if there is a better way to handle the model information using other structures, such as namedtuples?
Yes, you could use namedtuples -- up to the point where you said you have to update the information inside the dictionaries.
Either tuples and namedtuples are not modifiable.
So, if you want to act on your data refering to it by name ratehr than by index number, you should either use a nested dictionary structure - or create a simple class to hold your data.
The advantage of the creating a custom class just for that is that you an restrict the names assigned to it, by having a __slots__ attribute on the class, and you can customize the __repr__ of the object, so that it looks nice on inspection -
Something along:
class Data(object):
__slots__ = ("beta", "calculated", "function")
def __init__(self, beta=None, calculated=None, function=None):
self.beta = beta; self.calculated = calculated; self.function = function
def __repr__(self):
return "(%s, %s, %s)" % (self.beta, self.calculated, self.function)
Which works like this:
>>> Model_1_Dict = {'key1':Data('value1','value2','value3') }
>>>
>>> Model_1_Dict
{'key1': (value1, value2, value3)}
>>> Model_1_Dict["key1"].beta = "NewValue"
>>> Model_1_Dict
{'key1': (NewValue, value2, value3)}
>>>
I want to create a python dictionary that returns me the key value for the keys are missing from the dictionary.
Usage example:
dic = smart_dict()
dic['a'] = 'one a'
print(dic['a'])
# >>> one a
print(dic['b'])
# >>> b
dicts have a __missing__ hook for this:
class smart_dict(dict):
def __missing__(self, key):
return key
Could simplify it as (since self is never used):
class smart_dict(dict):
#staticmethod
def __missing__(key):
return key
Why don't you just use
dic.get('b', 'b')
Sure, you can subclass dict as others point out, but I find it handy to remind myself every once in a while that get can have a default value!
If you want to have a go at the defaultdict, try this:
dic = defaultdict()
dic.__missing__ = lambda key: key
dic['b'] # should set dic['b'] to 'b' and return 'b'
except... well: AttributeError: ^collections.defaultdict^object attribute '__missing__' is read-only, so you will have to subclass:
from collections import defaultdict
class KeyDict(defaultdict):
def __missing__(self, key):
return key
d = KeyDict()
print d['b'] #prints 'b'
print d.keys() #prints []
Congratulations. You too have discovered the uselessness of the
standard collections.defaultdict type. If that execrable midden heap of code smell
offends your delicate sensibilities as much as it did mine, this is your lucky
StackOverflow day.
Thanks to the forbidden wonder of the 3-parameter
variant of the type()
builtin, crafting a non-useless default dictionary type is both fun and profitable.
What's Wrong with dict.__missing__()?
Absolutely nothing, assuming you like excess boilerplate and the shocking silliness of collections.defaultdict – which should behave as expected but really doesn't. To be fair, Jochen
Ritzel's accepted
solution of subclassing dict and
implementing the optional __missing__()
method is a fantastic
workaround for small-scale use cases only requiring a single default dictionary.
But boilerplate of this sort scales poorly. If you find yourself instantiating
multiple default dictionaries, each with their own slightly different logic for
generating missing key-value pairs, an industrial-strength alternative
automating boilerplate is warranted.
Or at least nice. Because why not fix what's broken?
Introducing DefaultDict
In less than ten lines of pure Python (excluding docstrings, comments, and
whitespace), we now define a DefaultDict type initialized with a user-defined
callable generating default values for missing keys. Whereas the callable passed
to the standard collections.defaultdict type uselessly accepts no
parameters, the callable passed to our DefaultDict type usefully accepts the
following two parameters:
The current instance of this dictionary.
The current missing key to generate a default value for.
Given this type, solving sorin's
question reduces to a single line of Python:
>>> dic = DefaultDict(lambda self, missing_key: missing_key)
>>> dic['a'] = 'one a'
>>> print(dic['a'])
one a
>>> print(dic['b'])
b
Sanity. At last.
Code or It Didn't Happen
def DefaultDict(keygen):
'''
Sane **default dictionary** (i.e., dictionary implicitly mapping a missing
key to the value returned by a caller-defined callable passed both this
dictionary and that key).
The standard :class:`collections.defaultdict` class is sadly insane,
requiring the caller-defined callable accept *no* arguments. This
non-standard alternative requires this callable accept two arguments:
#. The current instance of this dictionary.
#. The current missing key to generate a default value for.
Parameters
----------
keygen : CallableTypes
Callable (e.g., function, lambda, method) called to generate the default
value for a "missing" (i.e., undefined) key on the first attempt to
access that key, passed first this dictionary and then this key and
returning this value. This callable should have a signature resembling:
``def keygen(self: DefaultDict, missing_key: object) -> object``.
Equivalently, this callable should have the exact same signature as that
of the optional :meth:`dict.__missing__` method.
Returns
----------
MappingType
Empty default dictionary creating missing keys via this callable.
'''
# Global variable modified below.
global _DEFAULT_DICT_ID
# Unique classname suffixed by this identifier.
default_dict_class_name = 'DefaultDict' + str(_DEFAULT_DICT_ID)
# Increment this identifier to preserve uniqueness.
_DEFAULT_DICT_ID += 1
# Dynamically generated default dictionary class specific to this callable.
default_dict_class = type(
default_dict_class_name, (dict,), {'__missing__': keygen,})
# Instantiate and return the first and only instance of this class.
return default_dict_class()
_DEFAULT_DICT_ID = 0
'''
Unique arbitrary identifier with which to uniquify the classname of the next
:func:`DefaultDict`-derived type.
'''
The key ...get it, key? to this arcane wizardry is the call to
the 3-parameter variant
of the type() builtin:
type(default_dict_class_name, (dict,), {'__missing__': keygen,})
This single line dynamically generates a new dict subclass aliasing the
optional __missing__ method to the caller-defined callable. Note the distinct
lack of boilerplate, reducing DefaultDict usage to a single line of Python.
Automation for the egregious win.
The first respondent mentioned defaultdict,
but you can define __missing__ for any subclass of dict:
>>> class Dict(dict):
def __missing__(self, key):
return key
>>> d = Dict(a=1, b=2)
>>> d['a']
1
>>> d['z']
'z'
Also, I like the second respondent's approach:
>>> d = dict(a=1, b=2)
>>> d.get('z', 'z')
'z'
I agree this should be easy to do, and also easy to set up with different defaults or functions that transform a missing value somehow.
Inspired by Cecil Curry's answer, I asked myself: why not have the default-generator (either a constant or a callable) as a member of the class, instead of generating different classes all the time? Let me demonstrate:
# default behaviour: return missing keys unchanged
dic = FlexDict()
dic['a'] = 'one a'
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# regardless of default: easy initialisation with existing dictionary
existing_dic = {'a' : 'one a'}
dic = FlexDict(existing_dic)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# using constant as default for missing values
dic = FlexDict(existing_dic, default = 10)
print(dic['a'])
# 'one a'
print(dic['b'])
# 10
# use callable as default for missing values
dic = FlexDict(existing_dic, default = lambda missing_key: missing_key * 2)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'bb'
print(dic[2])
# 4
How does it work? Not so difficult:
class FlexDict(dict):
'''Subclass of dictionary which returns a default for missing keys.
This default can either be a constant, or a callable accepting the missing key.
If "default" is not given (or None), each missing key will be returned unchanged.'''
def __init__(self, content = None, default = None):
if content is None:
super().__init__()
else:
super().__init__(content)
if default is None:
default = lambda missing_key: missing_key
self.default = default # sets self._default
#property
def default(self):
return self._default
#default.setter
def default(self, val):
if callable(val):
self._default = val
else: # constant value
self._default = lambda missing_key: val
def __missing__(self, x):
return self.default(x)
Of course, one can debate whether one wants to allow changing the default-function after initialisation, but that just means removing #default.setter and absorbing its logic into __init__.
Enabling introspection into the current (constant) default value could be added with two extra lines.
Subclass dict's __getitem__ method. For example, How to properly subclass dict and override __getitem__ & __setitem__