Consider this string: "{'a': A, 'b': B, 'c': 10}". Now I want to update this "string" and add new key d with let say value 20, so result would be "{'a': A, 'b': B, 'c': 10, 'd': 20}"
Normally, you could just eval string (eval or literal_eval) into dict, update the way you want and convert it back to string. But in this case, there are placeholders, which would not be recognized when evaluating.
What would be best way to update it, so old values are kept the same, but "dict-string" is updated properly?
For a more robust solution that properly parses the dict, you can subclass lib2to3.refactor.RefactoringTool to refactor the code using a fixer that is a subclass of lib2to3.fixer_base.BaseFix with a pattern that looks for a dictsetmaker node, and a transform method that extends the children list with leaf nodes that consist of the tokens that will make for a new key-value pair in the dict:
from lib2to3 import fixer_base, refactor, pytree
from lib2to3.pgen2 import token
class AddKeyValue(fixer_base.BaseFix):
PATTERN = "dictsetmaker"
def transform(self, node, results):
node.children.extend((
pytree.Leaf(token.COMMA, ','),
pytree.Leaf(token.STRING, "'d'", prefix=' '),
pytree.Leaf(token.COLON, ':'),
pytree.Leaf(token.NUMBER, 20, prefix=' ')
))
return node
class Refactor(refactor.RefactoringTool):
def __init__(self, fixers):
self._fixers= [cls(None, None) for cls in fixers]
super().__init__(None)
def get_fixers(self):
return self._fixers, []
s = "{'a': A, 'b': B, 'c': 10}"
print(Refactor([AddKeyValue]).refactor_string(s + '\n', ''))
This outputs:
{'a': A, 'b': B, 'c': 10, 'd': 20}
lib2to3 is round-trip stable so all white spaces are preserved after the transformation, and a new node should be specified with a prefix if whitespaces are to be inserted before it.
You can find the definition of the Python grammar in Grammar.txt of the lib2to3 module.
Demo: https://repl.it/#blhsing/RudeLimegreenConcentrate
This by no means a best solution but here is one approach:
import re
dict_str = "{'a': A, 'b': B, 'c': 10}"
def update_dict(dict_str, **keyvals):
"""creates an updated dict_str
Parameters:
dict_str (str): current dict_str
**keyvals: variable amounts of key-values
Returns:
str:updated string
"""
new_entries = ", ".join(map(lambda keyval: f"'{keyval[0]}': {keyval[1]}", keyvals.items())) # create a string representation for each key-value and join by ','
return dict_str.replace("}", f", {new_entries}{'}'}") # update the dict_str by removing the last '}' and add the new entries
output:
updated = update_dict(dict_str,
d = 20,
e = 30
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'd': 20, 'e': 30}
some_dict = {
'g': 2,
'h': 3
}
updated = update_dict(dict_str,
**some_dict
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'g': 2, 'h': 3}
I think that you can:
Option 1 - Adding
Insert the new string ", key: value" at the end of the string, before the "}".
Option 2 - RagEx for adding/updating
1 - use find() and search for the key. If it exist use the regex to substitute:
re.replace(regex_search,regex_replace,contents)
So using something like:
string = re.sub(r'key: (.+),', 'key: value', article)
2 - if the find() fail, use the add of the option 1
If it's just about adding at the end of the string...
this_string = "{'a': A, 'b': B, 'c': 10}"
this_add = "'d': 20"
this_string = f"{this_string[:-1]}, {this_add}{this_string[-1]}"
print(this_string)
will output
{'a': A, 'b': B, 'c': 10, 'd': 20}
If you need to insert the new string in between you can do something similar using string.find to locate the index and use that index number instead.
It's basically rewriting the entire string but strings are immutable what can we do.
Related
Imagine 2 different yaml files (or for demonstration purposes, 2 dictionaries)
a = {'A':'yes',
'B': 2,
'C': [-1,0,2],
'D': {
'E': True
}}
b = {'A':'yes',
'G': 2,
'C': [-1,0,1],
'F': {
'E': False
}}
Obviously they look very similar, but they have different keys for what intentionally appears to be similar values.
if we do a comparison of the two:
print(DeepDiff(a, b, ignore_order=True, significant_digits=10, verbose_level=2).pretty())
we get this kind of expected result
Item root['G'] (2) added to dictionary.
Item root['F'] ({'E': False}) added to dictionary.
Item root['B'] (2) removed from dictionary.
Item root['D'] ({'E': True}) removed from dictionary.
Value of root['C'][2] changed from 2 to 1.
Since DeefDiff doesn't know that the keys represent the same "things".
It is possible to rename the keys:
b['D'] = b.pop('F')
b['B'] = b.pop('G')
and now the same DeepDiff call results in
Value of root['C'][2] changed from 2 to 1.
Value of root['D']['E'] changed from True to False.
So, is there an efficient way to create a "Translator" for b to a and automatically interpret those difference without manually writing over each key or creating a new dictionary for comparison.
We could create a "mapping dictionary" and iterate through them:
translator_b2a = {'D': 'F',
'B': 'G'}
for key in translator:
value = translator_b2a[key]
b[key] = b.pop(value)
and get the same result... just wonder if there is a method/process more efficient or already designed. This method will obviously break down with the yaml/dictionaries get more complex, such as when the level of the nested keys are different, i.e.
b = {'A':'yes',
'G': 2,
'C': [-1,0,1],
'F': {
'E': {'H':False}
}}
Depending on your needs there are actually several options.
Option 1a: concentrate on the value comparison, but not the structure
from pandas.io.json._normalize import nested_to_record
aflat = nested_to_record(a, sep='')
bflat = nested_to_record(b, sep='')
bflat = dict(zip(list(aflat.keys()), bflat.values()))
print(DeepDiff(aflat, bflat, ignore_order=True, significant_digits=10, verbose_level=2).pretty())
###Output:
###Value of root['C'][2] changed from 2 to 1.
###Value of root['DE'] changed from True to False.
Option 1b: concentrate on the value comparison, but not the structure
import pandas as pd
def flatten(d):
df = pd.json_normalize(d, sep='')
return df.to_dict(orient='records')[0]
aflat = flatten(a)
bflat = flatten(b)
bflat = dict(zip(list(aflat.keys()), bflat.values()))
print(DeepDiff(aflat, bflat, ignore_order=True, significant_digits=10, verbose_level=2).pretty())
###Output:
###Value of root['C'][2] changed from 2 to 1.
###Value of root['DE'] changed from True to False.
Option 2: keep the structure of the first dict
import pandas as pd
def updateDict(init, values, count=0):
items = {}
for k,v in init.items():
if isinstance(v, dict):
items[k] = updateDict(v, values, count, k)
else:
items[k] = values[count]
count += 1
return items
dfb = pd.json_normalize(b, sep='')
b = updateDict(a, dfb.values[0])
print(DeepDiff(a, b, ignore_order=True, significant_digits=10, verbose_level=2).pretty())
###Output:
###Value of root['C'][2] changed from 2 to 1.
###Value of root['D']['E'] changed from True to False.
Well... this works for now, until I get more creative with finding ways to break it... I'm sure someone will make it better. This handles the 2nd example where the nested Keys might not have a 1:1 relationship and also handles where there is an error in the translator (i.e. if the requested key is not in the "b" dictionary, it leaves it as it was... however, if the requested key is not in the "a" dictionary then you'll still have a "difference" show up in the deepdiff).
Don't really have to "enumerate" the first inner loop. The tupley portion converts from a tuple back to dict format.
def translatorB2A():
return [{'b':['F','E','H'], 'a':['D', 'E']},
{'b':['G'], 'a':['B']}]
def convertB2A(b):
translator = translatorB2A()
bNew = copy.deepcopy(b)
for translation in translator:
skip = False
value = bNew
lastKey = ""
for i, key in enumerate(translation['b']):
if key in value.keys():
lastKey = key
value = value.pop(key)
else:
skip = True
if lastKey:
bNew[lastKey] = value
break
newObj = {}
if not skip:
for j, key in enumerate(reversed(translation['a'])):
if j==0:
newObj[key]=value
else:
tupley = newObj.popitem()
newObj[key]= {tupley[0]:tupley[1]}
bNew = bNew | newObj
return bNew
Now you can run it on some dictionaries that represent YAMLs
a = {'A':'yes',
'B': 2,
'C': [-1,0,2],
'D': {
'E': True
}}
b = {'A':'yes',
'G': 2,
'C': [-1,0,1],
'F': {
'E': {'H':False}
}}
bNew = convertB2A(b)
print(bNew)
print(DeepDiff(a, bNew, ignore_order=True, significant_digits=10, verbose_level=2).pretty())
should get a result like:
{'A': 'yes', 'C': [-1, 0, 1], 'D': {'E': False}, 'B': 2}
Value of root['C'][2] changed from 2 to 1.
Value of root['D']['E'] changed from True to False.
Would be nice if there is a way to insert the changed keys in the same order they are in the "translated to" dictionary. i.e. find a way to have b return as
{'A': 'yes', 'B': 2, 'C': [-1, 0, 1], 'D': {'E': False}}
so that it appears at a quick glance the same format as "a"
{'A': 'yes', 'B': 2, 'C': [-1, 0, 2], 'D': {'E': True}}
I want to write a code in Python, which assigns a number to every alphabetical character, like so: a=0, b=1, c=2, ..., y=24, z=25. I personally don't prefer setting up conditions for every single alphabet, and don't want my code look over engineered. I'd like to know the ways I can do this the shortest (meaning the shortest lines of code), fastest and easiest.
(What's on my mind is to create a dictionary for this purpose, but I wonder if there's a neater and better way).
Any suggestions and tips are in advance appreciated.
You definitely want a dictionary for this, not to declare each as a variable. A simple way is to use a dictionary comprehension with string.ascii_lowercase as:
from string import ascii_lowercase
{v:k for k,v in enumerate(ascii_lowercase)}
# {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5...
Here's my two cents, for loop will do the work:
d = {} #empty dictionary
alpha = 'abcdefghijklmnopqrstuvwxyz'
for i in range(26):
d[alpha[i]] = i #assigns the key value as alphabets and corresponding index value from alpha string as the value for the key
print(d) #instant verification that the dictionary has been created properly
One-liner with map and enumerate:
# given
foo = 'abcxyz'
dict(enumerate(foo))
# returns: {0: 'a', 1: 'b', 2: 'c', 3: 'x', 4: 'y', 5: 'z'}
If you needed it with the characters as the dictionary keys, what comes into my mind is either a dict comprehension...
{letter:num for (num,letter) in enumerate(foo) }
# returns {'a': 0, 'b': 1, 'c': 2, 'z': 3, 'y': 4, 'x': 5}
... or a lambda...
dict( map(lambda x: (x[1],x[0]), enumerate(foo)) )
# returns {'a': 0, 'b': 1, 'c': 2, 'z': 3, 'y': 4, 'x': 5}
I feel dict comprehension is much more readable than map+lambda+enumerate.
There are already numbers associated with characters. You can use these code points with ord().
A short (in terms of lines) solution would be:
num_of = lambda s: ord(s) - 97
A normal function would be easier to read:
def num_of(s):
return ord(s) - 97
Usage:
num_of("a") # 0
num_of("z") # 25
If it must be a dictionary you can create it without imports like that:
{chr(n):n-97 for n in range(ord("a"), ord("z")+1)}
I have two or more dictionary, I like to merge it as one with retaining multiple values of the same key as list. I would not able to share the original code, so please help me with the following example.
Input:
a= {'a':1, 'b': 2}
b= {'aa':4, 'b': 6}
c= {'aa':3, 'c': 8}
Output:
c= {'a':1,'aa':[3,4],'b': [2,6], 'c': 8}
I suggest you read up on the defaultdict: it lets you provide a factory method that initializes missing keys, i.e. if a key is looked up but not found, it creates a value by calling factory_method(missing_key). See this example, it might make things clearer:
from collections import defaultdict
a = {'a': 1, 'b': 2}
b = {'aa': 4, 'b': 6}
c = {'aa': 3, 'c': 8}
stuff = [a, b, c]
# our factory method is the list-constructor `list`,
# so whenever we look up a value that doesn't exist, a list is created;
# we can always be sure that we have list-values
store = defaultdict(list)
for s in stuff:
for k, v in s.items():
# since we know that our value is always a list, we can safely append
store[k].append(v)
print(store)
This has the "downside" of creating one-element lists for single occurences of values, but maybe you are able to work around that.
Please find below to resolve your issue. I hope this would work for you.
from collections import defaultdict
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c={'aa':3, 'c': 8}
dd = defaultdict(list)
for d in (a,b,c):
for key, value in d.items():
dd[key].append(value)
print(dd)
Use defaultdict to automatically create a dictionary entry with an empty list.
To process all source dictionaries in a single loop, use itertools.chain.
The main loop just adds a value from the current item, to the list under
the current key.
As you wrote, for cases when under some key there is only one item,
you have to generate a work dictionary (using dictonary comprehension),
limited to items with value (list) containing only one item.
The value of such item shoud contain only the first (and only) number
from the source list.
Then use this dictionary to update d.
So the whole script can be surprisingly short, as below:
from collections import defaultdict
from itertools import chain
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c = {'aa':3, 'c': 8}
d = defaultdict(list)
for k, v in chain(a.items(), b.items(), c.items()):
d[k].append(v)
d.update({ k: v[0] for k, v in d.items() if len(v) == 1 })
As you can see, the actual processing code is contained in only 4 (last) lines.
If you print d, the result is:
defaultdict(list, {'a': 1, 'b': [2, 6], 'aa': [4, 3], 'c': 8})
I am trying to update the values in a dictionary of dictionaries of uncertain depth. This dictionary of dictionaries is loaded via h5py. As a MWE one could try:
C = {'A':{'B':12, 'C':13, 'D':{'E':20}}, 'F':14, 'G':{'H':15}}
So when I would like to set the Value of 'E' to 42 I would use
C['A']['D']['E'] = 42
This part is easy, so now lets suppose I am trying to set a value in this dict that is given by a user in the form:
([keys], value) -> e.g. (['A','D','E'], 42)
How could I set this value? Do I have to write a setter function for each possible depth?
I tried using the update function of dict like this:
def dod(keys, value):
""" Generate a dict of dicts from a list of keys and one value """
if len(keys) == 1:
return {keys[0]: value}
else:
return {keys[0]: dod(keys[1:], value)}
C.update(dod(['A', 'D', 'E'], 42)
But using this deletes all the elements from depth two onwards, resulting in:
C = {'A': {'D': {'E': 42}}, 'G': {'H': 15}, 'F': 14}
Did I make a mistake or is there even a simpler way of setting a value with unknown depth?
This answer simply implements the methodology found here for your particular example:
from functools import reduce
import operator
def getFromDict(dataDict, mapList):
return reduce(operator.getitem, mapList, dataDict)
def setInDict(dataDict, mapList, value):
getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value
C = {'A':{'B':12, 'C':13, 'D':{'E':20}}, 'F':14, 'G':{'H':15}}
setInDict(C, ['A','D','E'], 42)
print(C)
Yields:
{'A': {'B': 12, 'C': 13, 'D': {'E': 42}}, 'F': 14, 'G': {'H': 15}}
Short Version:
In Python is there a way to (cleanly/elegantly) say "Give me these 5 (or however many) properties of an object, and nothing else, as a dictionary"?
Longer Version:
Using the Javascript Underscore library, I can reduce an bunch of objects/dictionaries (in JS they're the same thing) to a bunch of subsets of their properties like so:
var subsets = _(someObjects).map(function(someObject) {
_(someObject).pick(['a', 'd']);
});
If I want to do the same thing with a Python object (not a dictionary) however it seems like the best I can do is use a list comprehension and manually set each property:
subsets = [{"a": x.a, "d": x.d} for x in someObjects]
That doesn't look so bad when there's only two properties, and they're both one letter, but it gets uglier fast if I start having more/longer properties (plus I feel wrong whenever I write a multi-line list comprehension). I could turn the whole thing in to a function that uses a for loop, but before I do that, is there any cool built-in Python utility thing that I can use to do this as cleanly (or even more cleanly) than the JS version?
This can be done simply by combining a list comprehension with a dictionary comprehension.
subsets = [{attr: getattr(x, attr) for attr in ["a", "d"]}
for x in someObjects]
Naturally, you could distill out that comprehension if you wanted to:
def pick(*attrs):
return {attr: getattr(x, attr) for attr in attrs}
subsets = [pick("a", "d") for x in someObjects]
>>> A = ['a', 'c']
>>> O = [{'a': 1, 'b': 2, 'c': 3}, {'a': 11, 'b': 22, 'c': 33, 'd': 44}]
>>> [{a: o[a] for a in A} for o in O]
[{'a': 1, 'c': 3}, {'a': 11, 'c': 33}]
>>> list(map(lambda o: {a: o[a] for a in A}, O))
[{'a': 1, 'c': 3}, {'a': 11, 'c': 33}]