Reformatting a dict where the values have a dict-like relationship - python

I have a defaultdict that looks like this:
d = { 'ID_001': ['A', 'A_part1', 'A_part2'],
'ID_002': ['A', 'A_part3'],
'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
}
Before I go any further, I have to say that A_part1 isn't the actual string -- the strings are really a bunch of alphanumeric characters; I represented it as such to show that A_part1 is text that is associated with A, if you see what I mean.)
Standing back and looking at it, what I really have is a dict where the values have their own key/value relationship, but that relationship exists only in the order they appear in, in the list.
I am attempting to end up with something like this:
['ID_001 A A_part1, A_part2',
'ID_002 A A_part3',
'ID_003 B B_part1 B_part2',
'ID_003 A A_part4',
'ID_004 C C_part1',
'ID_004 A A_part5',
'ID_004 B B_part3']
I have made a variety of attempts; I keep wanting to run through the dict's value, making note of the character in the first position (eg, the A), and collect values until I find a B or a C, then stop collecting. Then append what I have to a list that I have declared elsewhere. Ad nauseum.
I'm running into all sorts of problems, not the least of which is bloated code. I'm missing the ability to iterate through the value in a clean way. Invariably, I seem to run into index errors.
If anyone has any ideas/philosophy/comments I'd be grateful.

What about something like:
d = { 'ID_001': ['A', 'A_part1', 'A_part2'],
'ID_002': ['A', 'A_part3'],
'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
}
def is_key(s):
return s in ['A','B','C']
out = {}
for (k,v) in d.iteritems():
key = None
for e in v:
if is_key(e): key = e
else:
out_key = (k,key)
out[out_key] = out.get(out_key, []) + [e]
which generates:
{('ID_001', 'A'): ['A_part1', 'A_part2'],
('ID_002', 'A'): ['A_part3'],
('ID_003', 'A'): ['A_part4'],
('ID_003', 'B'): ['B_part1', 'B_part2'],
('ID_004', 'A'): ['A_part5'],
('ID_004', 'B'): ['B_part3'],
('ID_004', 'C'): ['C_part1']}
It's important that you update the is_key function to match your actual input.
Also, the variable names are far from optimal, but I'm not really sure what you're doing -- you should be able to (and should) give them more appropriate names.

May not be in the order you want, but no thanks for further headaches.
d = { 'ID_001': ['A', 'A_part1', 'A_part2'],
'ID_002': ['A', 'A_part3'],
'ID_003': ['B', 'B_part1', 'B_part2', 'A', 'A_part4'],
'ID_004': ['C', 'C_part1', 'A', 'A_part5', 'B', 'B_part3']
}
rst = []
for o in d:
t_d={}
for t_o in d[o]:
if not t_o[0] in t_d:
t_d[t_o[0]] = [t_o]
else: t_d[t_o[0]].append(t_o)
for t_o in t_d:
rst.append(' '.join([o,t_d[t_o][0],', '.join(t_d[t_o][1:])]))
print(rst)
https://ideone.com/FeBDLA
['ID_004 C C_part1', 'ID_004 A A_part5', 'ID_004 B B_part3', 'ID_003 A A_part4', 'ID_003 B B_part1, B_part2', 'ID_002 A A_part3', 'ID_001 A A_part1, A_part2']

Whenever you're trying to do something involving contiguous groups, you should think of itertools.groupby. You weren't very specific about what condition separates the groups, but if we take "the character in the first position" at face value:
from itertools import groupby
new_list = []
for key, sublist in sorted(d.items()):
for _, group in groupby(sublist, key=lambda x: x[0]):
new_list.append(' '.join([key] + list(group)))
produces
>>> for elem in new_list:
... print(elem)
...
ID_001 A A_part1 A_part2
ID_002 A A_part3
ID_003 B B_part1 B_part2
ID_003 A A_part4
ID_004 C C_part1
ID_004 A A_part5
ID_004 B B_part3

Related

Python dict easiest and cleanest way to get value of key2 if key1 is not present

I have a python dict like this,
d = {
"k1" : "v1",
"k2" : "v2"
}
I want to pick up value of k1 from dict, which I can do like this,
d.get("k1")
But the problem is, sometimes k1 will be absent in the dict. In that case, I want to pick k2 from the dict. I do it like this now
val = d.get("k1", None)
if not val:
val = d.get("k2", None)
I can do like this as well,
if "k1" in d:
val = d['k1']
else:
val = d.get("k2", None)
These solutions look okay and works as expected, I was wondering if there exists a one-liner solution to this problem.
The None in d.get() specifies what to do if nothing is found.
Simply add another d.get() instead of None.
d.get('k1', d.get('k2', None))
#'v1'
A d.get inside a d.get:
Maybe try a double dict.get:
val = d.get("k1", d.get("k2", None))
And now:
print(val)
Would give k1's value if there is a k1 and k2's value if there isn't a key named k1, but if there also isn't a k2, it gives None.
My code does a d.get, but inside the parameters the second argument you did None, in this case we do another d.get, which is for k2, and in the second d.get it finally has the second argument as None, it only gives None if both of the keys are not in d.
Edit:
If there are more keys (i.e. k3, k4 ...), just add more d.gets:
val = d.get("k1", d.get("k2", d.get("k3", d.get("k4"))))
And just add a:
print(val)
To output it.
Using a generator with next:
You could also use a generator with next to get the first value, like this:
val = next((d[i] for i in ['k1', 'k2', 'k3', 'k4'...] if i in d), None)
And now:
print(val)
Would also give the right result.
Remember to add a None so that if there aren't any values from any of those keys it won't give a StopIteration.
If there's no built-in function which does what you want, then you can write your own; you only have to write it once, and then everywhere else that you actually use it, it's a one-liner.
def get_alt(d, *keys, default=None):
return next((d[k] for k in keys if k in d), default)
Examples:
>>> my_dict = {'a': 1, 'b': 2, 'c': 3}
>>> get_alt(my_dict, 'a', 'b', 'c')
1
>>> get_alt(my_dict, 'd', 'b', 'c')
2
>>> get_alt(my_dict, 'd', 'e', 'c')
3
>>> get_alt(my_dict, 'd', 'e', 'f') is None
True
>>> get_alt(my_dict, 'd', 'e', 'f', default=4)
4
val = d.get("k1") or d.get("k2")
This is equivalent to your "I do it like this now" solution. None is the get's default, no need to specify it. And unlike the nested gets posted by others, this doesn't execute both gets if the first one succeeds.

How to call each value of a dictionary by its key?

I want to separate each (key, value) pair from a dictionary and want to call each value by its key name.
I have two lists,
1. ListA = [1, 2, 3, 4, 5]
2. ListB = ['A', 'B', 'C', 'D', 'E']
Now I have created a dictionary like this,
Dict = {1: 'A', 2: 'B', 3: 'C', 4: 'D', 5: 'E'}
So, now if I want to see value for each key, I have to type:
Dict[key].
Now I expect my result to be like:
If I ask for the value of each key, I have to just type key not Dict[key]
and it should give me the answer.
Typing 1 should give A.
Typing 2 should give B.
You can somehow do like this but you if you can simply access below is not a very good idea honestly. You can set key as attribute of an object.
But it wont work for the dictionary you have my Input Dictionary is {"key1":"Apple"}
class MyDict:
def __init__(self,input_dict):
for key,value in input_dict.items():
setattr(self,key,value) #setting key as attribute
obj = MyDict({"key1":"Apple"}) #new instance
print(obj.key1) # this will print Apple
But still it wont work like obj.1 better not to mess it.
If you want to loop over dictionary values, you can use Dict.values():
for v in Dict.values():
print(v)
# > 'A'
# > 'B'
# ...
Inside a while true loop, keep popping items from your lists until you run into an IndexError like this:
ListA = [1, 2, 3]
ListB = ['A', 'B', 'C']
DictC = {}
while True:
try:
DictC[ListA.pop(0)] = ListB.pop(0)
except IndexError:
break
print(DictC)

Finding common value pair fields from numerous json in Python

I have list of JSON files. Now I intend to find all the common value pairs from all these JSON and copy it to different JSON. Also the common value pairs should be removed from all JSON's.
lets say I have a.json, b.json, c.json ... z.json
Now the common label value pair in all of them is
"Town" : "New York"
then, this common element should be moved to a new JSON file called common.json and also the element should be removed from all the JSON files.
An eg json file would look like:
{
"RepetitionTime": 2,
"EchoTime": 0,
"MagneticFieldStrength": 3,
"SequenceVariant": "SK",
"MRAcquisitionType": "2D",
"FlipAngle": 90,
"ScanOptions": "FS",
"SliceTiming": [[0.0025000000000000022], [0.5], [-0.030000000000000027], [0.46625], [-0.06374999999999997], [0.43375000000999997], [-0.09624999999999995], [0.40000000001], [-0.12999999999], [0.36750000001], [-0.16249999998999998], [0.333750000005], [-0.19624999999500004], [0.301250000005], [-0.228749999995], [0.26749999999999996], [-0.26249999999500007], [0.235], [-0.29500000000000004], [0.20124999999999998], [-0.32875], [0.16875000001], [-0.36124999999999996], [0.13500000001], [-0.39499999999], [0.10250000000999998], [-0.42749999999], [0.06875000000499998], [-0.46124999999500005], [0.036250000005000005]],
"SequenceName": "epfid2d1_64",
"ManufacturerModelName": "TrioTim",
"TaskName": "dis",
"ScanningSequence": "EP",
"Manufacturer": "SIEMENS"
}
I way i am thinking is too complex. I thought to take each line and of first json file and check with all other jsons.
There should be something easy and efficient. any pointers?
To compare all files in one time, you can also use Sets to compare all key-values at once using &
>>> import json
>>> json_dict1 = json.loads('{"a":1, "b":2}')
>>> json_dict2 = json.loads('{"a":1, "b":4, "c":5}')
>>> json_dict3 = json.loads('{"a":1, "b":2, "c":5}')
>>> a = set(json_dict1.items())
>>> b = set(json_dict2.items())
>>> c = set(json_dict3.items())
>>> a & b & c
{('a', 1)}
Note that you can also do other operations with Sets, here an example from the doc:
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
>>> a - b # letters in a but not in b
{'r', 'd', 'b'}
>>> a | b # letters in either a or b
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
>>> a & b # letters in both a and b
{'a', 'c'}
>>> a ^ b # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}
EDIT
Finally, I asked my own so question based on the same problem as you (almost).
Here is the best response
If you are too lazy to click on the link, here is an overview:
>>> list_dict = [json_dict1, json_dict2, json_dict3]
>>> {k: v
for k, v in list_dict[0].items()
if all(k in d and d[k] == v
for d in list_dict[1:])}
{'a': 1}
Since you didn't provide your exact JSON sample, I assume it is just regular json as '{"key":"value"}'.
convert json string to dictionary:
import json
json_dict = json.loads('{"a":1, "b":2}') # converts json string to dictionary
now assume we have two converted dictionaries:
>>> dict1= {"a":1,"b":2}
>>> dict2= {"a":1,"b":3}
comparing two dictionaries and finding the common key-value pairs(similarly for the diff k-v pairs), I am using python3:
>>> {k:v for k, v in dict1.items() for k1,v1 in dict2.items() if k ==k1 and v==v1}
{'a': 1}
My post showed you the idea how to solve your issue, it might have edge issues for your specific JSON lines, you can modify it and fit your needs. Hope it helps

python dictionary with list as values

I have a list of string like
vals = ['a', 'b', 'c', 'd',........]
using vals, I would be making a request to a server with each vals[i]. The server would return me a number for each of the vals.
it could be 1 for 'a', 2 for 'b', 1 again for 'c', 2 again for 'd', and so on.
Now I want to create a dictionary that should look like
{ 1: ['a','c'], 2:['b','d'], 3: ['e'], ..... }
What is the quickest possible way to achieve this? Could I use map() somehow?
I mean I can try doing this by storing the results of request in a separate list and then map them one by one - but I am trying to avoid that.
The following should work, using dict.setdefault():
results = {}
for val in vals:
i = some_request_to_server(val)
results.setdefault(i, []).append(val)
results.setdefault(i, []).append(val) is equivalent in behavior to the following code:
if i in results:
results[i].append(val)
else:
results[i] = [val]
Alternatively, you can use defaultdict from collections like so:
from collections import defaultdict
results = defaultdict(list)
for val in vals:
i = some_request_to_server(val)
results[i].append(val)

list to dictionary conversion with multiple values per key?

I have a Python list which holds pairs of key/value:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
I want to convert the list into a dictionary, where multiple values per key would be aggregated into a tuple:
{1: ('A', 'B'), 2: ('C',)}
The iterative solution is trivial:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for pair in l:
if pair[0] in d:
d[pair[0]] = d[pair[0]] + tuple(pair[1])
else:
d[pair[0]] = tuple(pair[1])
print(d)
{1: ('A', 'B'), 2: ('C',)}
Is there a more elegant, Pythonic solution for this task?
from collections import defaultdict
d1 = defaultdict(list)
for k, v in l:
d1[k].append(v)
d = dict((k, tuple(v)) for k, v in d1.items())
d contains now {1: ('A', 'B'), 2: ('C',)}
d1 is a temporary defaultdict with lists as values, which will be converted to tuples in the last line. This way you are appending to lists and not recreating tuples in the main loop.
Using lists instead of tuples as dict values:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for key, val in l:
d.setdefault(key, []).append(val)
print(d)
Using a plain dictionary is often preferable over a defaultdict, in particular if you build it just once and then continue to read from it later in your code:
First, the plain dictionary is faster to build and access.
Second, and more importantly, the later read operations will error out if you try to access a key that doesn't exist, instead of silently creating that key. A plain dictionary lets you explicitly state when you want to create a key-value pair, while the defaultdict always implicitly creates them, on any kind of access.
This method is relatively efficient and quite compact:
reduce(lambda x, (k,v): x[k].append(v) or x, l, defaultdict(list))
In Python3 this becomes (making exports explicit):
dict(functools.reduce(lambda x, d: x[d[0]].append(d[1]) or x, l, collections.defaultdict(list)))
Note that reduce has moved to functools and that lambdas no longer accept tuples. This version still works in 2.6 and 2.7.
Are the keys already sorted in the input list? If that's the case, you have a functional solution:
import itertools
lst = [(1, 'A'), (1, 'B'), (2, 'C')]
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in itertools.groupby(lst, lambda pair: pair[0]))
print dct
# {1: ('A', 'B'), 2: ('C',)}
I had a list of values created as follows:
performance_data = driver.execute_script('return window.performance.getEntries()')
Then I had to store the data (name and duration) in a dictionary with multiple values:
dictionary = {}
for performance_data in range(3):
driver.get(self.base_url)
performance_data = driver.execute_script('return window.performance.getEntries()')
for result in performance_data:
key=result['name']
val=result['duration']
dictionary.setdefault(key, []).append(val)
print(dictionary)
My data was in a Pandas.DataFrame
myDict = dict()
for idin set(data['id'].values):
temp = data[data['id'] == id]
myDict[id] = temp['IP_addr'].to_list()
myDict
Gave me a Dict of the keys, ID, mappings to >= 1 IP_addr. The first IP_addr is Guaranteed. My code should work even if temp['IP_addr'].to_list() == []
{'fooboo_NaN': ['1.1.1.1', '8.8.8.8']}
My two coins for toss into that amazing discussion)
I've tried to wonder around one line solution with only standad libraries. Excuse me for the two excessive imports. Perhaps below code could solve the issue with satisfying quality (for the python3):
from functools import reduce
from collections import defaultdict
a = [1, 1, 2, 3, 1]
b = ['A', 'B', 'C', 'D', 'E']
c = zip(a, b)
print({**reduce(lambda d,e: d[e[0]].append(e[1]) or d, c, defaultdict(list))})

Categories

Resources