unique combinations values in a dictionary

unique combinations values in a dictionary - python

For the following example dictionary, is there a builtin method to get all unique combinations?
a = {
"a": ["a_1", "a_2"],
"b": ["b_1", "b_2"]
}
output:
[
["a_1", "b_1"],
["a_1", "b_2"],
["a_2", "b_1"],
["a_2", "b_2"]
]

I did this with itertools.product()
import itertools
a = {
"a": ["a_1", "a_2"],
"b": ["b_1", "b_2"]
}
print(list(itertools.product(*a.values())))
Output:
[('a_1', 'b_1'), ('a_1', 'b_2'), ('a_2', 'b_1'), ('a_2', 'b_2')]

Related

how to count how often a particular key appers in a dict python

So I have a dict like that:
{
"channel_list" : [
{
"channel_index" : 0,
"channel_sth" : "A",
},
{
"channel_index" : 1,
"channel_sth" : "B",
}]
}
and I would like to count how often the "channel_index" appers in that dict.
How to do it?

you could use the sum() function with a generator expression:
my_dict = {
"channel_list" : [
{
"channel_index" : 0,
"channel_sth" : "A",
},
{
"channel_index" : 1,
"channel_sth" : "B",
}]
}
def count_keys(my_dict, key):
count = sum(key in channel for channel in my_dict["channel_list"])
return count
count_keys(my_dict, "channel_index")
output :
2

The simple answer is to create a variable that counts the amount of "channel_index" in the list and then make a for loop that increments 1 to the variable everytime the name is found, like this:
channel_index_count = 0
for channel in example_dict['channel_list']:
if channel.get('channel_index'): // if 'channel_index' exists
channel_index_count += 1
print(channel_index_count)
There are definitely more optimal ways of doing this but this is the easiest

d1 = eval(input())
def dict_keys_counts(d1):
list1 = []
for i in range(0,len(d1["channel_list"])):
for j in d1["channel_list"][i]:
list1.append(j)
list2 = []
for k in list1:
if k not in list2:
list2.append(k)
print(k,list1.count(k))
dict_keys_counts(d1)

Count number of objects in list of dictionary where a key's value is more than 1

Given a list of dictionaries:
data = {
"data": [
{
"categoryOptionCombo": {
"id": "A"
},
"dataElement": {
"id": "123"
}
},
{
"categoryOptionCombo": {
"id": "B"
},
"dataElement": {
"id": "123"
}
},
{
"categoryOptionCombo": {
"id": "C"
},
"dataElement": {
"id": "456"
}
}
]
}
I would like to display the dataElement where the count of distinct categoryOptionCombo is larger than 1.
e.g. the result of the function would be an iterable of IDs:
[123]
because the dataElement with id 123 has two different categoryOptionCombos.
tracker = {}
for d in data['data']:
data_element = d['dataElement']['id']
coc = d['categoryOptionCombo']['id']
if data_element not in tracker:
tracker[data_element] = set()
tracker[data_element].add(coc)
too_many = [key for key,value in tracker.items() if len(value) > 1]
How can I iterate the list of dictionaries preferably with a comprehension? This solution above is not pythonic.

One approach:
import collections
counts = collections.defaultdict(set)
for d in data["data"]:
counts[d["dataElement"]["id"]].add(d["categoryOptionCombo"]["id"])
res = [k for k, v in counts.items() if len(v) > 1]
print(res)
Output
['123']
This approach creates a dictionary mapping dataElements to the different types of categoryOptionCombo:
defaultdict(<class 'set'>, {'123': {'B', 'A'}, '456': {'C'}})

Almost a one-liner:
counts = collections.Counter( d['dataElement']['id'] for d in data['data'] )
print( counts )
Output:
Counter({'123': 2, '456': 1})

No need for sets, you can just remember each data element's first coc or mark it as having 'multiple'.
tracker = {}
for d in data['data']:
data_element = d['dataElement']['id']
coc = d['categoryOptionCombo']['id']
if tracker.setdefault(data_element, coc) != coc:
tracker[data_element] = 'multiple'
too_many = [key for key,value in tracker.items() if value == 'multiple']
(If the string 'multiple' can be a coc id, then use multiple = object() and compare with is).

Querying large Mongodb collection using pymongo

I want to query my mongodb collection which has more than 5k records, each record has key-value pair like
{
"A" : "unique-value1",
"B" : "service1",
"C" : 1.2321,
...
},
...
here A will always have unique value, B has value like service1, service2, ....service8 and C is some float value.
what I want is to get a record like this with key-value pair.
{
"A" : "unique-value1",
"B" : "service1",
"C" : 1.2321
}
{
"A" : "unique-value2",
"B" : "service2",
"C" : 0.2321
}
{
"A" : "unique-value3",
"B" : "service1",
"C" : 3.2321
}
I am not sure how to do this, earlier I used MapReduce but that time I was needed to generate records with A and C key value paire only but now since i also need B i do not know what should i do.
this is what i was doing
map_reduce = Code("""
function () {
emit(this.A, parseFloat(this.C));
}
""")
result = my_collection.map_reduce(map_reduce, reduce, out='temp_collection')
for doc in result.find({}):
out = dict()
out[doc['_id']] = doc['_id']
out['cost'] = doc['value']
out_handle.update_one(
{'A': doc['_id']},
{'$set': out},
upsert=True
)

Unless I've misunderstood what you need , it looks like you are making this harder than it need be. Just project the keys you want using the second parameter of the find method.
for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)
Full example:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['testdatabase']
db.testcollection.insert_one({
"A": "unique-value1",
"B": "service1",
"C": 1.2321,
"D": "D",
"E": "E",
"F": "F",
})
for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)
print(dumps(db.existingornewcollection.find_one({}, {'_id': 0}), indent=4))
gives:
{
"A": "unique-value1",
"B": "service1",
"C": 1.2321
}

How to interpret a string to define a dictionary call?

I am attempting to pass in to a function, a string which will be interpreted to determine the desired dictionary call required to update a dictionary.
Here is an example of what I have so far, hard-coded:
import json
from collections import defaultdict
def default_dict():
return defaultdict(default_dict)
def build_dict():
d["a"]["b"]["c"]["d"]["e"]["f"].update({})
d["a"]["b"]["c1"]["d1"].update({})
return json.dumps(d)
d = default_dict()
print build_dict()
But to be useful to me I want to pass in strings to the build_dict() function. Lets call it 's':
for s in ["a/b/c/d/e/f", "a/b/c1/d1"]:
print build_dict(s)
Which should print the following (exactly as it does in the example I hard-coded:
{
"a": {
"b": {
"c": {
"d": {
"e": {
"f": {}
}
}
},
"c1": {
"d1": {}
}
}
}
}
I have to make sure that multiple branches are supported in the way they are (as far as I have tested) in my hard-coded example.
What I am currently attempting:
Midway through constructing this question I found out about dpath, "A python library for accessing and searching dictionaries via /slashed/paths ala xpath". It looks exactly what I need so if I successfully work it out, I will post an answer to this question.

I worked out a solution to my own question.
import json
import dpath.util
def build_dict(viewsDict, viewsList):
for views in viewsList:
viewsDict = new_keys(viewsDict, views)
return viewsDict
def new_keys(viewsDict, views):
dpath.util.new(viewsDict, views, {})
return viewsDict
viewsDict = {}
viewsList = [
"a/b/c/d/e/f",
"a/b/c1/d1"
]
print json.dumps(build_dict(viewsDict, viewsList), indent=4, sort_keys=True)

This builds a dict based on sequence of paths and passes your test case.
It builds a dictionary from up to down, adding a new keys if they are missing, and updating an existing dictionary when they are present.
def build_dict(string_seq):
d = {}
for s in string_seq:
active_d = d
parts = s.split("/")
for p in parts:
if p not in active_d:
active_d[p] = {}
active_d = active_d[p]
return d
expected = {
"a": {
"b": {
"c": {
"d": {
"e": {
"f": {}
}
}
},
"c1": {
"d1": {}
}
}
}
}
string_seq = ["a/b/c/d/e/f", "a/b/c1/d1"]
result = build_dict(string_seq)
assert result == expected

Dynamic approach to iterate nested dict and list of dict in Python

I am looking for a dynamic approach to solve my issue. I have a very complex structure, but for simplicity,
I have a dictionary structure like this:
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"id": 1
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
I want to this iterate dictionary and solve #1, #2 and #3.
My attempt to solve #1 and #3:
def getTotal(dict1):
#for solving #1
for key,val in dict1.iteritems():
val = dict1[key]
if isinstance(val, dict):
for k1 in val:
if k1=='total':
total=val[k1]
print total #gives output 5
#for solving #3
if isinstance(val,list):
print len(val[0]['comment']['nested_comment']) #gives output 2
#How can i get this dynamicallty?
Output:
total=5
2
Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")

Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
User Counter from the standard library.
from collections import Counter
my_list = [{'hello': 'world'}, {'foo': 'bar'}, 1, 2, 'hello']
dict_count = Counter([x for x in my_list if type(x) is dict])
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")
It's not clear what you're asking for here. If by "total count", you are referring to the total number of comments in all dicts where "type" equals "ABC":
abcs = [x for x in dict1['outer_key2'] if x['type'] == 'ABC']
comment_count = sum([len(x['comments']['nested_comment']) for x in abcs])
But I've gotta say, that is some weird data you're dealing with.

You got answers for #1 and #3, check this too
from collections import Counter
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"key": "value"
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
print "total: ",dict1['outer_key1']['total']
print "No of nested comments: ", len(dict1['outer_key2'][0]['comments'] ['nested_comment']),
Assuming that below is the data structure for outer_key2 this is how you get total number of comments of type='ABC'
dict2={
"outer_key1" : {
"total" : 5
},
"outer_key2" :
[{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
}]}
i=0
k=0
while k < len(dict2['outer_key2']):
#print k
if dict2['outer_key2'][k]['type'] == 'ABC':
i+=int(1)
else:
pass
k+=1
print ("\r\nNo of dictionaries with type = 'ABC' : "), i

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

unique combinations values in a dictionary - python

For the following example dictionary, is there a builtin method to get all unique combinations? a = { "a": ["a_1", "a_2"], "b": ["b_1", "b_2"] } output: [ ["a_1", "b_1"], ["a_1", "b_2"], ["a_2", "b_1"], ["a_2", "b_2"] ]

I did this with itertools.product() import itertools a = { "a": ["a_1", "a_2"], "b": ["b_1", "b_2"] } print(list(itertools.product(*a.values()))) Output: [('a_1', 'b_1'), ('a_1', 'b_2'), ('a_2', 'b_1'), ('a_2', 'b_2')]

Related

how to count how often a particular key appers in a dict python

Count number of objects in list of dictionary where a key's value is more than 1

Querying large Mongodb collection using pymongo

How to interpret a string to define a dictionary call?

Dynamic approach to iterate nested dict and list of dict in Python

Categories

Resources