list of dicts- get the number of duplications [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a list of dicts (same format) like this :
L = [
{'id': 1, 'name': 'john', 'age': 34},
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
{'id': 2, 'name': 'hanna', 'age': 30},
{'id': 3, 'name': 'stack', 'age': 40}
]
I want to remove duplication and get the number of this duplication like this
[
{'id': 1, 'name': 'john', 'age': 34, 'duplication': 2},
{'id': 2, 'name': 'hanna', 'age': 30, 'duplication': 2},
{'id': 3, 'name': 'stack', 'age': 40, 'duplication': 1}
]
I already managed to remove the duplication by using a set.... but I can't get the number of duplications
my code :
no_duplication = [dict(s) for s in set(frozenset(d.items()) for d in L)]
no_duplication = [
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
{'id': 3, 'name': 'stack', 'age': 40}
]

Here is a solution you can give a try using collections.Counter,
from collections import Counter
print([
{**dict(k), "duplicated": v}
for k, v in Counter(frozenset(i.items()) for i in L).items()
])
[{'age': 34, 'duplicated': 2, 'id': 1, 'name': 'john'},
{'age': 30, 'duplicated': 2, 'id': 2, 'name': 'hanna'},
{'age': 40, 'duplicated': 1, 'id': 3, 'name': 'stack'}]

ar = [
{'id': 1, 'name': 'john', 'age': 34},
{'id': 1, 'name': 'john', 'age': 34},
{'id': 2, 'name': 'hanna', 'age': 30},
{'id': 2, 'name': 'hanna', 'age': 30},
{'id': 3, 'name': 'stack', 'age': 40}
]
br = []
cnt = []
for i in ar:
if i not in br:
br.append(i)
cnt.append(1)
else:
cnt[br.index(i)] += 1
for i in range(len(br)):
br[i]['duplication'] = cnt[i]
The desired output is contained in br as:
[
{'id': 1, 'name': 'john', 'age': 34, 'duplication': 2},
{'id': 2, 'name': 'hanna', 'age': 30, 'duplication': 2},
{'id': 3, 'name': 'stack', 'age': 40, 'duplication': 1}
]

Related

Increment a key value in a list of dictionaries

I would like to add an id key to a list of dictionaries, where each id represents the enumerated nested dictionary.
Current list of dictionaries:
current_list_d = [{'id': 0, 'name': 'Paco', 'age': 18} #all id's are 0
{'id': 0, 'name': 'John', 'age': 20}
{'id': 0, 'name': 'Claire', 'age': 22}]
Desired output:
output_list_d = [{'id': 1, 'name': 'Paco', 'age': 18} #id's are counted/enumerated
{'id': 2, 'name': 'John', 'age': 20}
{'id': 3, 'name': 'Claire', 'age': 22}]
My code:
for d in current_list_d:
d["id"]+=1
You could use a simple for loop with enumerate and update in-place the id keys in the dictionaries:
for new_id, d in enumerate(current_list_d, start=1):
d['id'] = new_id
current_list_d
[{'id': 1, 'name': 'Paco', 'age': 18},
{'id': 2, 'name': 'John', 'age': 20},
{'id': 3, 'name': 'Claire', 'age': 22}]
You can use a variable.
id_val = 1
for dict in current_list_d :
dict["id"] = id_val
id_val+=1

Sort a list of dict with a key from another list of dict

In the following example, I would like to sort the animals by the alphabetical order of their category, which is stored in an order dictionnary.
category = [{'uid': 0, 'name': 'mammals'},
{'uid': 1, 'name': 'birds'},
{'uid': 2, 'name': 'fish'},
{'uid': 3, 'name': 'reptiles'},
{'uid': 4, 'name': 'invertebrates'},
{'uid': 5, 'name': 'amphibians'}]
animals = [{'name': 'horse', 'category': 0},
{'name': 'whale', 'category': 2},
{'name': 'mollusk', 'category': 4},
{'name': 'tuna ', 'category': 2},
{'name': 'worms', 'category': 4},
{'name': 'frog', 'category': 5},
{'name': 'dog', 'category': 0},
{'name': 'salamander', 'category': 5},
{'name': 'horse', 'category': 0},
{'name': 'octopus', 'category': 4},
{'name': 'alligator', 'category': 3},
{'name': 'monkey', 'category': 0},
{'name': 'kangaroos', 'category': 0},
{'name': 'salmon', 'category': 2}]
sorted_animals = sorted(animals, key=lambda k: (k['category'])
How could I achieve this?
Thanks.
You are now sorting on the category id. All you need to do is map that id to a lookup for a given category name.
Create a dictionary for the categories first so you can directly map the numeric id to the associated name from the category list, then use that mapping when sorting:
catuid_to_name = {c['uid']: c['name'] for c in category}
sorted_animals = sorted(animals, key=lambda k: catuid_to_name[k['category']])
Demo:
>>> from pprint import pprint
>>> category = [{'uid': 0, 'name': 'mammals'},
... {'uid': 1, 'name': 'birds'},
... {'uid': 2, 'name': 'fish'},
... {'uid': 3, 'name': 'reptiles'},
... {'uid': 4, 'name': 'invertebrates'},
... {'uid': 5, 'name': 'amphibians'}]
>>> animals = [{'name': 'horse', 'category': 0},
... {'name': 'whale', 'category': 2},
... {'name': 'mollusk', 'category': 4},
... {'name': 'tuna ', 'category': 2},
... {'name': 'worms', 'category': 4},
... {'name': 'frog', 'category': 5},
... {'name': 'dog', 'category': 0},
... {'name': 'salamander', 'category': 5},
... {'name': 'horse', 'category': 0},
... {'name': 'octopus', 'category': 4},
... {'name': 'alligator', 'category': 3},
... {'name': 'monkey', 'category': 0},
... {'name': 'kangaroos', 'category': 0},
... {'name': 'salmon', 'category': 2}]
>>> catuid_to_name = {c['uid']: c['name'] for c in category}
>>> pprint(catuid_to_name)
{0: 'mammals',
1: 'birds',
2: 'fish',
3: 'reptiles',
4: 'invertebrates',
5: 'amphibians'}
>>> sorted_animals = sorted(animals, key=lambda k: catuid_to_name[k['category']])
>>> pprint(sorted_animals)
[{'category': 5, 'name': 'frog'},
{'category': 5, 'name': 'salamander'},
{'category': 2, 'name': 'whale'},
{'category': 2, 'name': 'tuna '},
{'category': 2, 'name': 'salmon'},
{'category': 4, 'name': 'mollusk'},
{'category': 4, 'name': 'worms'},
{'category': 4, 'name': 'octopus'},
{'category': 0, 'name': 'horse'},
{'category': 0, 'name': 'dog'},
{'category': 0, 'name': 'horse'},
{'category': 0, 'name': 'monkey'},
{'category': 0, 'name': 'kangaroos'},
{'category': 3, 'name': 'alligator'}]
Note that within each category, the dictionaries have been left in relative input order. You could return a tuple of values from the sorting key to further apply a sorting order within each category, e.g.:
sorted_animals = sorted(
animals,
key=lambda k: (catuid_to_name[k['category']], k['name'])
)
would sort by animal name within each category, producing:
>>> pprint(sorted(animals, key=lambda k: (catuid_to_name[k['category']], k['name'])))
[{'category': 5, 'name': 'frog'},
{'category': 5, 'name': 'salamander'},
{'category': 2, 'name': 'salmon'},
{'category': 2, 'name': 'tuna '},
{'category': 2, 'name': 'whale'},
{'category': 4, 'name': 'mollusk'},
{'category': 4, 'name': 'octopus'},
{'category': 4, 'name': 'worms'},
{'category': 0, 'name': 'dog'},
{'category': 0, 'name': 'horse'},
{'category': 0, 'name': 'horse'},
{'category': 0, 'name': 'kangaroos'},
{'category': 0, 'name': 'monkey'},
{'category': 3, 'name': 'alligator'}]
imo your category structure is far too complicated - at least as long as the uid is nothing but the index, you could simply use a list for that:
category = [c['name'] for c in category]
# ['mammals', 'birds', 'fish', 'reptiles', 'invertebrates', 'amphibians']
sorted_animals = sorted(animals, key=lambda k: category[k['category']])
#[{'name': 'frog', 'category': 5}, {'name': 'salamander', 'category': 5}, {'name': 'whale', 'category': 2}, {'name': 'tuna ', 'category': 2}, {'name': 'salmon', 'category': 2}, {'name': 'mollusk', 'category': 4}, {'name': 'worms', 'category': 4}, {'name': 'octopus', 'category': 4}, {'name': 'horse', 'category': 0}, {'name': 'dog', 'category': 0}, {'name': 'horse', 'category': 0}, {'name': 'monkey', 'category': 0}, {'name': 'kangaroos', 'category': 0}, {'name': 'alligator', 'category': 3}]

Select list element where a field has the min value

Suppose I have a named list as follows:
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
I want to select the element (not only the field) where an specific field meets certain criteria, e.g., the element with the minimum 'Age'. Something like:
youngerPerson = [person for person in myListOfPeople if person = ***person with minimum age***]
And will get as answer:
>>youngerPerson: {'ID': 0, 'Name': Mary, 'Age': 25}
How can I do that?
You can use the key parameter of min:
>>> myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
>>>
>>> min(myListOfPeople, key=lambda x: x["Age"])
{'ID': 0, 'Name': 'Mary', 'Age': 25}
>>>
You can use itemgetter :
from operator import itemgetter
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
sorted(myListOfPeople, key=itemgetter('Age'))[0]
# {'ID': 0, 'Name': 'Mary', 'Age': 25}

Sorting list of dictionaries---what is the default behaviour (without key parameter)?

I m trying to sort a list of dict using sorted
>>> help(sorted)
Help on built-in function sorted in module __builtin__:
sorted(...)
sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted list
I have just given list to sorted and it sorts according to id.
>>>l = [{'id': 4, 'quantity': 40}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
>>> sorted(l) # sorts by id
[{'id': -1, 'quantity': -10}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 4, 'quantity': 40}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
>>> l.sort()
>>> l # sorts by id
[{'id': -1, 'quantity': -10}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 4, 'quantity': 40}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
Many example of sorted says it requires key to sort the list of dict. But I didn't give any key. Why it didn't sort according to quantity? How did it choose to sort with id?
I tried another example with name & age,
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 30,'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 5, 'name': 'sita'}]
>>> sorted(a) # sorts by age
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name':'sita'}, {'age': 15, 'name': 'rita'}, {'age': 30, 'name': 'ram'}]
>>> a.sort() # sorts by age
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name':'sita'}, {'age': 15, 'name': 'rita'}, {'age': 30, 'name': 'ram'}]
Here it sorts according to age but not name. What am I missing in default behavior of these method?
From some old Python docs:
Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. Outcomes other than equality are resolved consistently, but are not otherwise defined.
Earlier versions of Python used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.
Ignore the default behaviour and just provide a key.
By default it will compare against the first difference it finds. If you are sorting dictionaries this is quite dangerous (consistent yet undefined).
Pass a function to key= parameter that takes a value from the list (in this case a dictionary) and returns the value to sort against.
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 30,'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 5, 'name': 'sita'}]
>>> sorted(a, key=lambda d : d['name']) # sorts by name
[{'age': 1, 'name': 'john'}, {'age': 30, 'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name': 'sita'}]
See https://wiki.python.org/moin/HowTo/Sorting
The key parameter is quite powerful as it can cope with all sorts of data to be sorted, although maybe not very intuitive.

Generate all combinations from a nested python dictionary and segregate them

My sample dict is:
sample_dict = {
'company': {
'employee': {
'name': [
{'explore': ["noname"],
'valid': ["john","tom"],
'boundary': ["aaaaaaaaaa"],
'negative': ["$"]}],
'age': [
{'explore': [200],
'valid': [20,30],
'boundary': [1,99],
'negative': [-1,100]}],
'others':{
'grade':[
{'explore': ["star"],
'valid': ["A","B"],
'boundary': ["C"],
'negative': ["AB"]}]}
}
}}
Its a "follow-on" question to-> Split python dictionary to result in all combinations of values
I would like to get a segregated list of combinations like below
Valid combinations:[generate only out of valid list of data]
COMPLETE OUTPUT for VALID CATEGORY :
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'B'}}}
Negative combinations : [Here its bit tricky because, negative combinations should be combined with "valid" pool as well with atleast only value being negative]
Complete output expected for NEGATIVE category :
=>[Basically, excluding combinations where all values are valid - ensuring atleast one value in the combination is from negative group]
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 100}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': 100}, 'name': '$', 'others': {'grade': 'AB'}}}
In the above output, in the first line, grade is tested for negative value AB by keeping remaining all valid. So its not necessary to generate the same with age as 30 as the intent is to test only negative set. We can supply the remaining parameters with any valid data.
Boundary Combinations is similar to valid -> Combinations for all values within the boundary pool only
Explore : Similar to negative - Mix with valid pool and always atleast one explore value in all combinations.
Sample dict - revised version
sample_dict2 = {
'company': {
'employee_list': [
{'employee': {'age': [{'boundary': [1,99],
'explore': [200],
'negative': [-1,100],
'valid': [20, 30]}],
'name': [{'boundary': ['aaaaaaaaaa'],
'explore': ['noname'],
'negative': ['$'],
'valid': ['john','tom']}],
'others': {
'grade': [
{'boundary': ['C'],
'explore': ['star'],
'negative': ['AB'],
'valid': ['A','B']},
{'boundary': ['C'],
'explore': ['star'],
'negative': ['AB'],
'valid': ['A','B']}]}}},
{'employee': {'age': [{'boundary': [1, 99],
'explore': [200],
'negative': [],
'valid': [20, 30]}],
'name': [{'boundary': [],
'explore': [],
'negative': ['$'],
'valid': ['john', 'tom']}],
'others': {
'grade': [
{'boundary': ['C'],
'explore': ['star'],
'negative': [],
'valid': ['A', 'B']},
{'boundary': [],
'explore': ['star'],
'negative': ['AB'],
'valid': ['A', 'B']}]}}}
]
}
}
The sample_dict2 contains list of dicts. Here "employee" the whole hierarchy is a list element and also leaf node "grade" is a list
Also, except "valid" and "boundary" other data set can be empty - [] and we need to handle them as well.
VALID COMBINATIONS will be like
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}
plus combinations of age=30 and name =tom in employee index 0
import itertools
def generate_combinations(thing, positive="valid", negative=None):
""" Generate all possible combinations, walking and mimicking structure of "thing" """
if isinstance(thing, dict): # if dictionary, distinguish between two types of dictionary
if positive in thing:
return thing[positive] if negative is None else [thing[positive][0]] + thing[negative]
else:
results = []
for key, value in thing.items(): # generate all possible key: value combinations
subresults = []
for result in generate_combinations(value, positive, negative):
subresults.append((key, result))
results.append(subresults)
return [dict(result) for result in itertools.product(*results)]
elif isinstance(thing, list) or isinstance(thing, tuple): # added tuple just to be safe
results = []
for element in thing: # generate recursive result sets for each element of list
for result in generate_combinations(element, positive, negative):
results.append(result)
return results
else: # not a type we know how to handle
raise TypeError("Unexpected type")
def generate_invalid_combinations(thing):
""" Generate all possible combinations and weed out the valid ones """
valid = generate_combinations(thing)
return [result for result in generate_combinations(thing, negative='negative') if result not in valid]
def generate_boundary_combinations(thing):
""" Generate all possible boundary combinations """
return generate_combinations(thing, positive="boundary")
def generate_explore_combinations(thing):
""" Generate all possible explore combinations and weed out the valid ones """
valid = generate_combinations(thing)
return [result for result in generate_combinations(thing, negative='explore') if result not in valid]
Calling generate_combinations(sample_dict) returns:
[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'B'}}}}
]
Calling generate_invalid_combinations(sample_dict) returns:
[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'AB'}}}}
]
Calling generate_boundary_combinations(sample_dict) returns:
[
{'company': {'employee': {'age': 1, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}},
{'company': {'employee': {'age': 99, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}}
]
Calling generate_explore_combinations(sample_dict) returns:
[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'star'}}}}
]
REVISED SOLUTION (To match revised problem)
import itertools
import random
def generate_combinations(thing, positive="valid", negative=None):
""" Generate all possible combinations, walking and mimicking structure of "thing" """
if isinstance(thing, dict): # if dictionary, distinguish between two types of dictionary
if positive in thing:
if negative is None:
return thing[positive] # here it's OK if it's empty
elif thing[positive]: # here it's not OK if it's empty
return [random.choice(thing[positive])] + thing[negative]
else:
return []
else:
results = []
for key, value in thing.items(): # generate all possible key: value combinations
results.append([(key, result) for result in generate_combinations(value, positive, negative)])
return [dict(result) for result in itertools.product(*results)]
elif isinstance(thing, (list, tuple)): # added tuple just to be safe (thanks Padraic!)
# generate recursive result sets for each element of list
results = [generate_combinations(element, positive, negative) for element in thing]
return [list(result) for result in itertools.product(*results)]
else: # not a type we know how to handle
raise TypeError("Unexpected type")
def generate_boundary_combinations(thing):
""" Generate all possible boundary combinations """
valid = generate_combinations(thing)
return [result for result in generate_combinations(thing, negative='boundary') if result not in valid]
generate_invalid_combinations() and generate_explore_combinations() are the same as before. Subtle differences:
Instead of grabbing the first item out of the valid array in a negative evaluation, it now grabs a random item from the valid array.
Values for items like 'age': [30] come back as lists as that's how they were specified:
'age': [{'boundary': [1, 99],
'explore': [200],
'negative': [-1, 100],
'valid': [20, 30]}],
If you instead want 'age': 30 like the earlier output examples, then modify the definition accordingly:
'age': {'boundary': [1, 99],
'explore': [200],
'negative': [-1, 100],
'valid': [20, 30]},
The boundary property is now treated like one of the 'negative' values.
Just for reference, I don't plan to generate all the outputs this time: calling generate_combinations(sample_dict2) returns results like:
[
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
...
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}]}}
]
This is an open-ended hornet's nest of a question.
Look at the whitepapers for Agitar other tools by Agitar to see if this what you are thinking about.
Look at Knuth's work on combinationals. It's a tough read.
Consider just writing a recursive descent generator that uses 'yield '.

Categories

Resources