How to count amount of combinations in a python list

How to count amount of combinations in a python list - python

I need some help to count the amount of combinations in a list array in python.
I need to count the amount of possible combinations between three letters in all of the elements and then find the most repeated one. eg, ABC, CDA, CCA, etc...
I have created a for loop to look in each element of the list, then I have another loop to check each combo of three letters and add it to a new list. I am not sure about how to count the amount of times a combination is repeated, and then to find the mode, I think I might use the max() function.
this is part of the code I have, but it does not work as I am expecting, because it is just adding each item of the list into an independent list.
lst = ["ABCDABCD", "ABDCABD", "ACCACABB", "BACDABC"]
for combo in lst:
for i in range (0,3):
combolst = []
combolst.append(lst[i].split())
print(combolst)
I am new to coding so that's why I'm here. Thanks!

(Assuming my math memory isn't garbage)
So okay, we are interested in combinations. Your code simply splits the list and creates a new one (as you said). Then we would use the combination formula : n!/(z!(n-z)!).
Where:
n is the number of elements, in this case the length of our string in question
z would be how many objects we wish to choose
Thus you would get:
for combo in lst:
n = math.factorial(len(combo))
r = math.factorial(3)
nMinR = math.factorial((len(combo) - 3))
result = n/(r*nMinR)
print(result)
This is for combination, if we want permutations (where order does matter)
for combo in lst:
n = math.factorial(len(combo))
nMinR = math.factorial((len(combo) - 3))
result = n/(nMinR)
print(result)
I hope I understood your question correctly. Here is some reading about combinations vs permutations (https://medium.com/i-math/combinations-permutations-fa7ac680f0ac). Keep in mind, the above code will only print out how many possible combinations or permutations are possible; it won't actually try to construct the possible values

Related

the code is giving me a number from the list instead of the mode

in one of my work i need to find the mode a list called "dataset" using no modual or function that would find the mode by itself.
i tried to make it so it can output the mode or the list of modes depending on the list of numbers. I used 2 for loops so the first number of the list checks each number of the list including its self to see how many numbers of its self there is, for example if my list was 123415 it would say there is 2 ones, and it does this for all the numbers of the list. the number with the most counts would be the mode. The bottom section of the code where the if elif and else is, there is where it checks if the number has the most counts by comparing with the other numbers of the list checking if it has more numbers or the same as the previous biggest number.
I've tried to change the order of the codes but i'm still confused why it is doing this error
pop_number = []
pop_amount = 0
amount = 0
for i in range(len(dataset)):
for x in dataset:
if dataset[i] == x:
amount += 1
if amount>pop_amount:
pop_amount = amount
pop_number = []
pop_number.append(x)
amount = 0
elif amount==pop_amount:
pop_amount = amount
if x not in pop_number:
pop_number.append(x)
pop_amount = amount
amount = 0
else:
continue
print(pop_number)
i expected the output to be the mode of the list or the list of modes but it came up with the last number from the list

As this is apparently homework, I will present a sketch, not working code.
Observe that a dict in Python can hold key-value mappings.
Let the numbers in the input list be the keys, and the values the number of times they occur. Going over the list, use each item as the key for the dict, and add one to the value (starting at 0 -- defaultdict(int) is good for this). If the result is bigger than any previous maximum, remember this key.
Since you want to allow for more than one mode value, the variable which remembers the maximum key should be a list; but since you have a new maximum, replace the old list with a list containing just this key. If another value also reaches the maximum, add it to the list. (That's the append method.)
(See how this is if bigger than maximum so far and then else if equal to maximum so far and then otherwise there is no need to do anything.)
When you have looped over all items in the input list, the list of remembered keys is your result.
Go back and think about what variables you need already before the loop. The maximum so far should be defined but guaranteed to be smaller than any value you will see -- it makes sense to start this at 0 because as soon as you see one key, it will have a bigger count than zero. And the keys you want to remember can start out as an empty list.
Now think about how you would test this. What happens if the input list is empty? What happens if the input list contains just the same number over and over? What happens if every item on the input list is unique? Can you think of other corner cases?

Without using any module or function that will specifically find the mode itself, you can do that with much less code. Your code will work with a little more effort, I highly suggest you to try to solve the problem on your own logic, but meanwhile let me show you how to take the help of all the built-in data structures in Python List, Tuples, Dictionaries and Sets within 7-8 lines. Also there is unzipping at the end (*). I will suggest you to look these up, when you get time.
lst = [1,1,1,1,2,2,2,3,3,3,3,3,3,4,2,2,2,5,5,6]
# finds the unique elements
unique_elems = set(lst)
# creates a dictionary with the unique elems as keys and initializes the values to 0
count = dict.fromkeys(unique_elems,0)
# gets the frequency of each element in the lst
for elem in unique_elems:
count[elem] = lst.count(elem)
# finds max frequency
max_freq = max(count.values())
# stores list of mode(s)
modes = [i for i in count if count[i] == max_freq]
# prints mode(s), I have used unzipping here so that in case there is one mode,
# you don't have to print ugly [x]
print(*modes)
Or if you want to go for the shortest (I really shouldn't be making such bold claims in StackOverflow), then I guess this will be it (even though, writing short codes for the sake of it is discouraged)
lst = [1,1,1,1,2,2,2,3,3,3,3,3,3,4,2,2,2,5,5,6]
freq_dist = [(i, lst.count(i)) for i in set(lst)]
[print(i,end=' ') for i,j in freq_dist if j==max(freq_dist, key=lambda x:x[1])[1]]
And if you just want to go bonkers and say goodbye to loops (Goes without saying, this is ugly, really ugly):
lst = [1,1,1,1,2,2,2,3,3,3,3,3,3,4,2,2,2,5,5,6]
unique_elems = set(lst)
freq_dist = list(map(lambda x:(x, lst.count(x)), unique_elems))
print(*list(map(lambda x:x[0] if x[1] == max(freq_dist,key = lambda y: y[1])[1] else '', freq_dist)))

How do I remove the element with fewest sets in a list and only keep the one with the most?

I have a list where each element contains an unknown number of sets. (The sets in the list varies depending on choices the user do in the program.) Now I want to remove all the elements with the fewest number of sets and only keep the one or ones that contains the most number of sets.
My list can look like this:
[{'Chocolate'}, {'Chocolate'}, {'JellyBean', 'Chips'}]
In this case, I would have wanted to keep just the last element because it contains two sets and the rest only one set. But sometimes there are several elements with the highest number of sets and then I want to keep them all.
I have tried to do something like:
if min(len(list)) != max(len(list)):
list.remove(min(len(list)))
but Python just says "'int' object is not iterable" and I can understand why but not how to think instead.
Would be very thankful if someone helped me!

You'll need to iterate the list once to determine the max and then again to find the elements you want to keep. There will be fancier solutions to do it the most efficient way possible but will be harder to understand.
example_list = [('Chocolate'), ('Chocolate'), ('JellyBean', 'Chips')]
max_length = 0
for item in example_list:
if len(item) > max_length:
max_length = len(item)
new_list = []
for item in example_list:
if len(item) == max_length:
new_list.append(item)
You might be able to do remove but normally Python gets mad about changing something you are actively iterating through.

Why not create a new list wherein it has the max number of elements. Try below:
list_of_sets= [{'Chocolate'}, {'Chocolate', 'Candy'}, {'JellyBean', 'Chips'}]
max_len=max(len(s) for s in list_of_sets)
final_list = [s for s in list_of_sets if len(s)==max_len]
final_list
Result:
[{'Candy', 'Chocolate'}, {'Chips', 'JellyBean'}]

Shuffling with constraints on pairs

I have n lists each of length m. assume n*m is even. i want to get a randomly shuffled list with all elements, under the constraint that the elements in locations i,i+1 where i=0,2,...,n*m-2 never come from the same list. edit: other than this constraint i do not want to bias the distribution of random lists. that is, the solution should be equivalent to a complete random choice that is reshuffled until the constraint hold.
example:
list1: a1,a2
list2: b1,b2
list3: c1,c2
allowed: b1,c1,c2,a2,a1,b2
disallowed: b1,c1,c2,b2,a1,a2

A possible solution is to think of your number set as n chunks of item, each chunk having the length of m. If you randomly select for each chunk exactly one item from each lists, then you will never hit dead ends. Just make sure that the first item in each chunk (except the first chunk) will be of different list than the last element of the previous chunk.
You can also iteratively randomize numbers, always making sure you pick from a different list than the previous number, but then you can hit some dead ends.
Finally, another possible solution is to randomize a number on each position sequentially, but only from those which "can be put there", that is, if you put a number, none of the constraints will be violated, that is, you will have at least a possible solution.

A variation of b above that avoids dead ends: At each step you choose twice. First, randomly chose an item. Second, randomly choose where to place it. At the Kth step there are k optional places to put the item (the new item can be injected between two existing items). Naturally, you only choose from allowed places.
Money!

arrange your lists into a list of lists
save each item in the list as a tuple with the list index in the list of lists
loop n*m times
on even turns - flatten into one list and just rand pop - yield the item and the item group
on odd turns - temporarily remove the last item group and pop as before - in the end add the removed group back
important - how to avoid deadlocks?
a deadlock can occur if all the remaining items are from one group only.
to avoid that, check in each iteration the lengths of all the lists
and check if the longest list is longer than the sum of all the others.
if true - pull for that list
that way you are never left with only one list full
here's a gist with an attempt to solve this in python
https://gist.github.com/YontiLevin/bd32815a0ec62b920bed214921a96c9d

A very quick and simple method i am trying is:
random shuffle
loop over the pairs in the list:
if pair is bad:
loop over the pairs in the list:
if both elements of the new pair are different than the bad pair:
swap the second elements
break
will this always find a solution? will the solutions have the same distribution as naive shuffling until finding a legit solution?

How to order a list based on what a function returns when called with its items

Basically, I need to order a 2D array. Genes is an array of 8 lists, all containing 8 items, all of which are floats. This is for an evolution simulator of sorts, hence 'genes'. My current solution is this:
scores = []
[scores.append(score(x)) for x in genes]
unsorted = genes
genes = [unsorted[0]]
for y in range(7):
for x in range(len(genes)):
if score(unsorted[y+1]) >= score(genes[x]):
genes.insert(x, unsorted[y+1])
break
I have a list of all the scores, I save a copy of 'genes' called 'unsorted', and set genes as the first item it once contained. The nested loop underneath should run through unsorted, taking each item through the 'x' loop, and inserting it into 'genes' once it finds the first item of score equal or smaller than its own. I thought this would work, but for some reason, it returns lists of random sizes, like 3, 2 and 5 or even 16. If you have a more efficient or pythonic way to do this, or just one that works, please help!

That is what sorted is for.
genes = sorted(genes, key=score)

Manual product with unknown number of arguments

The following examples give the same result:
A.
product = []
for a in "abcd":
for b in "xy":
product.append((a,b))
B.
from itertools import product
list(product("abcd","xy"))
How can I calculate the cartesian product like in example A when I don't know the number of arguments n?
REASON I'm asking this:
Consider this piece of code:
allocations = list(product(*strategies.values()))
for alloc in allocations:
PWC[alloc] = [a for (a,b) in zip(help,alloc) if coalitions[a] >= sum(b)]
The values of the strategies dictionary are list of tuples, help is an auxiliary variable (a list with the same length of every alloc) and coalitions is another dictionary that assigns to the tuples in help some numeric value.
Since strategies values are sorted, I know that the if statement won't be true anymore after a certain alloc. Since allocations is a pretty big list, I would avoid tons of comparisons and tons of sums if I could use the example algorithm A.

You can do:
items = ["abcd","xy"]
from itertools import product
list(product(*items))
The list items can contain an arbitrary number of strings and it'll the calculation with product will provide you with the Cartesian product of those strings.
Note that you don't have to turn it into a list - you can iterate over it and stop when you no longer wish to continue:
for item in product(*items):
print(item)
if condition:
break

If you just want to abort the allocations after you hit a certain condition, and you want to avoid generating all the elements from the cartesian product for those, then simply don’t make a list of all combinations in the first place.
itertools.product is lazy that means that it will only generate a single value of the cartesian product at a time. So you never need to generate all elements, and you also never need to compare the elements then. Just don’t call list() on the result as that would iterate the whole sequence and store all possible combinations in memory:
allocations = product(*strategies.values())
for alloc in allocations:
PWC[alloc] = [a for (a,b) in zip(help,alloc) if coalitions[a] >= sum(b)]
# check whether you can stop looking at more values from the cartesian product
if someCondition(alloc):
break
It’s just important to note how itertools.product generates the values, what pattern it follows. It’s basically equivalent to the following:
for a in firstIterable:
for b in secondIterable:
for c in thirdIterable:
…
for n in nthIterable:
yield (a, b, c, …, n)
So you get an increasing pattern from the left side of your iterables. So make sure that you order the iterables in a way that you can correctly specify a break condition.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.