Enumerating all possible scenarios - python

I am trying to find all of the possible combinations for a set. Suppose I have 2 vehicles (A and B) and I want to use them by sending them and then return. Send and return are two distinct actions, and I want to enumerate all of the possible sequences of sending and returning this vehicle. Thus the set is [ A, A, B, B]. I use this code to enumerate:
from itertools import permutations
a = permutations(['A', 'A', 'B', 'B'])
# Print the permutations
seq = []
for i in list(a):
seq.append(i)
seq = list(set(seq)) # remove duplicates
The result is as follows:
('A', 'B', 'B', 'A')
('A', 'B', 'A', 'B')
('A', 'A', 'B', 'B')
('B', 'A', 'B', 'A')
('B', 'B', 'A', 'A')
('B', 'A', 'A', 'B')
Suppose my assumption is the two vehicles identical. Thus, it doesn't matter which one is on the first order (i.e. ABBA is the same as BAAB). Here's what I expect the result is:
('A', 'B', 'B', 'A')
('A', 'B', 'A', 'B')
('A', 'A', 'B', 'B')
I can do this easily by removing the last three elements. However, I encounter a problem when I try to do the same thing for three vehicles ( a = permutations(['A', 'A', 'B', 'B', 'C', 'C']). How to ensure that the result already considers the three identical vehicles?

One way would be to generate all the combinations, then filter for only those where the first mention of each vehicle is in alphabetical order.
In recent versions of Python, dict retains first-insertion order, so we can use it to determine the first mention; something like:
from itertools import permutations
seq = set()
for i in permutations(['A', 'A', 'B', 'B']):
first_mentions = {car: None for car in i}.keys()
if list(first_mentions) == sorted(first_mentions):
seq.add(i)
(This works in practice since Python 3.5, and officially since Python 3.7)

from itertools import permutations
a = permutations(['A', 'A', 'B', 'B'])
seq = []
for i in list(a):
if i[0]=='A':
seq.append(i)
seq = list(set(seq))
print(seq)
Try this, I think this should do

Related

Is there a more elegant way of getting permutations with replacement in python?

I currently want all permutations of a set of elements with replacement.
Example:
elements = ['a', 'b']
permutations with replacement =
[('a', 'a', 'a'),
('a', 'a', 'b'),
('a', 'b', 'a'),
('a', 'b', 'b'),
('b', 'a', 'a'),
('b', 'a', 'b'),
('b', 'b', 'a'),
('b', 'b', 'b')]
The only way I have been able to do this is so far is with itertools.product as follows:
import itertools as it
sample_space = ['a', 'b']
outcomes = it.product(sample_space, sample_space, sample_space)
list(outcomes)
I am just wondering if there is a better way to do this as it obvious that this can get unwieldy and error prone as the sample space and required length gets larger
was expecting to find something along the lines of itertools.permutations(['a', 'b'], length=3, replace=True) maybe?
I tried itertools.permutations but the only arguments are iterable, and r which is the length required.
The output for the above example using it.permutations(sample_space, 3) would be an empty list []
If you're sampling with replacement, what you get is by definition not a permutation (which just means "rearrangement") of the set. So I wouldn't look to the permutations function for this. product is the right thing; e.g. itertools.product(['a','b'], repeat=3).
Note that if you sample from a two-element set with replacement N times, then you have effectively created an N-digit binary number, with all the possibilities that go with it. You've just swapped in 'a' and 'b' for 0 and 1.

Storing The Output of a Permutation as a List of Lists

When I run the following code I get rows of tuples:
{perm = itertools.permutations(['A','B','C','D','E','F'],4)
for val in perm:
print(val)}.
How do I make the code give me the output as a single list of lists instead of rows of tuples?
When I run the code I get something like this
('F', 'E', 'B', 'C')
('F', 'E', 'B', 'D')
('F', 'E', 'C', 'A')
('F', 'E', 'C', 'B')
type here
etc.
What I want is something like this
[['F', 'E', 'B', 'C'],
['F', 'E', 'B', 'D'],
['F', 'E', 'C', 'A'],...,]
cast val into a list and append it to another list.
import itertools
perm = itertools.permutations(['A','B','C','D','E','F'],4)
result = []
for val in perm:
result.append(list(val))
print(result)
The question is, do you want to generate all permutations and store them?
As you have it now, the generator will give you one permutation each time, which is memory efficient.
You can generate all of them into a list of lists, but just think if you really want that, since the number of permutations could be very large.

Why reading/unpacking data from itertools.permutation changed its attribute/content?

I am trying to use the permutation feature from itertools, then I noticed. If I try to unpack/read the data from permutation, it changes some attribute info
from itertools import permutations
a = permutations('abc')
print(('a', 'b', 'c') in a)
for x in a:
print(x)
print(('a', 'b', 'c') in a)
for x in a:
print(x)
Output:
True
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
False
How come does this happen? I checked out the official page, and cannot find any clue.
My environment is pycharm with python 3.7.4
As others aready said, the problem is that a is not a list, but a generator; that is, a sequence that gets used up as you iterate over it -- hence you can only iterate over it once.
If you look carefully, you'll see that your first print loop only printed five of the six permutations; the first permutation disappeared when you checked it against ('a', 'b', 'c') in your first print statement. The for-loop then prints out what's left, and the rest of your code is trying to drink from an empty cup.
To get the behavior you expect, make a into a list like this:
a = list(permutations('abc'))
And when you get a chance, read up on generators, iterators, and "comprehensions"; they're everywhere in Python (often hidden in plain sight), and they're great.
Get rid from generator and convert the output to list since the comparison is vanishing because it used already. itertools.permutation is just an iterator which is shifting to next when you use one value in comparison.
CODE:
from itertools import permutations
a = list(permutations('abc'))
print(('a', 'b', 'c') in a)
for x in a:
print(x)
print(('a', 'b', 'c') in a)
for x in a:
print(x)
OUTPUT:
True
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
True
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

Remove some duplicates from list in python

UPDATE: I believe I found the solution. I've put it at the end.
Let’s say we have this list:
a = ['a', 'a', 'b', 'b', 'a', 'a', 'c', 'c']
I want to create another list to remove the duplicates from list a, but at the same time, keep the ratio approximately intact AND maintain order.
The output should be:
b = ['a', 'b', 'a', 'c']
EDIT: To explain better, the ratio doesn't need to be exactly intact. All that's required is the output of ONE single letter for all letters in the data. However, two letters might be the same but represent two different things. The counts are important to identify this as I say later. Letters representing ONE unique variable appear in counts between 3000-3400 so when I divide the total count by 3500 and round it, I know how many time it should appear in the end, but the problem is I don't know what order they should be in.
To illustrate this I'll include one more input and desired output:
Input: ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'a', 'a', 'd', 'd', 'a', 'a']
Desired Output: ['a', 'a', 'b', 'c', 'a', 'd', 'a']
Note that 'C' has been repeated three times. The ratio needs not be preserved exactly, all I need to represent is how many times that variable is represented and because it's represented 3 times only in this example, it isn't considered enough for it to count as two.
The only difference is that here I'm assuming all letters repeating exactly twice are unique, although in the data-set, again, uniqueness is dependent on the appearance of 3000-3400 times.
Note(1): This doesn't necessarily need to be considered but there's a possibility that not all letters will be grouped together nicely, for example, considering 4 letters for uniqueness to make it short: ['a','a',''b','a','a','b','b','b','b'] should still be represented as ['a','b']. This is a minor problem in this case, however.
EDIT:
Example of what I've tried and successfully done:
full_list = ['a', 'a', 'b', 'b', 'a', 'a', 'c', 'c']
#full_list is a list containing around 10k items, just using this as example
rep = 2 # number of estimated repetitions for unique item,
# in the real list this was set to 3500
quant = {'a': 0, "b" : 0, "c" : 0, "d" : 0, "e" : 0, "f" : 0, "g": 0}
for x in set(full_list):
quant[x] = round(full_list.count(x)/rep)
final = []
for x in range(len(full_list)):
if full_list[x] in final:
lastindex = len(full_list) - 1 - full_list[::-1].index(full_list[x])
if lastindex == x and final.count(full_list[x]) < quant[full_list[x]]:
final.append(full_list[x])
else:
final.append(full_list[x])
print(final)
My problem with the above code is two-fold:
If there are more than 2 repetitions of the same data, it will not count them correctly. For example: ['a', 'a', 'b', 'b', 'a', 'a', 'c', 'c', 'a', 'a'] should become ['a','b','a','c','a'] but instead it becomes ['a','b,'c','a']
It takes a very log time to finish as I'm sure it's a very
inefficient way to do this.
Final remark: The code I've tried was more of a little hack to achieve the desired output on the most common input, however it doesn't do exactly what I intended it to. It's also important to note that the input changes over time. Repetitions of single letters aren't always the same, although I believe they're always grouped together, so I was thinking of making a flag that is True when it hits a letter and becomes false as soon as it changes to a different one, but this also has the problem of not being able to account for the fact that two letters that are the same might be put right next to each other. The count for each letter as an individual is always between 3000-3400, so I know that if the count is above that, there are more than 1.
UPDATE: Solution
Following hiro protagonist's suggestion with minor modifications, the following code seems to work:
full = ['a', 'a', 'b', 'b', 'a', 'a', 'c', 'c', 'a', 'a']
from itertools import groupby
letters_pre = [key for key, _group in groupby(full)]
letters_post = []
for x in range(len(letters_pre)):
if x>0 and letters_pre[x] != letters_pre[x-1]:
letters_post.append(letters_pre[x])
if x == 0:
letters_post.append(letters_pre [x])
print(letters_post)
The only problem is that it doesn't consider that sometimes letters can appear in between unique ones, as described in "Note(1)", but that's only a very minor issue. The bigger issue is that it doesn't consider when two separate occurances of the same letter are consecutive, for example (two for uniqueness as example): ['a','a','a','a','b','b'] gets turned to ['a','b'] when desired output should be ['a','a','b']
this is where itertools.groupby may come in handy:
from itertools import groupby
a = ["a", "a", "b", "b", "a", "a", "c", "c"]
res = [key for key, _group in groupby(a)]
print(res) # ['a', 'b', 'a', 'c']
this is a version where you could 'scale' down the unique keys (but are guaranteed to have at leas one in the result):
from itertools import groupby, repeat, chain
a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'a', 'a',
'd', 'd', 'a', 'a']
scale = 0.4
key_count = tuple((key, sum(1 for _item in group)) for key, group in groupby(a))
# (('a', 4), ('b', 2), ('c', 5), ('a', 2), ('d', 2), ('a', 2))
res = tuple(
chain.from_iterable(
(repeat(key, round(scale * count) or 1)) for key, count in key_count
)
)
# ('a', 'a', 'b', 'c', 'c', 'a', 'd', 'a')
there may be smarter ways to determine the scale (probably based on the length of the input list a and the average group length).
Might be a strange one, but:
b = []
for i in a:
if next(iter(b[::-1]), None) != i:
b.append(i)
print(b)
Output:
['a', 'b', 'a', 'c']

How to unpack a list?

When extracting data from a list this way
line[0:3], line[3][:2], line[3][2:]
I receive an array and two variables after it, as should be expected:
(['a', 'b', 'c'], 'd', 'e')
I need to manipulate the list so the end result is
('a', 'b', 'c', 'd', 'e')
How? Thank you.
P.S. Yes, I know that I can write down the first element as line[0], line[1], line[2], but I think that's a pretty awkward solution.
from itertools import chain
print tuple(chain(['a', 'b', 'c'], 'd', 'e'))
Output:
('a', 'b', 'c', 'd','e')
Try this.
line = ['a', 'b', 'c', 'de']
tuple(line[0:3] + [line[3][:1]] + [line[3][1:]])
('a', 'b', 'c', 'd', 'e')
NOTE:
I think there is some funny business in your slicing logic.
If [2:] returns any characters, [:2] must return 2 characters.
Please provide your input line.
Obvious answer: Instead of your first line, do:
line[0:3] + [line[3][:2], line[3][2:]]
That works assuming that line[0:3] is a list. Otherwise, you may need to make some minor adjustments.
This function
def merge(seq):
merged = []
for s in seq:
for x in s:
merged.append(x)
return merged
source: http://www.testingreflections.com/node/view/4930
def is_iterable(i):
return hasattr(i,'__iter__')
def iterative_flatten(List):
for item in List:
if is_iterable(item):
for sub_item in iterative_flatten(item):
yield sub_item
else:
yield item
def flatten_iterable(to_flatten):
return tuple(iterative_flatten(to_flatten))
this should work for any level of nesting

Categories

Resources