Related
I have a coding problem that is very hard to explain, so I came up with this fruit bag swapping problem that is very easy to understand:
A smart fruiterer sells 3 kinds of fruits: apples (denoted as 'A'), bananas (denoted as 'B') and coconuts (denoted as 'C'). The weird thing about him is that he always prepares 10 bags (bag id 0 to 9), including 3 small bags that can contain 2 fruits, 3 medium bags that can contain 4 fruits, 3 large bags that can contain 8 fruits, and one extra large bag that can contain 16 fruits. He only puts in same type of fruit into each bag. This can be abstracted to the following python code:
import random
from string import ascii_uppercase
#length = 30
#max_size = 10
#bag_sizes = [random.randint(1, max_size) for _ in range(length)]
n_fruit = 3
bag_sizes = [2, 2, 2, 4, 4, 4, 8, 8, 8, 16]
random.seed(55)
fruiterer_bags = [[random.choice(ascii_uppercase[:n_fruit])] * bag_sizes[i]
for i in range(len(bag_sizes))]
print(fruiterer_bags)
Output:
[['A', 'A'], ['A', 'A'], ['A', 'A'],
['C', 'C', 'C', 'C'], ['B', 'B', 'B', 'B'], ['A', 'A', 'A', 'A'],
['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
['B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']]
Note that every day the fruiterer put fruits in each bag like this.
The supplier comes to the fruit shop every day with the same 10 bags, including 3 small, 3 medium, 3 large, 1 extra large. The bags are also assigned with id 0 to 9. He also only puts in same type of fruit in each bag, but the fruits he puts in each bag are not exactly the same as the fruiterer's. Every day he comes with the following bags:
random.seed(100)
supplier_bags = [[random.choice(ascii_uppercase[:n_fruit])] * bag_sizes[i]
for i in range(len(bag_sizes))]
print(supplier_bags)
Output:
[['A', 'A'], ['B', 'B'], ['B', 'B'],
['A', 'A', 'A', 'A'], ['C', 'C', 'C', 'C'], ['B', 'B', 'B', 'B'],
['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
['B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
['B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C']]
The smart thing about this fruiterer is that every time the supplier comes, he wants to swap some fruit bags because they are fresher. The supplier says "OK, fine. You can swap however you like, you can do as many swaps as you want, but:
I don't swap the bags with the same type of fruits in them;
I must have the same amount of each fruit after all the swaps;
I only swap whole bags, not individual fruit;
I only swap bags with same ids;
Each bag can only be swapped once;
PLEASE BE FAST!"
Since supplier_bags has to have the same amount of each fruit after the swaps, then fruiterer_bags will of course also have the same amount of each fruit after the swaps. In this way, both fruiterer and supplier do not change the amount of each fruit on paper, but the smart fruiterer always gets fresher fruits. Every day the fruiterer and supplier come with same bags of fruits, but hopefully they can find very different ways of swapping bags (radomness).
My question is: is there any algorithm to get one possible solution very fast. The expected output is the indices of the bags being swapped. For example, one solution could be
swap_ids = [1, 2, 3, 4]
and the supplier_bags after the swaps becomes:
new_supplier_bags = supplier_bags.copy()
for i in swap_ids:
new_supplier_bags[i] = fruiterer_bags[i]
print(new_supplier_bags)
Output:
[['A', 'A'], ['A', 'A'], ['A', 'A'],
['C', 'C', 'C', 'C'], ['B', 'B', 'B', 'B'], ['B', 'B', 'B', 'B'],
['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
['B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
['B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C']]
which still has the same amount of each fruit as before swaps. This can be checked by
assert sorted([f for bag in new_supplier_bags for f in bag]) == sorted(
[f for bag in supplier_bags for f in bag])
Although I don't care about fruiterer_bagsat all, it must also have the same amount of each fruit after the swaps.
I don't want all possible solutions. I just want one fast solution with randomness, because in reality I am dealing with lists of more than 200 "bag"s with various sizes and many types of "fruit"s, which is infeasible for an enumeration. I want an algorithm to work for any given n_fruit, bag_sizes, fruiterer_bags and supplier_bags. I guess a greedy algorithm will do.
Also, Is there any way to quick check if there is a solution at all? If not, maybe we can say there is no solution after the greedy algorithm suggests e.g. 1000 wrong solutions (if fast enough).
Any suggestions?
Here is how:
import random
from string import ascii_uppercase
def swaps(bags1, bags2):
fruits = sorted([i for j in bags1 for i in j])
zipped = list(map(list, zip(bags1, bags2)))
idxs = [i for i, (bags1, bags2) in enumerate(zipped) if bags1 != bags2]
while True:
random.shuffle(idxs)
for n, i in enumerate(idxs):
zipped[i].reverse()
if fruits == sorted(fruit for bags1, _ in zipped for fruit in bags1):
return idxs[:n + 1]
zipped = list(map(list, zip(bags1, bags2)))
n_fruit = 3
bag_sizes = [2, 2, 2, 4, 4, 4, 8, 8, 8, 16]
random.seed(55)
fruiterer_bags = [[random.choice(ascii_uppercase[:n_fruit])] * i for i in bag_sizes]
random.seed(100)
supplier_bags = [[random.choice(ascii_uppercase[:n_fruit])] * i for i in bag_sizes]
random.seed()
print(swaps(fruiterer_bags, supplier_bags))
Running the code generates a random solution. Here are the results for 10 runs:
[1, 4, 2, 3]
[4, 5, 3]
[5, 4, 3]
[4, 3, 5]
[1, 2, 4, 3]
[3, 2, 4, 1]
[2, 3, 4, 1]
[3, 5, 4]
[4, 3, 5]
[5, 4, 3]
What I did in the code:
Define a list from zipping the two nested lists of strings, and define a list of all the indices of both nested lists that don't equal to each others at the same index.
Shuffle the list of indices and loop through the indice, reversing the element of the zipped list at the index of each iteration.
At any point in the looping, if all the strings sorted in the first element of each element in the zipped list equals to all the strings sorted in the original first nested list, we got a solution, can we can return it.
I am trying to iterate over two lists A and B. Where the B is equal to A - A[i], where i = 1:
For E.g. listA = ['A', 'B', 'C', 'D'].
For first Item, 'A' in List A, I
want the List B to have ['B', 'C', 'D'] For second Item 'B' in List A,
I want the List B to have ['A', 'C', 'D']
What I have tried until now.
listA = ['A', 'B', 'C', 'D']
for term in listA:
listA.remove(term)
for item in listA:
print(listA)
If all you want is to print the sublists, it will be like:
for i in range(len(listA)):
print(listA[:i]+listA[i+1:])
Or,
for i in listA:
print(list(set(listA) - set(i)))
Try this,
>>> la = ['A', 'B', 'C', 'D']
>>> for i in la:
_temp = la.copy()
_temp.remove(i)
print(_temp)
Output:
['B', 'C', 'D']
['A', 'C', 'D']
['A', 'B', 'D']
['A', 'B', 'C']
*If you want to assign the print output to new variables, use a dictionary where the key will the name of list and value is printted output.
Is this what you want?
listA = ['A', 'B', 'C', 'D']
Bs = \
[listA[:idx] + listA[idx + 1:]
for idx
in range(len(listA))]
for B in Bs:
print(B)
Taking the above solutions a step further, you can store a reference to each of the resulting list in the corresponding variable using a dictionary comprehension:
keys_map = {x: [item for item in listA if item != x] for x in listA}
print(keys_map)
Output
{
'A': ['B', 'C', 'D'],
'B': ['A', 'C', 'D'],
'C': ['A', 'B', 'D'],
'D': ['A', 'B', 'C']
}
and access the desired key like so
keys_map.get('A')
# returns
['B', 'C', 'D']
If I have a list of lists, and I want to remove all the items after 'd', and I want to do that based on the index location of 'd' in both lists, how would I do that if the index location of 'd' is different in each list.
Is there a better way than indexing?
ab_list = ['a', 'b', 'c' ,'d','e', 'f'], ['a', 'd', 'e', 'f', 'g']
loc=[]
for i in ab_list:
loc.append(i.index('d'))
print(loc)
# output is [3, 1]
for i in ab_list:
for l in loc:
ab_list_keep=(i[0:l])
print(ab_list_keep)
## output is
#['a', 'b', 'c']
#['a']
#['a', 'd', 'e']
#['a']
The first two lines of the output is what I'd want, but making a list out of the index locations of 'd' doesn't seem to be right.
Python's built in itertools.takewhile method is designed for cases like this one:
import itertools
ab_list = ['a', 'b', 'c' ,'d','e', 'f'],['a', 'd', 'e', 'f', 'g']
print([list(itertools.takewhile(lambda i: i != "d", sublist)) for sublist in ab_list])
output:
[['a', 'b', 'c'], ['a']]
How can I determine the longest sequence of the same letter using python?
For example I use the following code to print a shuffled list with 3 conditions A,B and C
from random import shuffle
condition = ["A"]*20
condition_B = ["B"]*20
condition_C = ["C"]*20
condition.extend(condition_B)
condition.extend(condition_C)
shuffle(condition)
print(condition)
Now i want to make sure that the same condition does not happen more than three times in a row.
E.g., allowed: [A, B, C, A, B, B, C, C, C, A, B….]
Not allowed: [A, A, B, B, B, B, C, A, B...] (because of four B’s in a row)
How can I solve this problem?
Thank you in advance.
Maybe you should build the list sequentially, rather than shuffling:
result = []
for i in range(60): # for each item in original list
start = true # we haven't found a suitable one yet
if start or i>2: # don't do checking unless 3 items in list
while start or (
c==shuf[-1] and # is the chosen value
c==shuf[-2] and # the same as any of
c==shuf[-3] ): # the last 3 items?
idx = random.randint(0,len(condition)) # chose a new one
c = condition[idx]
start = false
result.append(c) # add to result
del condition[i] # remove from list
Warning! not tested - just conceptual...
# Validate with this function it return false if more than three consecutive characters are same else True.
def isValidShuffle( test_condition):
for i in range(len(test_condition)-4):
if len(set(test_condition[ i:i+4])) == 1:
# set size will be 1 all four consecutive chars are same
return False
return True
Simplest way to create shuffled sequence of A,B,C for which isValidShuffle will return True.
from random import shuffle
# condition list contains 20 A's 20 B's 20 C's
seq = ['A','B','C']
condition = []
for seq_i in range(20):
shuffle(seq)
condition += seq
print(condition) # at most two consecutive characters will be same
print(isValidShuffle(condition))
-----------------------------------------------------------------------------
Output
['A', 'B', 'C', 'B', 'C', 'A', 'C', 'B', 'A', 'C', 'B', 'A', 'C', 'A', 'B', 'C', 'B', 'A', 'B', 'A', 'C', 'B', 'C', 'A', 'B', 'C', 'A', 'C', 'A', 'B', 'B', 'C', 'A', 'B', 'A', 'C', 'A', 'B', 'C', 'C', 'A', 'B', 'A', 'B', 'C', 'B', 'A', 'C', 'C', 'A', 'B', 'B', 'C', 'A', 'B', 'A', 'C', 'A', 'B', 'C']
...............................................................................................................................................................
This is not imposing your restriction while creating shuffled sequence but keeps on trying until it find the sequence which meets your consecutive char restriction.
validshuffle = False
condition = ['A']*20 + ['B']*20 + ['C']*20
while not validshuffle:
shuffle(condition)
if isValidShuffle(condition):
validshuffle = True
print(condition)
-------------------------------------------------------------------------------
Output
try
try
['A', 'C', 'A', 'B', 'B', 'C', 'B', 'C', 'A', 'C', 'A', 'C', 'B', 'B', 'B', 'C', 'A', 'A', 'B', 'C', 'A', 'A', 'B', 'B', 'C', 'B', 'B', 'C', 'B', 'C', 'C', 'B', 'A', 'B', 'B', 'A', 'C', 'A', 'A', 'C', 'A', 'C', 'B', 'C', 'A', 'A', 'C', 'A', 'C', 'A', 'C', 'B', 'B', 'B', 'A', 'B', 'C', 'A', 'C', 'A']
If you just want to know, how long is the longest subsequence, you could do this.
This is iterating over it the sequence and recording the length of the subsequences of the same character, saving it, getting the max for each subsequence, and then, getting the max of characters.
This is not exactly the problem you mention, but It could be useful.
from random import shuffle
sequence = ['A']*20 + ['B']*20 + ['C']*20
sequences = {'A': [], 'B':[], 'C':[]}
shuffle(sequence)
current = sequence[0]
acc = 0
for elem in sequence:
if elem == current:
acc += 1
else:
sequences[current].append(acc)
current = elem
acc = 1
else:
sequences[current].append(acc)
for key, seqs in sequences.items():
sequences[key] = max(seqs)
print(max(sequences.items(), key=lambda i: i[1]))
I want to check the columns of a dataframe inside a for loop by using a list, then perform some operations that change the contents of that list for the next iteration. Is it possible to dynamically size the if statement described here.
Example:
df =
a|b|c|d|e
1|2|3|4|5
6|7|8|9|0
check_list = ['a']
for i in range(10):
if check_list in df.columns:
do x
// variable check_list is now equal to ['a','b']
so in the first iteration the list only contains 'a' and in the second iteration it contains 'a' and 'b' and then in further iterations it will be changed further. I hope this adequately explains my question.
Working code that might help answer your question:
def add_column(l, all_columns):
""" Example of a function for adding new columns to the check list"""
if len(l) < len(all_columns):
return l + [all_columns[len(l)]]
else:
return l
all_columns = 'abcde'
df = pd.DataFrame([[1, 2, 3, 4, 5], [6, 7, 8, 9, 0]], columns=list(all_columns))
print(df)
check_list = ['a']
for i in range(10):
# issuperset() seems to be the best way to check that the list includes only column names from the dataframe
if set(df.columns).issuperset(set(check_list)):
check_list = add_column(check_list, all_columns)
print("checking list:", check_list)
# do other stuff
Output:
a b c d e
0 1 2 3 4 5
1 6 7 8 9 0
checking list: ['a', 'b']
checking list: ['a', 'b', 'c']
checking list: ['a', 'b', 'c', 'd']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
checking list: ['a', 'b', 'c', 'd', 'e']
I hope this helps.