I have collected a large Pokemon data set and I am setting out with the goal to identify the 'Top 10 Teams' based on a ratio I constructed - Pokemon BST (base stat total) : average weakness. For those who care, I calculate average weakness as the sum of a Pokemon's weakness to each type ( 0.25 to flying + 1 to water + 2 to steel + 4 to fire, etc.) and then divide it by 18 (the total number of types available in game).
To provide a quick example - a team of the following three Pokemon: Kingler, Mimikyu, Magnezone will yield a team ratio of 1604.1365384615383.
Because the data will be used for competitive play, I removed all non-fully evolved Pokemon as well as legendary/mythical Pokemon. Here is my process so far:
Create a collection of all possible combinations of fully evolved Pokemon teams
Use a for loop to iterate over each combination
The first 10 combinations will automatically be added to the list
Starting with the 11th combination, I will add the current team iteration to the list, sort the list in descending order, and then remove the team with the lowest ratio. This ensures only the top 10 will remain after each iteration.
Obviously, this process will take an impossibly long time to run. I'm wondering if there is a more efficient way to run this. Finally, please see my code below:
import itertools
import pandas as pd
df = pd.read_csv("Downloads/pokemon.csv") # read in csv of fully-evolved Pokemon data
# list(df) # list of df column names - useful to see what data has been collected
df = df[df["is_legendary"] == 0] # remove legendary pokemon - many legendaries are allowed in competitive play
df = df[['abilities', # trim df to contain only the columns we care about
'against_bug',
'against_dark',
'against_dragon',
'against_electric',
'against_fairy',
'against_fight',
'against_fire',
'against_flying',
'against_ghost',
'against_grass',
'against_ground',
'against_ice',
'against_normal',
'against_poison',
'against_psychic',
'against_rock',
'against_steel',
'against_water',
'attack',
'defense',
'hp',
'name',
'sp_attack',
'sp_defense',
'speed',
'type1',
'type2']]
df["bst"] = df["hp"] + df["attack"] + df["defense"] + df["sp_attack"] + df["sp_defense"] + df["speed"] # calculate BSTs
df['average_weakness'] = (df['against_bug'] # calculates a Pokemon's 'average weakness' to other types
+ df['against_dark']
+ df['against_dragon']
+ df['against_electric']
+ df['against_fairy']
+ df['against_fight']
+ df['against_fire']
+ df['against_flying']
+ df['against_ghost']
+ df['against_grass']
+ df['against_ground']
+ df['against_ice']
+ df['against_normal']
+ df['against_poison']
+ df['against_psychic']
+ df['against_rock']
+ df['against_steel']
+ df['against_water']) / 18
df['bst-weakness-ratio'] = df['bst'] / df['average_weakness'] # ratio of BST:avg weakness - the higher the better
names = df["name"] # pull out list of all names for creating combinations
combinations = itertools.combinations(names, 6) # create all possible combinations of 6 pokemon teams
top_10_teams = [] # list for storing top 10 teams
for x in combinations:
ratio = sum(df.loc[df['name'].isin(x)]['bst-weakness-ratio']) # pull out sum of team's ratio
if(len(top_10_teams) != 10):
top_10_teams.append((x, ratio)) # first 10 teams will automatically populate list
else:
top_10_teams.append((x, ratio)) # add team to list
top_10_teams.sort(key=lambda x:x[1], reverse=True) # sort list by descending ratios
del top_10_teams[-1] # drop team with the lowest ratio - only top 10 remain in list
top_10_teams
In your example every Pokemon has a bst_weakness-ratio and for the calculation of the team value you do not take into account that the members counterbalance each others weaknesses, but simply sum up the ratios of the 6 members? If so, shouldn't the best team be the one with the 6 best individual Pokemon? I don't get why you need the combinations in your case.
Nevertheless I guess you could remove a lot of the Pokemon's from your list before going into the combinatorics.
If you have a boolean array (n_pokemons, n_types) indicating the weaknesses of each Pokemon with True, you could check if there is a Pokemon with the same weaknesses but a better bst value.
# Loop over all pokemon and check if there are other pokemon
# ... with the exact same weaknesses but better stats
# -name -weaknesses -bst
# pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A
# pokemon B [0, 0, 1, 1, 0, ...], bst=43.58
# ... with a subset of the weaknesses and better stats
# pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A
# pokemon B [0, 0, 1, 0, 0, ...], bst=43.58
I wrote a little snippet using numpy. The values for bst and the weaknesses are
chosen randomly. With my settings
n_pokemons = 1000
n_types = 18
n_min_weaknesses = 1 # number of minimal and maximal weaknesses for each Pokemon
n_max_weaknesses = 4
Only about 30-40 pokemons remain in the list. I am not sure how plausible this is for 'real' pokemons but with such a number a combinatorial search is way more feasible.
import numpy as np
# Generate pokemons
name_arr = np.array(['pikabra_{}'.format(i) for i in range(n_pokemons)])
# Random stats
bst_arr = np.random.random(n_pokemons) * 100
# Random weaknesses
weakness_array = np.zeros((n_pokemons, n_types), dtype=bool) # bool array indicating the weak types of each pokemon
for i in range(n_pokemons):
rnd_weaknesses = np.random.choice(np.arange(n_types), np.random.randint(n_min_weaknesses, n_max_weaknesses+1))
weakness_array[i, rnd_weaknesses] = True
# Remove unnecessary pokemons
i = 0
while i < n_pokemons:
j = i + 1
while j < n_pokemons:
del_idx = None
combined_weaknesses = np.logical_or(weakness_array[i], weakness_array[j])
if np.all(weakness_array[i] == weakness_array[j]):
if bst_arr[j] < bst_arr[i]:
del_idx = i
else:
del_idx = j
elif np.all(combined_weaknesses == weakness_array[i]) and bst_arr[j] < bst_arr[i]:
del_idx = i
elif np.all(combined_weaknesses == weakness_array[j]) and bst_arr[i] < bst_arr[j]:
del_idx = j
if del_idx is not None:
name_arr = np.delete(name_arr, del_idx, axis=0)
bst_arr = np.delete(bst_arr, del_idx, axis=0)
weakness_array = np.delete(weakness_array, del_idx, axis=0)
n_pokemons -= 1
if del_idx == i:
i -= 1
break
else:
j -= 1
j += 1
i += 1
print(n_pokemons)
Related
I am working in Python with a dataset that looks like the following
Original Dataset:
Where
Card Number - Unique client identifier
Store Number - Unique store identifier
Count - Count of times a unique store has been visited by a unique client
Sum_Check Subtotal Accrued - Sum a client has spent at a unique store
Max_Date - Last time the unique client visited the unique store
I am trying to turn this into a dataframe that contains the Card Number and Store Number with the following logic applied in this order:
the most visits
if the amount of visits is tied at 2+, I want the Store Number with the highest spend
If the amount of visits is tied at 1 between multiple locations I want the most recently visited location.
So the final output should look as follows:
Currently my code looks like this
#sorting the values so that the most visited locations are at the bottom of the group followed by the highest spend.
#This allows for in the event of a tie for the algo to go to the check subtotal sum field and take the largest value
df = df.sort_values(['Card Number', 'Count', 'Sum_Check Subtotal Accrued', 'Max_Date']).drop_duplicates('Card Number', keep='last')
#dropping fields we no longer need now that our dataset is summarized
df=df.drop(['Count', 'Sum_Check Subtotal Accrued', 'Max_Date], axis = 1)
Which was working until the 3rd logic point was added which requires me to pull the most recent visit if tied at 1. I have tried adding the "Max_Date" field to the above code. However, the "Sum_Check Subtotal Accrued" field doesn't allow this to work for the clients tied at 1.
I am guessing some sort of If statement can solve this but am conceptually stuck on how to approach in this way
Any help is greatly appreciated.
Ok I think I got it:
import pandas as pd
CN = [1, 1, 2, 2, 3, 4, 4, 5, 5, 5]
SN = [111, 222, 111, 222, 444, 22, 55, 22, 222, 888]
Count = [2, 1, 1, 1, 1, 1, 1, 1, 1, 1]
SCSA = [40, 100, 50, 20, 30, 20, 50, 2, 200, 100]
Date = ["1/2/2021", "2/2/2021", "3/2/2021", "3/1/2021", "5/1/2021", "7/11/2022", "6/1/2018", "7/11/2022", "3/4/2020" ,"1/2/2019"]
df = pd.DataFrame({"Card":CN, "Store":SN, "Count":Count, "SCSA":SCSA, "Date":Date})
cards = df.Card.unique()
storeList = []
# Loop through each card uniquely, checking for their max values
for x in cards:
Card = df[df.Card == x]
countMax = Card.Count.max()
dateMax = Card.Date.max()
# If there is only one store with the max visits, add it to the list
if len(Card[Card.Count == countMax]) < 2:
storeList.append(Card.Store[Card.Count == countMax].values[0])
# If the number of visits is >= 2 and there is more than 1 store with this number of visits...
elif (countMax >= 2) and (len(Card[Card.Count == countMax]) > 1):
scsaMax = Card[Card.Count == countMax].SCSA.max() # Find the highest spending of the stores that were visited the most
storeList.append(Card.Store[Card.SCSA == scsaMax].values[0]) # add the store with the most spending of the store that were visited the most
# Otherwise, just add the most recently visited store to the list
else:
storeList.append(Card.Store[Card.Date == dateMax].values[0])
pd.DataFrame({"Card Number":cards, "Store Number":storeList})
Output:
Card Number Store Number
1 111
2 111
3 444
4 22
5 22
I changed some of the visit counts and SCSA values to make sure it was still printing out what I expected it to, seems to be right now.
Try this:
(df.sort_values(['Count','Max_Date','Sum_Check Subtotal Accrued'],ascending = [0,0,0])
.groupby('Card Number')[['Card Number','Store Number']]
.head(1)
.sort_values('Card Number'))
A recruiter wants to form a team with different skills and he wants to pick the minimum number of persons which can cover all the required skills.
N represents number of persons and K is the number of distinct skills that need to be included. list spec_skill = [[1,3],[0,1,2],[0,2,4]] provides information about skills of each person. e.g. person 0 has skills 1 and 3, person 1 has skills 0, 1 and 2 and so on.
The code should outputs the size of the smallest team that recruiter could find (the minimum number of persons) and values indicating the specific IDs of the people to recruit onto the team.
I implemented the code with brute force as below but since some data are more than thousands, it seems I need to be solved with heuristic approaches. In this case it is possible to have approximate answer.
Any suggestion how to solve it with heuristic methods will be appreciated.
N,K = 3,5
spec_skill = [[1,3],[0,1,2],[0,2,4]]
A = list(range(K))
set_a = set(A)
solved = False
for L in range(0, len(spec_skill)+1):
for subset in itertools.combinations(spec_skill, L):
s = set(item for sublist in subset for item in sublist)
if set_a.issubset(s):
print(str(len(subset)) + '\n' + ' '.join([str(spec_skill.index(item)) for item in subset]))
solved = True
break
if solved: break
Here is my way of doing this. There might be potential optimization possibilities in the code, but the base idea should be understandable.
import random
import time
def man_power(lst, K, iterations=None, period=0):
"""
Specify a fixed number of iterations
or a period in seconds to limit the total computation time.
"""
# mapping each sublist into a (sublist, original_index) tuple
lst2 = [(lst[i], i) for i in range(len(lst))]
mini_sample = [0]*(len(lst)+1)
if period<0 or (period == 0 and iterations is None):
raise AttributeError("You must specify iterations or a positive period")
def shuffle_and_pick(lst, iterations):
mini = [0]*len(lst)
for _ in range(iterations):
random.shuffle(lst2)
skillset = set()
chosen_ones = []
idx = 0
fullset = True
# Breaks from the loop when all skillsets are found
while len(skillset) < K:
# No need to go further, we didn't find a better combination
if len(chosen_ones) >= len(mini):
fullset = False
break
before = len(skillset)
skillset.update(lst2[idx][0])
after = len(skillset)
if after > before:
# We append with the orginal index of the sublist
chosen_ones.append(lst2[idx][1])
idx += 1
if fullset:
mini = chosen_ones.copy()
return mini
# Estimates how many iterations we can do in the specified period
if iterations is None:
t0 = time.perf_counter()
mini_sample = shuffle_and_pick(lst, 1)
iterations = int(period / (time.perf_counter() - t0)) - 1
mini_result = shuffle_and_pick(lst, iterations)
if len(mini_sample)<len(mini_result):
return mini_sample, len(mini_sample)
else:
return mini_result, len(mini_result)
I have the following code block which figures out the number of overlapping sessions. Given different intervals, the task is to print the maximum number of overlap among these intervals at any time and also to find the overlapped interval.
def overlap(v):
# variable to store the maximum
# count
ans = 0
count = 0
data = []
# storing the x and y
# coordinates in data vector
for i in range(len(v)):
# pushing the x coordinate
data.append([v[i][0], 'x'])
# pushing the y coordinate
data.append([v[i][1], 'y'])
# sorting of ranges
data = sorted(data)
# Traverse the data vector to
# count number of overlaps
for i in range(len(data)):
# if x occur it means a new range
# is added so we increase count
if (data[i][1] == 'x'):
count += 1
# if y occur it means a range
# is ended so we decrease count
if (data[i][1] == 'y'):
count -= 1
# updating the value of ans
# after every traversal
ans = max(ans, count)
# printing the maximum value
print(ans)
# Driver code
v = [[ 1, 2 ], [ 2, 4 ], [ 3, 6 ],[3,8]]
overlap(v)
This returns 3.
But what would be the best way to also return the maximum overlapping interval by modifying my existing approach? In this case which should be [3,4].
You could use the counter object (from collections) to create a list of intersecting sub intervals and count the number of original intervals that intersect with them. Each interval in your list would be intersected with all the sub-intervals found so far in order to accumulate the counts:
v = [[ 1, 2 ], [ 2, 4 ], [ 3, 6 ],[3,8]]
from collections import Counter
overCounts = Counter()
for vStart,vEnd in v:
overlaps = [(max(s,vStart),min(e,vEnd)) for s,e in overCounts
if s<=vEnd and e>=vStart]
overCounts += Counter(overlaps + [(vStart,vEnd)])
interval,count = overCounts.most_common(1)[0]
print(interval,count) # (3,4) 3
The overlaps list detects intersections with the sub-intervals found so far. s<=vEnd and e>=vStart will return True when interval (s,e) intersects with interval (vStart,vEnd). For those intervals that do intersect we want the start and end of the intersection (sub-interval). The intersection will start at the largest beginning and end at the smallest end. So we take the max() of the start positions with the min() of the end positions to form the sub-interval: (max(s,vStart),min(e,vEnd))
vStart vEnd
[--------------------]
[--------------------------]
s e
[-------------]
--max-> <----min-----
[EDIT]
To be honest, I like your original approach better than mine. It will respond in O(NLogN) time whereas mine could go up to O(N^2) depending on the data.
In order to capture the sub-interval corresponding to the result in your original approach, you would need to add a variable to keep track of the last starting position encountered and move detection of a higher count inside the 'y' condition.
For example:
lastStart = maxStart = maxEnd = None
# ...
if (data[i][1] == 'x'):
lastStart = data[i][0] # last start of sub-interval
count += 1
if (data[i][1] == 'y'):
if count > ans: # detect a greater overlap
maxStart = lastStart # start of corresponding sub-interval
maxEnd = data[i][0]
ans = count
count -= 1
# ans = max(ans, count) <-- removed
# ...
You could also implement it using accumulate:
v = [[ 1, 2 ], [ 2, 4 ], [ 3, 6 ],[3,8]]
from itertools import accumulate
edges = sorted((p,e) for i in v for p,e in zip(i,(-1,1)))
counts = accumulate(-e for _,e in edges)
starts = accumulate((p*(e<0) for p,e in edges),max)
count,start,end = max((c+1,s,p) for c,s,(p,e) in zip(counts,starts,edges) if e>0)
print(count,[start,end]) # 3 [3, 4]
Algorithm Objective:
link to the pictures i took while giving the amazon interview:
[https://boards.wetransfer.com/board/shl7w5z1e62os7nwv20190618224258/latest][pictures]
Eight houses, represented as cells, are arranged in a straight line. Each day every cell competes with its adjacent cells(neighbors). An integer value of 1 represents an active cell and a value of 0 represents an inactive cell. If the neighbors on both sides of a cell are either active or inactive, the cell becomes inactive on the next day, otherwise the cell becomes active. The two cell on each end have a single a single adjacent cell, so assume that the unoccupied space on the opposite side is an inactive cell. Even after updating the cell state, consider its previous state when updating the state of other cells. The state information of all cells should be updated simultaneously.
Create an algorithm to output the state of the cells after the given number of days.
Input:
The input to the function/method consists of two arguments:
states, a list of integers representing the current state of cells,
days,an integer representing the number of days.
Output:
Return a list of integers representing the state of the cells after the given number of days
Note:
The elements of the list states contains 0s and 1s only
TestCase 1:
Input: [1,0,0,0,0,1,0,0] , 1
Expected Return Value: 0 1 0 0 1 0 1 0
TestCase 2:
Input: [1,1,1,0,1,1,1,1] , 2
Expected Return Value: 0 0 0 0 0 1 1 0
What I Tried:
def cellCompete(states, days):
# WRITE YOUR CODE HERE
il = 0;
tl = len(states);
intialvalue = states
results = []
states = []
for i in range(days):
#first range
if(intialvalue[il] != intialvalue[il+1]):
print('value of index 0 is : ',reverse(intialvalue[il]))
results.append(reverse(intialvalue[il]))
else:
print('value of index 0 is :', intialvalue[il])
results.append(intialvalue[il])
print("-------------------")
#range middle
while il < tl-2:
if(intialvalue[il] != intialvalue[il+1] or intialvalue[il+1] != intialvalue[il+2]):
print('value of index',il+1,'is : ',reverse(intialvalue[il+1]))
results.append(reverse(intialvalue[il+1]))
else:
print('value of index', il+1,'is :', intialvalue[il+1])
results.append(intialvalue[il+1])
print("-------------------")
il += 1
#range last
if(intialvalue[tl-2] != intialvalue[tl-1]):
print('value of index',tl-1,'is : ',reverse(intialvalue[tl-1]))
results.append(reverse(intialvalue[tl-1]))
else:
print('value of index',tl-1,'is :', intialvalue[tl-1])
results.append(intialvalue[tl-1])
print("-------------------")
print('Input: ',intialvalue)
print('Results: ',results)
initialvalue = results
def reverse(val):
if(val == 0):
return 1
elif(val == 1):
return 0
print("-------------------------Test case 1--------------------")
cellCompete([1,0,0,0,0,1,0,0],1)
print("-------------------------Test case 2--------------------")
cellCompete([1,1,1,0,1,1,1,1],2)
I am relatively new to python and i could not complete this algorithm for the second case on this python
Here is a much shorter routine that solves your problem.
def cellCompete(states, days):
n = len(states)
for day in range(days):
houses = [0] + states + [0]
states = [houses[i-1] ^ houses[i+1] for i in range(1, n+1)]
return states
print(cellCompete([1,0,0,0,0,1,0,0] , 1))
print(cellCompete([1,1,1,0,1,1,1,1] , 2))
The printout from that is what you want (though with list brackets included):
[0, 1, 0, 0, 1, 0, 1, 0]
[0, 0, 0, 0, 0, 1, 1, 0]
This routine adds sentinel zeros to each end of the list of house states. It then uses a list comprehension to find the houses' new states. All this is repeated the proper number of times before the house states are returned.
The calculation of a new house state is houses[i-1] ^ houses[i+1]. That character ^ is bitwise exclusive-or. The value is 1 if the two values are different and 0 if the two values are the same. That is just what is needed in your problem.
Recursive version:
def cell_compete(states, days):
s = [0] + states + [0]
states = [i ^ j for i, j in zip(s[:-2], s[2:])] # Thanks #RoyDaulton
return cell_compete(states, days - 1) if days > 1 else states
A non-recursive version that also avoids extending the list by adding edge [0] elements would be:
def cell_compete(states, days):
for _ in range(days):
states = [states[1]] + [i ^ j for i, j in zip(states[:-2], states[2:])] + [states[-2]]
return states
Another possibility:
def cellCompete(states,days):
newstates = []
added_states = [0] + states + [0]
for counter,value in enumerate(states):
newstates.append(int((added_states[counter] != added_states[counter+2])))
if days > 1:
return cellCompete(newstates,days-1)
else:
return newstates
print(cellCompete([1,1,1,0,1,1,1,1],2))
Similar to Rory's using XOR but without the need for the internal comprehension. Bit shift the number by 2 and clip the extra bit from the left by taking the modulus:
def process(state, r):
n = int(''.join(map(str,state)), 2)
for i in range(r):
n = ((n ^ n << 2) >> 1) % 256
return list(map(int,format(n, "08b")))
process([1,1,1,0,1,1,1,1], 2)
# [0, 0, 0, 0, 0, 1, 1, 0]
process([1,0,0,0,0,1,0,0] , 1)
# [0, 1, 0, 0, 1, 0, 1, 0]
While everyone is trying to make the simplest version possible here's a more complex version. It's pretty similar to the previous answers except that instead of keeping the state in the function, this solution is formed of 2 two part. One is the utility function that we want to be able to call, the other is a generator that keep tracks of the states.
The main difference here is that the generator takes a comparator and an initial state that will be mutated. The generator can also be sent as a parameter so the generator can help divide the logic of how many state you want to generate and to have a way to mutate from an actual state indefinitely.
def mutator(state, comparator):
while True:
states = [0] + state + [0]
state = [
comparator(states[cellid-1], states[cellid+1])
for cellid in range(1, len(states)-1)
]
yield state
def cellCompete(states, days):
generator = mutator(states, lambda x, y: x ^ y)
for idx, states in enumerate(generator):
if idx+2 > days:
break
return states
print(cellCompete([1,0,0,0,0,1,0,0] , 1))
print(cellCompete([1,1,1,0,1,1,1,1] , 2))
Also, I added a comparator that allow us to have some kind of undefined operation on both elements. It can allow the code to be extended beyond the initial spec. It's obviously a superfluous implementation but as mentioned, it's supposed to be an interview answer and as much as I like to see a straight to the point answer, if someone can come up with a flexible answer in the same timeframe, then why not.
The input is an integer that specifies the amount to be ordered.
There are predefined package sizes that have to be used to create that order.
e.g.
Packs
3 for $5
5 for $9
9 for $16
for an input order 13 the output should be:
2x5 + 1x3
So far I've the following approach:
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
while remaining_order > 0:
found = False
for pack_num in package_numbers:
if pack_num <= remaining_order:
required_packages.append(pack_num)
remaining_order -= pack_num
found = True
break
if not found:
break
But this will lead to the wrong result:
1x9 + 1x3
remaining: 1
So, you need to fill the order with the packages such that the total price is maximal? This is known as Knapsack problem. In that Wikipedia article you'll find several solutions written in Python.
To be more precise, you need a solution for the unbounded knapsack problem, in contrast to popular 0/1 knapsack problem (where each item can be packed only once). Here is working code from Rosetta:
from itertools import product
NAME, SIZE, VALUE = range(3)
items = (
# NAME, SIZE, VALUE
('A', 3, 5),
('B', 5, 9),
('C', 9, 16))
capacity = 13
def knapsack_unbounded_enumeration(items, C):
# find max of any one item
max1 = [int(C / item[SIZE]) for item in items]
itemsizes = [item[SIZE] for item in items]
itemvalues = [item[VALUE] for item in items]
# def totvalue(itemscount, =itemsizes, itemvalues=itemvalues, C=C):
def totvalue(itemscount):
# nonlocal itemsizes, itemvalues, C
totsize = sum(n * size for n, size in zip(itemscount, itemsizes))
totval = sum(n * val for n, val in zip(itemscount, itemvalues))
return (totval, -totsize) if totsize <= C else (-1, 0)
# Try all combinations of bounty items from 0 up to max1
bagged = max(product(*[range(n + 1) for n in max1]), key=totvalue)
numbagged = sum(bagged)
value, size = totvalue(bagged)
size = -size
# convert to (iten, count) pairs) in name order
bagged = ['%dx%d' % (n, items[i][SIZE]) for i, n in enumerate(bagged) if n]
return value, size, numbagged, bagged
if __name__ == '__main__':
value, size, numbagged, bagged = knapsack_unbounded_enumeration(items, capacity)
print(value)
print(bagged)
Output is:
23
['1x3', '2x5']
Keep in mind that this is a NP-hard problem, so it will blow as you enter some large values :)
You can use itertools.product:
import itertools
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
a=min([x for i in range(1,remaining_order+1//min(package_numbers)) for x in itertools.product(package_numbers,repeat=i)],key=lambda x: abs(sum(x)-remaining_order))
remaining_order-=sum(a)
print(a)
print(remaining_order)
Output:
(5, 5, 3)
0
This simply does the below steps:
Get value closest to 13, in the list with all the product values.
Then simply make it modify the number of remaining_order.
If you want it output with 'x':
import itertools
from collections import Counter
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
a=min([x for i in range(1,remaining_order+1//min(package_numbers)) for x in itertools.product(package_numbers,repeat=i)],key=lambda x: abs(sum(x)-remaining_order))
remaining_order-=sum(a)
print(' '.join(['{0}x{1}'.format(v,k) for k,v in Counter(a).items()]))
print(remaining_order)
Output:
2x5 + 1x3
0
For you problem, I tried two implementations depending on what you want, in both of the solutions I supposed you absolutely needed your remaining to be at 0. Otherwise the algorithm will return you -1. If you need them, tell me I can adapt my algorithm.
As the algorithm is implemented via dynamic programming, it handles good inputs, at least more than 130 packages !
In the first solution, I admitted we fill with the biggest package each time.
I n the second solution, I try to minimize the price, but the number of packages should always be 0.
remaining_order = 13
package_numbers = sorted([9,5,3], reverse=True) # To make sure the biggest package is the first element
prices = {9: 16, 5: 9, 3: 5}
required_packages = []
# First solution, using the biggest package each time, and making the total order remaining at 0 each time
ans = [[] for _ in range(remaining_order + 1)]
ans[0] = [0, 0, 0]
for i in range(1, remaining_order + 1):
for index, package_number in enumerate(package_numbers):
if i-package_number > -1:
tmp = ans[i-package_number]
if tmp != -1:
ans[i] = [tmp[x] if x != index else tmp[x] + 1 for x in range(len(tmp))]
break
else: # Using for else instead of a boolean value `found`
ans[i] = -1 # -1 is the not found combinations
print(ans[13]) # [0, 2, 1]
print(ans[9]) # [1, 0, 0]
# Second solution, minimizing the price with order at 0
def price(x):
return 16*x[0]+9*x[1]+5*x[2]
ans = [[] for _ in range(remaining_order + 1)]
ans[0] = ([0, 0, 0],0) # combination + price
for i in range(1, remaining_order + 1):
# The not found packages will be (-1, float('inf'))
minimal_price = float('inf')
minimal_combinations = -1
for index, package_number in enumerate(package_numbers):
if i-package_number > -1:
tmp = ans[i-package_number]
if tmp != (-1, float('inf')):
tmp_price = price(tmp[0]) + prices[package_number]
if tmp_price < minimal_price:
minimal_price = tmp_price
minimal_combinations = [tmp[0][x] if x != index else tmp[0][x] + 1 for x in range(len(tmp[0]))]
ans[i] = (minimal_combinations, minimal_price)
print(ans[13]) # ([0, 2, 1], 23)
print(ans[9]) # ([0, 0, 3], 15) Because the price of three packages is lower than the price of a package of 9
In case you need a solution for a small number of possible
package_numbers
but a possibly very big
remaining_order,
in which case all the other solutions would fail, you can use this to reduce remaining_order:
import numpy as np
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
sub_max=np.sum([(np.product(package_numbers)/i-1)*i for i in package_numbers])
while remaining_order > sub_max:
remaining_order -= np.product(package_numbers)
required_packages.append([max(package_numbers)]*np.product(package_numbers)/max(package_numbers))
Because if any package is in required_packages more often than (np.product(package_numbers)/i-1)*i it's sum is equal to np.product(package_numbers). In case the package max(package_numbers) isn't the one with the samllest price per unit, take the one with the smallest price per unit instead.
Example:
remaining_order = 100
package_numbers = [5,3]
Any part of remaining_order bigger than 5*2 plus 3*4 = 22 can be sorted out by adding 5 three times to the solution and taking remaining_order - 5*3.
So remaining order that actually needs to be calculated is 10. Which can then be solved to beeing 2 times 5. The rest is filled with 6 times 15 which is 18 times 5.
In case the number of possible package_numbers is bigger than just a handful, I recommend building a lookup table (with one of the others answers' code) for all numbers below sub_max which will make this immensely fast for any input.
Since no declaration about the object function is found, I assume your goal is to maximize the package value within the pack's capability.
Explanation: time complexity is fixed. Optimal solution may not be filling the highest valued item as many as possible, you have to search all possible combinations. However, you can reuse the possible optimal solutions you have searched to save space. For example, [5,5,3] is derived from adding 3 to a previous [5,5] try so the intermediate result can be "cached". You may either use an array or you may use a set to store possible solutions. The code below runs the same performance as the rosetta code but I think it's clearer.
To further optimize, use a priority set for opts.
costs = [3,5,9]
value = [5,9,16]
volume = 130
# solutions
opts = set()
opts.add(tuple([0]))
# calc total value
cost_val = dict(zip(costs, value))
def total_value(opt):
return sum([cost_val.get(cost,0) for cost in opt])
def possible_solutions():
solutions = set()
for opt in opts:
for cost in costs:
if cost + sum(opt) > volume:
continue
cnt = (volume - sum(opt)) // cost
for _ in range(1, cnt + 1):
sol = tuple(list(opt) + [cost] * _)
solutions.add(sol)
return solutions
def optimize_max_return(opts):
if not opts:
return tuple([])
cur = list(opts)[0]
for sol in opts:
if total_value(sol) > total_value(cur):
cur = sol
return cur
while sum(optimize_max_return(opts)) <= volume - min(costs):
opts = opts.union(possible_solutions())
print(optimize_max_return(opts))
If your requirement is "just fill the pack" it'll be even simpler using the volume for each item instead.