How to calculate shipping fee from tuples in Python? - python

I am looking to return the delivery charge in pence (p) for each order. We have a flat £50 (5000p) fee plus £20 (2000p) for each 100lb (additional weight ignored below 100lbs).
An ordered item has 4 components: name, quantity, unit price (in pence), unit weight (in pounds)
ORDER_SAMPLE_1 = {("lamp", 2, 2399, 2), ("chair", 4, 3199, 10), ("table", 1, 5599, 85)}
ORDER_SAMPLE_2 = {("sofa", 1, 18399, 140), ("bookshelf", 2, 4799, 40)}
def delivery_charges(order):
E.g., delivery_charges({("desk", 1, 11999, 160)}) is 7000 (pence)
E.g., delivery_charges({("desk", 2, 11999, 160)}) is 11000 (pence)
E.g., delivery_charges({("lamp", 1, 2399, 2)}) is 5000 (pence)
E.g., delivery_charges({("lamp", 50, 2399, 2)}) is 7000 (pence)
Is a for loop or elif the best way to approach?

I am assuming that 5000p is the base fee for the whole order and not for each item in it. So here is one way you can do this,
You can iterate over all the items in an order.
For each item calculate its total weight then calculate the extra_charges by dividing the total weight by 100 and then multiplying it with 2000.
Add this to the total charges. Return the total charges after the loop finishes iterating.
def delivery_charges(order):
total_charges = 5000
for item in order:
total_item_weight = item[1] * item[3]
extra_charges = (total_item_weight // 100) * (2000)
total_charges += extra_charges
return total_charges

Related

Python - Pull most visited store, if tied pull store where most was spent. If tied at 1 then pull most recently visited store

I am working in Python with a dataset that looks like the following
Original Dataset:
Where
Card Number - Unique client identifier
Store Number - Unique store identifier
Count - Count of times a unique store has been visited by a unique client
Sum_Check Subtotal Accrued - Sum a client has spent at a unique store
Max_Date - Last time the unique client visited the unique store
I am trying to turn this into a dataframe that contains the Card Number and Store Number with the following logic applied in this order:
the most visits
if the amount of visits is tied at 2+, I want the Store Number with the highest spend
If the amount of visits is tied at 1 between multiple locations I want the most recently visited location.
So the final output should look as follows:
Currently my code looks like this
#sorting the values so that the most visited locations are at the bottom of the group followed by the highest spend.
#This allows for in the event of a tie for the algo to go to the check subtotal sum field and take the largest value
df = df.sort_values(['Card Number', 'Count', 'Sum_Check Subtotal Accrued', 'Max_Date']).drop_duplicates('Card Number', keep='last')
#dropping fields we no longer need now that our dataset is summarized
df=df.drop(['Count', 'Sum_Check Subtotal Accrued', 'Max_Date], axis = 1)
Which was working until the 3rd logic point was added which requires me to pull the most recent visit if tied at 1. I have tried adding the "Max_Date" field to the above code. However, the "Sum_Check Subtotal Accrued" field doesn't allow this to work for the clients tied at 1.
I am guessing some sort of If statement can solve this but am conceptually stuck on how to approach in this way
Any help is greatly appreciated.
Ok I think I got it:
import pandas as pd
CN = [1, 1, 2, 2, 3, 4, 4, 5, 5, 5]
SN = [111, 222, 111, 222, 444, 22, 55, 22, 222, 888]
Count = [2, 1, 1, 1, 1, 1, 1, 1, 1, 1]
SCSA = [40, 100, 50, 20, 30, 20, 50, 2, 200, 100]
Date = ["1/2/2021", "2/2/2021", "3/2/2021", "3/1/2021", "5/1/2021", "7/11/2022", "6/1/2018", "7/11/2022", "3/4/2020" ,"1/2/2019"]
df = pd.DataFrame({"Card":CN, "Store":SN, "Count":Count, "SCSA":SCSA, "Date":Date})
cards = df.Card.unique()
storeList = []
# Loop through each card uniquely, checking for their max values
for x in cards:
Card = df[df.Card == x]
countMax = Card.Count.max()
dateMax = Card.Date.max()
# If there is only one store with the max visits, add it to the list
if len(Card[Card.Count == countMax]) < 2:
storeList.append(Card.Store[Card.Count == countMax].values[0])
# If the number of visits is >= 2 and there is more than 1 store with this number of visits...
elif (countMax >= 2) and (len(Card[Card.Count == countMax]) > 1):
scsaMax = Card[Card.Count == countMax].SCSA.max() # Find the highest spending of the stores that were visited the most
storeList.append(Card.Store[Card.SCSA == scsaMax].values[0]) # add the store with the most spending of the store that were visited the most
# Otherwise, just add the most recently visited store to the list
else:
storeList.append(Card.Store[Card.Date == dateMax].values[0])
pd.DataFrame({"Card Number":cards, "Store Number":storeList})
Output:
Card Number Store Number
1 111
2 111
3 444
4 22
5 22
I changed some of the visit counts and SCSA values to make sure it was still printing out what I expected it to, seems to be right now.
Try this:
(df.sort_values(['Count','Max_Date','Sum_Check Subtotal Accrued'],ascending = [0,0,0])
.groupby('Card Number')[['Card Number','Store Number']]
.head(1)
.sort_values('Card Number'))

Efficiently iterating over 3.311031748 E+12 combinations in Python

I have collected a large Pokemon data set and I am setting out with the goal to identify the 'Top 10 Teams' based on a ratio I constructed - Pokemon BST (base stat total) : average weakness. For those who care, I calculate average weakness as the sum of a Pokemon's weakness to each type ( 0.25 to flying + 1 to water + 2 to steel + 4 to fire, etc.) and then divide it by 18 (the total number of types available in game).
To provide a quick example - a team of the following three Pokemon: Kingler, Mimikyu, Magnezone will yield a team ratio of 1604.1365384615383.
Because the data will be used for competitive play, I removed all non-fully evolved Pokemon as well as legendary/mythical Pokemon. Here is my process so far:
Create a collection of all possible combinations of fully evolved Pokemon teams
Use a for loop to iterate over each combination
The first 10 combinations will automatically be added to the list
Starting with the 11th combination, I will add the current team iteration to the list, sort the list in descending order, and then remove the team with the lowest ratio. This ensures only the top 10 will remain after each iteration.
Obviously, this process will take an impossibly long time to run. I'm wondering if there is a more efficient way to run this. Finally, please see my code below:
import itertools
import pandas as pd
df = pd.read_csv("Downloads/pokemon.csv") # read in csv of fully-evolved Pokemon data
# list(df) # list of df column names - useful to see what data has been collected
df = df[df["is_legendary"] == 0] # remove legendary pokemon - many legendaries are allowed in competitive play
df = df[['abilities', # trim df to contain only the columns we care about
'against_bug',
'against_dark',
'against_dragon',
'against_electric',
'against_fairy',
'against_fight',
'against_fire',
'against_flying',
'against_ghost',
'against_grass',
'against_ground',
'against_ice',
'against_normal',
'against_poison',
'against_psychic',
'against_rock',
'against_steel',
'against_water',
'attack',
'defense',
'hp',
'name',
'sp_attack',
'sp_defense',
'speed',
'type1',
'type2']]
df["bst"] = df["hp"] + df["attack"] + df["defense"] + df["sp_attack"] + df["sp_defense"] + df["speed"] # calculate BSTs
df['average_weakness'] = (df['against_bug'] # calculates a Pokemon's 'average weakness' to other types
+ df['against_dark']
+ df['against_dragon']
+ df['against_electric']
+ df['against_fairy']
+ df['against_fight']
+ df['against_fire']
+ df['against_flying']
+ df['against_ghost']
+ df['against_grass']
+ df['against_ground']
+ df['against_ice']
+ df['against_normal']
+ df['against_poison']
+ df['against_psychic']
+ df['against_rock']
+ df['against_steel']
+ df['against_water']) / 18
df['bst-weakness-ratio'] = df['bst'] / df['average_weakness'] # ratio of BST:avg weakness - the higher the better
names = df["name"] # pull out list of all names for creating combinations
combinations = itertools.combinations(names, 6) # create all possible combinations of 6 pokemon teams
top_10_teams = [] # list for storing top 10 teams
for x in combinations:
ratio = sum(df.loc[df['name'].isin(x)]['bst-weakness-ratio']) # pull out sum of team's ratio
if(len(top_10_teams) != 10):
top_10_teams.append((x, ratio)) # first 10 teams will automatically populate list
else:
top_10_teams.append((x, ratio)) # add team to list
top_10_teams.sort(key=lambda x:x[1], reverse=True) # sort list by descending ratios
del top_10_teams[-1] # drop team with the lowest ratio - only top 10 remain in list
top_10_teams
In your example every Pokemon has a bst_weakness-ratio and for the calculation of the team value you do not take into account that the members counterbalance each others weaknesses, but simply sum up the ratios of the 6 members? If so, shouldn't the best team be the one with the 6 best individual Pokemon? I don't get why you need the combinations in your case.
Nevertheless I guess you could remove a lot of the Pokemon's from your list before going into the combinatorics.
If you have a boolean array (n_pokemons, n_types) indicating the weaknesses of each Pokemon with True, you could check if there is a Pokemon with the same weaknesses but a better bst value.
# Loop over all pokemon and check if there are other pokemon
# ... with the exact same weaknesses but better stats
# -name -weaknesses -bst
# pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A
# pokemon B [0, 0, 1, 1, 0, ...], bst=43.58
# ... with a subset of the weaknesses and better stats
# pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A
# pokemon B [0, 0, 1, 0, 0, ...], bst=43.58
I wrote a little snippet using numpy. The values for bst and the weaknesses are
chosen randomly. With my settings
n_pokemons = 1000
n_types = 18
n_min_weaknesses = 1 # number of minimal and maximal weaknesses for each Pokemon
n_max_weaknesses = 4
Only about 30-40 pokemons remain in the list. I am not sure how plausible this is for 'real' pokemons but with such a number a combinatorial search is way more feasible.
import numpy as np
# Generate pokemons
name_arr = np.array(['pikabra_{}'.format(i) for i in range(n_pokemons)])
# Random stats
bst_arr = np.random.random(n_pokemons) * 100
# Random weaknesses
weakness_array = np.zeros((n_pokemons, n_types), dtype=bool) # bool array indicating the weak types of each pokemon
for i in range(n_pokemons):
rnd_weaknesses = np.random.choice(np.arange(n_types), np.random.randint(n_min_weaknesses, n_max_weaknesses+1))
weakness_array[i, rnd_weaknesses] = True
# Remove unnecessary pokemons
i = 0
while i < n_pokemons:
j = i + 1
while j < n_pokemons:
del_idx = None
combined_weaknesses = np.logical_or(weakness_array[i], weakness_array[j])
if np.all(weakness_array[i] == weakness_array[j]):
if bst_arr[j] < bst_arr[i]:
del_idx = i
else:
del_idx = j
elif np.all(combined_weaknesses == weakness_array[i]) and bst_arr[j] < bst_arr[i]:
del_idx = i
elif np.all(combined_weaknesses == weakness_array[j]) and bst_arr[i] < bst_arr[j]:
del_idx = j
if del_idx is not None:
name_arr = np.delete(name_arr, del_idx, axis=0)
bst_arr = np.delete(bst_arr, del_idx, axis=0)
weakness_array = np.delete(weakness_array, del_idx, axis=0)
n_pokemons -= 1
if del_idx == i:
i -= 1
break
else:
j -= 1
j += 1
i += 1
print(n_pokemons)

How do I print the position of each item in the list?

I would like for the positions I selected to be printed out. I made this (with the help from here) to help me. This code is for me to easily work out level rewards. People tell me what level they're at and I enter those levels one by one, I'll then add points to their accounts using the command (>add-money...). When writing out the reward I'm giving I want to be able to easily write what levels the rewards are from (i.e. what position in the list)
How can I make it so that I can print each position in the list I used?
My list:
rewards = [0, 150, 225, 330, 500, 1000, 1500, 2250, 3400, 5000, 10000, 13000, 17000, 22000, 29000, 60000]
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
def rewardz():
# Running totals.
lists = []
total = 0
user=input('User -> ')
while True:
# Get reward level from the user. If not a valid reward level, stop.
level = input('-> ')
try:
level_num = int(level)
except ValueError:
break
if level_num not in range(len(rewards)):
break
# Add the reward to the lists and the total.
reward = rewards[level_num]
lists.append(reward)
total += reward
# Final output.
print(lists)
print(total, total*1000)
print()
print(" + ".join(str(i) for i in lists))
print('>add-money bank',user, total*1000)
print("\n-----########------\n\n")
rewardz()
rewardz()
What (or similar to what) I want the result to be:
[2, 4, 7, 1, 4, etc]
Since you are asking the user to input the level you already have the level saved in level_num. If you want to return it you could do lists.append((reward, level_num)) or for a possibly cleaner solution use a dictionary.
lists = {"rewards": [],
"level": []}
And then to append to it you can do:
lists["rewards"].append(reward)
lists["index"].append(level_num)
Now in lists["index"] you have the list you want as an output and in lists["rewards"] you get your values.
Alternatively you can open a new list and append to that as well:
levels = []
levels.append(level_num) # In your while loop

Triggering a Binary Variable based on the values of another Variable

Background:
This is a fairly simple script that is looking to achieve the following:
For a list of four Items, each has a Demand
For each of those items, there are four Vendors who have differing prices and quantities for each of those four items and fixed shipping costs
Shipping is only added once per checkout, regardless of the number of items ordered from the Vendor (although shipping will not be charged if nothing is ordered from that Vendor)
I've gotten so far as to returning the minimal cost and breakdown of what to order from where without shipping.
I'm currently stuck on how to working in the SUM(VendorVar[x]{0:1} * ShippingData[x]) portion, as I essentially need a way to switch the binary value to ON/1 if the quantity of items I'm ordering from a Seller is > 0
from pulp import *
items = ["Item1", "Item2", "Item3", "Item4"]
vendors = ["Vendor1", "Vendor2", "Vendor3", "Vendor4"]
# List containing lists for each Vendor and their Item costs for Item1, Item2, Item3, Item4 respectively:
costData = [[1.00,5.00,10.00,0.15],
[1.50,2.50,5.00,0.25],
[0.50,1.00,15.00,0.50],
[1.75,10.00,2.00,0.10]]
# List containing lists for each Vendor and their Supply for Item1, Item2, Item3, Item4 respectively:
supplyData = [[0,2,4,1],
[4,0,1,4],
[1,1,1,1],
[8,8,8,8]]
# Created nested dictionaries per Item per Vendor for Costs: {Item1: {Vendor1:Cost, Vendor2:Cost...}}
vendoritemcosts = makeDict([items,vendors],costData)
# Created nested dictionaries per Item per Vendor for Supply: {Item1: {Vendor1:Supply, Vendor2:Supply...}}
vendoritemsupply = makeDict([items,vendors],supplyData)
# Shipping costs per Vendor:
shippingData = {"Vendor1":0.99,
"Vendor2":1.99,
"Vendor3":0.00,
"Vendor4":2.99}
# Number of items desired:
demand = {"Item1":4,
"Item2":4,
"Item3":4,
"Item4":8}
# Number of items to purchase for each Vendor/Item combination:
vendoritemvar = LpVariable.dicts("item",(items,vendors),0,None,LpInteger)
# Binary flag that (hopefully) will determine if a Vendor is included in the final optimized formula or not:
vendorvar = LpVariable.dicts("vendor",vendors,0,1,LpBinary)
prob = LpProblem("cart",LpMinimize)
# Objective Function: Take the sum of quantity ordered of each unique Vendor+Item combination multiplied by its price
# For every Vendor included in the list, multiple {0:1} to their shipping costs, with 1 being used if they have any items in the first portion of the function above
prob += lpSum([vendoritemvar[a][b] * vendoritemcosts[a][b] for a in vendoritemvar for b in vendoritemvar[a]]) \
+ lpSum(vendorvar[c] * shippingData[c] for c in vendorvar)
for a in vendoritemvar:
# Sum total of each item must equal Demand
prob += lpSum(vendoritemvar[a]) == demand[a]
# Currently testing minimum checkout values which will be a future addition that isn't a fixed value:
prob += lpSum(vendoritemvar[a][b] * vendoritemcosts[a][b] for b in vendoritemvar[a]) >= 2.00
for b in vendoritemvar[a]:
# Non-negativity constraint
prob += vendoritemvar[a][b] >= 0
# Can't exceed available supply
prob += vendoritemvar[a][b] <= vendoritemsupply[a][b]
prob.solve()
print("Status: %s" % LpStatus[prob.status])
for v in prob.variables():
print("%s = %s" % (v.name,v.varValue))
print("Total cart = %s" % value(prob.objective))
I think you only need to add the implication
vendorvar[v] = 0 => vendoritemvar[i,v] = 0
This can be modeled with a big-M constraint:
vendoritemvar[i,v] ≤ M * vendorvar[v]
Good values for M can be derived from the supplyData/vendoritemsupply tables:
vendoritemvar[i,v] ≤ vendoritemsupply[i,v] * vendorvar[v]

Calculating moving average for values in a dictionary with keys in a specific range

So far this is my solution. I wonder if there is some more elegant/efficient way?
import datetime as dt
example = {dt.datetime(2008, 1, 1) : 5, dt.datetime(2008, 1, 2) : 6, dt.datetime(2008, 1, 3) : 7, dt.datetime(2008, 1, 4) : 9, dt.datetime(2008, 1, 5) : 12,
dt.datetime(2008, 1, 6) : 15, dt.datetime(2008, 1, 7) : 20, dt.datetime(2008, 1, 8) : 22, dt.datetime(2008, 1, 9) : 25, dt.datetime(2008, 1, 10) : 35}
def calculateMovingAverage(prices, period):
#calculates the moving average between each datapoint and two days before (usually 3! datapoints included)
average_dict = {}
for price in prices:
pricepoints = [prices[x] for x in prices.keys() if price - dt.timedelta(period) <= x <= price]
average = reduce(lambda x, y: x + y, pricepoints) / len(pricepoints)
average_dict[price] = average
return average_dict
print calculateMovingAverage(example, 2)
I am not sure, if I should use list-comprehension here.
There is probably some function for this somewhere, but I didn't find it.
If you're looking for other interesting ways to solve the problem, here is an answer using itertools:
import datetime as dt
from collections import deque
from itertools import tee, islice, izip
def dayiter(start, end):
one = dt.timedelta(days=1)
day = start
while day <= end:
yield day
day += one
def moving_average(mapping, window, dft=0):
n = float(window)
t1, t2 = tee(dayiter(min(mapping), max(mapping)))
s = sum(mapping.get(day, dft) for day in islice(t2, window))
yield s / n
for olddate, newdate in izip(t1, t2):
oldvalue = mapping.get(olddate, dft)
newvalue = mapping.get(newdate, dft)
s += newvalue - oldvalue
yield s / n
example = {dt.datetime(2008, 1, 1) : 5, dt.datetime(2008, 1, 2) : 6, dt.datetime(2008, 1, 3) : 7, dt.datetime(2008, 1, 4) : 9, dt.datetime(2008, 1, 5) : 12,
dt.datetime(2008, 1, 6) : 15, dt.datetime(2008, 1, 7) : 20, dt.datetime(2008, 1, 8) : 22, dt.datetime(2008, 1, 9) : 25, dt.datetime(2008, 1, 10) : 35}
for ma in moving_average(example, window=3):
print ma
The ideas involved are:
Use a simple generator to make a date iterator that loops over consecutive days from the lowest to the highest.
Use itertools.tee to construct a pair of iterators over the oldest data and the newest data (the front of the data window and the back).
Keep a running sum in a variable s. On each iteration, update s by subtracting the oldest value and adding the newest value.
This solution is space efficient (it keeps no more than window values in memory) and it is time efficient, one addition and one subtraction for each day regardless of the size of the window.
Handle missing days by defaulting to zero. There are other strategies that could be used for missing days (like using the current moving average as a default or adjusting n up and down to reflect the number of actual data points in the window).
The problem with using list comprehension in this case is that its inefficient to search through the entire set of prices in every iteration of your loop. The list comprehension in your code checks every element of prices.keys() on every iteration of the for price in prices: loop.
What you really want to do is take advantage of the fact that dates are sequential, and process them in order. That way when you eliminate a date from consideration on the current iteration of the loop, you can eliminate it from consideration in all subsequent iterations of your loop.
Here's an example:
def calculateMovingAverage(prices, period):
dates = list(prices.keys())
dates.sort()
total = 0.0
count = 0
average_dict = {}
for i, d in enumerate(dates):
# search through prior dates and eliminate any that are too old
old = [e for e in dates[i-count:i] if (d-e).days > period]
total -= sum(prices[o] for o in old)
count -= len(old)
# add in the current date
total += prices[d]
count += 1
average_dict[d] = total / count
return average_dict
Instead of checking every element of prices.keys() on every iteration of the loop, this code searches back from the current date through the list of dates that are included in total. When it finds a date that's too old, it removes it from total and since we're processing the dates in order, it never needs to look at that date again.

Categories

Resources