iterate on non zero columns to get cost

iterate on non zero columns to get cost - python

I have a data with weekly sale quantity, amount and cost and i want to find out the cost price for each product row by dividing the weekly quantity sold with the cost, however it is possible that the latest row has zero values, so i wish to skip it i it has zero value and use the previous week to calculate for the cost or until it finds a non zero values and computes the item cost(wkx_cost/wkx_amount) . Also note that product price may have changed over the weeks so i need the cost from the latest week but if not available try calcuating item cost price from the previous week.
df2 = pd.DataFrame([
{'product':'iphone11', 'wk1_qty':2, 'wk1_amount':100,
'wk1_cost':60, 'wk2_qty':3, 'wk2_amount':150,
'wk2_cost':90, 'wk3_qty':0, 'wk3_amount':0,
'wk3_cost':0, 'wk4_qty':5, 'wk4_amount':300,
'wk4_cost':60, 'wk5_qty':0, 'wk5_amount':0,
'wk5_cost':0}, {'product':'acer laptop', 'wk1_qty':3, 'wk1_amount':300,
'wk1_cost':210, 'wk2_qty':3, 'wk2_amount':300,
'wk2_cost':210, 'wk3_qty':0, 'wk3_amount':0,
'wk3_cost':0, 'wk4_qty':5, 'wk4_amount':550,
'wk4_cost':375, 'wk5_qty':5, 'wk5_amount':500,
'wk5_cost':375}])
What result should look like
df2 = pd.DataFrame([
{'product':'iphone11', 'wk1_qty':2, 'wk1_amount':100,
'wk1_cost':60, 'wk2_qty':3, 'wk2_amount':150,
'wk2_cost':90, 'wk3_qty':0, 'wk3_amount':0,
'wk3_cost':0, 'wk4_qty':5, 'wk4_amount':300,
'wk4_cost':160, 'wk5_qty':0, 'wk5_amount':0,
'wk5_cost':0, 'product_price':32}, {'product':'acer laptop', 'wk1_qty':3, 'wk1_amount':300,
'wk1_cost':210, 'wk2_qty':3, 'wk2_amount':300,
'wk2_cost':210, 'wk3_qty':0, 'wk3_amount':0,
'wk3_cost':0, 'wk4_qty':5, 'wk4_amount':550,
'wk4_cost':375, 'wk5_qty':5, 'wk5_amount':500,
'wk5_cost':375, 'product_price':75}])

The problem arises when there is a division by zero. So, to short-cut that (or deal with it) would could try this:
try:
Product_price = wk_cost/wk_qty
except ZeroDivisionError:
Product_price = 0

Related

Calculate average asset price when using netting instead of hedging

I'm trying to come up with a formula to calculate the average entry/position price to further update my stop loss and take profit.
For example opened BTC buy position with amount of 1 when price was 20000.
Later when price dropped down to 19000 we made another buy using the same amount of 1, "avereging" the position to the middle, so end up with position at 19500 with amount of 2.
Where I'm struggling is what if we want to increase the order size on each price.
Say 1 at 20000, 1.5 at 19500, 2 at 19000 and so on.
Or made new buys of the same amount but shorter distance between.
Inital buy at 20000. then 19000 then 19150
Or combine these two variants.
I use mainly Python and Pandas. Maybe the latter one has some built-in function which I'm not aware of. I checked the official Pandas docs, but found only regular mean function.

Thanks to Yuri's suggestion to look into VWAP, I came up with the following code, which is more advanced and allows you to use different contract/volume sizes and increase/decrease "distance" between orders.
As an example here I used avarage price of BTC 20000 and increased steps distance using 1.1 multiplier as well as increased volume. Operated in Binance futures terms, where you can buy minimum 1 contract for 10$.
The idea is to find sweet spot for orders distance, volume, stop loss and take profit while avereging down.
# initial entry price
initial_price = 20000
# bottom price
bottom_price = 0
# enter on every 5% price drop
step = int(initial_price*0.05)
# 1.1 to increase distance between orders, 0.9 to decrease
step_multiplier = 1.1
# initial volume size in contracts
initial_volume = 1
# volume_multiplier, can't be less than 1, in case of use float, will be rounded to decimal number
volume_multiplier = 1.1
# defining empty arrays
prices = []
volumes = []
# checking if we are going to use simple approach with 1 contract volume and no sep or volume multiplier
if step_multiplier == 1 and volume_multiplier == 1:
prices = range(initial_price,bottom_price,-step)
else:
# defining current price and volume vars
curr_price = initial_price
curr_volume = initial_volume
# Checking if current price is still bigger then defined bottom price
while curr_price > bottom_price:
# adding current price to the list
prices.append(curr_price)
# calulating next order price
curr_price = curr_price-step*step_multiplier
# checking if volume multiplier is bigger then 1
if volume_multiplier > 1:
# adding current volume to the list
volumes.append(int(curr_volume))
# calulating next order volume
curr_volume = curr_volume*volume_multiplier
print("Prices:")
for price in prices:
print(price)
print("Volumes:")
for volume in volumes:
print(volume)
print("Prices array length", len(prices))
print("Volumes array length", len(volumes))
a = [item1 * item2 for item1, item2 in zip(prices, volumes)]
b = volumes
print("Average position price when price will reach",prices[-1], "is", sum(a)/sum(b))

Combine csv values and output to csv

I'm trying to read a csv file and combine the duplicate values then output the values into a csv again.
Iterate through each line in the text file. The first line contains headers, so should be skipped.
Separate the three values found in each line. Each line contains the product name, quantity sold, and unit price (the price of a single product), separated by a tab character.
Keep a running total for the quantity sold of each product; for example, the total quantity sold for ‘product b’ is 12.
Keep a record of the unit price of each product.
Write the result to the sales-report.csv; the summary should include the name of each product, the sales volume (total quantity sold), and the sales revenue (total quantity sold * by the product price).
What I intend.
Input Data:
product name,quantity,unit price
product c,2,22.5
product a,1,10
product b,5,19.7
product a,3,10
product f,1,45.9
product d,4,34.5
product e,1,9.99
product c,3,22.5
product d,2,34.5
product e,4,9.99
product f,5,45.9
product b,7,19.7
Output Data:
product name,sales volume,sales revenue
product c,5,112.5
product a,4,40
product b,12,236.4
product f,6,275.4
product d,6,207
product e,5,49.95
This is what I have so far, I've looked around and it isn't entirely clear how I'm supposed to perform list comprehension and combine values.
When I looked for an answer, it was mostly more complicated than it probably needs to be, it is relatively simple...
record = []
with open("items.csv", "r") as f:
next(f)
for values in f:
split = values.rstrip().split(',')
record.append(split)
print(record)

You can use pandas for this:
import pandas as pd
df = pd.read_csv('path/to/file')
Then calculate sales revenue, groupby and sum
df = df.assign(sales_revenue=lambda x: x['quantity'] * x['unit price']).groupby('product name').sum().reset_index()
product name quantity sales_revenue
0 product a 4 20.00
1 product b 12 39.40
2 product c 5 45.00
3 product d 6 69.00
4 product e 5 19.98
5 product f 6 91.80
You can save the result to a csv file
df.to_csv('new_file_name.csv', index=False)

pandas is the way to go with the problem. If you don't already use it, it aggregates operations across entire tables so you don't have to iterate yourself. Notice that entire columns can be multiplied in a single step. groupby will group the dataframe by each product and then its easy to sum.
import pandas as pd
df = pd.read_csv("f.csv")
df["sales revenue"] = df["quantity"] * df["unit price"]
del df["unit price"]
outdf = df.groupby("product name").sum()
outdf.rename(columns={"quantity": "sales volume"})
outdf.to_csv("f-out.csv")

linear programming problem using python scipy minimize

I am trying to optimise, using python scipy.optimize.minimize, the calorie intake of a person using available food items and sticking to a budget.
The problem statement is: There are n food items, each available in different quantities. Their price changes everyday. Each has a different nutrition value which reduces every day. I need to buy food over a month so that total nutrition is closest to my target and I use my exact monthly budget to buy them.
#df_available has 1 row each for each item's available quantity at the beginning of the month
#df_bought has initial guesses for purchase of each item for each day of the month. This is based on prorated allotment of my total budget to each item on each day.
#df_price has price of each item on each day of the month.
#df_nutrition['nutrition'] has initial nutrition value per unit. it decreases by 1 unit each year, so 1/365 each day.
#strt is start date of the month.
#tot_nutrition is monthly total nutrition target for the month
#tot_budget is my monthly budget
def obj_func():
return (df_bought.sum()*(df_nutrition['nutrition'] - strt/365).sum())
#constraint 1 makes sure that I buy exactly at my budget
def constraint1():
return ((df_bought * df_price).sum().sum()- tot_budget)
cons1 = {'type':'eq', 'fun':constraint1}
#constraint 2 makes sure that I dont buy more than available quantity of any item
def constraint2():
return df_available - df_bought
cons2 = {'type':'ineq', 'fun':constraint2}
cons = ([cons1, cons2])
#bounds ensure that I dont buy negative quantity of any item
bnds = (0, None)
res = minimize(obj_func, df_bought_nominal_m, bounds= bnds, constraints=cons)
print(res)
For output, i would like the df_bought values to be adjusted to minimise the obj_function.
The code gives the following error:
TypeError: constraint1() takes no arguments (1 given)
apologies if I am making any rookie mistakes. I am new to python and couldn't find appropriate help online for this problem.

Counting entries in a CSV?

I'm just learning Python, and have been having a little bit of trouble with the list functionality of the language. I have a .csv file named purchases.csv and I need to do four things with it:
output the total number of "purchase orders" aka count the total number of entries in the csv
output the average amount of the purchases, showing three decimals.
output the total number of purchases made over 1,800
output the average amount of purchases made that are over 1,800 showing three decimals.
The output needs to look something like:
Total Number of Purchases: xxxx
Amount of Average Purchase: xxxx
Number of Purchase Orders over $1,800: xxxx
Amount of Average Purchases over $1,800: xxxx
So far I've written
import csv
with open('purchases.csv') as csvfile:
readCSV = csv.reader(csvfile,delimiter=',')
total_purchases=[]
for row in readCSV:
total=row[0]
total_purchases.append(total)
print(total_purchases)
my_sum=0
for x in home_runs:
my_sum=my_sum+int(x)
print("The total number of purchases was: ", my_sum)
To find the total number of purchases, but I've hit a wall and can't seem to figure out the rest! I'd love any help and guidance with this...I just can't figure it out!

You need an a series of separate similar for loops, but with if statements to only count the sum conditionally.
Assuming row[0] is your price column:
var sumAbove1800 = 0;
var countAbove1800 = 0;
var totalSum = 0;
var totalPurchases = 0;
for row in readCSV:
var price = float(row[0])
totalPurchases = totalPurchases + 1;
totalSum = totalSum + price;
if(price > 1800):
sumAbove1800 = sumAbove1800 + price;
countAbove1800 = countAbove1800 + 1;
Now to print them out with 3 decimal places:
print("Total Average Price: {:.3f}".format(totalSum / totalPurchases));
print("Total Transactions: {:.3f}".format(totalPurchases));
print("Total Average Price above 1800: {:.3f}".format(sumAbove1800 / countAbove1800 ));
print("Total Transactions above 1800: {:.3f}".format(countAbove1800 ));

Your question is a bit too vague, but here goes anyway.
Unless you are constrained by requirements as this appears to be homework / an assignment, you should give Pandas a try. It's a Python library that helps tremendously with data wrangling and data analysis.
output the total number of "purchase orders" aka count the total number of entries in the csv
This is dead easy with Pandas:
import pandas as pd
df = pd.read_csv('purchases.csv')
num = df.shape[0]
The first two lines are self-explanatory. You build an instance of a Pandas.DataFrame object with read_csv() and store it in df. For the last line, just know that Pandas.DataFrame has a member named shape with the format (number of lines, number of columns), so shape[0] returns the number of lines.
output the average amount of the purchases, showing three decimals.
mean = df['purchase_amount'].mean()
Access column 'purchase_amount' using brackets.
output the total number of purchases made over 1,800
num_over_1800 = df[df['purchase_amount'] > 1800].shape[0]
Slight twist here, just know that this is one way to set a condition in Pandas.
output the average amount of purchases made that are over 1,800
showing three decimals.
mean_over_1800 = df[df['purchase_amount'] > 1800].mean()
This should be self-explanatory from the rest above.

How to efficiently iterate through a dictionary?

I'm new to Python and programming.
My textbook says I have to do the following problem set:
Create a second purchase summary that which accumulates total investment by ticker symbol. In the
above sample data, there are two blocks of CAT.
These can easily be combined by creating a dict where
the key is the ticker and the value is the list of blocks purchased. The program makes one pass
through the data to create the dict. A pass through the dict can then create a report showing each
ticker symbol and all blocks of stock.
I cannot think of a way, apart from hard-coding, to add the two entries of the 'CAT' stock.
## Stock Reports
stockDict = {"GM":"General Motors", "CAT":"Caterpillar", "EK":"Eastman Kodak",
"FB":"Facebook"}
# symbol,prices,dates,shares
purchases = [("GM",100,"10-sep-2001",48), ("CAT",100,"01-apr-1999",24),
("FB",200,"01-jul-2013",56), ("CAT", 200,"02-may-1999",53)]
# purchase history:
print "Company", "\t\tPrice", "\tDate\n"
for stock in purchases:
price = stock[1] * stock[3]
name = stockDict[stock[0]]
print name, "\t\t", price, "\t", stock[2]
print "\n"
# THIS IS THE PROBLEM SET I NEED HELP WITH:
# accumulate total investment by ticker symbol
byTicker = {}
# create dict
for stock in purchases:
ticker = stock[0]
block = [stock]
if ticker in byTicker:
byTicker[ticker] += block
else:
byTicker[ticker] = block
for i in byTicker.values():
shares = i[0][3]
price = i[0][1]
investment = shares * price
print investment
Right now, the output is:
4800
11200
2400
It's not good because it does not calculate the two CAT stocks. Right now it only calculates one. The code should be flexible enough that I could add more CAT stocks.

Your problem is in the last part of your code, the penultimate bit creates a list of all stocks against each ticker, which is fine:
for i in byTicker.values():
shares = i[0][3]
price = i[0][1]
investment = shares * price
print investment
Here you only use the zeroth stock for each ticker. Instead, try:
for name, purchases in byTicker.items():
investment = sum(shares * price for _, shares, _, price in purchases)
print name, investment
This will add up all of the stocks for each ticker, and for your example gives me:
CAT 13000
FB 11200
GM 4800

The problem with your code is that you are not iterating over the purchaes, but just getting the first element from each ticker value. That is, byTicker looks something like:
byTicker: {
"GM": [("GM",100,"10-sep-2001",48)],
"CAT": [("CAT",100,"01-apr-1999",24), ("CAT", 200,"02-may-1999",53)],
"FB": [("FB",200,"01-jul-2013",56)]
}
so when you iterate over the values, you actually get three lists. But when you process these lists, you are actually accessing only the first of them:
price = i[0][1]
for the value corresponding to "CAT", i[0] is ("CAT",100,"01-apr-1999",24). You should look into i[1] as well! Consider iterating over the different purchases:
for company, purchases in byTicker.items():
investment = 0
for purchase in purchases:
investment += purchase[1] * purchase[3]
print(company, investment)

Maybe something like this:
## Stock Reports
stockDict = {"GM":"General Motors", "CAT":"Caterpillar", "EK":"Eastman Kodak",
"FB":"Facebook"}
# symbol,prices,dates,shares
purchases = [("GM",100,"10-sep-2001",48), ("CAT",100,"01-apr-1999",24),
("FB",200,"01-jul-2013",56), ("CAT", 200,"02-may-1999",53)]
# purchase history:
print "Company", "\t\tPrice", "\tDate\n"
for stock in purchases:
price = stock[1] * stock[3]
name = stockDict[stock[0]]
print name, "\t\t", price, "\t", stock[2]
print "\n"
# THIS IS THE PROBLEM SET I NEED HELP WITH:
# accumulate total investment by ticker symbol
byTicker = {}
# create dict
for stock in purchases:
ticker = stock[0]
price = stock[1] * stock[3]
if ticker in byTicker:
byTicker[ticker] += price
else:
byTicker[ticker] = price
for ticker, price in byTicker.iteritems():
print ticker, price
The output I get is:
Company Price Date
General Motors 4800 10-sep-2001
Caterpillar 2400 01-apr-1999
Facebook 11200 01-jul-2013
Caterpillar 10600 02-may-1999
GM 4800
FB 11200
CAT 13000
which appears to be correct.
Testing whether or not a ticker is in the byTicker dict tells you whether or not there's already been a purchase recorded for that stock. If there is, you just add to it, if not, you start fresh. This is basically what you were doing, except for some reason you were collecting all of the purchase records for a given stock in that dict, when all you really cared about was the price of the purchase.
You could build the dict the same way you were originally, and then iterate over the items stored under each key, and add them up. Something like this:
totals = []
for ticker in byTicker:
total = 0
for purchase in byTicker[ticker]:
total += purchase[1] * purchase[3]
totals.append((ticker, total))
for ticker, total in totals:
print ticker, total
And just for kicks, you could compress it all into one line with generator statements:
print "\n".join("%s: %d" % (ticker, sum(purchase[1]*purchase[3] for purchase in byTicker[ticker])) for ticker in byTicker)
Either of these last two are completely unnecessary to do though, since you're already iterating through every purchase, you may as well just accumulate the total price for each stock as you go, as I showed in the first example.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

iterate on non zero columns to get cost - python

The problem arises when there is a division by zero. So, to short-cut that (or deal with it) would could try this: try: Product_price = wk_cost/wk_qty except ZeroDivisionError: Product_price = 0

Related

Calculate average asset price when using netting instead of hedging

Combine csv values and output to csv

linear programming problem using python scipy minimize

Counting entries in a CSV?

How to efficiently iterate through a dictionary?

Categories

Resources