Simulating the number of contract to maintain constant value - python

I have a set of say 10,000 contracts ( Loan account numbers). The contracts are for specific durations say 24,48,84 months. My pool consists of a mix of these contracts for a different duration. Assume that at the start of this month I have 100 contracts amounting to 10,000 USD . After 2 months a few accounts/contracts are closed prematurely (early pay off) and few are extended. I need to simulate the data to maintain a constant value (amount of 10,000 USD). It means I need to know how many new contracts that I need to add say 2 months from now so that the value of my portfolio remains at 10,000 USD. Can someone help me if there is any technique to simulate this? Preferably in R, Python or SAS

Add a payoff date element to each contract object. Then, do:
need = 0
for c in contracts:
need += int( (date today + 2 mo) > c.payoff_date or c.paid_off == True)
print 10000 - len(contracts) + need

Related

Is there a way in pandas (python) to calculate the days of inventory grouped by material number

With the given data frame the calculated measure would be the DOI (aka how many days into the future will the inventory last; based on the demand. Note: The figures populated in the DOI need to be programmatically calculated and grouped on the material.
Calculation of DOI: Let us take the first row belonging to material A1. The dates are on weekly basis.
Inventory = 1000
Days into the future till when the inventory would last: 300 + 400 + part of 500. This means the DOI is 7 + 7 + (1000-300-400)/500 = 14.6 [aka 26.01.2023 - 19-01-2023]; [09.02.2023 - 02.02.2023]
An important point to note is that the demand figure of the concerned row is NOT taken into account while calculating DOI.
I have tried to calculate the cumulative demand without taking the first row for each material (here A1 and B1).
inv_network['cum_demand'] = 0
for i in range(inv_network.shape[0]-1):
if inv_network.loc[i+1,'Period'] > inv_network.loc[0,'Period']:
inv_network.loc[i+1,'cum_demand']= inv_network.loc[i,'cum_demand'] + inv_network.loc[i+1,'Primary Demand']
print(inv_network)
However, this piece of code is taking a lot of time, with the increase in the number of records.
As part of the next step when I am trying to calculate DOI, I am running into issues to get the right value.

Iterating pandas dataframe and changing values

I'm looking to predict the number of customers in a restaurant at a certain time. My data preprocessing is almost finished - I have acquired the arrival dates of each customer. Those are presented in acchour. weekday means the day of the week, 0 being Monday and 6 Sunday. Now, I'd like to calculate the number of customers at that certain time in the restaurant. I figured I have to loop through the dataframe in reverse and keep adding the arrived customers to the customer count at a certain time, while simultaneously keeping track of when previous customers are leaving. As there is no data on this, we will simply assume every guest stays for an hour.
My sketch looks like this:
exp = [] #Keep track of the expiring customers
for row in reversed(df['customers']): #start from the earliest time
if row != 1: #skip the 1st row
res = len(exp) + 1 #amount of customers
for i in range(len(exp) - 1, -1, -1): #loop exp sensibly while deleting
if df['acchour'] > exp[i]+1: #if the current time is more than an hour more than the customer's arrival time
res -= 1
del exp[i]
exp.append(df['acchour'])
row = res
However, I can see that df['acchour'] is not a sensible expression and was wondering how to reference the different column on the same row properly. Altogether, if someone can come up with a more convenient way to solve the problem, I'd hugely appreciate it!
So you can get the total customers visiting at a specific time like so:
df.groupby(['weekday','time', 'hour'], as_index=False).sum()
Then maybe you can calculate the difference between each time window you want?

Obtaining a Total of Only Part of a Column

The database I'm using shows the total number of debtors for every town for every quarter.
Since there's 43 towns listed, there's 43 'total debtors' per quarter (30-Sep-17, etc).
My goal is to find the total number of debtors for every quarter (so theoretically, finding the sum of every 43 'total debtors' listed) but I'm not quite sure how.
I've tried using the sum() function, but I'm sure how to make it so it only adds the total quarter by quarter.
Here's what the database looks like and my attempt (I printed the first 50 rows just to provide an idea of what it looks like)
https://i.imgur.com/h1y43j8.png
Sorry in advance if the explanation was a bit unclear.
You should use groupby. It's a nice pandas function to do exactly what you are trying to do. It groups the df according to whatever column you pick.
total_debtors_pq = df.groupby('Quarter end date')['Total number of debtors'].sum()
You can then extract the total for each quarter from total_debtors_pq.

data structure: pandas dataframe or relational database for my model?

I want to build a model which is supposed to calculate the value of a parameter for the receiving process to optimize a warehouse’s capacity.
The parameter makes the decision in the receiving process where a SKU is going to be stored - as a palett in the high-bay racking (more expensive) or as a carton in the automatic carton shelf (less expensive).
The parameter is set based on this data:
Concerning the sum of all SKUs:
the capacity of the high-bay racking
the capacity of the carton shelf.
The capacity of the racking and shelf depend on the current inventory level of all SKUs and the volume leaving the storage (because the SKUs are sold).
Concerning the single values for each SKU and each day (20.000 SKUs and 365 days):
number of products of this specific SKU received per day
number of products of this specific SKU sold per day
predicted number of products of this specific SKU to be sold in the x upcoming days
volume already stored in the automatic carton shelf of this specific SKU
Now, I wonder which data structure I should use in order to import and use the data in my process in Python as the data encompasses four values for 20.000 SKUs and 365 days each.
I thought that I should use the Pandas Dataframe because it is very powerful in building models and visualization. But as the tabular form only has a kind of 2D nature, as I understand, I would not be able to model the data for 20.000 SKUs and all 365 days because this is kind of more 3D.
Therefore, I wonder whether I have to use a relational database, where each of the above mentioned data sets (received volume per SKU, sold volume per SKU, predicted number to be sold per SKU, volume in carton shelf per SKU) would make up a table.
I found the following set of questions in the answer to one question here, which I feel are important to answer my questions. Here are my answers:
1) Size of data, # of rows, columns, types of columns; are you appending rows, or just columns?
number of rows: 20.000 SKUs
number of columns: if you take separate tables for each data set, than it is 365 columns (=days); if it is one table, than it is 365 * 4 (365 days * received volume per SKU, sold volume per SKU, predicted number to be sold per SKU, volume in carton shelf per SKU)
types of columns: floats, boolean
As I understand, I am not appending data but I use the data to calculate values for each SKU, and then from the bottom (the detailed data on the SKU) to the top (the sum of all SKUs = capacity, inventory level)
2) What will typical operations look like. E.g. do a query on columns to select a bunch of rows and specific columns, then do an operation (in-memory), create new columns, save these.
sum, subtraction, multiplication, division, bigger than, smaller than, equal to ...
3) Giving a toy example could enable us to offer more specific recommendations.
Example:
SKU 123456:
has 200 liters of inventory in carton shelf
1000 liters are received today
300 liters will be sold today
predicted sales for x days is 250 liters (should be in carton shelf)
the parameter is set to 600 liters (if the volume of the receiving is higher, it goes in the palett racking, otherwise in the carton shelf)
so you need to store the following volume:
200 liters in inventory + 1000 liters received = 1200 of inventory
1200 liters - 300 liters sold = 900 of inventory
250 liters needed in carton shelf = 650 liters left
as 650 > 600, 250 liters are stored in the carton shelf, the other 650 in the high-bay racking
Overall sum:
inventory high bay racking after the receiving of this SKU is + 650 liters
inventory carton shelf is + 50 liters
If the capacity of the high-bay racking is already full, and +650 liters is not possible, the parameter has to be recalculated, so that the total on this day fits in.
-> the calculation is proceeded for the next 364 days …
4) After that processing, then what do you do? Is step 2 ad hoc, or repeatable?
repeatable, as it needs to be done for every day
5) Input flat files: how many, rough total size in Gb. How are these organized e.g. by records? Does each one contains different fields, or do they have some records per file with all of the fields in each file?
I guess they need to be organized by SKUs and days
6) Do you ever select subsets of rows (records) based on criteria (e.g. select the rows with field A > 5)? and then do something, or do you just select fields A, B, C with all of the records (and then do something)?
yes -> it always checks whether the capacity is met; whether there is a need to put some volume into the carton shelf, …
7) Do you 'work on' all of your columns (in groups), or are there a good proportion that you may only use for reports (e.g. you want to keep the data around, but don't need to pull in that column explicity until final results time)?
I guess, mostly, there are calculations made on the data, so it is not just keeping the data around…
Thank you so much upfront!

Trying to find the lowest multiple of ten to pay off a credit card balance with python

I am new to Python and I am currently stuck on this learning problem.
I am trying to make a program which will output the lowest common multiple of 10 required to pay off a credit card balance. Each payment is made once a month and has to be the same for each month in order to satisfy the requirements of the problem, and monthly interest must also be taken into account.
def debt(payment):
balance = 3329
annualInterestRate = 0.2
month = 1
finalbalance = balance
while month <= 12:
#Monthly interest rate
rate=(annualInterestRate / 12.0)
#Monthly unpaid balance
finalbalance = round(finalbalance - payment,2)
#Updated balance each month
finalbalance = round(finalbalance + (rate * finalbalance),2)
#Moves month forward
month = month + 1
#Shows final fingures
print('Lowest Payment: ' + str(payment))
debt(10)
The above works fine except for the fact I am lacking a mechanism to supply ever greater multiples of ten into the problem until the final balance becomes less than zero.
I posted a similar question here with different code that I deleted as I felt it could not go anywhere and I had subsequently rewrote my code anyway.
If so, then you need to restructure your function. Instead of payment, use balance as the parameter. Your function should output your payment, not take it in as a parameter. Then, since you're doing this monthly, the final output (whatever it is) would be greater than balance / 12, because that would be how you pay the core debt, without interest.
So, now off we go to find the worst thing possible. The entire balance unpaid plus interests. That would be (annual rate x balance) + balance. Divide that by 12, and you get the max amount you should pay per month.
There, now that you have your min and max, you have a start and end point for a loop. Just add 1 to the payment for each loop until you get the minimum amount to pay for the included interests.

Categories

Resources