get financial data using Python

get financial data using Python - python

I have managed to write some Python code and Selenium that navigates to a webpage that contains financial data that is in some tables.
I want to be able to extract the data and put it into excel.
The tables seem to be html based tables code below:
<tr>
<td class="bc2T bc2gt">Last update</td>
<td class="bc2V bc2D">03/15/2018</td><td class="bc2V bc2D">03/14/2019</td><td class="bc2V bc2D">03/12/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/22/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/20/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/18/2020</td>
</tr>
</table>
The table has the following class name:
<table class='BordCollapseYear2' style="margin-right:20px; font-size:12px; width:100%;" cellspacing=0>
Is there a way I can extract this data? Ideally I want this to be dynamic so that it can extract information for different companies.
I've never used it before, but I've seen BeautifulSoup library mentioned a few times.
https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/
As an example Microsoft. I'd want to extract the income statement data, balance sheet etc.

This script will scrape all tables found on the page and pretty prints them:
import requests
from bs4 import BeautifulSoup
url = 'https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = {}
# for every table found on page...
for table in soup.select('table.BordCollapseYear2'):
table_name = table.find_previous('b').text
all_data[table_name] = []
# ..scrape every row
for tr in table.select('tr'):
row = [td.get_text(strip=True, separator=' ') for td in tr.select('td')]
if len(row) == 7:
all_data[table_name].append(row)
#pretty print all data:
for k, v in all_data.items():
print('Table name: {}'.format(k))
print('-' * 160)
for row in v:
print(('{:<25}'*7).format(*row))
print()
Prints:
Table name: Valuation
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Capitalization 1 532 175 757 640 1 026 511 1 391 637 - -
Entreprise Value (EV) 1 485 388 700 112 964 870 1 315 823 1 299 246 1 276 659
P/E ratio 25,4x 46,3x 26,5x 32,3x 29,7x 25,8x
Yield 2,26% 1,70% 1,37% 1,10% 1,18% 1,31%
Capitalization / Revenue 5,51x 6,87x 8,16x 9,81x 8,89x 7,95x
EV / Revenue 5,02x 6,34x 7,67x 9,28x 8,30x 7,30x
EV / EBITDA 12,7x 15,4x 17,7x 20,2x 18,3x 15,9x
Cours sur Actif net 7,46x 9,15x 10,0x 12,1x 10,1x 8,49x
Nbr of stocks (in thousands)7 720 510 7 683 198 7 662 818 7 583 440 - -
Reference price (USD) 68,9 98,6 134 184 184 184
Last update 07/20/2017 07/19/2018 07/18/2019 05/08/2020 04/30/2020 04/30/2020
Table name: Annual Income Statement Data
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Net sales 1 96 657 110 360 125 843 141 818 156 534 174 945
EBITDA 1 38 117 45 319 54 641 65 074 70 966 80 445
Operating profit (EBIT) 129 339 35 058 42 959 52 544 57 045 65 289
Operating Margin 30,4% 31,8% 34,1% 37,1% 36,4% 37,3%
Pre-Tax Profit (EBT) 1 23 149 36 474 43 688 52 521 57 042 65 225
Net income 1 21 204 16 571 39 240 43 693 47 223 53 905
Net margin 21,9% 15,0% 31,2% 30,8% 30,2% 30,8%
EPS 2 2,71 2,13 5,06 5,68 6,18 7,11
Dividend per Share 2 1,56 1,68 1,84 2,02 2,16 2,41
Last update 07/20/2017 07/19/2018 07/18/2019 05/22/2020 05/22/2020 05/22/2020
Table name: Balance Sheet Analysis
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Net Debt 1 - - - - - -
Net Cash position 1 46 787 57 528 61 641 75 814 92 392 114 978
Leverage (Debt / EBITDA) -1,23x -1,27x -1,13x -1,17x -1,30x -1,43x
Free Cash Flow 1 31 378 32 252 38 260 41 953 46 887 53 155
ROE (Net Profit / Equities)29,4% 19,4% 42,4% 36,6% 34,5% 36,1%
Shareholders' equity 1 72 195 85 215 92 524 119 417 136 690 149 484
ROA (Net Profit / Asset) 9,76% 6,51% 14,4% 18,5% 14,6% 14,7%
Assets 1 217 276 254 580 272 703 235 800 323 445 366 702
Book Value Per Share 2 9,24 10,8 13,4 15,2 18,2 21,6
Cash Flow per Share 2 5,04 5,63 6,73 7,03 8,02 9,79
Capex 1 8 129 11 632 13 925 15 698 17 922 19 507
Capex / Sales 8,41% 10,5% 11,1% 11,1% 11,4% 11,2%
Last update 07/20/2017 07/19/2018 07/18/2019 05/22/2020 05/22/2020 05/04/2020
EDIT (to save all_data as csv file):
import csv
with open('data.csv', 'w', newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
for k, v in all_data.items():
spamwriter.writerow([k])
for row in v:
spamwriter.writerow(row)
Screenshot from LibreOffice:

Related

text file rows into CSV column python

I've a question
I've a text file containing data like this
A 34 45 7789 3475768 443 67 8999 3343 656 8876 802 383358 873 36789 2374859 485994 86960 32838459 3484549 24549 58423
T 3445 574649 68078 59348604 45959 64585304 56568 595 49686 656564 55446 665 677 778 433 545 333 65665 3535
and so on
I want to make a csv file from this text file, displaying data like this, A & T as column headings, and then numbers
A T
34 3445
45 574649
7789 68078
3475768 59348604
443 45959

EDIT (A lot simpler solution inspired by Michael Butscher's comment):
import pandas as pd
df = pd.read_csv("filename.txt", delimiter=" ")
df.T.to_csv("filename.csv", header=False)
Here is the code:
import pandas as pd
# Read file
with open("filename.txt", "r") as f:
data = f.read()
# Split data by lines and remove empty lines
columns = data.split("\n")
columns = [x.split() for x in columns if x!=""]
# Row sizes are different in your example so find max number of rows
column_lengths = [len(x) for x in columns]
max_col_length = max(column_lengths)
data = {}
for i in columns:
# Add None to end for columns that have less values
if len(i)<max_col_length:
i += [None]*(max_col_length-len(i))
data[i[0]] = i[1:]
# Create dataframe
df = pd.DataFrame(data)
# Create csv
df.to_csv("filename.csv", index=False)
Output should look like this:
A T
0 34 3445
1 45 574649
2 7789 68078
3 3475768 59348604
4 443 45959
5 67 64585304
6 8999 56568
7 3343 595
8 656 49686
9 8876 656564
10 802 55446
11 383358 665
12 873 677
13 36789 778
14 2374859 433
15 485994 545
16 86960 333
17 32838459 65665
18 3484549 3535
19 24549 None
20 58423 None

here is my code
import pandas as pd
data = pd.read_csv("text (3).txt", header = None)
Our_Data = pd.DataFrame(data)
for rows in Our_Data:
New_Data=pd.DataFrame(Our_Data[rows].str.split(' ').tolist()).T
New_Data.columns = New_Data.iloc[0]
New_Data = New_Data[1:]
New_Data.to_csv("filename.csv", index=False)
The Output
A T
1 34 3445
2 45 574649
3 7789 68078
4 3475768 59348604
5 443 45959
6 67 64585304
7 8999 56568
8 3343 595
9 656 49686
10 8876 656564
11 802 55446
12 383358 665
13 873 677
14 36789 778
15 2374859 433
16 485994 545
17 86960 333
18 32838459 65665
19 3484549 3535
20 24549 None
21 58423 None

Create new Pandas.DataFrame with .groupby(...).agg(sum) then recover unsummed columns

I'm starting with a dataframe of baseabll seasons a section of which looks similar to this:
Name Season AB H SB playerid
13047 A.J. Pierzynski 2013 503 137 1 746
6891 A.J. Pierzynski 2006 509 150 1 746
1374 Rod Carew 1977 616 239 23 1001942
1422 Stan Musial 1948 611 230 7 1009405
1507 Todd Helton 2000 580 216 5 432
1508 Nomar Garciaparra 2000 529 197 5 190
1509 Ichiro Suzuki 2004 704 262 36 1101
From these seasons, I want to create a dataframe of career stats; that is, one row for each player which is a sum of their AB, H, etc. This dataframe should still include the names of the players. The playerid in the above is a unique key for each player and should either be an index or an unchanged value in a column after creating the career stats dataframe.
My hypothetical starting point is df_careers = df_seasons.groupby('playerid').agg(sum) but this leaves out all the non-numeric data. With numeric_only = False I can get some sort of mess in the names columns like 'Ichiro SuzukiIchiro SuzukiIchiro Suzuki' from concatenation, but that just requires a bunch of cleaning. This is something I'd like to be able to do with other data sets and the actually data I have is more like 25 columns, so I'd rather understand a specific routine for getting the Name data back or preserving it from the outset rather than write a specific function and use groupby('playerid').agg(func) (or a similar process) to do it, if possible.
I'm guessing there's a fairly simply way to do this, but I only started learning Pandas a week ago, so there are gaps in my knowledge.

You can write your own condition how do you want to include non summed columns.
col = df.columns.tolist()
col.remove('playerid')
df.groupby('playerid').agg({i : lambda x: x.iloc[0] if x.dtypes=='object' else x.sum() for i in df.columns})
df:
Name Season AB H SB playerid
playerid
190 Nomar_Garciaparra 2000 529 197 5 190
432 Todd_Helton 2000 580 216 5 432
746 A.J._Pierzynski 4019 1012 287 2 1492
1101 Ichiro_Suzuki 2004 704 262 36 1101
1001942 Rod_Carew 1977 616 239 23 1001942
1009405 Stan_Musial 1948 611 230 7 1009405

If there is a one-to-one relationship between 'playerid' and 'Name', as appears to be the case, you can just include 'Name' in the groupby columns:
stat_cols = ['AB', 'H', 'SB']
groupby_cols = ['playerid', 'Name']
results = df.groupby(groupby_cols)[stat_cols].sum()
Results:
AB H SB
playerid Name
190 Nomar Garciaparra 529 197 5
432 Todd Helton 580 216 5
746 A.J. Pierzynski 1012 287 2
1101 Ichiro Suzuki 704 262 36
1001942 Rod Carew 616 239 23
1009405 Stan Musial 611 230 7
If you'd prefer to group only by 'playerid' and add the 'Name' data back in afterwards, you can instead create a 'playerId' to 'Name' mapping as a dictionary, and look it up using map:
results = df.groupby('playerid')[stat_cols].sum()
name_map = pd.Series(df.Name.to_numpy(), df.playerid).to_dict()
results['Name'] = results.index.map(name_map)
Results:
AB H SB Name
playerid
190 529 197 5 Nomar Garciaparra
432 580 216 5 Todd Helton
746 1012 287 2 A.J. Pierzynski
1101 704 262 36 Ichiro Suzuki
1001942 616 239 23 Rod Carew
1009405 611 230 7 Stan Musial

groupy.agg() can accept a dictionary that maps column names to functions. So, one solution is to pass a dictionary to agg, specifying which functions to apply to each column.
Using the sample data above, one might use
mapping = { 'AB': sum,'H': sum, 'SB': sum, 'Season': max, 'Name': max }
df_1 = df.groupby('playerid').agg(mapping)
The choice to use 'max' for those that shouldn't be summed is arbitrary. You could define a lambda function to apply to a column if you want to handle it in a certain way. DataFrameGroupBy.agg can work with any function that will work with DataFrame.apply.
To expand this to larger data sets, you might use a dictionary comprehension. This would work well:
dictionary = { x : sum for x in df.columns}
dont_sum = {'Name': max, 'Season': max}
dictionary.update(dont_sum)
df_1 = df.groupby('playerid').agg(dictionary)

Nested loop to replace rows in dataframe

I'm trying to write a for loop that takes each row in a dataframe and compares it to the rows in a second dataframe.
If the row in the second dataframe:
isn't in the first dataframe already
has a higher value in the total points column
has a lower cost than the available budget (row_budget)
then I want to remove the row from the first dataframe and add the row from the second dataframe in its place.
Example data:
df
code team_name total_points now_cost
78 93284 BHA 38 50
395 173514 WAT 42 50
342 20452 SOU 66 50
92 17761 BUR 97 50
427 18073 WHU 99 50
69 61933 BHA 115 50
130 116594 CHE 116 50
pos_pool
code team_name total_points now_cost
438 90585 WOL 120 50
281 67089 NEW 131 50
419 37096 WHU 143 50
200 97032 LIV 208 65
209 110979 LIV 231 115
My expected output for the first three loops should be:
df
code team_name total_points now_cost
92 17761 BUR 97 50
427 18073 WHU 99 50
69 61933 BHA 115 50
130 116594 CHE 116 50
438 90585 WOL 120 50
281 67089 NEW 131 50
419 37096 WHU 143 50
Here is the nested for loop that I've tried:
for index, row in df.iterrows():
budget = squad['budget']
team_limits = squad['team_limits']
pos_pool = players_1920.loc[players_1920['position'] == row['position']].sort_values('total_points', ascending=False)
row_budget = row.now_cost + 1000 - budget
for index2, row2 in pos_pool.iterrows():
if (row2 not in df) and (row2.total_points > row.total_points) and (row2.now_cost <= row_budget):
team_limits[row.team_name] += 1
team_limits[row2.team_name] -=1
budget += row.now_cost - row2.now_cost
df = df.append(row2)
df = df.drop(row)
else:
pass
return df
At the moment I am only iterating through the first dataframe but doesn't seem to do anything in the second.

How to display a sequence of numbers in column-major order?

Program description:
Find all the prime numbers between 1 and 4,027 and print them in a table which
"reads down", using as few rows as possible, and using as few sheets of paper
as possible. (This is because I have to print them out on paper to turn it in.) All numbers should be right-justified in their column. The height
of the columns should all be the same, except for perhaps the last column,
which might have a few blank entries towards its bottom row.
The plan for my first function is to find all prime numbers between the range above and put them in a list. Then I want my second function to display the list in a table that reads up to down.
2 23 59
3 29 61
5 31 67
7 37 71
11 41 73
13 43 79
17 47 83
19 53 89
ect...
This all I've been able to come up with myself:
def findPrimes(n):
""" Adds calculated prime numbers to a list. """
prime_list = list()
for number in range(1, n + 1):
prime = True
for i in range(2, number):
if(number % i == 0):
prime = False
if prime:
prime_list.append(number)
return prime_list
def displayPrimes():
pass
print(findPrimes(4027))
I'm not sure how to make a row/column display in Python. I remember using Java in my previous class and we had to use a for loop inside a for loop I believe. Do I have to do something similar to that?

Although I frequently don't answer questions where the original poster hasn't even made an attempt to solve the problem themselves, I decided to make an exception of yours—mostly because I found it an interesting (and surprisingly challenging) problem that required solving a number of somewhat tricky sub-problems.
I also optimized your find_primes() function slightly by taking advantage of some reatively well-know computational shortcuts for calculating them.
For testing and demo purposes, I made the tables only 15 rows high to force more than one page to be generated as shown in the output at the end.
from itertools import zip_longest
import locale
import math
locale.setlocale(locale.LC_ALL, '') # enable locale-specific formatting
def zip_discard(*iterables, _NULL=object()):
""" Like zip_longest() but doesn't fill out all rows to equal length.
https://stackoverflow.com/questions/38054593/zip-longest-without-fillvalue
"""
return [[entry for entry in iterable if entry is not _NULL]
for iterable in zip_longest(*iterables, fillvalue=_NULL)]
def grouper(n, seq):
""" Group elements in sequence into groups of "n" items. """
for i in range(0, len(seq), n):
yield seq[i:i+n]
def tabularize(width, height, numbers):
""" Print list of numbers in column-major tabular form given the dimensions
of the table in characters (rows and columns). Will create multiple
tables of required to display all numbers.
"""
# Determine number of chars needed to hold longest formatted numeric value
gap = 2 # including space between numbers
col_width = len('{:n}'.format(max(numbers))) + gap
# Determine number of columns that will fit within the table's width.
num_cols = width // col_width
chunk_size = num_cols * height # maximum numbers in each table
for i, chunk in enumerate(grouper(chunk_size, numbers), start=1):
print('---- Page {} ----'.format(i))
num_rows = int(math.ceil(len(chunk) / num_cols)) # rounded up
table = zip_discard(*grouper(num_rows, chunk))
for row in table:
print(''.join(('{:{width}n}'.format(num, width=col_width)
for num in row)))
def find_primes(n):
""" Create list of prime numbers from 1 to n. """
prime_list = []
for number in range(1, n+1):
for i in range(2, int(math.sqrt(number)) + 1):
if not number % i: # Evenly divisible?
break # Not prime.
else:
prime_list.append(number)
return prime_list
primes = find_primes(4027)
tabularize(80, 15, primes)
Output:
---- Page 1 ----
1 47 113 197 281 379 463 571 659 761 863
2 53 127 199 283 383 467 577 661 769 877
3 59 131 211 293 389 479 587 673 773 881
5 61 137 223 307 397 487 593 677 787 883
7 67 139 227 311 401 491 599 683 797 887
11 71 149 229 313 409 499 601 691 809 907
13 73 151 233 317 419 503 607 701 811 911
17 79 157 239 331 421 509 613 709 821 919
19 83 163 241 337 431 521 617 719 823 929
23 89 167 251 347 433 523 619 727 827 937
29 97 173 257 349 439 541 631 733 829 941
31 101 179 263 353 443 547 641 739 839 947
37 103 181 269 359 449 557 643 743 853 953
41 107 191 271 367 457 563 647 751 857 967
43 109 193 277 373 461 569 653 757 859 971
---- Page 2 ----
977 1,069 1,187 1,291 1,427 1,511 1,613 1,733 1,867 1,987 2,087
983 1,087 1,193 1,297 1,429 1,523 1,619 1,741 1,871 1,993 2,089
991 1,091 1,201 1,301 1,433 1,531 1,621 1,747 1,873 1,997 2,099
997 1,093 1,213 1,303 1,439 1,543 1,627 1,753 1,877 1,999 2,111
1,009 1,097 1,217 1,307 1,447 1,549 1,637 1,759 1,879 2,003 2,113
1,013 1,103 1,223 1,319 1,451 1,553 1,657 1,777 1,889 2,011 2,129
1,019 1,109 1,229 1,321 1,453 1,559 1,663 1,783 1,901 2,017 2,131
1,021 1,117 1,231 1,327 1,459 1,567 1,667 1,787 1,907 2,027 2,137
1,031 1,123 1,237 1,361 1,471 1,571 1,669 1,789 1,913 2,029 2,141
1,033 1,129 1,249 1,367 1,481 1,579 1,693 1,801 1,931 2,039 2,143
1,039 1,151 1,259 1,373 1,483 1,583 1,697 1,811 1,933 2,053 2,153
1,049 1,153 1,277 1,381 1,487 1,597 1,699 1,823 1,949 2,063 2,161
1,051 1,163 1,279 1,399 1,489 1,601 1,709 1,831 1,951 2,069 2,179
1,061 1,171 1,283 1,409 1,493 1,607 1,721 1,847 1,973 2,081 2,203
1,063 1,181 1,289 1,423 1,499 1,609 1,723 1,861 1,979 2,083 2,207
---- Page 3 ----
2,213 2,333 2,423 2,557 2,687 2,789 2,903 3,037 3,181 3,307 3,413
2,221 2,339 2,437 2,579 2,689 2,791 2,909 3,041 3,187 3,313 3,433
2,237 2,341 2,441 2,591 2,693 2,797 2,917 3,049 3,191 3,319 3,449
2,239 2,347 2,447 2,593 2,699 2,801 2,927 3,061 3,203 3,323 3,457
2,243 2,351 2,459 2,609 2,707 2,803 2,939 3,067 3,209 3,329 3,461
2,251 2,357 2,467 2,617 2,711 2,819 2,953 3,079 3,217 3,331 3,463
2,267 2,371 2,473 2,621 2,713 2,833 2,957 3,083 3,221 3,343 3,467
2,269 2,377 2,477 2,633 2,719 2,837 2,963 3,089 3,229 3,347 3,469
2,273 2,381 2,503 2,647 2,729 2,843 2,969 3,109 3,251 3,359 3,491
2,281 2,383 2,521 2,657 2,731 2,851 2,971 3,119 3,253 3,361 3,499
2,287 2,389 2,531 2,659 2,741 2,857 2,999 3,121 3,257 3,371 3,511
2,293 2,393 2,539 2,663 2,749 2,861 3,001 3,137 3,259 3,373 3,517
2,297 2,399 2,543 2,671 2,753 2,879 3,011 3,163 3,271 3,389 3,527
2,309 2,411 2,549 2,677 2,767 2,887 3,019 3,167 3,299 3,391 3,529
2,311 2,417 2,551 2,683 2,777 2,897 3,023 3,169 3,301 3,407 3,533
---- Page 4 ----
3,539 3,581 3,623 3,673 3,719 3,769 3,823 3,877 3,919 3,967 4,019
3,541 3,583 3,631 3,677 3,727 3,779 3,833 3,881 3,923 3,989 4,021
3,547 3,593 3,637 3,691 3,733 3,793 3,847 3,889 3,929 4,001 4,027
3,557 3,607 3,643 3,697 3,739 3,797 3,851 3,907 3,931 4,003
3,559 3,613 3,659 3,701 3,761 3,803 3,853 3,911 3,943 4,007
3,571 3,617 3,671 3,709 3,767 3,821 3,863 3,917 3,947 4,013

program to calculate days of the week

It it maybe tricky to explain.
I have to "translate" a Old BASIC program into python.
the program is called weekdays:
10 PRINT TAB(32);"WEEKDAY"
20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"
30 PRINT:PRINT:PRINT
100 PRINT "WEEKDAY IS A COMPUTER DEMONSTRATION THAT"
110 PRINT"GIVES FACTS ABOUT A DATE OF INTEREST TO YOU."
120 PRINT
130 PRINT "ENTER TODAY'S DATE IN THE FORM: 3,24,1979 ";
140 INPUT M1,D1,Y1
150 REM THIS PROGRAM DETERMINES THE DAY OF THE WEEK
160 REM FOR A DATE AFTER 1582
170 DEF FNA(A)=INT(A/4)
180 DIM T(12)
190 DEF FNB(A)=INT(A/7)
200 REM SPACE OUTPUT AND READ IN INITIAL VALUES FOR MONTHS.
210 FOR I= 1 TO 12
220 READ T(I)
230 NEXT I
240 PRINT"ENTER DAY OF BIRTH (OR OTHER DAY OF INTEREST)";
250 INPUT M,D,Y
260 PRINT
270 LET I1 = INT((Y-1500)/100)
280 REM TEST FOR DATE BEFORE CURRENT CALENDAR.
290 IF Y-1582 <0 THEN 1300
300 LET A = I1*5+(I1+3)/4
310 LET I2=INT(A-FNB(A)*7)
320 LET Y2=INT(Y/100)
330 LET Y3 =INT(Y-Y2*100)
340 LET A =Y3/4+Y3+D+T(M)+I2
350 LET B=INT(A-FNB(A)*7)+1
360 IF M > 2 THEN 470
370 IF Y3 = 0 THEN 440
380 LET T1=INT(Y-FNA(Y)*4)
390 IF T1 <> 0 THEN 470
400 IF B<>0 THEN 420
410 LET B=6
420 LET B = B-1
430 GOTO 470
440 LET A = I1-1
450 LET T1=INT(A-FNA(A)*4)
460 IF T1 = 0 THEN 400
470 IF B <>0 THEN 490
480 LET B = 7
490 IF (Y1*12+M1)*31+D1<(Y*12+M)*31+D THEN 550
500 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 530
510 PRINT M;"/";D;"/";Y;" WAS A ";
520 GOTO 570
530 PRINT M;"/";D;"/";Y;" IS A ";
540 GOTO 570
550 PRINT M;"/";D;"/";Y;" WILL BE A ";
560 REM PRINT THE DAY OF THE WEEK THE DATE FALLS ON.
570 IF B <>1 THEN 590
580 PRINT "SUNDAY."
590 IF B<>2 THEN 610
600 PRINT "MONDAY."
610 IF B<>3 THEN 630
620 PRINT "TUESDAY."
630 IF B<>4 THEN 650
640 PRINT "WEDNESDAY."
650 IF B<>5 THEN 670
660 PRINT "THURSDAY."
670 IF B<>6 THEN 690
680 GOTO 1250
690 IF B<>7 THEN 710
700 PRINT "SATURDAY."
710 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 1120
720 LET I5=Y1-Y
730 PRINT
740 LET I6=M1-M
750 LET I7=D1-D
760 IF I7>=0 THEN 790
770 LET I6= I6-1
780 LET I7=I7+30
790 IF I6>=0 THEN 820
800 LET I5=I5-1
810 LET I6=I6+12
820 IF I5<0 THEN 1310
830 IF I7 <> 0 THEN 850
835 IF I6 <> 0 THEN 850
840 PRINT"***HAPPY BIRTHDAY***"
850 PRINT " "," ","YEARS","MONTHS","DAYS"
855 PRINT " "," ","-----","------","----"
860 PRINT "YOUR AGE (IF BIRTHDATE) ",I5,I6,I7
870 LET A8 = (I5*365)+(I6*30)+I7+INT(I6/2)
880 LET K5 = I5
890 LET K6 = I6
900 LET K7 = I7
910 REM CALCULATE RETIREMENT DATE.
920 LET E = Y+65
930 REM CALCULATE TIME SPENT IN THE FOLLOWING FUNCTIONS.
940 LET F = .35
950 PRINT "YOU HAVE SLEPT ",
960 GOSUB 1370
970 LET F = .17
980 PRINT "YOU HAVE EATEN ",
990 GOSUB 1370
1000 LET F = .23
1010 IF K5 > 3 THEN 1040
1020 PRINT "YOU HAVE PLAYED",
1030 GOTO 1080
1040 IF K5 > 9 THEN 1070
1050 PRINT "YOU HAVE PLAYED/STUDIED",
1060 GOTO 1080
1070 PRINT "YOU HAVE WORKED/PLAYED",
1080 GOSUB 1370
1085 GOTO 1530
1090 PRINT "YOU HAVE RELAXED ",K5,K6,K7
1100 PRINT
1110 PRINT TAB(16);"*** YOU MAY RETIRE IN";E;" ***"
1120 PRINT
1140 PRINT
1200 PRINT
1210 PRINT
1220 PRINT
1230 PRINT
1240 END
1250 IF D=13 THEN 1280
1260 PRINT "FRIDAY."
1270 GOTO 710
1280 PRINT "FRIDAY THE THIRTEENTH---BEWARE!"
1290 GOTO 710
1300 PRINT "NOT PREPARED TO GIVE DAY OF WEEK PRIOR TO MDLXXXII. "
1310 GOTO 1140
1320 REM TABLE OF VALUES FOR THE MONTHS TO BE USED IN CALCULATIONS.
1330 DATA 0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5
1340 REM THIS IS THE CURRENT DATE USED IN THE CALCULATIONS.
1350 REM THIS IS THE DATE TO BE CALCULATED ON.
1360 REM CALCULATE TIME IN YEARS, MONTHS, AND DAYS
1370 LET K1=INT(F*A8)
1380 LET I5 = INT(K1/365)
1390 LET K1 = K1- (I5*365)
1400 LET I6 = INT(K1/30)
1410 LET I7 = K1 -(I6*30)
1420 LET K5 = K5-I5
1430 LET K6 =K6-I6
1440 LET K7 = K7-I7
1450 IF K7>=0 THEN 1480
1460 LET K7=K7+30
1470 LET K6=K6-1
1480 IF K6>0 THEN 1510
1490 LET K6=K6+12
1500 LET K5=K5-1
1510 PRINT I5,I6,I7
1520 RETURN
1530 IF K6=12 THEN 1550
1540 GOTO 1090
1550 LET K5=K5+1
1560 LET K6=0
1570 GOTO 1090
1580 REM
1590 END
this program will take current date, and date of birth and return some statistics eg how long you have lives, how many days you have slept.
For part of the assignment, I have to explain what each variable means in the OLD BASIC program. In the old days, the variable name can only be things like A1, B3 etc...
In this program, There is an array
Call
DATA = [0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5]
There are 12 numbers in this array. I realized that the program will read each number and match from Jan to Dec and I also find out this is to deal with calculating what is it is eg Monday. Tuesday.
I have found that much so far but can anybody explain to me what those numbers in DATA array mean exactly.
thanks.

Without pulling all the code apart, it looks like it's the offset for the start of the week for a given month...
Assume Jan 1st is a Tuesday (like 2013)...
Jan 0 Tuesday
Feb 3 Friday (Tuesday + 3)
Mar 3 Friday (Tuesday + 3)
Apr 6 Monday (Tuesday + 6)
etc...
This seems to assume it's not a leap year otherwise the number from March onwards would need to be decreased by 1 to allow for the extra day.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.