get financial data using Python - python
I have managed to write some Python code and Selenium that navigates to a webpage that contains financial data that is in some tables.
I want to be able to extract the data and put it into excel.
The tables seem to be html based tables code below:
<tr>
<td class="bc2T bc2gt">Last update</td>
<td class="bc2V bc2D">03/15/2018</td><td class="bc2V bc2D">03/14/2019</td><td class="bc2V bc2D">03/12/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/22/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/20/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/18/2020</td>
</tr>
</table>
The table has the following class name:
<table class='BordCollapseYear2' style="margin-right:20px; font-size:12px; width:100%;" cellspacing=0>
Is there a way I can extract this data? Ideally I want this to be dynamic so that it can extract information for different companies.
I've never used it before, but I've seen BeautifulSoup library mentioned a few times.
https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/
As an example Microsoft. I'd want to extract the income statement data, balance sheet etc.
This script will scrape all tables found on the page and pretty prints them:
import requests
from bs4 import BeautifulSoup
url = 'https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = {}
# for every table found on page...
for table in soup.select('table.BordCollapseYear2'):
table_name = table.find_previous('b').text
all_data[table_name] = []
# ..scrape every row
for tr in table.select('tr'):
row = [td.get_text(strip=True, separator=' ') for td in tr.select('td')]
if len(row) == 7:
all_data[table_name].append(row)
#pretty print all data:
for k, v in all_data.items():
print('Table name: {}'.format(k))
print('-' * 160)
for row in v:
print(('{:<25}'*7).format(*row))
print()
Prints:
Table name: Valuation
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Capitalization 1 532 175 757 640 1 026 511 1 391 637 - -
Entreprise Value (EV) 1 485 388 700 112 964 870 1 315 823 1 299 246 1 276 659
P/E ratio 25,4x 46,3x 26,5x 32,3x 29,7x 25,8x
Yield 2,26% 1,70% 1,37% 1,10% 1,18% 1,31%
Capitalization / Revenue 5,51x 6,87x 8,16x 9,81x 8,89x 7,95x
EV / Revenue 5,02x 6,34x 7,67x 9,28x 8,30x 7,30x
EV / EBITDA 12,7x 15,4x 17,7x 20,2x 18,3x 15,9x
Cours sur Actif net 7,46x 9,15x 10,0x 12,1x 10,1x 8,49x
Nbr of stocks (in thousands)7 720 510 7 683 198 7 662 818 7 583 440 - -
Reference price (USD) 68,9 98,6 134 184 184 184
Last update 07/20/2017 07/19/2018 07/18/2019 05/08/2020 04/30/2020 04/30/2020
Table name: Annual Income Statement Data
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Net sales 1 96 657 110 360 125 843 141 818 156 534 174 945
EBITDA 1 38 117 45 319 54 641 65 074 70 966 80 445
Operating profit (EBIT) 129 339 35 058 42 959 52 544 57 045 65 289
Operating Margin 30,4% 31,8% 34,1% 37,1% 36,4% 37,3%
Pre-Tax Profit (EBT) 1 23 149 36 474 43 688 52 521 57 042 65 225
Net income 1 21 204 16 571 39 240 43 693 47 223 53 905
Net margin 21,9% 15,0% 31,2% 30,8% 30,2% 30,8%
EPS 2 2,71 2,13 5,06 5,68 6,18 7,11
Dividend per Share 2 1,56 1,68 1,84 2,02 2,16 2,41
Last update 07/20/2017 07/19/2018 07/18/2019 05/22/2020 05/22/2020 05/22/2020
Table name: Balance Sheet Analysis
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June 2017 2018 2019 2020 2021 2022
Net Debt 1 - - - - - -
Net Cash position 1 46 787 57 528 61 641 75 814 92 392 114 978
Leverage (Debt / EBITDA) -1,23x -1,27x -1,13x -1,17x -1,30x -1,43x
Free Cash Flow 1 31 378 32 252 38 260 41 953 46 887 53 155
ROE (Net Profit / Equities)29,4% 19,4% 42,4% 36,6% 34,5% 36,1%
Shareholders' equity 1 72 195 85 215 92 524 119 417 136 690 149 484
ROA (Net Profit / Asset) 9,76% 6,51% 14,4% 18,5% 14,6% 14,7%
Assets 1 217 276 254 580 272 703 235 800 323 445 366 702
Book Value Per Share 2 9,24 10,8 13,4 15,2 18,2 21,6
Cash Flow per Share 2 5,04 5,63 6,73 7,03 8,02 9,79
Capex 1 8 129 11 632 13 925 15 698 17 922 19 507
Capex / Sales 8,41% 10,5% 11,1% 11,1% 11,4% 11,2%
Last update 07/20/2017 07/19/2018 07/18/2019 05/22/2020 05/22/2020 05/04/2020
EDIT (to save all_data as csv file):
import csv
with open('data.csv', 'w', newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
for k, v in all_data.items():
spamwriter.writerow([k])
for row in v:
spamwriter.writerow(row)
Screenshot from LibreOffice:
Related
text file rows into CSV column python
I've a question I've a text file containing data like this A 34 45 7789 3475768 443 67 8999 3343 656 8876 802 383358 873 36789 2374859 485994 86960 32838459 3484549 24549 58423 T 3445 574649 68078 59348604 45959 64585304 56568 595 49686 656564 55446 665 677 778 433 545 333 65665 3535 and so on I want to make a csv file from this text file, displaying data like this, A & T as column headings, and then numbers A T 34 3445 45 574649 7789 68078 3475768 59348604 443 45959
EDIT (A lot simpler solution inspired by Michael Butscher's comment): import pandas as pd df = pd.read_csv("filename.txt", delimiter=" ") df.T.to_csv("filename.csv", header=False) Here is the code: import pandas as pd # Read file with open("filename.txt", "r") as f: data = f.read() # Split data by lines and remove empty lines columns = data.split("\n") columns = [x.split() for x in columns if x!=""] # Row sizes are different in your example so find max number of rows column_lengths = [len(x) for x in columns] max_col_length = max(column_lengths) data = {} for i in columns: # Add None to end for columns that have less values if len(i)<max_col_length: i += [None]*(max_col_length-len(i)) data[i[0]] = i[1:] # Create dataframe df = pd.DataFrame(data) # Create csv df.to_csv("filename.csv", index=False) Output should look like this: A T 0 34 3445 1 45 574649 2 7789 68078 3 3475768 59348604 4 443 45959 5 67 64585304 6 8999 56568 7 3343 595 8 656 49686 9 8876 656564 10 802 55446 11 383358 665 12 873 677 13 36789 778 14 2374859 433 15 485994 545 16 86960 333 17 32838459 65665 18 3484549 3535 19 24549 None 20 58423 None
here is my code import pandas as pd data = pd.read_csv("text (3).txt", header = None) Our_Data = pd.DataFrame(data) for rows in Our_Data: New_Data=pd.DataFrame(Our_Data[rows].str.split(' ').tolist()).T New_Data.columns = New_Data.iloc[0] New_Data = New_Data[1:] New_Data.to_csv("filename.csv", index=False) The Output A T 1 34 3445 2 45 574649 3 7789 68078 4 3475768 59348604 5 443 45959 6 67 64585304 7 8999 56568 8 3343 595 9 656 49686 10 8876 656564 11 802 55446 12 383358 665 13 873 677 14 36789 778 15 2374859 433 16 485994 545 17 86960 333 18 32838459 65665 19 3484549 3535 20 24549 None 21 58423 None
Create new Pandas.DataFrame with .groupby(...).agg(sum) then recover unsummed columns
I'm starting with a dataframe of baseabll seasons a section of which looks similar to this: Name Season AB H SB playerid 13047 A.J. Pierzynski 2013 503 137 1 746 6891 A.J. Pierzynski 2006 509 150 1 746 1374 Rod Carew 1977 616 239 23 1001942 1422 Stan Musial 1948 611 230 7 1009405 1507 Todd Helton 2000 580 216 5 432 1508 Nomar Garciaparra 2000 529 197 5 190 1509 Ichiro Suzuki 2004 704 262 36 1101 From these seasons, I want to create a dataframe of career stats; that is, one row for each player which is a sum of their AB, H, etc. This dataframe should still include the names of the players. The playerid in the above is a unique key for each player and should either be an index or an unchanged value in a column after creating the career stats dataframe. My hypothetical starting point is df_careers = df_seasons.groupby('playerid').agg(sum) but this leaves out all the non-numeric data. With numeric_only = False I can get some sort of mess in the names columns like 'Ichiro SuzukiIchiro SuzukiIchiro Suzuki' from concatenation, but that just requires a bunch of cleaning. This is something I'd like to be able to do with other data sets and the actually data I have is more like 25 columns, so I'd rather understand a specific routine for getting the Name data back or preserving it from the outset rather than write a specific function and use groupby('playerid').agg(func) (or a similar process) to do it, if possible. I'm guessing there's a fairly simply way to do this, but I only started learning Pandas a week ago, so there are gaps in my knowledge.
You can write your own condition how do you want to include non summed columns. col = df.columns.tolist() col.remove('playerid') df.groupby('playerid').agg({i : lambda x: x.iloc[0] if x.dtypes=='object' else x.sum() for i in df.columns}) df: Name Season AB H SB playerid playerid 190 Nomar_Garciaparra 2000 529 197 5 190 432 Todd_Helton 2000 580 216 5 432 746 A.J._Pierzynski 4019 1012 287 2 1492 1101 Ichiro_Suzuki 2004 704 262 36 1101 1001942 Rod_Carew 1977 616 239 23 1001942 1009405 Stan_Musial 1948 611 230 7 1009405
If there is a one-to-one relationship between 'playerid' and 'Name', as appears to be the case, you can just include 'Name' in the groupby columns: stat_cols = ['AB', 'H', 'SB'] groupby_cols = ['playerid', 'Name'] results = df.groupby(groupby_cols)[stat_cols].sum() Results: AB H SB playerid Name 190 Nomar Garciaparra 529 197 5 432 Todd Helton 580 216 5 746 A.J. Pierzynski 1012 287 2 1101 Ichiro Suzuki 704 262 36 1001942 Rod Carew 616 239 23 1009405 Stan Musial 611 230 7 If you'd prefer to group only by 'playerid' and add the 'Name' data back in afterwards, you can instead create a 'playerId' to 'Name' mapping as a dictionary, and look it up using map: results = df.groupby('playerid')[stat_cols].sum() name_map = pd.Series(df.Name.to_numpy(), df.playerid).to_dict() results['Name'] = results.index.map(name_map) Results: AB H SB Name playerid 190 529 197 5 Nomar Garciaparra 432 580 216 5 Todd Helton 746 1012 287 2 A.J. Pierzynski 1101 704 262 36 Ichiro Suzuki 1001942 616 239 23 Rod Carew 1009405 611 230 7 Stan Musial
groupy.agg() can accept a dictionary that maps column names to functions. So, one solution is to pass a dictionary to agg, specifying which functions to apply to each column. Using the sample data above, one might use mapping = { 'AB': sum,'H': sum, 'SB': sum, 'Season': max, 'Name': max } df_1 = df.groupby('playerid').agg(mapping) The choice to use 'max' for those that shouldn't be summed is arbitrary. You could define a lambda function to apply to a column if you want to handle it in a certain way. DataFrameGroupBy.agg can work with any function that will work with DataFrame.apply. To expand this to larger data sets, you might use a dictionary comprehension. This would work well: dictionary = { x : sum for x in df.columns} dont_sum = {'Name': max, 'Season': max} dictionary.update(dont_sum) df_1 = df.groupby('playerid').agg(dictionary)
Nested loop to replace rows in dataframe
I'm trying to write a for loop that takes each row in a dataframe and compares it to the rows in a second dataframe. If the row in the second dataframe: isn't in the first dataframe already has a higher value in the total points column has a lower cost than the available budget (row_budget) then I want to remove the row from the first dataframe and add the row from the second dataframe in its place. Example data: df code team_name total_points now_cost 78 93284 BHA 38 50 395 173514 WAT 42 50 342 20452 SOU 66 50 92 17761 BUR 97 50 427 18073 WHU 99 50 69 61933 BHA 115 50 130 116594 CHE 116 50 pos_pool code team_name total_points now_cost 438 90585 WOL 120 50 281 67089 NEW 131 50 419 37096 WHU 143 50 200 97032 LIV 208 65 209 110979 LIV 231 115 My expected output for the first three loops should be: df code team_name total_points now_cost 92 17761 BUR 97 50 427 18073 WHU 99 50 69 61933 BHA 115 50 130 116594 CHE 116 50 438 90585 WOL 120 50 281 67089 NEW 131 50 419 37096 WHU 143 50 Here is the nested for loop that I've tried: for index, row in df.iterrows(): budget = squad['budget'] team_limits = squad['team_limits'] pos_pool = players_1920.loc[players_1920['position'] == row['position']].sort_values('total_points', ascending=False) row_budget = row.now_cost + 1000 - budget for index2, row2 in pos_pool.iterrows(): if (row2 not in df) and (row2.total_points > row.total_points) and (row2.now_cost <= row_budget): team_limits[row.team_name] += 1 team_limits[row2.team_name] -=1 budget += row.now_cost - row2.now_cost df = df.append(row2) df = df.drop(row) else: pass return df At the moment I am only iterating through the first dataframe but doesn't seem to do anything in the second.
How to display a sequence of numbers in column-major order?
Program description: Find all the prime numbers between 1 and 4,027 and print them in a table which "reads down", using as few rows as possible, and using as few sheets of paper as possible. (This is because I have to print them out on paper to turn it in.) All numbers should be right-justified in their column. The height of the columns should all be the same, except for perhaps the last column, which might have a few blank entries towards its bottom row. The plan for my first function is to find all prime numbers between the range above and put them in a list. Then I want my second function to display the list in a table that reads up to down. 2 23 59 3 29 61 5 31 67 7 37 71 11 41 73 13 43 79 17 47 83 19 53 89 ect... This all I've been able to come up with myself: def findPrimes(n): """ Adds calculated prime numbers to a list. """ prime_list = list() for number in range(1, n + 1): prime = True for i in range(2, number): if(number % i == 0): prime = False if prime: prime_list.append(number) return prime_list def displayPrimes(): pass print(findPrimes(4027)) I'm not sure how to make a row/column display in Python. I remember using Java in my previous class and we had to use a for loop inside a for loop I believe. Do I have to do something similar to that?
Although I frequently don't answer questions where the original poster hasn't even made an attempt to solve the problem themselves, I decided to make an exception of yours—mostly because I found it an interesting (and surprisingly challenging) problem that required solving a number of somewhat tricky sub-problems. I also optimized your find_primes() function slightly by taking advantage of some reatively well-know computational shortcuts for calculating them. For testing and demo purposes, I made the tables only 15 rows high to force more than one page to be generated as shown in the output at the end. from itertools import zip_longest import locale import math locale.setlocale(locale.LC_ALL, '') # enable locale-specific formatting def zip_discard(*iterables, _NULL=object()): """ Like zip_longest() but doesn't fill out all rows to equal length. https://stackoverflow.com/questions/38054593/zip-longest-without-fillvalue """ return [[entry for entry in iterable if entry is not _NULL] for iterable in zip_longest(*iterables, fillvalue=_NULL)] def grouper(n, seq): """ Group elements in sequence into groups of "n" items. """ for i in range(0, len(seq), n): yield seq[i:i+n] def tabularize(width, height, numbers): """ Print list of numbers in column-major tabular form given the dimensions of the table in characters (rows and columns). Will create multiple tables of required to display all numbers. """ # Determine number of chars needed to hold longest formatted numeric value gap = 2 # including space between numbers col_width = len('{:n}'.format(max(numbers))) + gap # Determine number of columns that will fit within the table's width. num_cols = width // col_width chunk_size = num_cols * height # maximum numbers in each table for i, chunk in enumerate(grouper(chunk_size, numbers), start=1): print('---- Page {} ----'.format(i)) num_rows = int(math.ceil(len(chunk) / num_cols)) # rounded up table = zip_discard(*grouper(num_rows, chunk)) for row in table: print(''.join(('{:{width}n}'.format(num, width=col_width) for num in row))) def find_primes(n): """ Create list of prime numbers from 1 to n. """ prime_list = [] for number in range(1, n+1): for i in range(2, int(math.sqrt(number)) + 1): if not number % i: # Evenly divisible? break # Not prime. else: prime_list.append(number) return prime_list primes = find_primes(4027) tabularize(80, 15, primes) Output: ---- Page 1 ---- 1 47 113 197 281 379 463 571 659 761 863 2 53 127 199 283 383 467 577 661 769 877 3 59 131 211 293 389 479 587 673 773 881 5 61 137 223 307 397 487 593 677 787 883 7 67 139 227 311 401 491 599 683 797 887 11 71 149 229 313 409 499 601 691 809 907 13 73 151 233 317 419 503 607 701 811 911 17 79 157 239 331 421 509 613 709 821 919 19 83 163 241 337 431 521 617 719 823 929 23 89 167 251 347 433 523 619 727 827 937 29 97 173 257 349 439 541 631 733 829 941 31 101 179 263 353 443 547 641 739 839 947 37 103 181 269 359 449 557 643 743 853 953 41 107 191 271 367 457 563 647 751 857 967 43 109 193 277 373 461 569 653 757 859 971 ---- Page 2 ---- 977 1,069 1,187 1,291 1,427 1,511 1,613 1,733 1,867 1,987 2,087 983 1,087 1,193 1,297 1,429 1,523 1,619 1,741 1,871 1,993 2,089 991 1,091 1,201 1,301 1,433 1,531 1,621 1,747 1,873 1,997 2,099 997 1,093 1,213 1,303 1,439 1,543 1,627 1,753 1,877 1,999 2,111 1,009 1,097 1,217 1,307 1,447 1,549 1,637 1,759 1,879 2,003 2,113 1,013 1,103 1,223 1,319 1,451 1,553 1,657 1,777 1,889 2,011 2,129 1,019 1,109 1,229 1,321 1,453 1,559 1,663 1,783 1,901 2,017 2,131 1,021 1,117 1,231 1,327 1,459 1,567 1,667 1,787 1,907 2,027 2,137 1,031 1,123 1,237 1,361 1,471 1,571 1,669 1,789 1,913 2,029 2,141 1,033 1,129 1,249 1,367 1,481 1,579 1,693 1,801 1,931 2,039 2,143 1,039 1,151 1,259 1,373 1,483 1,583 1,697 1,811 1,933 2,053 2,153 1,049 1,153 1,277 1,381 1,487 1,597 1,699 1,823 1,949 2,063 2,161 1,051 1,163 1,279 1,399 1,489 1,601 1,709 1,831 1,951 2,069 2,179 1,061 1,171 1,283 1,409 1,493 1,607 1,721 1,847 1,973 2,081 2,203 1,063 1,181 1,289 1,423 1,499 1,609 1,723 1,861 1,979 2,083 2,207 ---- Page 3 ---- 2,213 2,333 2,423 2,557 2,687 2,789 2,903 3,037 3,181 3,307 3,413 2,221 2,339 2,437 2,579 2,689 2,791 2,909 3,041 3,187 3,313 3,433 2,237 2,341 2,441 2,591 2,693 2,797 2,917 3,049 3,191 3,319 3,449 2,239 2,347 2,447 2,593 2,699 2,801 2,927 3,061 3,203 3,323 3,457 2,243 2,351 2,459 2,609 2,707 2,803 2,939 3,067 3,209 3,329 3,461 2,251 2,357 2,467 2,617 2,711 2,819 2,953 3,079 3,217 3,331 3,463 2,267 2,371 2,473 2,621 2,713 2,833 2,957 3,083 3,221 3,343 3,467 2,269 2,377 2,477 2,633 2,719 2,837 2,963 3,089 3,229 3,347 3,469 2,273 2,381 2,503 2,647 2,729 2,843 2,969 3,109 3,251 3,359 3,491 2,281 2,383 2,521 2,657 2,731 2,851 2,971 3,119 3,253 3,361 3,499 2,287 2,389 2,531 2,659 2,741 2,857 2,999 3,121 3,257 3,371 3,511 2,293 2,393 2,539 2,663 2,749 2,861 3,001 3,137 3,259 3,373 3,517 2,297 2,399 2,543 2,671 2,753 2,879 3,011 3,163 3,271 3,389 3,527 2,309 2,411 2,549 2,677 2,767 2,887 3,019 3,167 3,299 3,391 3,529 2,311 2,417 2,551 2,683 2,777 2,897 3,023 3,169 3,301 3,407 3,533 ---- Page 4 ---- 3,539 3,581 3,623 3,673 3,719 3,769 3,823 3,877 3,919 3,967 4,019 3,541 3,583 3,631 3,677 3,727 3,779 3,833 3,881 3,923 3,989 4,021 3,547 3,593 3,637 3,691 3,733 3,793 3,847 3,889 3,929 4,001 4,027 3,557 3,607 3,643 3,697 3,739 3,797 3,851 3,907 3,931 4,003 3,559 3,613 3,659 3,701 3,761 3,803 3,853 3,911 3,943 4,007 3,571 3,617 3,671 3,709 3,767 3,821 3,863 3,917 3,947 4,013
program to calculate days of the week
It it maybe tricky to explain. I have to "translate" a Old BASIC program into python. the program is called weekdays: 10 PRINT TAB(32);"WEEKDAY" 20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY" 30 PRINT:PRINT:PRINT 100 PRINT "WEEKDAY IS A COMPUTER DEMONSTRATION THAT" 110 PRINT"GIVES FACTS ABOUT A DATE OF INTEREST TO YOU." 120 PRINT 130 PRINT "ENTER TODAY'S DATE IN THE FORM: 3,24,1979 "; 140 INPUT M1,D1,Y1 150 REM THIS PROGRAM DETERMINES THE DAY OF THE WEEK 160 REM FOR A DATE AFTER 1582 170 DEF FNA(A)=INT(A/4) 180 DIM T(12) 190 DEF FNB(A)=INT(A/7) 200 REM SPACE OUTPUT AND READ IN INITIAL VALUES FOR MONTHS. 210 FOR I= 1 TO 12 220 READ T(I) 230 NEXT I 240 PRINT"ENTER DAY OF BIRTH (OR OTHER DAY OF INTEREST)"; 250 INPUT M,D,Y 260 PRINT 270 LET I1 = INT((Y-1500)/100) 280 REM TEST FOR DATE BEFORE CURRENT CALENDAR. 290 IF Y-1582 <0 THEN 1300 300 LET A = I1*5+(I1+3)/4 310 LET I2=INT(A-FNB(A)*7) 320 LET Y2=INT(Y/100) 330 LET Y3 =INT(Y-Y2*100) 340 LET A =Y3/4+Y3+D+T(M)+I2 350 LET B=INT(A-FNB(A)*7)+1 360 IF M > 2 THEN 470 370 IF Y3 = 0 THEN 440 380 LET T1=INT(Y-FNA(Y)*4) 390 IF T1 <> 0 THEN 470 400 IF B<>0 THEN 420 410 LET B=6 420 LET B = B-1 430 GOTO 470 440 LET A = I1-1 450 LET T1=INT(A-FNA(A)*4) 460 IF T1 = 0 THEN 400 470 IF B <>0 THEN 490 480 LET B = 7 490 IF (Y1*12+M1)*31+D1<(Y*12+M)*31+D THEN 550 500 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 530 510 PRINT M;"/";D;"/";Y;" WAS A "; 520 GOTO 570 530 PRINT M;"/";D;"/";Y;" IS A "; 540 GOTO 570 550 PRINT M;"/";D;"/";Y;" WILL BE A "; 560 REM PRINT THE DAY OF THE WEEK THE DATE FALLS ON. 570 IF B <>1 THEN 590 580 PRINT "SUNDAY." 590 IF B<>2 THEN 610 600 PRINT "MONDAY." 610 IF B<>3 THEN 630 620 PRINT "TUESDAY." 630 IF B<>4 THEN 650 640 PRINT "WEDNESDAY." 650 IF B<>5 THEN 670 660 PRINT "THURSDAY." 670 IF B<>6 THEN 690 680 GOTO 1250 690 IF B<>7 THEN 710 700 PRINT "SATURDAY." 710 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 1120 720 LET I5=Y1-Y 730 PRINT 740 LET I6=M1-M 750 LET I7=D1-D 760 IF I7>=0 THEN 790 770 LET I6= I6-1 780 LET I7=I7+30 790 IF I6>=0 THEN 820 800 LET I5=I5-1 810 LET I6=I6+12 820 IF I5<0 THEN 1310 830 IF I7 <> 0 THEN 850 835 IF I6 <> 0 THEN 850 840 PRINT"***HAPPY BIRTHDAY***" 850 PRINT " "," ","YEARS","MONTHS","DAYS" 855 PRINT " "," ","-----","------","----" 860 PRINT "YOUR AGE (IF BIRTHDATE) ",I5,I6,I7 870 LET A8 = (I5*365)+(I6*30)+I7+INT(I6/2) 880 LET K5 = I5 890 LET K6 = I6 900 LET K7 = I7 910 REM CALCULATE RETIREMENT DATE. 920 LET E = Y+65 930 REM CALCULATE TIME SPENT IN THE FOLLOWING FUNCTIONS. 940 LET F = .35 950 PRINT "YOU HAVE SLEPT ", 960 GOSUB 1370 970 LET F = .17 980 PRINT "YOU HAVE EATEN ", 990 GOSUB 1370 1000 LET F = .23 1010 IF K5 > 3 THEN 1040 1020 PRINT "YOU HAVE PLAYED", 1030 GOTO 1080 1040 IF K5 > 9 THEN 1070 1050 PRINT "YOU HAVE PLAYED/STUDIED", 1060 GOTO 1080 1070 PRINT "YOU HAVE WORKED/PLAYED", 1080 GOSUB 1370 1085 GOTO 1530 1090 PRINT "YOU HAVE RELAXED ",K5,K6,K7 1100 PRINT 1110 PRINT TAB(16);"*** YOU MAY RETIRE IN";E;" ***" 1120 PRINT 1140 PRINT 1200 PRINT 1210 PRINT 1220 PRINT 1230 PRINT 1240 END 1250 IF D=13 THEN 1280 1260 PRINT "FRIDAY." 1270 GOTO 710 1280 PRINT "FRIDAY THE THIRTEENTH---BEWARE!" 1290 GOTO 710 1300 PRINT "NOT PREPARED TO GIVE DAY OF WEEK PRIOR TO MDLXXXII. " 1310 GOTO 1140 1320 REM TABLE OF VALUES FOR THE MONTHS TO BE USED IN CALCULATIONS. 1330 DATA 0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5 1340 REM THIS IS THE CURRENT DATE USED IN THE CALCULATIONS. 1350 REM THIS IS THE DATE TO BE CALCULATED ON. 1360 REM CALCULATE TIME IN YEARS, MONTHS, AND DAYS 1370 LET K1=INT(F*A8) 1380 LET I5 = INT(K1/365) 1390 LET K1 = K1- (I5*365) 1400 LET I6 = INT(K1/30) 1410 LET I7 = K1 -(I6*30) 1420 LET K5 = K5-I5 1430 LET K6 =K6-I6 1440 LET K7 = K7-I7 1450 IF K7>=0 THEN 1480 1460 LET K7=K7+30 1470 LET K6=K6-1 1480 IF K6>0 THEN 1510 1490 LET K6=K6+12 1500 LET K5=K5-1 1510 PRINT I5,I6,I7 1520 RETURN 1530 IF K6=12 THEN 1550 1540 GOTO 1090 1550 LET K5=K5+1 1560 LET K6=0 1570 GOTO 1090 1580 REM 1590 END this program will take current date, and date of birth and return some statistics eg how long you have lives, how many days you have slept. For part of the assignment, I have to explain what each variable means in the OLD BASIC program. In the old days, the variable name can only be things like A1, B3 etc... In this program, There is an array Call DATA = [0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5] There are 12 numbers in this array. I realized that the program will read each number and match from Jan to Dec and I also find out this is to deal with calculating what is it is eg Monday. Tuesday. I have found that much so far but can anybody explain to me what those numbers in DATA array mean exactly. thanks.
Without pulling all the code apart, it looks like it's the offset for the start of the week for a given month... Assume Jan 1st is a Tuesday (like 2013)... Jan 0 Tuesday Feb 3 Friday (Tuesday + 3) Mar 3 Friday (Tuesday + 3) Apr 6 Monday (Tuesday + 6) etc... This seems to assume it's not a leap year otherwise the number from March onwards would need to be decreased by 1 to allow for the extra day.