Is it possible to nest the apply function for a dataframe?

Is it possible to nest the apply function for a dataframe? - python

I am returning to my quest to replace an old VBA program with python code, but as I am able to get code working, Im conscious of the fact that it needs to be pythonic. The below code works but since it uses a for loop in the apply function, is it possible to clean this up with another apply function based around the find function?
cheers
import pandas as pd
dfWorking = pd.DataFrame({
'Date': ['2023-01-25', '2023-01-24', '2023-01-24', '2023-01-23'],
'Item': ['Visa Purchase 21Jan Hanaro Northlakes Mango Hil ',
'Visa Purchase 21Jan Event Cinemas North North Lak',
'Visa Purchase 21Jan Event Cinemas North North Lak',
'Mcare Benefits 4880000027 Eywq'],
'Debit': [67.0, 10.2, 7.65, 39.75],
'Credit': [0.0, 0, 0, 0],
'Balance': [1830.0, 1897.99, 1908.019, 1915.84],
'Description': ['a', 'b', 'c', 'd']
})
dfAcctHist = pd.DataFrame({
'Date': ['2023-01-23', '2023-01-23', '2023-01-23', '2023-01-23'],
'Item': ['Csc R555558Df Nett',
'Tfr Wdl BPAY Internet 25Jan05:31 208655638973732700058Deft Payments',
'Eftpos Debit 25Jan15:19 Sq *Becs Cafe Kippa-Ring Qldau',
'Mcare Benefits 4880000027 Eywq'],
'Debit': [0, 168.0, 9.0, 39.75],
'Credit': [907.92, 0, 0, 0],
'Balance': [2053.09, 1885.07, 1876.09, 1915.84],
'Description': ['z', 'x', 's', 'f']})
dfIncome = pd.DataFrame({
'Job': ['Pension'],
'Income': [1000.00],
'Tax': [0.0],
'SearchName': ['R555558Df']
})
# This Function checks if the income has changed for a Job in tblIncome.
def chk_income(row):
for i in dfIncome.index:
income = dfIncome.loc[i, 'SearchName']
if row.loc['Item'].find(income) != -1:
print("Item is an income")
if row.loc['Credit'] != dfIncome.loc[i, 'Income']:
print(" income has changed")
# NOTE Dialog box required here to ask user if they want to update the income table
print(
f"\nOld Income = {dfIncome.loc[i, 'Income']} New Income = {row.loc['Credit']}")
dfIncome.loc[i, 'Income'] = row.loc['Credit']
print(
f"old income has been changed to {dfIncome.loc[i, 'Income']}")
else:
print(" income has not changed")
else:
print("Item is not an income")
# Check and change if an income value changes
dfAcctHist.apply(chk_income, axis=1).copy()

Related

Python yahoo finance data optimitzation

I've found a code here pretty good to retrieve some data I need (Python yahoo finance error market_cap=int(data.get_quote_yahoo(str)['marketCap']) TypeError: 'int' object is not callable):
tickers=["AAPL","GOOG","RY","HPQ"]
# Get market cap (not really necessary for you)
market_cap_data = web.get_quote_yahoo(tickers)['marketCap']
# Get the P/E ratio directly
pe_data = web.get_quote_yahoo(tickers)['trailingPE']
# print stock and p/e ratio
for stock, pe in zip(tickers, pe_data):
print(stock, pe)
# More keys that can be used
['language', 'region', 'quoteType', 'triggerable', 'quoteSourceName',
'currency', 'preMarketChange', 'preMarketChangePercent',
'preMarketTime', 'preMarketPrice', 'regularMarketChange',
'regularMarketChangePercent', 'regularMarketTime', 'regularMarketPrice',
'regularMarketDayHigh', 'regularMarketDayRange', 'regularMarketDayLow',
'regularMarketVolume', 'regularMarketPreviousClose', 'bid', 'ask',
'bidSize', 'askSize', 'fullExchangeName', 'financialCurrency',
'regularMarketOpen', 'averageDailyVolume3Month',
'averageDailyVolume10Day', 'fiftyTwoWeekLowChange',
'fiftyTwoWeekLowChangePercent', 'fiftyTwoWeekRange',
'fiftyTwoWeekHighChange', 'fiftyTwoWeekHighChangePercent',
'fiftyTwoWeekLow', 'fiftyTwoWeekHigh', 'dividendDate',
'earningsTimestamp', 'earningsTimestampStart', 'earningsTimestampEnd',
'trailingAnnualDividendRate', 'trailingPE',
'trailingAnnualDividendYield', 'marketState', 'epsTrailingTwelveMonths',
'epsForward', 'sharesOutstanding', 'bookValue', 'fiftyDayAverage',
'fiftyDayAverageChange', 'fiftyDayAverageChangePercent',
'twoHundredDayAverage', 'twoHundredDayAverageChange',
'twoHundredDayAverageChangePercent', 'marketCap', 'forwardPE',
'priceToBook', 'sourceInterval', 'exchangeDataDelayedBy', 'tradeable',
'firstTradeDateMilliseconds', 'priceHint', 'exchange', 'shortName',
'longName', 'messageBoardId', 'exchangeTimezoneName',
'exchangeTimezoneShortName', 'gmtOffSetMilliseconds', 'market',
'esgPopulated', 'price']
I would like to retrieve most of the commented fields at the end of the previous code, but I've done this so far:
import pandas_datareader as web
tickers = ["AAPL", "GOOG", "RY", "SAB.MC"]
market_cap_data = web.get_quote_yahoo(tickers)['marketCap']
pe_data = web.get_quote_yahoo(tickers)['trailingPE']
fiftytwo_low_data = web.get_quote_yahoo(tickers)['fiftyTwoWeekLowChangePercent']
for stock, mcap, pe, fiftytwo_low in zip(tickers, market_cap_data, pe_data, fiftytwo_low_data):
print(stock, mcap, pe, fiftytwo_low)
Obviously I could continue with my brute force, but do you know any way to make the code more elegant to retrieve the whole string of fields with column names?
['language', 'region', 'quoteType', 'triggerable', 'quoteSourceName',
'currency', 'preMarketChange', 'preMarketChangePercent',
'preMarketTime', 'preMarketPrice', 'regularMarketChange',
'regularMarketChangePercent', 'regularMarketTime', 'regularMarketPrice',
'regularMarketDayHigh', 'regularMarketDayRange', 'regularMarketDayLow',
'regularMarketVolume', 'regularMarketPreviousClose', 'bid', 'ask',
'bidSize', 'askSize', 'fullExchangeName', 'financialCurrency',
'regularMarketOpen', 'averageDailyVolume3Month',
'averageDailyVolume10Day', 'fiftyTwoWeekLowChange',
'fiftyTwoWeekLowChangePercent', 'fiftyTwoWeekRange',
'fiftyTwoWeekHighChange', 'fiftyTwoWeekHighChangePercent',
'fiftyTwoWeekLow', 'fiftyTwoWeekHigh', 'dividendDate',
'earningsTimestamp', 'earningsTimestampStart', 'earningsTimestampEnd',
'trailingAnnualDividendRate', 'trailingPE',
'trailingAnnualDividendYield', 'marketState', 'epsTrailingTwelveMonths',
'epsForward', 'sharesOutstanding', 'bookValue', 'fiftyDayAverage',
'fiftyDayAverageChange', 'fiftyDayAverageChangePercent',
'twoHundredDayAverage', 'twoHundredDayAverageChange',
'twoHundredDayAverageChangePercent', 'marketCap', 'forwardPE',
'priceToBook', 'sourceInterval', 'exchangeDataDelayedBy', 'tradeable',
'firstTradeDateMilliseconds', 'priceHint', 'exchange', 'shortName',
'longName', 'messageBoardId', 'exchangeTimezoneName',
'exchangeTimezoneShortName', 'gmtOffSetMilliseconds', 'market',
'esgPopulated', 'price']
thanks

Using the set, you can get all the items that can be retrieved by the ticker for the initial set, and using the union set, you can also add in a list, so you can get all the item names that have a value in the issue you want to retrieve.
import pandas_datareader as web
import pandas as pd
tickers = ["AAPL", "GOOG", "RY", "SAB.MC"]
names = set()
for t in tickers:
market_cap_data = web.get_quote_yahoo(t)
names |= set(market_cap_data.columns.to_list())
names
{'ask',
'askSize',
'averageAnalystRating',
'averageDailyVolume10Day',
'averageDailyVolume3Month',
'bid',
'bidSize',
'bookValue',
'cryptoTradeable',
'currency',
'customPriceAlertConfidence',
'displayName',
...
'trailingAnnualDividendYield',
'trailingPE',
'triggerable',
'twoHundredDayAverage',
'twoHundredDayAverageChange',
'twoHundredDayAverageChangePercent',
'typeDisp'}

I know this post is pretty old, but I just came across it now. Check out the 'yfinance' library. There's all kinds of stuff available over there!!
import pandas_datareader as web
import pandas as pd
df = web.DataReader('AAPL', data_source='yahoo', start='2011-01-01', end='2021-01-12')
df.head()
import yfinance as yf
aapl = yf.Ticker("AAPL")
aapl
# get stock info
aapl.info
# get historical market data
hist = aapl.history(period="max")
# show actions (dividends, splits)
aapl.actions
# show dividends
aapl.dividends
# show splits
aapl.splits
# show financials
aapl.financials
aapl.quarterly_financials
# show major holders
aapl.major_holders
# show institutional holders
aapl.institutional_holders
# show balance sheet
aapl.balance_sheet
aapl.quarterly_balance_sheet
# show cashflow
aapl.cashflow
aapl.quarterly_cashflow
# show earnings
aapl.earnings
aapl.quarterly_earnings
# show sustainability
aapl.sustainability
# show analysts recommendations
aapl.recommendations
# show next event (earnings, etc)
aapl.calendar
# show ISIN code - *experimental*
# ISIN = International Securities Identification Number
aapl.isin
# show options expirations
aapl.options
# get option chain for specific expiration
opt = aapl.option_chain('YYYY-MM-DD')
Result:
{'zip': '95014',
'sector': 'Technology',
'fullTimeEmployees': 164000,
'longBusinessSummary': 'Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. It also sells various related services. In addition, the company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, and HomePod. Further, it provides AppleCare support and cloud services store services; and operates various platforms, including the App Store that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. Additionally, the company offers various services, such as Apple Arcade, a game subscription service; Apple Fitness+, a personalized fitness service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It distributes third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was incorporated in 1977 and is headquartered in Cupertino, California.',
'city': 'Cupertino',
'phone': '408 996 1010',
'state': 'CA',
'country': 'United States',
'companyOfficers': [],
'website': 'https://www.apple.com',
'maxAge': 1,
'address1': 'One Apple Park Way',
'industry': 'Consumer Electronics',
'ebitdaMargins': 0.33105,
'profitMargins': 0.2531,
'grossMargins': 0.43310001,
'operatingCashflow': 122151002112,
'revenueGrowth': 0.081,
'operatingMargins': 0.30289,
'ebitda': 130541002752,
'targetLowPrice': 122,
'recommendationKey': 'buy',
'grossProfits': 170782000000,
'freeCashflow': 90215251968,
'targetMedianPrice': 180,
'currentPrice': 151.29,
'earningsGrowth': 0.048,
'currentRatio': 0.879,
'returnOnAssets': 0.21214001,
'numberOfAnalystOpinions': 41,
'targetMeanPrice': 178.15,
'debtToEquity': 261.446,
'returnOnEquity': 1.75459,
'targetHighPrice': 214,
'totalCash': 48304001024,
'totalDebt': 132480000000,
'totalRevenue': 394328014848,
'totalCashPerShare': 3.036,
'financialCurrency': 'USD',
'revenuePerShare': 24.317,
'quickRatio': 0.709,
'recommendationMean': 1.9,
'exchange': 'NMS',
'shortName': 'Apple Inc.',
'longName': 'Apple Inc.',
'exchangeTimezoneName': 'America/New_York',
'exchangeTimezoneShortName': 'EST',
'isEsgPopulated': False,
'gmtOffSetMilliseconds': '-18000000',
'quoteType': 'EQUITY',
'symbol': 'AAPL',
'messageBoardId': 'finmb_24937',
'market': 'us_market',
'annualHoldingsTurnover': None,
'enterpriseToRevenue': 6.317,
'beta3Year': None,
'enterpriseToEbitda': 19.081,
'52WeekChange': -0.06042725,
'morningStarRiskRating': None,
'forwardEps': 6.82,
'revenueQuarterlyGrowth': None,
'sharesOutstanding': 15908100096,
'fundInceptionDate': None,
'annualReportExpenseRatio': None,
'totalAssets': None,
'bookValue': 3.178,
'sharesShort': 103178670,
'sharesPercentSharesOut': 0.0064999997,
'fundFamily': None,
'lastFiscalYearEnd': 1663977600,
'heldPercentInstitutions': 0.60030997,
'netIncomeToCommon': 99802996736,
'trailingEps': 6.11,
'lastDividendValue': 0.23,
'SandP52WeekChange': -0.15323704,
'priceToBook': 47.60541,
'heldPercentInsiders': 0.00071999995,
'nextFiscalYearEnd': 1727136000,
'yield': None,
'mostRecentQuarter': 1663977600,
'shortRatio': 1.14,
'sharesShortPreviousMonthDate': 1664496000,
'floatShares': 15891414476,
'beta': 1.246644,
'enterpriseValue': 2490915094528,
'priceHint': 2,
'threeYearAverageReturn': None,
'lastSplitDate': 1598832000,
'lastSplitFactor': '4:1',
'legalType': None,
'lastDividendDate': 1667520000,
'morningStarOverallRating': None,
'earningsQuarterlyGrowth': 0.008,
'priceToSalesTrailing12Months': 6.103387,
'dateShortInterest': 1667174400,
'pegRatio': 2.71,
'ytdReturn': None,
'forwardPE': 22.183283,
'lastCapGain': None,
'shortPercentOfFloat': 0.0064999997,
'sharesShortPriorMonth': 103251184,
'impliedSharesOutstanding': 0,
'category': None,
'fiveYearAverageReturn': None,
'previousClose': 150.72,
'regularMarketOpen': 152.305,
'twoHundredDayAverage': 155.0841,
'trailingAnnualDividendYield': 0.005971337,
'payoutRatio': 0.14729999,
'volume24Hr': None,
'regularMarketDayHigh': 152.57,
'navPrice': None,
'averageDailyVolume10Day': 84360340,
'regularMarketPreviousClose': 150.72,
'fiftyDayAverage': 147.0834,
'trailingAnnualDividendRate': 0.9,
'open': 152.305,
'toCurrency': None,
'averageVolume10days': 84360340,
'expireDate': None,
'algorithm': None,
'dividendRate': 0.92,
'exDividendDate': 1667520000,
'circulatingSupply': None,
'startDate': None,
'regularMarketDayLow': 149.97,
'currency': 'USD',
'trailingPE': 24.761045,
'regularMarketVolume': 74496725,
'lastMarket': None,
'maxSupply': None,
'openInterest': None,
'marketCap': 2406736461824,
'volumeAllCurrencies': None,
'strikePrice': None,
'averageVolume': 89929545,
'dayLow': 149.97,
'ask': 150.95,
'askSize': 1000,
'volume': 74496725,
'fiftyTwoWeekHigh': 182.94,
'fromCurrency': None,
'fiveYearAvgDividendYield': 1,
'fiftyTwoWeekLow': 129.04,
'bid': 150.82,
'tradeable': False,
'dividendYield': 0.0061000003,
'bidSize': 1100,
'dayHigh': 152.57,
'coinMarketCapLink': None,
'regularMarketPrice': 151.29,
'preMarketPrice': None,
'logo_url': 'https://logo.clearb
Just pick/choose what you want.

Python Generators and how to iterate over correctly to drop records based on a key within the dictionary being present in a a separate list

I'm new to the concept of generators and I'm struggling with how to apply my changes to the records within the generator object returned from the RISparser module.
I understand that a generator only reads a record at a time and doesn't actually store the data in memory but I'm having a tough time iterating over it effectively and applying my changes.
My changes will involve dropping records that have not got ['doi'] values that are contained within a list of DOIs [doi_match].
doi_match = ['10.1002/14651858.CD008259.pub2','10.1002/14651858.CD011552','10.1002/14651858.CD011990']
Generator object returned form RISparser contains the following information, this is just the first 2 records returned of a few 100. I want to iterate over it and compare the 'doi': key from the generator with the list of DOIs.
{'type_of_reference': 'JOUR', 'title': "The CoRe Outcomes in WomeN's health (CROWN) initiative: Journal editors invite researchers to develop core outcomes in women's health", 'secondary_title': 'Neurourology and Urodynamics', 'alternate_title1': 'Neurourol. Urodyn.', 'volume': '33', 'number': '8', 'start_page': '1176', 'end_page': '1177', 'year': '2014', 'doi': '10.1002/nau.22674', 'issn': '07332467 (ISSN)', 'authors': ['Khan, K.'], 'keywords': ['Bias (epidemiology)', 'Clinical trials', 'Consensus', 'Endpoint determination/standards', 'Evidence-based medicine', 'Guidelines', 'Research design/standards', 'Systematic reviews', 'Treatment outcome', 'consensus', 'editor', 'female', 'human', 'medical literature', 'Note', 'outcomes research', 'peer review', 'randomized controlled trial (topic)', 'systematic review (topic)', "women's health", 'outcome assessment', 'personnel', 'publication', 'Female', 'Humans', 'Outcome Assessment (Health Care)', 'Periodicals as Topic', 'Research Personnel', "Women's Health"], 'publisher': 'John Wiley and Sons Inc.', 'notes': ['Export Date: 14 July 2020', 'CODEN: NEURE'], 'type_of_work': 'Note', 'name_of_database': 'Scopus', 'custom2': '25270392', 'language': 'English', 'url': 'https://www.scopus.com/inward/record.uri?eid=2-s2.0-84908368202&doi=10.1002%2fnau.22674&partnerID=40&md5=b220702e005430b637ef9d80a94dadc4'}
{'type_of_reference': 'JOUR', 'title': "The CROWN initiative: Journal editors invite researchers to develop core outcomes in women's health", 'secondary_title': 'Gynecologic Oncology', 'alternate_title1': 'Gynecol. Oncol.', 'volume': '134', 'number': '3', 'start_page': '443', 'end_page': '444', 'year': '2014', 'doi': '10.1016/j.ygyno.2014.05.005', 'issn': '00908258 (ISSN)', 'authors': ['Karlan, B.Y.'], 'author_address': 'Gynecologic Oncology and Gynecologic Oncology Reports, India', 'keywords': ['clinical trial (topic)', 'decision making', 'Editorial', 'evidence based practice', 'female infertility', 'health care personnel', 'human', 'outcome assessment', 'outcomes research', 'peer review', 'practice guideline', 'premature labor', 'priority journal', 'publication', 'systematic review (topic)', "women's health", 'editorial', 'female', 'outcome assessment', 'personnel', 'publication', 'Female', 'Humans', 'Outcome Assessment (Health Care)', 'Periodicals as Topic', 'Research Personnel', "Women's Health"], 'publisher': 'Academic Press Inc.', 'notes': ['Export Date: 14 July 2020', 'CODEN: GYNOA', 'Correspondence Address: Karlan, B.Y.; Gynecologic Oncology and Gynecologic Oncology ReportsIndia'], 'type_of_work': 'Editorial', 'name_of_database': 'Scopus', 'custom2': '25199578', 'language': 'English', 'url': 'https://www.scopus.com/inward/record.uri?eid=2-s2.0-84908351159&doi=10.1016%2fj.ygyno.2014.05.005&partnerID=40&md5=ab5a4d26d52c12d081e38364b0c79678'}
I tried iterating over the generator and applying the changes. But the records that have matches are not being placed in the match list.
match = []
for entry in ris_records:
if entry['doi'] in doi_match:
match.append(entry)
else:
del entry
any advice on how to iterate over a generator correctly, thanks.

Can't get data in table form using Selenium Python

Am new to scraping using selenium python. So i could retrieve some of the data, but i want it in table form as is displayed on the web page:
Here is what i have so far:
url='https://definitivehc.maps.arcgis.com/home/item.html?id=1044bb19da8d4dbfb6a96eb1b4ebf629&view=list&showFilters=false#data'
browser = webdriver.Chrome(r"C:\task\chromedriver")
browser.get(url)
time.sleep(25)
rows_in_table = browser.find_elements_by_xpath('//table[#class="dgrid-row-table"]//tr[th or td]')
for element in rows_in_table:
print(element.text.replace('\n', ''))
result snippet:
Hospital NameHospital TypeCityState AbrvZip CodeCounty NameState Name
Phoenix VA Health Care System (AKA Carl T Hayden VA Medical Center)VA HospitalPhoenixAZ85012MaricopaArizona040130401362620000.001
Southern Arizona VA Health Care SystemVA HospitalTucsonAZ85723PimaArizona04019040192952952202.002
VA Central California Health Care SystemVA HospitalFresnoCA93703FresnoCalifornia060190601954542202.003
VA Connecticut Healthcare System - West Haven Campus (AKA West Haven VA Medical Center)VA HospitalWest HavenCT6516New HavenConnecticut09009090092162161102.004
I will really appreciate a help form an expert on this. Thanks.

This is an updated version to what #Andrej answered, this code will download the table and instead of printing, saves it as an excel document.
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize
config_url = 'https://definitivehc.maps.arcgis.com/sharing/rest/portals/self?culture=en-us&f=json'
page_url = 'https://services7.arcgis.com/{_id}/arcgis/rest/services/Definitive_Healthcare_USA_Hospital_Beds/FeatureServer/0/query?f=json&where=1%3D1&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&orderByFields=OBJECTID%20ASC&resultOffset={offset}&resultRecordCount=50&cacheHint=true&quantizationParameters=%7B%22mode%22%3A%22edit%22%7D'
_id = requests.get(config_url).json()['id']
required=[]
offset = 0
while True:
data = requests.get(page_url.format(_id=_id, offset=offset)).json()
# uncommnet this to print all data:
#pprint(json.dumps(data, indent=4))
for i, f in enumerate(data['features'], offset+1):
required.append(f['attributes'])
if i % 50:
break
offset += 50
df=json_normalize(required)
with pd.ExcelWriter('dataFunction.xlsx', mode='A') as writer:
df.to_excel(writer)
I tried this and uploaded the excel sheet HERE(LINK TO EXCEL SHEET)!

The data is loaded dynamically using Javascript. You can use requests module to simulate those requests:
import json
import requests
config_url = 'https://definitivehc.maps.arcgis.com/sharing/rest/portals/self?culture=en-us&f=json'
page_url = 'https://services7.arcgis.com/{_id}/arcgis/rest/services/Definitive_Healthcare_USA_Hospital_Beds/FeatureServer/0/query?f=json&where=1%3D1&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&orderByFields=OBJECTID%20ASC&resultOffset={offset}&resultRecordCount=50&cacheHint=true&quantizationParameters=%7B%22mode%22%3A%22edit%22%7D'
_id = requests.get(config_url).json()['id']
offset = 0
while True:
data = requests.get(page_url.format(_id=_id, offset=offset)).json()
# uncommnet this to print all data:
# print(json.dumps(data, indent=4))
for i, f in enumerate(data['features'], offset+1):
print(i, f['attributes'])
print('-' * 160)
if i % 50:
break
offset += 50
Prints all 6624 records:
...
6614 {'OBJECTID': 6614, 'HOSPITAL_NAME': 'Walter E Washington Convention Center Field Hospital (Temporarily Open due to COVID-19)', 'HOSPITAL_TYPE': 'Short Term Acute Care Hospital', 'HQ_ADDRESS': '801 Mount Vernon Pl Nw', 'HQ_ADDRESS1': None, 'HQ_CITY': 'Washington', 'HQ_STATE': 'DC', 'HQ_ZIP_CODE': '20001', 'COUNTY_NAME': 'District of Columbia', 'STATE_NAME': 'District of Columbia', 'STATE_FIPS': '11', 'CNTY_FIPS': '001', 'FIPS': '11001', 'NUM_LICENSED_BEDS': None, 'NUM_STAFFED_BEDS': None, 'NUM_ICU_BEDS': 0, 'ADULT_ICU_BEDS': 0, 'PEDI_ICU_BEDS': None, 'BED_UTILIZATION': None, 'Potential_Increase_In_Bed_Capac': 0, 'AVG_VENTILATOR_USAGE': None}
----------------------------------------------------------------------------------------------------------------------------------------------------------------
6615 {'OBJECTID': 6615, 'HOSPITAL_NAME': 'Joint Base Cape Cod Field Hospital (Temporarily Open due to COVID-19)', 'HOSPITAL_TYPE': 'Short Term Acute Care Hospital', 'HQ_ADDRESS': 'Connery Ave', 'HQ_ADDRESS1': None, 'HQ_CITY': 'Buzzards Bay', 'HQ_STATE': 'MA', 'HQ_ZIP_CODE': '2542', 'COUNTY_NAME': 'Barnstable', 'STATE_NAME': 'Massachusetts', 'STATE_FIPS': '25', 'CNTY_FIPS': '001', 'FIPS': '25001', 'NUM_LICENSED_BEDS': None, 'NUM_STAFFED_BEDS': None, 'NUM_ICU_BEDS': 0, 'ADULT_ICU_BEDS': 0, 'PEDI_ICU_BEDS': None, 'BED_UTILIZATION': None, 'Potential_Increase_In_Bed_Capac': 0, 'AVG_VENTILATOR_USAGE': None}
----------------------------------------------------------------------------------------------------------------------------------------------------------------
6616 {'OBJECTID': 6616, 'HOSPITAL_NAME': 'UMass Lowell Recreation Center Field Hospital (Temporarily Open due to COVID-19)', 'HOSPITAL_TYPE': 'Short Term Acute Care Hospital', 'HQ_ADDRESS': '322 Aiken St', 'HQ_ADDRESS1': None, 'HQ_CITY': 'Lowell', 'HQ_STATE': 'MA', 'HQ_ZIP_CODE': '1854', 'COUNTY_NAME': 'Middlesex', 'STATE_NAME': 'Massachusetts', 'STATE_FIPS': '25', 'CNTY_FIPS': '017', 'FIPS': '25017', 'NUM_LICENSED_BEDS': None, 'NUM_STAFFED_BEDS': None, 'NUM_ICU_BEDS': 0, 'ADULT_ICU_BEDS': 0, 'PEDI_ICU_BEDS': None, 'BED_UTILIZATION': None, 'Potential_Increase_In_Bed_Capac': 0, 'AVG_VENTILATOR_USAGE': None}
----------------------------------------------------------------------------------------------------------------------------------------------------------------
6617 {'OBJECTID': 6617, 'HOSPITAL_NAME': 'Miami Beach Convention Center Field Hospital (Temporarily Open due to COVID-19)', 'HOSPITAL_TYPE': 'Short Term Acute Care Hospital', 'HQ_ADDRESS': '1901 Convention Center Dr', 'HQ_ADDRESS1': None, 'HQ_CITY': 'Miami Beach', 'HQ_STATE': 'FL', 'HQ_ZIP_CODE': '33139', 'COUNTY_NAME': 'Miami-Dade', 'STATE_NAME': 'Florida', 'STATE_FIPS': '12', 'CNTY_FIPS': '086', 'FIPS': '12086', 'NUM_LICENSED_BEDS': None, 'NUM_STAFFED_BEDS': None, 'NUM_ICU_BEDS': 0, 'ADULT_ICU_BEDS': 0, 'PEDI_ICU_BEDS': None, 'BED_UTILIZATION': None, 'Potential_Increase_In_Bed_Capac': 0, 'AVG_VENTILATOR_USAGE': None}
...

sort, add and delete similar index of list of dictionary, using FIFO

I am trying to solve this, and I have got the list of dictionary something like this. (This is the dictionary from 'purchase method')
[{'qty': 20, 'price': 2000.0, 'product': 'Computer', 'date': '2017-03-05'},
{'qty': 22, 'price': 5000.0, 'product': 'Computer', 'date': '2017-11-11'},
{'qty': 6, 'price': 1523.0, 'product': 'Computer', 'date': '2018-02-03'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2018-12-05'},
{'qty': 20, 'price': 2000.0, 'product': 'Computer', 'date': '2019-11-06'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2019-08-02'}
]
I am trying create a method called def sale(quantity, date): In this function, I want to pass in quantity and date, if enough stock is available before sale date, it allows me to sale that amount of quantity.
e.g. if I pass quantity = 30 and date = 2018-01-01 , it should allow me to sell because it is possible to sell due to enough quantity, and after this, remaining quantity and price should be calculated according and should be added to the above list of dictionary.
eg. in our case
{'qty': 12, 'price': 2000.0, 'product': 'Computer', 'date': '2018-01-01'}
and the first 2 dictionary should be deleted, because we already sold them!
(this is like inventory FIFO thing)
Here's my code that i am trying to do this. However, I'm getting errors and I'm not getting my desired output. Any other possibilities? How do I make it work?
import datetime
from collections import Counter
class Supplier:
def add_supplier(self, name, address, email, contact_no):
self.name = name
self.address = address
self.email = email
self.contact_no = contact_no
class Product:
def add_product(self, name):
self.name = name
class Company(Supplier, Product):
data_dict = []
def purchase(self, product_obj, qty, price, date=datetime.date.today()):
self.data_dict.append({'product': product_obj.name, 'qty': qty, 'price': price, 'date': str(date)})
def sale(self, sell_qty, sell_date=datetime.date.today()):
a = 0
p = 0
unit_val = 0
new_price = 0
newdict = (sorted(self.data_dict, key=lambda x: x['date']))
for dt in newdict:
a += dt['qty']
p += dt['price']
if sell_date > dt['date']:
if sell_qty <= a:
unit_val = float(p / a)
new_price = unit_val * a
a -= sell_qty
self.data_dict.append({'product': product_obj.name, 'qty': a, 'price': new_price, 'date': str(sell_date)})
print("sold!")
else:
print("Sorry, not enough qty.\n")
C = Company()
PRODUCT_OBJ = Product()
PRODUCT_OBJ.add_product('Computer')
while True:
option = int(input(" 1. You want to add stock of the product!\n2. Want to sell product?\n"))
if option == 1:
qty = int(input("Enter the qty of the product.\n"))
price = float(input("Enter the price of the product.\n"))
purchase_date = input("Enter purchase date.\n")
C.purchase(PRODUCT_OBJ, qty, price, purchase_date)
elif option == 2:
qty = int(input("Enter the qty you wanna sell, pal!"))
sale_date = input("Enter sell date.\n")
C.sale(qty)
getting errors :
Traceback (most recent call last):
File "G:/python/test.py", line 63, in <module> C.sale(qty)
File "G:/python/test.py", line 33, in sale if sell_date > dt['date']:
TypeError: '>' not supported between instances of 'datetime.date' and 'str'
desired output :
[{'qty': 12, 'price': 2000.0, 'product': 'Computer', 'date': '2018-01-01'},
{'qty': 6, 'price': 1523.0, 'product': 'Computer', 'date': '2018-02-03'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2018-12-05'},
{'qty': 20, 'price': 2000.0, 'product': 'Computer', 'date': '2019-11-06'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2019-08-02'}
]

#adrtam showed you how to fix the error you posted. But your code has serious issues. I won't fix everything, but here are a few hints:
class Supplier: add_supplier should be __init__. You create a new supplier with: s = Supplier("foo", "bar", "baz#baz", "no")
class Product: idem
class Company:
should not inherit from Product and Supplier (a company IS A product and a supplier ? No)
misses an __init__ method: data_dict is currently a class field, should be an instance field
avoid names like a, p: always use significant names.
in the desired output, there's no reason to set, after the sale, the date of the first row to '2018-01-01'
prefer verbs to name your method (depends on the context): sell instead of sale.
Now, let's look at the sell method. I assume that this is a FIFO stock: you sell first the products that were purchased first.
Here's a simple idea:
Iterate over the rows and sum the qty available.
As soon as the qty is sufficient, exit of the loop
Then remove the rows browsed and fix the qty of the last browsed row.
Example code (I assume that the data is always ordered by date: to enforce this, you may use a priority queue):
data = [{'qty': 20, 'price': 2000.0, 'product': 'Computer', 'date': '2017-03-05'},
{'qty': 22, 'price': 5000.0, 'product': 'Computer', 'date': '2017-11-11'},
{'qty': 6, 'price': 1523.0, 'product': 'Computer', 'date': '2018-02-03'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2018-12-05'},
{'qty': 20, 'price': 2000.0, 'product': 'Computer', 'date': '2019-11-06'},
{'qty': 10, 'price': 1000.0, 'product': 'Computer', 'date': '2019-08-02'}]
wanted = {'qty': 30, 'date': '2018-01-01'}
def sell(wanted):
global data # NEVER do this: just for the example
assert wanted['qty'] > 0
qty = 0
for i, row in enumerate(data):
# too late!
if row['date'] > wanted['date']:
raise Exception("Sorry, not enough qty. Operation cancelled")
qty += row['qty']
# we have enough Computers!
if qty >= wanted['qty']:
break
else: # loop completes normally
raise Exception("Sorry, not enough qty. Operation cancelled")
remaining_qty_in_last_row = qty-wanted['qty']
# copy of the last row with a new quantity + the remaining rows
data = [{**row, 'qty':remaining_qty_in_last_row}] + data[i+1:]
print ("Sold!")
for wanted in [{'qty': 30, 'date': '2018-01-01'}, {'qty': 30, 'date': '2018-01-01'}]:
try:
sell(wanted)
except e:
print(e)
print ("data", data)
That's just a sketch of the algorithm: you have to practice to extract the right design of classes and methods.

Let's just fix the error you posted:
Traceback (most recent call last):
File "G:/python/test.py", line 63, in <module> C.sale(qty)
File "G:/python/test.py", line 33, in sale if sell_date > dt['date']:
TypeError: '>' not supported between instances of 'datetime.date' and 'str'
The issue is you have sell_date in a datetime object but the date you have in the dictionary is a string. You need to convert them to the same type in order to make the comparison work. Two ways to do it:
Make your datetime object a fixed-format string that can be compared:
if sell_date.strftime("%Y-%m-%d") > dt['date']:
Parse the date string into datetime object:
import datetime
if sell_date > datetime.datetime.strptime(dt['date'], '%Y-%m-%d'):

Dictionary data is not properly appended to another dictionay

Dictionary data is not properly appended to another dictionay. Hereacc_grp is a grouped pandas data.
acc_grp
amount_currency balance credit debit lid
ldate
2018-04-01 0.0 -27359.250 30219.25 1115.0000 643259
2018-04-02 0.0 -208574.742 5000.00 1194.0005 872275
Here template_dict is my dictionay.When I print result , both lines of my acc_grp is correctly available.
Flow (From Terminal)
result (1st iteration)
{'date': '2018-04-01', 'credit': 30219.25, 'balance': -29104.25, 'debit': 1115.0}
template_dict
{'code': u'300103', 'lines': [{'date': '2018-04-01', 'credit': 30219.25, 'balance': -29104.25, 'debit': 1115.0}], 'name': u'CASH COLLECTION'}
In first case,result is correctly appended to template_dict.
result (2nd iteration)
{'date': '2018-04-02', 'credit': 5000.0, 'balance': -3805.9994999999999, 'debit': 1194.0005}
template_dict
{'code': u'300103', 'lines': [{'date': '2018-04-02', 'credit': 5000.0, 'balance': -3805.9994999999999, 'debit': 1194.0005}, {'date': '2018-04-02', 'credit': 5000.0, 'balance': -3805.9994999999999, 'debit': 1194.0005}], 'name': u'CASH COLLECTION'}
Here when we look , template_dict's lines's value is supposed to be result1 , result2 but the data is coming as result2,result2.
code
result = {}
template_dict = dict()
template_dict['lines'] = []
template_dict['code'] = line['code']
template_dict['name'] = line['name']
for index,row in acc_grp.iterrows():
balance=0
row.balance=row.debit.item()-row.credit.item()
result['date']=row.name
result['debit']=row.debit.item()
result['credit']=row.credit.item()
result['balance']=row.balance
print result
template_dict['lines'].append(result)
print template_dict

You need to create a new dictionary for each line. Otherwise, you're always changing the very same dictionary:
...
for index,row in acc_grp.iterrows():
result = {} # Create a brand new dictionary
balance=0
row.balance=row.debit.item()-row.credit.item()
result['date']=row.name
...
template_dict['lines'].append(result)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is it possible to nest the apply function for a dataframe? - python

Related

Python yahoo finance data optimitzation

Python Generators and how to iterate over correctly to drop records based on a key within the dictionary being present in a a separate list

Can't get data in table form using Selenium Python

sort, add and delete similar index of list of dictionary, using FIFO

Dictionary data is not properly appended to another dictionay

Categories

Resources