I'm quite new to Python and have mostly targeted learning the language exactly to automate some processes and update / populate excel spreadsheets with realtime data. Is there a way (e.g. through openpyxl) to update specific cells with data that's extracted through python packages such as pandas or web scraping through BeautifulSoup ?
I already have the necessary code to extract the data-series that I need for my project in Python but am stuck entirely on how to link this data to excel.
import pandas as pd
import pandas_datareader.data as web
import datetime as dt
start = dt.datetime(2000,1,1)
end = dt.datetime.today()
tickers = [
"PYPL",
"BABA",
"SAP"
]
df = web.DataReader (tickers, 'yahoo', start, end)
print (df.tail(365)['Adj Close'])
Pandas has a method to export a Dataframe to Excel. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html
filename = "output.xlsx"
df.to_excel(filename)
One option is to run your python script run on a schedule and output to .csv or another format that Excel can link to. This option allows the data to be updated whenever the python script is executed.
Setup:
Output your dataframe to csv/database or other Excel readable format
Setup your python file to run on a schedule (either by scheduling, or a loop with a delay)
Create a data connection from Excel to your python outputted file/database
Build pivot tables based on table in Excel
Refresh data connection/pivot tables in Excel to get the new data
(Appreciate that this is an old question). Real time data in Excel is possible with xlOil. xlOil allows you to very easily define an Excel RTD (real time data) function in python. Excel's RTD functions operate outside the normal calc cycle and can push data onto a sheet.
Your example could be written as:
import xloil, datetime as dt, asyncio
import pandas_datareader.data as web
start = dt.datetime(2000,1,1)
#xloil.func
async def pyGetTickers(names:list, fetch_every_secs=10):
while True:
yield web.DataReader(
names, 'yahoo', start, dt.datetime.now())
await asyncio.sleep(fetch_every_secs)
Which would appear as a worksheet function pyGetTickers.
One easy solution is using xlwings library
import xlwings as xw
..
xw.Book(file_path).sheets['Sheet_name'].range('A1').value = df
this would print out your df to cell A1 of an excel file, via COM - which means it actually writes the values while file is open.
Hope this is helpful
Related
I have a code that downloads data from yahoo finance to df for list of stocks. Than I create new spreadsheet for each stock. But I cannot manage to copy data from df to this spreadsheet.
n=number_of_stocks
m = 0
while n > 0:
x = Input_Stock_Names[m]
m+=1
n-=1
df = pdr.get_data_yahoo(x,starting_date,ending_date)
df = df.reset_index()
ExcelWrksht = ExcelWrkbook.Worksheets.Add()
ExcelWrksht.Name = x
ExcelWrksht = ExcelWrkbook.Worksheets(x)
Also excel file is open while code is running.
If the task is to simply store downloaded data, then using win32com is over-engineering. Simply use the facilities within pandas to write to the Excel file in the .xlsx format:
import pandas as pd
import yfinance as yf
from datetime import date,timedelta
#90 days of history from today
end_date = date.today()
start_date = end_date - timedelta(days=90)
Input_Stock_Names = ['AAPL','TSLA','MSFT']
with pd.ExcelWriter('c:\\somepath\\stocks.xlsx',mode='w') as ew:
for stock in Input_Stock_Names:
df = yf.download(stock,start=start_date,end=end_date)
df.to_excel(ew,stock)
This will create a new Excel file, with one sheet for each stock.
win32com allows you to 'drive' Excel, and do pretty much everything you would do if you had the Excel application open. However, it is relatively slow: the Excel application has to be started (and closed) and all the data and commands have to cross the 'process boundary' from one process (python) to the other (Excel).
Using ExcelWriter you simply write data to a file in the .xlsx format, so that Excel can read it later. If all you want to do is store data this is very much more efficient than using win32com.
I have an excel sheet with [.xls] format containing live streaming of stock data from a software.
I want to read and process the data from the sheet in python after every 5 seconds.
Python is getting refreshed data only when i manually save the .xls file. It is not automatically getting new data points on running script after 1st time.
Any help?
This should help you:
import threading
import pandas as pd
def main_task():
threading.Timer(5.0, main_task).start() #Repeats the function main_task every 5 seconds
df = pd.read_excel("filename.xls") #Reads the excel file
main_task() #Calls the function
This code will update your pandas DataFrame with the new values every 5 seconds.
I'm new with programing (main challenge) with Python and so far Python is my favorite programming language. Anyways, I would like to finish my first project using Quandl but I'm stuck need support.
I have this code, to pull Quandl stocks- however, I don't want to input my tickers on my TextEditor/Sublime either by typing or pasting in.
THE GOAL: I want to pull my ticker data from a spreadsheet, then run the .py file.
Is there a way to read cell data and input into this line of code as a string, so I get the cell data to show as ['AAPL', 'MSFT'....'FB'] or ['A1', 'A2', 'A3'.......'A10']
data = quandl.get_table('WIKI/PRICES', ticker = ['AAPL', 'MSFT', 'WMT']
TO:
data = quandl.get_table('WIKI/PRICES', ticker = **(READ TICKERS PULLED FROM EXCELSTOCKDATA.PY** ['read/write cell A1', 'read/write cell A2'....''read/write cell A10'],
I've read some of the package docs, seen YouTube examples and can't seem to solve, so I'm reaching out for your wisdom!
Here's an example of what it would look like on python if I wanted to analyzed 10+ stocks (maybe more below not sure as I copied and pasted here) but this is absolutely crazy to type in all the info by hand into python or to copy and paste as the text is also crazy. There has to be a faster method using one line of code?
GetQuandlData.py
data = quandl.get_table(['AAPL', 'MSFT','NFLX','FB','GS','TSLA','BAC','TWTR','COF','TOL','EA','PFE','MS','C','SKX','GLD','SPY','EEM']
further background:
My secondary file on reading excel data:
readstockdata.py
df = pd.read_excel("StockTickerData.xlsx", sheet_name = 'Control')
df = tickers
I guess I'd like to do this:
data = quandl.get_table('WIKI/PRICES', ticker = **(READ TICKERS PULLED FROM EXCELSTOCKDATA.PY** ['read/write cell A1', 'read/write cell A2'....''read/write cell A10'],
You have the right idea with pulling in the data from excel as a dataframe.
You should store the data in one column with the first line as a header (e.g. Tickers). Then read it as a dataframe with pandas like you have. After that to get your list, you can run:
tickerList = list(df['Tickers'])
I need to extract the domain for example: (http: //www.example.com/example-page, http ://test.com/test-page) from a list of websites in an excel sheet and modify that domain to give its url (example.com, test.com). I have got the code part figured put but i still need to get these commands to work on excel sheet cells in a column automatically.
here's_the_code
I think you should read in the data as a pandas DataFrame (pd.read_excel), make a function from your code then apply to the dframe (df.apply). Then it is easy to save to excel with pd.to_excel().
ofc you will need pandas to be installed.
Something like:
import pandas as pd
dframe = pd.read_excel(io='' , sheet_name='')
dframe['domains'] = dframe['urls col name'].apply(your function)
dframe.to_excel('your path')
Best
I've been spending the better part of the weekend trying to figure out the best way to transfer data from an MS Access table into an Excel sheet using Python. I've found a few modules that may help (execsql, python-excel), but with my limited knowledge and the modules I have to use to create certain data (I'm a GIS professional, so I'm creating spatial data using the ArcGIS arcpy module into an access table)
I'm not sure what the best approach should be. All I need to do is copy 4 columns of data from access to excel and then format the excel. I have the formatting part solved.
Should I:
Iterate through the rows using a cursor and somehow load the rows into excel?
Copy the columns from access to excel?
Export the whole access table into a sheet in excel?
Thanks for any suggestions.
I eventually found a way to do this. I thought I'd post my code for anyone who may run into the same situation. I use some GIS files, but if you don't, you can set a variable to a directory path instead of using env.workspace and use a cursor search instead of the arcpy.SearchCursor function, then this is doable.
import arcpy, xlwt
from arcpy import env
from xlwt import Workbook
# Set the workspace. Location of feature class or dbf file. I used a dbf file.
env.workspace = "C:\data"
# Use row object to get and set field values
cur = arcpy.SearchCursor("SMU_Areas.dbf")
# Set up workbook and sheet
book = Workbook()
sheet1 = book.add_sheet('Sheet 1')
book.add_sheet('Sheet 2')
# Set counter
rowx = 0
# Loop through rows in dbf file.
for row in cur:
rowx += 1
# Write each row to the sheet from the workbook. Set column index in sheet for each column in .dbf
sheet1.write(rowx,0,row.ID)
sheet1.write(rowx,1,row.SHAPE_Area/10000)
book.save('C:\data\MyExcel.xls')
del cur, row
I currently use the XLRD module to suck in data from an Excel spreadsheet and an insert cursor to create a feature class, which works very well.
You should be able to use a search cursor to iterate through the feature class records and then use the XLWT Python module (http://www.python-excel.org/) to write the records to Excel.
You can use ADO to read the data from Access(Here are the connection strings for Access 2007+(.accdb files) and Access 2003-(.mdb files)) and than use Excel's Range.CopyFromRecordset method(assuming you are using Excel via COM) to copy the entire recordset into Excel.
The best approach might be to not use Python for this task.
You could use the macro recorder in Excel to record the import of the External data into Excel.
After starting the macro recorder click Data -> Get External Data -> New Database Query and enter your criteria. Once the data import is complete you can look at the code that was generated and replace the hard coded search criteria with variables.
Another idea - how important is the formatting part? If you can ditch the formatting, you can output your data as CSV. Excel can open CSV files, and the CSV format is much simpler then the Excel format - it's so simple you can write it directly from Python like a text file, and that way you won't need to mess with Office COM objects.