I have a code that downloads data from yahoo finance to df for list of stocks. Than I create new spreadsheet for each stock. But I cannot manage to copy data from df to this spreadsheet.
n=number_of_stocks
m = 0
while n > 0:
x = Input_Stock_Names[m]
m+=1
n-=1
df = pdr.get_data_yahoo(x,starting_date,ending_date)
df = df.reset_index()
ExcelWrksht = ExcelWrkbook.Worksheets.Add()
ExcelWrksht.Name = x
ExcelWrksht = ExcelWrkbook.Worksheets(x)
Also excel file is open while code is running.
If the task is to simply store downloaded data, then using win32com is over-engineering. Simply use the facilities within pandas to write to the Excel file in the .xlsx format:
import pandas as pd
import yfinance as yf
from datetime import date,timedelta
#90 days of history from today
end_date = date.today()
start_date = end_date - timedelta(days=90)
Input_Stock_Names = ['AAPL','TSLA','MSFT']
with pd.ExcelWriter('c:\\somepath\\stocks.xlsx',mode='w') as ew:
for stock in Input_Stock_Names:
df = yf.download(stock,start=start_date,end=end_date)
df.to_excel(ew,stock)
This will create a new Excel file, with one sheet for each stock.
win32com allows you to 'drive' Excel, and do pretty much everything you would do if you had the Excel application open. However, it is relatively slow: the Excel application has to be started (and closed) and all the data and commands have to cross the 'process boundary' from one process (python) to the other (Excel).
Using ExcelWriter you simply write data to a file in the .xlsx format, so that Excel can read it later. If all you want to do is store data this is very much more efficient than using win32com.
Related
For the past few days I've been trying to do a relatively simple task but I'd always encounter some errors so I'd really appreciate some help on this. Here goes:
I have an Excel file which contains a specific column (Column F) that has a list of IDs.
What I want to do is for the program to read this excel file and allow the user to input any of the IDs they would like.
When the user types in one of the IDs, I would want the program to return a bunch IDs that contain the text that the user has inputted, and after that I'd like to export those 'bunch of IDs' to a new & separate Excel file where all the IDs would be displayed in one column but in separate rows.
Here's my code so far, I've tried using arrays and stuff but nothing seems to be working for me :/
import pandas as pd
import numpy as np
import re
import xlrd
import os.path
import xlsxwriter
import openpyxl as xl;
from pandas import ExcelWriter
from openpyxl import load_workbook
# LOAD EXCEL TO DATAFRAME
xls = pd.ExcelFile('N:/TEST/TEST UTILIZATION/IA 2020/Dev/SCS-FT-IE-Report.xlsm')
df = pd.read_excel(xls, 'FT')
# GET USER INPUT (USE AD1852 AS EXAMPLE)
value = input("Enter a Part ID:\n")
print(f'You entered {value}\n\n')
i = 0
x = df.loc[i, "MFG Device"]
df2 = np.array(['', 'MFG Device', 'Loadboard Group','Socket Group', 'ChangeKit Group'])
for i in range(17367):
# x = df.loc[i, "MFG Device"]
if value in x:
df = np.array[x]
df2.append(df)
i += 1
print(df2)
# create excel writer object
writer = pd.ExcelWriter('N:/TEST/TEST UTILIZATION/IA 2020/Dev/output.xlsx')
# write dataframe to excel
df2.to_excel(writer)
# save the excel
writer.save()
print('DataFrame is written successfully to Excel File.')
Any help would be appreciated, thanks in advance! :)
It looks like you're doing much more than you need to do. Rather than monkeying around with xlsxwriter, pandas.DataFrame.to_excel is your friend.
Just do
df2.to_excel("output.xlsx")
You don't need xlsxwriter. Simply df.to_excel() would work. In your code df2 is a numpy array/ First convert it into a pandas DataFrame format a/c to the requirement (index and columns) before writing it to excel.
I'm new with programing (main challenge) with Python and so far Python is my favorite programming language. Anyways, I would like to finish my first project using Quandl but I'm stuck need support.
I have this code, to pull Quandl stocks- however, I don't want to input my tickers on my TextEditor/Sublime either by typing or pasting in.
THE GOAL: I want to pull my ticker data from a spreadsheet, then run the .py file.
Is there a way to read cell data and input into this line of code as a string, so I get the cell data to show as ['AAPL', 'MSFT'....'FB'] or ['A1', 'A2', 'A3'.......'A10']
data = quandl.get_table('WIKI/PRICES', ticker = ['AAPL', 'MSFT', 'WMT']
TO:
data = quandl.get_table('WIKI/PRICES', ticker = **(READ TICKERS PULLED FROM EXCELSTOCKDATA.PY** ['read/write cell A1', 'read/write cell A2'....''read/write cell A10'],
I've read some of the package docs, seen YouTube examples and can't seem to solve, so I'm reaching out for your wisdom!
Here's an example of what it would look like on python if I wanted to analyzed 10+ stocks (maybe more below not sure as I copied and pasted here) but this is absolutely crazy to type in all the info by hand into python or to copy and paste as the text is also crazy. There has to be a faster method using one line of code?
GetQuandlData.py
data = quandl.get_table(['AAPL', 'MSFT','NFLX','FB','GS','TSLA','BAC','TWTR','COF','TOL','EA','PFE','MS','C','SKX','GLD','SPY','EEM']
further background:
My secondary file on reading excel data:
readstockdata.py
df = pd.read_excel("StockTickerData.xlsx", sheet_name = 'Control')
df = tickers
I guess I'd like to do this:
data = quandl.get_table('WIKI/PRICES', ticker = **(READ TICKERS PULLED FROM EXCELSTOCKDATA.PY** ['read/write cell A1', 'read/write cell A2'....''read/write cell A10'],
You have the right idea with pulling in the data from excel as a dataframe.
You should store the data in one column with the first line as a header (e.g. Tickers). Then read it as a dataframe with pandas like you have. After that to get your list, you can run:
tickerList = list(df['Tickers'])
i have a "database in the form of an excel file" used to generate reports, i want to build an app to generate the report using data exchange with this database, on windows, i'm trying to learn pandas through this,
1- am i wasting my time and should be using another approach instead
2- i need to create a data frame automatically, so i'm trying to extract column names for the data frame from my excel table
3- i managed to store the first row " the column names" in a list, how can i create a dataframe with heders using this list.
import pandas as pd
import os
import xlrd
cwd = os.getcwd()
book = xlrd.open_workbook('LISTERC.xlsx')
sheet = book.sheet_by_name("Sheet1")
count = 0
hdr = []
for c in range(10):
if (sheet.cell(0,c).value != ''):
count +=1
hdr.append (sheet.cell(0,c).value)
DF = pd.DataFrame(*zip(hdr))
I'm quite new to Python and have mostly targeted learning the language exactly to automate some processes and update / populate excel spreadsheets with realtime data. Is there a way (e.g. through openpyxl) to update specific cells with data that's extracted through python packages such as pandas or web scraping through BeautifulSoup ?
I already have the necessary code to extract the data-series that I need for my project in Python but am stuck entirely on how to link this data to excel.
import pandas as pd
import pandas_datareader.data as web
import datetime as dt
start = dt.datetime(2000,1,1)
end = dt.datetime.today()
tickers = [
"PYPL",
"BABA",
"SAP"
]
df = web.DataReader (tickers, 'yahoo', start, end)
print (df.tail(365)['Adj Close'])
Pandas has a method to export a Dataframe to Excel. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html
filename = "output.xlsx"
df.to_excel(filename)
One option is to run your python script run on a schedule and output to .csv or another format that Excel can link to. This option allows the data to be updated whenever the python script is executed.
Setup:
Output your dataframe to csv/database or other Excel readable format
Setup your python file to run on a schedule (either by scheduling, or a loop with a delay)
Create a data connection from Excel to your python outputted file/database
Build pivot tables based on table in Excel
Refresh data connection/pivot tables in Excel to get the new data
(Appreciate that this is an old question). Real time data in Excel is possible with xlOil. xlOil allows you to very easily define an Excel RTD (real time data) function in python. Excel's RTD functions operate outside the normal calc cycle and can push data onto a sheet.
Your example could be written as:
import xloil, datetime as dt, asyncio
import pandas_datareader.data as web
start = dt.datetime(2000,1,1)
#xloil.func
async def pyGetTickers(names:list, fetch_every_secs=10):
while True:
yield web.DataReader(
names, 'yahoo', start, dt.datetime.now())
await asyncio.sleep(fetch_every_secs)
Which would appear as a worksheet function pyGetTickers.
One easy solution is using xlwings library
import xlwings as xw
..
xw.Book(file_path).sheets['Sheet_name'].range('A1').value = df
this would print out your df to cell A1 of an excel file, via COM - which means it actually writes the values while file is open.
Hope this is helpful
I have a 2 column CSV with download links in the first column and company symbols in the second column. For example:
http://data.com/data001.csv, BHP
http://data.com/data001.csv, TSA
I am trying to loop through the list so that Python opens each CSV via the download link and saves it separately as the company name. Therefore each file should be downloaded and saved as follows:
BHP.csv
TSA.csv
Below is the code I am using. It currently exports the entire CSV into a single row tabbed format, then loops back and does it again and again in an infinite loop.
import pandas as pd
data = pd.read_csv('download_links.csv', names=['download', 'symbol'])
file = pd.DataFrame()
cache = []
for d in data.download:
df = pd.read_csv(d,index_col=None, header=0)
cache.append(df)
file = pd.DataFrame(cache)
for s in data.symbol:
file.to_csv(s+'.csv')
print("done")
Up until I convert the list 'cache' into the DataFrame 'file' to export it, the data is formatted perfectly. It's only when it gets converted to a DataFrame when the trouble starts.
I'd love some help on this one as I've been stuck on it for a few hours.
import pandas as pd
data = pd.read_csv('download_links.csv')
links = data.download
file_names = data.symbol
for link, file_name in zip(links,file_names):
file = pd.read_csv(link).to_csv(file_name+'.csv', index=False)
Iterate over both fields in parallel:
for download, symbol in data.itertuples(index=False):
df = pd.read_csv(d,index_col=None, header=0)
df.to_csv('{}.csv'.format(symbol))