import csv file to google sheets via python - python

I am trying to setup a Python script that will automatically take a rather large csv file and upload it to a Google Sheet overwriting all the data on the current sheet.
for example lets say a new file comes out everyday of data and it is currently around 13,000 rows and goes to column ah. I want to have my pi read that csv file that is already on the pi and overwrite the cells that are already on the google sheet from the previous day.
any advice will be greatly appreciated

Related

Tableau data and python

hoping someone can help with this - please go easy I'm new to python - what I'm trying to do is download data from tableau as a CSV (which I have done) and then read that csv file using python - the problem I'm having is that no matter what method I try and use to read the file (csv, pandas, etc) all of the data from the csv is in the first 'cell' (col1 row1) and has question marks in diamonds in-between every character (these are also present when the csv file is imported into Google sheets or excel) - what am I doing wrong, or what do I need to do to fix this? - thank you in advance
Also when the csv file is opened in something like notepad it looks like a normal csv file

Python: Issue Updating Data Links In Excel After Python Dataframe Export

Situation
I'm working on a data project integrating python in Google Colab and Excel 365 on Win 8.1. My python code collects new data updates on a regimented schedule and then exports/writes (e.g. overwrites, not appends the data) like to a report on an Excel spreadsheet.
I have no issue getting this to work going to a standalone spreadsheet.
I know I could potentially do all this in Python and not use Excel at all, but I prefer not to reinvent the wheel and not spend hours hardcoding all the formulas and links already existing in Excel.
Goal
My goal is to:
1. Use new data from my python export to populate/overwrite a data table on Sheet A in an existing Excel workbook.
2. Then I have a separate Sheet B in the same Excel workbook performing calculations via pre-existing links connecting to the original data table on Sheet A. I then want the links to auto update each time my python export updates the data table on the first sheet.
Problem
The issues I am running into are that if I use the df.to_excel function to export the data and even if I use the spreadsheet name parameter, the export overrides the data table and names the tab okay, but wipes out any other pre-existing sheets within the same workbook.
So I attempted a work around by exporting to an external workbook and then trying to update the links in the second workbook automatically. Problem is the links don't appear to update without the source data file and the second workbook with the links both being manually opened and then the updated file manually saved.
I tried using openpyxl to control the excel files but it appeared to have no effect on the files and no data was updated. (See code block and result at the end of this post.)
Assistance
Does anybody know a way to use python to:
1. Overwrite a specific sheet within an Excel workbook without wiping out the other existing sheets? And then have the links on another sheet automatically update which are connected to the new data?
Or
2. Auto update external links between separate Excel workbooks while the files are unopened?
Or
3. Control an instance of excel that can open both files to allow the links to auto update and then save and close the files automatically?
I found a post from some years ago that identified a win32 package for python that appeared to be able to control instances of excel. When I try doing a pip install in Colab I got an error that the package was unrecognized or doesn't exist.
Ideally, I would prefer not to use VB if at all possible to solve this.
Any solutions are much appreciated.
Thanks in advance.
Sample Code that isn't producing any results:
import openpyxl
# Example code
from openpyxl import load_workbook
from openpyxl import Workbook
wb = load_workbook('/content/drive/MyDrive/Data/Series/AC5M.xlsx', keep_links=True)
ws = wb.active
Workbook.save
Workbook.close
print(ws)
Result:
"function openpyxl.workbook.workbook.Workbook.close"

I can't modify excel data I read from python

I am currently trying to develop a program, to use it I need to read data from an excel with Pandas.
The problem is that once I open Anaconda and Jupiter and run the program it won't let me go back to modify the excel it gets data from.
The program works and reads initial data, but I can't modify the excel sheet and save it for the program to run with other input data.
excel=pd.ExcelFile(r'C:\Users\ADURAN3\Anaconda3\python.xlsx')
df=pd.read_excel(excel,'Sheet1',index_col=0)
When I try to save the excel sheet with the new changes it forces me to rename it.
I would love it if you could help me, I am very new to pyhton.
Thank you very much.
To read an excel file, you don't need to do pd.ExcelFile(r'C:\Users\ADURAN3\Anaconda3\python.xlsx')
Just putting your file path into read_excel will work:
df = pd.read_excel(r'C:\Users\ADURAN3\Anaconda3\python.xlsx','Sheet1',index_col=0)

how to use cache while working with a heavy excel file for extracting data using python

HI I have a rather huge excel file (.xlsx) and has got multiple tab that I need to access for various purpose. Every time I have to read from excel it slows the process down. is that any way I can load selected tabs to cache the first time I read the excel book? thanks

Python, pandas.read_csv on large csv file with 10 Million rows from Google Drive file

I extracted a .csv file from Google Bigquery of 2 columns and 10 Million rows.
I have downloaded the file locally as a .csv with the size of 170Mb, then I uploaded the file to Google Drive, and I want to use pandas.read_csv() function to read it into pandas DataFrame in my Jupyter Notebook.
Here is the code I used, with specific fileID that I wanna read.
# read into pandasDF from .csv stored on Google Drive.
follow_network_df = pd.read_csv("https://drive.google.com/uc?export=download&id=1WqHWdgMVLPKVbFzIIprBBhe3I9faq4HA")
Then here is what I got:
It seems the 170Mb csv file is read as an html link?
While when I tried the same code with another csv file of 40Mb, it worked perfectly
# another csv file of 40Mb.
user_behavior_df = pd.read_csv("https://drive.google.com/uc?export=download&id=1NT3HZmrrbgUVBz5o6z_JwW5A5vRXOgJo")
Can anyone give me some hint on the root cause of the difference?
Any ideas on how to read a csv file of 10 Million rows and 170Mb from online storage? I know it's possible to just read the 10 Million rows into pandasDF by just using the BigQuery interface or from local machine, but I have to include this as part of my submission, so it's only possible for me to read from online source.
The problem is that your first file is too large for Google Drive to scan for viruses, so there's a user prompt that gets displayed instead of the actual file. You can see this if you access the first file's link.
I'd say click on the user prompt and use the following url with pd.read_csv.

Categories

Resources