Is it possible to pass a xlwings object to multiprocessing? - python

My goal is to write a program in python that opens an Excel workbook and starts a macro. If the macro is not completed within a certain amount of time or if it encounters an error, the program should exit. This should not only terminate the process, but also close the Excel workbook. Unfortunately, it is not possible to pass the created Excel workbook to the multiprocessing process.
import pythoncom
import xlwings as xw
import multiprocessing as mp
import time
def ExcelFunktionStarten(pfad, startanweisung, maxWartezeit):
pythoncom.CoInitialize()
xl_app = xw.App(visible=True, add_book=False)
t1 = mp.Process(target=ExcelRun, args=(pfad, startanweisung, xl_app, ))
t1.start()
vergangeneZeit = 0
while t1.is_alive():
if vergangeneZeit > maxWartezeit:
t1.terminate()
time.sleep(1)
xl_app.quit()
break
else:
print(vergangeneZeit, t1.is_alive())
print(xl_app)
vergangeneZeit += 1
time.sleep(1)
def ExcelRun(pfad, startanweisung, xl_app):
try:
wb = xl_app.books.open(pfad)
run_macro = wb.app.macro(startanweisung)
run_macro()
wb.save()
wb.close()
xl_app.quit()
except:
xl_app.quit()

Related

Python code doesn't call class from another program when Import statement is used

I'm trying to write a code a that gets user input via pop up and use that in different program.
below is the code which gets user input.
Excel_connection.py
import openpyxl
import tkinter as tk
class App(tk.Frame):
def __init__(self,master=None,**kw):
#Create a blank dictionary
self.answers = {}
tk.Frame.__init__(self,master=master,**kw)
tk.Label(self,text="Give Input Sheet Path").grid(row=0,column=0)
self.Input_From_User1 = tk.Entry(self)
self.Input_From_User1.grid(row=0,column=1)
tk.Button(self,text="Feed into Program",command =
self.collectAnswers).grid(row=2,column=1)
def collectAnswers(self):
self.answers['Input_Path'] = self.Input_From_User1.get()
global Input_Path
Input_Path = self.answers['Input_Path']
functionThatUsesAnswers(self.answers)
quit()
def quit():
root.destroy()
if __name__ == '__main__':
root = tk.Tk()
App(root).grid()
root.mainloop()
wb = openpyxl.load_workbook(Input_Path) # trying to open the open the input sheet from the
below path
ws = wb["Sheet1"]
Below is the code where i'm importing the above program which does some operation
Execution.py
import pandas
from Excel_Connection import *
from Snowflake_Connection import *
all_rows = list(ws.rows)
cur = ctx.cursor()
# Pull information from specific cells.
for row in all_rows[1:400]:
scenario = row[1].value
query = row[2].value
if_execute = row[3].value
if if_execute == 'Y':
try:
cur.execute(query)
df = cur.fetch_pandas_all()
except:
print(scenario," Failed")
else:
print("CREATED",scenario,".csv successfully")
print("ALL INDIVDUAL REPORT GENERATED")
When I'm executing Execution.py, the program does not produce pop up window and instead the code throws below error,
wb = openpyxl.load_workbook(Input_Path) # trying to open the open the input sheet from the below path
NameError: name 'Input_Path' is not defined
I tried to -
executing Excel_connection.py separately and it just worked fine.
place the code directly instead of importing the program in the Execution.py and again it worked fine as expected.
The only time I'm facing issue is when I try to import the Excel_connection.py into Excel_connection.py
Could somebody kindly help me out here.
When you import Excel_connection.py, it'll run the code in it.
So as you run Execution.py:
import pandas -> "run" the pandas stuff to define functions, etc...
from Excel_connection import * -> You import everything from Excel_connection. So the interpreter will open this file, parse, and run:
2.1 Class App is defined
2.2 runs: wb = openpyxl.load_workbook(Input_Path), which is a nonsense. Since as I see Input_Path is defined in App.collectAnswers(), which was never executed before. So there is no Input_Path to use... And you program terminates here, and tells you that.
If you run your Excel_connection.py directly, it'll work, because the if __name__ == '__main__' is True in this case and that section runs too. But if you import the file, it is false so you skip that part of the code.
You should move this to the Execution.py file before the all_rows = list(ws.rows) line
wb = openpyxl.load_workbook(Input_Path) # trying to open the open the input sheet from the
below path
ws = wb["Sheet1"]
And it'll still break since we have no Input_Path, so you must define it somehow, but it is up to you how. You can create an App like how you do it in the Excel_connection.py.
But I shouldn't do that since it is a little ugly. I would do something like:
Excel_Connection.py
import openpyxl
import tkinter as tk
class App(tk.Frame):
def __init__(self,master=None,**kw):
#Create a blank dictionary
self.answers = {}
tk.Frame.__init__(self,master=master,**kw)
tk.Label(self,text="Give Input Sheet Path").grid(row=0,column=0)
self.Input_From_User1 = tk.Entry(self)
self.Input_From_User1.grid(row=0,column=1)
tk.Button(self,text="Feed into Program",command =
self.collectAnswers).grid(row=2,column=1)
def collectAnswers(self):
self.answers['Input_Path'] = self.Input_From_User1.get()
global Input_Path
Input_Path = self.answers['Input_Path']
functionThatUsesAnswers(self.answers)
self.quit()
# def quit():
# root.destroy()
def main():
root = tk.Tk()
App(root).grid()
root.mainloop()
wb = openpyxl.load_workbook(Input_Path) # trying to open the open the input sheet from the below path
return wb["Sheet1"]
if __name__ == '__main__':
main()
and then
Execution.py
import pandas
import Excel_Connection
from Snowflake_Connection import *
# Now we call the main() function from Excel_Connection, which will return the worksheet for us.
ws = Excel_Connection.main()
all_rows = list(ws.rows)
cur = ctx.cursor()
# Pull information from specific cells.
for row in all_rows[1:400]:
scenario = row[1].value
query = row[2].value
if_execute = row[3].value
if if_execute == 'Y':
try:
cur.execute(query)
df = cur.fetch_pandas_all()
except:
print(scenario," Failed")
else:
print("CREATED",scenario,".csv successfully")
print("ALL INDIVDUAL REPORT GENERATED")
you can simply use
csv files
or some file format like that to store and transfer data between files.

How do I implement and execute threading with multiple classes in Python?

I'm very new to Python (with most of my previous programming experience being in intermediate C++ and Java) and am trying to develop a script which will read sensor data and log it to a .csv file. To do this I created separate classes for the code-- one will read the sensor data and output it to the console, while the other is supposed to take that data and log it-- and combined them together into a master script containing each class. Separately, they work perfectly, but together only the sensorReader class functions. I am trying to get each class to run in its own thread, while passing the sensor data from the first class (sensorReader) to the second class (csvWriter) as well. I've posted some of my pseudocode below, but I'd be happy to clarify any questions with the actual source code if needed.
import time
import sensorStuff
import csv
import threading
import datetime
class sensorReader:
# Initializers for the sensors.
this.code(initializes the sensors)
while True:
try:
this.code(prints the sensor data to the console)
this.code(throws exceptions)
this.code(waits 60 seconds)
class csvWriter:
this.code(fetches the date and time)
this.code(writes the headers for the excel sheet once)
while True:
this.code(gets date and time)
this.code(writes the time and one row of data to excel)
this.code(writes a message to console then repeats every minute)
r = sensorReader()
t = threading.Thread(target = r, name = "Thread #1")
t.start()
t.join
w = csvWriter()
t = threading.Thread(target = w, name = "Thread #2")
t.start()
I realize the last part doesn't really make sense, but I'm really punching above my weight here, so I'm not even sure why only the first class works and not the second, let alone how to implement threading for multiple classes. I would really appreciate it if anyone could point me in the right direction.
Thank you!
EDIT
I've decided to put up the full source code:
import time
import board
import busio
import adafruit_dps310
import adafruit_dht
import csv
import threading
import datetime
# import random
class sensorReader:
# Initializers for the sensors.
i2c = busio.I2C(board.SCL, board.SDA)
dps310 = adafruit_dps310.DPS310(i2c)
dhtDevice = adafruit_dht.DHT22(board.D4)
while True:
# Print the values to the console.
try:
global pres
pres = dps310.pressure
print("Pressure = %.2f hPa"%pres)
global temperature_c
temperature_c = dhtDevice.temperature
global temperature_f
temperature_f = temperature_c * (9 / 5) + 32
global humidity
humidity = dhtDevice.humidity
print("Temp: {:.1f} F / {:.1f} C \nHumidity: {}% "
.format(temperature_f, temperature_c, humidity))
print("")
# Errors happen fairly often with DHT sensors, and will occasionally throw exceptions.
except RuntimeError as error:
print("n/a")
print("")
# Waits 60 seconds before repeating.
time.sleep(10)
class csvWriter:
# Fetches the date and time for future file naming and data logging operations.
starttime=time.time()
x = datetime.datetime.now()
# Writes the header for the .csv file once.
with open('Weather Log %s.csv' % x, 'w', newline='') as f:
fieldnames = ['Time', 'Temperature (F)', 'Humidity (%)', 'Pressure (hPa)']
thewriter = csv.DictWriter(f, fieldnames=fieldnames)
thewriter.writeheader()
# Fetches the date and time.
while True:
from datetime import datetime
now = datetime.now()
current_time = now.strftime("%H:%M:%S")
# Writes incoming data to the .csv file.
with open('Weather Log %s.csv', 'a', newline='') as f:
fieldnames = ['TIME', 'TEMP', 'HUMI', 'PRES']
thewriter = csv.DictWriter(f, fieldnames=fieldnames)
thewriter.writerow({'TIME' : current_time, 'TEMP' : temperature_f, 'HUMI' : humidity, 'PRES' : pres})
# Writes a message confirming the data's entry into the log, then sets a 60 second repeat cycle.
print("New entry added.")
time.sleep(10.0 - ((time.time() - starttime) % 10.0)) # Repeat every ten seconds.
r = sensorReader()
t = threading.Thread(target = r, name = "Thread #1")
t.start()
t.join
w = csvWriter()
t = threading.Thread(target = w, name = "Thread #2")
t.start()
It would work better structured like this. If you put the first loop in a function, you can delay its evaluation until you're ready to start the thread. But in a class body it would run immediately and you never get to the second definition.
def sensor_reader():
# Initializers for the sensors.
this.code(initializes the sensors)
while True:
try:
this.code(prints the sensor data to the console)
except:
print()
this.code(waits 60 seconds)
threading.Thread(target=sensor_reader, name="Thread #1", daemon=True).start()
this.code(fetches the date and time)
this.code(writes the headers for the excel sheet once)
while True:
this.code(gets date and time)
this.code(writes the time and one row of data to excel)
this.code(writes a message to console then repeats every minute)
I made it a daemon so it will stop when you terminate the program. Note also that we only needed to create one thread, since we already have the main thread.

Python script keeps running in the background

This is a simple script that is used to calculate Z scores of some neuropsychological tests.
But recently the code seems to be running on the background even after exiting the program. This problem didn't exist before, I use the following startup piece to ensure program is running with elevation, and is correct size for the display.
Program can be shut down using a "exit" command that raises SystemExit or just by pressing the X button on the top bar. Program keep running on the background regardless.
Where did I go wrong?
def progStructure():
first_run = True
if first_run:
import os
os.system("mode con: cols=160 lines=50")
print("""
================================================
PROGRAM GREETING
================================================
""")
mainStartup()
first_run = False
while settings("auto_run"):
mainStartup()
#this function contains the main program, but irrelevant to the question
print("Auto shutdown enabled, program is shutting down.")
wait(2)
exit()
#informs the user data has been saved then restarts
import ctypes, sys
def is_admin():
try:
return ctypes.windll.shell32.IsUserAnAdmin()
except:
return False
if is_admin():
progStructure()
else:
ctypes.windll.shell32.ShellExecuteW(None, "runas", sys.executable, "", None, 0)
progStructure()
My excel writing function:
def excelWriter(excel_path, data_num, printable_list):
from time import strftime
date = strftime("%Y-%m-%d")
time = strftime("%H:%M:%S")
if settings("excel_output_subjectNames"):
patient_name_local = patient_name
else:
patient_name_local = "N/A"
demographic_data = [patient_ID, patient_name_local, patient_admin, date, time, patient_age, patient_sex, patient_edu]
from openpyxl import Workbook
from openpyxl import load_workbook
wb = Workbook(write_only=False)
try:
test_name_list = [
"(1)MMT", "(2)MOCA", "(3)3MS", "(4)GISD", "(5)ECR",
"(6)Sözel Bellek Süreçleri", "(7)Rey Karmaşık Figür", "(8)İz Sürme", "(9)Stroop",
"(10)Wisconsin", "(11)Görsel Sözel Test", "(12)Renkli İz Sürme",
"(13)Wechsler", "(14)Wechsler-Sayı Dizisi", "(15)Sözel Akıcılık",
"(16)Semantik Akıcılık", "(17)Saat Çizme", "(18)SDOT", "(19)Ayları İleri-Geri Sayma"
]
while True:
try:
data_workbook = load_workbook(filename = excel_path + settings("excel_name"), read_only=False)
active_sheet = data_workbook.get_sheet_by_name(test_name_list[data_num-1])
active_sheet.append(demographic_data + printable_list)
data_workbook.save(filename = excel_path + settings("excel_name"))
break
except:
for i in range(len(test_name_list)):
wb.create_sheet(title = test_name_list[i])
wb.save(filename = excel_path + settings("excel_name"))
data_workbook = load_workbook(filename = excel_path + settings("excel_name"), read_only=False)
active_sheet = data_workbook.get_sheet_by_name("Sheet")
data_workbook.remove_sheet(active_sheet)
data_workbook.save(filename = excel_path + settings("excel_name"))
continue
except:
raise
Forgot to close the excel file after writing, that seemingly fixed the problem, thanks for everyone helping out.

redirect sys.stdout to specific Jupyter Notebook cell

Jupyter==4.1.0, Python==2.7.10, IPython==4.2.0
I'm writing a SQL UI for my Jupyter Notebooks and would like to incorporate multithreading so that I can run a query in one cell and continue to work in other cells while the query is running.
The problem I'm having is that if I execute a query in one cell, the output will be displayed in the last-executed cell's output prompt instead of in the output prompt of the cell that executed the query.
I scoured the interwebs and discovered this clever trick, but I think it's outdated and/or no longer works in my version of Jupyter. When I run it, I only get output for whatever cell was last executed. So if I run both, I only get the last-executed output, instead of the output printing to separate cells simultaneously.
So I have my context manager which sets the parent_header:
import sys
import threading
from contextlib import contextmanager
# we need a lock so that other threads don't snatch control
# while we have set a temporary parent
stdout_lock = threading.Lock()
#contextmanager
def set_stdout_parent(parent):
"""a context manager for setting a particular parent for sys.stdout
the parent determines the destination cell of the output
"""
save_parent = sys.stdout.parent_header
with stdout_lock:
sys.stdout.parent_header = parent
try:
yield
finally:
# the flush is important, because that's when the parent_header actually has its effect
sys.stdout.flush()
sys.stdout.parent_header = save_parent
I essentially want to be able to get the parent_header of a cell In[1] and redirect the output of cell In[2] to the output of In[1].
Example:
Get parent_header of In[1]:
In[1]: t = sys.stdout.parent_header
Then the following code will run, but the output should print to Out[1] (currently, I get no output when I run this code):
In [2]: with set_stdout_parent(t):
print 'FOO'
Which should produce:
In[1]: t = sys.stdout.parent_header
Out[1]:'FOO'
The documentation for ipywidgets.Output has a section about interacting with output widgets from background threads. Using the Output.append_stdout method there is no need for locking. The final cell in this answer can then be replaced with:
def t1_main():
for i in range(10):
output1.append_stdout(f'thread1 {i}\n')
time.sleep(0.5)
def t2_main():
for i in range(10):
output2.append_stdout(f'thread2 {i}\n')
time.sleep(0.5)
output1.clear_output()
output2.clear_output()
t1 = Thread(target=t1_main)
t2 = Thread(target=t2_main)
t1.start()
t2.start()
t1.join()
t2.join()
You can use a combination of ipywidgets.Output (docs) and locking:
Code in jupyter cells:
# In[1]:
from threading import Thread, Lock
import time
from ipywidgets import Output
# In[2]:
output1 = Output()
output1
# In[3]:
output2 = Output()
output2
# In[4]:
print_lock = Lock()
def t1_main():
for i in range(10):
with print_lock, output1:
print('thread1', i)
time.sleep(0.5)
def t2_main():
for i in range(10):
with print_lock, output2:
print('thread2', i)
time.sleep(0.5)
output1.clear_output()
output2.clear_output()
t1 = Thread(target=t1_main)
t2 = Thread(target=t2_main)
t1.start()
t2.start()
t1.join()
t2.join()

Multiprocessing: Use Win32 API to modify Excel-Cells from Python

I'm really looking for a good solution here, maybe the complete concept how I did it or at elast tried to do it is wrong!?
I want to make my code capable of using all my cores. In the code I'm modifying Excel Cells using Win32 API. I wrote a small xls-Class which can check whether the desired file is already open (or open it if not so) and set Values to Cells. My stripped down code looks like this:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import win32com.client as win32
from multiprocessing import Pool
from time import sleep
class xls:
excel = None
filename = None
wb = None
ws = None
def __init__(self, file):
self.filename = file
def getNumOpenWorkbooks(self):
return self.excel.Workbooks.Count
def openExcelOrActivateWb(self):
self.excel = win32.gencache.EnsureDispatch('Excel.Application')
# Check whether one of the open files is the desired file (self.filename)
if self.getNumOpenWorkbooks() > 0:
for i in range(self.getNumOpenWorkbooks()):
if self.excel.Workbooks.Item(i+1).Name == os.path.basename(self.filename):
self.wb = self.excel.Workbooks.Item(i+1)
break
else:
self.wb = self.excel.Workbooks.Open(self.filename)
def setCell(self, row, col, val):
self.ws.Cells(row, col).Value = val
def setLastWorksheet(self):
self.ws = self.wb.Worksheets(self.wb.Worksheets.Count)
if __name__ == '__main__':
dat = zip(range(1, 11), [1]*10)
# Create Object
xls = xls('blaa.xls')
xls.openExcelOrActivateWb()
xls.setLastWorksheet()
for (row, col) in dat:
# Calculate some value here (only depending on row,col):
# val = some_func(row, col)
val = 'test'
xls.setCell(row, col, val)
Now as the loop does ONLY depend on the both iterated vars, I wanted to make it run in parallel on many cores. So I've heard of Threading and Multiprocessing, but the latter seemed easier to me so I gave it a go.
So I changed the code like this:
import os
import win32com.client as win32
from multiprocessing import Pool
from time import sleep
class xls:
### CLASS_DEFINITION LIKE BEFORE ###
''' Define Multiprocessing Worker '''
def multiWorker((row, col)):
xls.setCell(row, col, 'test')
if __name__ == '__main__':
# Create Object
xls = xls('StockDatabase.xlsm')
xls.openExcelOrActivateWb()
xls.setLastWorksheet()
dat = zip(range(1, 11), [1]*10)
p = Pool()
p.map(multiWorker, dat)
Didn't seem to work because after some reading, Multiprocessing starts new Processes hence xls is not known to the workers.
Unfortunately I can neither pass xls to them as a third parameter as the Win32 can't be pickled :( Like this:
def multiWorker((row, col, xls)):
xls.setCell(row, col, 'test')
if __name__ == '__main__':
# Create Object
xls = xls('StockDatabase.xlsm')
xls.openExcelOrActivateWb()
xls.setLastWorksheet()
dat = zip(range(1, 11), [1]*10, [xls]*10)
p = Pool()
p.map(multiWorker, dat)
The only way would be to initialize the Win32 for each process right before the definition of the multiWorker:
# Create Object
xls = xls('StockDatabase.xlsm')
xls.openExcelOrActivateWb()
xls.setLastWorksheet()
def multiWorker((row, col, xls)):
xls.setCell(row, col, 'test')
if __name__ == '__main__':
dat = zip(range(1, 11), [1]*10, [xls]*10)
p = Pool()
p.map(multiWorker, dat)
But I don't like it because my constructor of xls has some more logic, which automatically tries to find column ids for known header substrings... So that is a little bit more effort then wanted (and I don't think each process should really open it's own Win32 COM Interface), and this also gives me an error because gencache.EnsureDispatch might not be possible to call so often....
What to do? How is the solution?
Thanks!!
While Excel can use multiple cores when recalculating spreadsheets, its programmatic interface is strongly tied to the UI model, which is single threaded. The active workbook, worksheet, and selection are all singleton objects; this is why you cannot interact with the Excel UI at the same time you're driving it using COM (or VBA, for that matter).
tl;dr
Excel doesn't work that way.

Categories

Resources