I'm teaching myself how to Python3. I wanted to train my acquired skills and write a command-line backup program. I'm trying to save the default backup and save locations with the Shelve module but it seems that it keeps forgetting the variables I save whenever I close or restart the program.
Here is the main function that works with the shelves:
def WorkShelf(key, mode='get', variable=None):
"""Either stores a variable to the shelf or gets one from it.
Possible modes are 'store' and 'get'"""
config = shelve.open('Config')
if mode.strip() == 'get':
print(config[key])
return config[key]
elif mode.strip() == 'store':
config[key] = variable
print(key,'holds',variable)
else:
print("mode has not been reconginzed. Possible modes:\n\t- 'get'\n\t-'store'")
config.close()
So, whenever I call this function to store variables and call the function just after that, it works perfectly. I tried to access the shelf manually and everything is there.
This is the code used to store the variables:
WorkShelf('BackUpPath','store', bupath)
WorkShelf('Path2BU', 'store', path)
The problem comes when I try to get my variables from the shelf after restarting the script. This code:
config = shelve.open('Config')
path = config['Path2BU']
bupath = config['BackUpPath']
Gives me this error:
Traceback (most recent call last):
File "C:\Python35-32\lib\shelve.py", line 111, in __getitem__
value = self.cache[key]
KeyError: 'Path2BU'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
config['Path2BU']
File "C:\Python35-32\lib\shelve.py", line 113, in __getitem__
f = BytesIO(self.dict[key.encode(self.keyencoding)])
File "C:\Python35-32\lib\dbm\dumb.py", line 141, in __getitem__
pos, siz = self._index[key] # may raise KeyError
KeyError: b'Path2BU
Basically, this is an error I could reproduce by calling ShelveObject['ThisKeyDoesNotExist'].
I am really lost right now. When I try to manually create a shelf, close it and access it again it seems to work (even though I got an error doing that before) now. I've read every post concerning this, I thought about shelf corruption (but it's not likely it happens every time), and I've read my script A to Z around 20 times now.
Thanks for any help (and I hope I asked my question the right way this time)!
EDIT
Okay so this is making me crazy. Isolating WorkShelf() does work perfectly, like so:
import shelve
def WorkShelf(key, mode='get', variable=None):
"""Either stores a variable to the shelf or gets one from it.
Possible modes are 'store' and 'get'"""
config = shelve.open('Config')
if mode.strip() == 'get':
print(config[key])
return config[key]
elif mode.strip() == 'store':
config[key] = variable
print(key,'holds',variable)
else:
print("mode has not been reconginzed. Possible modes:\n\t- 'get'\n\t-'store'")
config.close()
if False:
print('Enter path n1: ')
path1 = input("> ")
WorkShelf('Path1', 'store', path1)
print ('Enter path n2: ')
path2 = input("> ")
WorkShelf('Path2', 'store', path2)
else:
path1, path2 = WorkShelf('Path1'), WorkShelf('Path2')
print (path1, path2)
No problem, perfect.
But when I use the same function in my script, I get this output. It basically tells me it does write the variables to the shelve files ('This key holds this variable' message). I can even call them with the same code I use when restarting. But when calling them after a program reset it's all like 'What are you talking about m8? I never saved these'.
Welcome to this backup manager.
We will walk you around creating your backup and backup preferences
What directory would you like to backup?
Enter path here: W:\Users\Damien\Documents\Code\Code Pyth
Would you like to se all folders and files at that path? (enter YES|NO)
> n
Okay, let's proceed.
Where would you like to create your backup?
Enter path here: N:\
Something already exists there:
19 folders and 254 documents
Would you like to change your location?
> n
Would you like to save this destination (N:\) as your default backup location ? That way you don't have to type it again.
> y
BackUpPath holds N:\
If you're going to be backing the same data up we can save the files location W:\Users\Damien\Documents\Code\Code Pyth so you don't have to type all the paths again.
Would you like me to remember the backup file's location?
> y
Path2BU holds W:\Users\Damien\Documents\Code\Code Pyth
>>>
======== RESTART: W:\Users\Damien\Documents\Code\Code Pyth\Backup.py ========
Welcome to this backup manager.
Traceback (most recent call last):
File "C:\Python35-32\lib\shelve.py", line 111, in __getitem__
value = self.cache[key]
KeyError: 'Path2BU'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "W:\Users\Damien\Documents\Code\Code Pyth\Backup.py", line 198, in <module>
path, bupath = WorkShelf('Path2BU'), WorkShelf('BackUpPath')
File "W:\Users\Damien\Documents\Code\Code Pyth\Backup.py", line 165, in WorkShelf
print(config[key])
File "C:\Python35-32\lib\shelve.py", line 113, in __getitem__
f = BytesIO(self.dict[key.encode(self.keyencoding)])
File "C:\Python35-32\lib\dbm\dumb.py", line 141, in __getitem__
pos, siz = self._index[key] # may raise KeyError
KeyError: b'Path2BU'
Please help, I'm going crazy and might develop a class to store and get variables from text files.
Okay so I just discovered what went wrong.
I had a os.chdir() in my setup code. Whenever Setup was done and I wanted to open my config files it would look in the current directory but the shelves were in the directory os.chdir() pointed to in the setup.
For some reason I ended up with empty shelves in the directory the actual ones were supposed to be. That cost me a day of debugging.
That's all for today folks!
Related
Here is my problem. We have an Excel based report that business users enter comments into two separate fields, as well as selecting a code form a drop down. We then have a manual process that collects those files and pushes the comments and codes to a Snowflake table to be able to use in various reports.
I am trying to improve the process with a Python script that will collect the files, copy them to a staging_folder location, then read in the data from the sheet, append it all together, do some cleanup and push to Snowflake. The plan is that this would be completely automated - but this is where we run into issues.
Initial step works perfectly. I have a loop that grabs the files based on the previous business day date, copies them to a staging folder. There are typically 32 files each day.
Next step reads those files to append to a dataframe. Here is the function that is loading the Excel files in my Python script.
def load_files():
file_list = glob.glob(file_path + r'\*')
df = pd.DataFrame()
print("Importing data to Pandas DF...")
for file in file_list:
try:
wb = load_workbook(file)
ws = wb["Daily Outs"]
data = ws.values
cols = next(data)[1:]
data = list(data)
idx = [r[0] for r in data]
data = (islice(r, 1, None) for r in data)
data_1 = pd.DataFrame(data, index=idx, columns=cols)
df = df.append(data_1, sort=False)
print(file + " Imported to Df...")
except Exception as e:
print("Error: " + e + " When attempting to open file: " + file)
# error_notify(e)
print(df.head(10))
return df
The problem is when we have files that have some sort of corruption. The files when opened manually will show an error like the one below.
I thought with my try, except code above this would catch an error like this and alert me with the error_notify(e) function. However, we get a result where the Python script crashes with an error like this: zipfile.BadZipFile: File is not a zip file
During handling of the above exception, another exception occurred.
There is more to the error, but I only copied & pasted this part in some communication with some folks int he office. Impossible to replicate the error on our own - I have no idea how the files get corrupted in this way - except that there are multiple people accessing the files throughout the day.
The way to make the file readable is completely manual - we must open the file, get that error, hit yes, and save the file over the existing one. Then re-launch the script. But since the try, except isn't catching it and alerting us to the failure, we have to run the script manually to see if it works or not.
Two questions - am I doing something incorrect in my try, except command? I am admittedly weak in error catching so my first thought is there is more I can do there to make that work. Secondly, is there a Python way to get past that error in the Excel workbook files?
Here is the error text:
Traceback (most recent call last):
File "G:/Replenishment/Reporting/00 - I&A Replenishment/02 - Service
Level/Daily Outs Comment Capture/Python/daily_outs_missed_files.py", line 48, in load_files
wb = load_workbook(file)
File "C:\ProgramData\Anaconda3\lib\site-packages\openpyxl\reader\excel.py", line 314, in load_workbook
data_only, keep_links)
File "C:\ProgramData\Anaconda3\lib\site-packages\openpyxl\reader\excel.py", line 124, in init
self.archive = _validate_archive(fn)
File "C:\ProgramData\Anaconda3\lib\site-packages\openpyxl\reader\excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "C:\ProgramData\Anaconda3\lib\zipfile.py", line 1222, in init
self._RealGetContents()
File "C:\ProgramData\Anaconda3\lib\zipfile.py", line 1289, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:/Replenishment/Reporting/00 - I&A Replenishment/02 - Service Level/Daily Outs Comment Capture/Python/daily_outs_missed_files.py", line 123, in <module>
main()
File "G:/Replenishment/Reporting/00 - I&A Replenishment/02 - Service Level/Daily Outs Comment Capture/Python/daily_outs_missed_files.py", line 86, in main
df_output = df_clean()
File "G:/Replenishment/Reporting/00 - I&A Replenishment/02 - Service Level/Daily Outs Comment Capture/Python/daily_outs_missed_files.py", line 68, in df_clean
df = load_files()
File "G:/Replenishment/Reporting/00 - I&A Replenishment/02 - Service Level/Daily Outs Comment Capture/Python/daily_outs_missed_files.py", line 61, in load_files
print("Error: " + e + " When attempting to open file: " + file)
TypeError: can only concatenate str (not "BadZipFile") to str
Your try/except code looks correct. All user defined exceptions in python should be classes based on Exception. See BaseException and
and Exception in python documentation :
"Exception (..) All user-defined exceptions should also be derived from this class" see also the exception class hierarchy tree at the end of the python doc sesction.
If your python script "crashes" it means one of the library procedures throws an exception which is not based on the Exception class, something that "should not" be. You could look at the Traceback and try catching the offending exception type separately, or find what part of the source code and which library is the cause, fix it and submit a PR. Here are two examples of a good and bad way of deriving own exceptions
class MyBadError(BaseException):
"""
my bad exception, do not make yours that way
"""
pass
instead of recommended
class MyGoodError(Exception):
"""
exception based on the Exception
"""
pass
Where and what exactly fails is a bit of mystery still but the problems with your exception from the Traceback is not new, see zipfile.BadZipfile issue in pandas discussion. Note that xlrd used by pandas to read Excel workbooks data is currently a "no-maintainer-ware" declaration about xlrd from the authors and in case of any issues the recommendation is to use openpyxl instead or fix any issues yourself (pandas maintainers are doing pontius pilate on that, but happily use xlrd as a dependency). I suggest you catch the BadZipfile as a special known corruption error separately from all other exceptions, see python error handling tutorial for example code (you probably already have seen it, this is for other readers). If that does not work I can trace it in the source code of your libraries / python modules to the exact offending section and find the culprit, if you reach out directly.
I recently made a program using an external document with pickle. But when it tries to load the file with pickle, I got this error (the file is already existing but it also fails when the file in't existing):
python3.6 check-graph_amazon.py
a
b
g
URL to follow www.amazon.com
Product to follow Pool_table
h
i
[' www.amazon.com', ' Pool_table', []]
p
Traceback (most recent call last):
File "check-graph_amazon.py", line 17, in <module>
tab_simple = pickle.load(doc_simple)
io.UnsupportedOperation: read
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "check-graph_amazon.py", line 42, in <module>
pickle.dump(tab_simple, 'simple_data.dat')
TypeError: file must have a 'write' attribute
Here is the code :
import pickle5 as pickle
#import os
try:
print("a")
with open('simple_data.dat', 'rb') as doc_simple:
print("b")
tab_simple = pickle.load(doc_simple)
print("c")
print(tab_simple)
print("d")
URL = tab_simple[0]
produit_nom = tab_simple[1]
tous_jours = tab_simple[2]
print("f")
except :
print("g")
URL = str(input("URL to follow"))
produit_nom = str(input("Product to follow"))
with open('simple_data.dat', 'wb+') as doc_simple:
print("h")
#os.system('chmod +x simple_data.dat')
tab_simple = []
tab_simple.append(URL)
tab_simple.append(produit_nom)
tab_simple.append([])
print(tab_simple)
print("c'est le 2")
print("p")
pickle.dump(tab_simple, 'simple_data.dat')
print("q")
The prints are here to show which lines are executed. The os.system is here to allow writing on the file but the error is persisting.
I don't understand why it's said that the document doesn't have a write attribute because I opened it in writing mode. And I neither understand the first error where it can't load the file.
If it can help you the goal of this script is to initialise the program, with a try. It tries to open the document in reading mode in the try part and then set variables. If the document doesn't exist (because the program is lauched for the first time) it goes in the except part and create the document, before writing informations on it.
I hope you will have any clue, including changing the architecture of the code if you have a better way to make an initialisation for the 1st time the program is launched.
Thanks you in advance and sorry if the code isn't well formated, I'm a beginner with this website.
Quote from the docs for pickle.dump:
pickle.dumps(obj, protocol=None, *, fix_imports=True)
Write a pickled representation of obj to the open file object file. ...
...
The file argument must have a write() method that accepts a single bytes argument. It can thus be an on-disk file opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.
So, you should pass to this function a file object, not a file name, like this:
with open("simple_data.dat", "wb"): as File:
pickle.dump(tab_simple, File)
Yeah, in your case the file has already been opened, so you should write to doc_simple.
I want to search a Perforce depot for files.
I do this from a python script and use the p4python library command:
list = p4.run("files", "//mypath/myfolder/*")
This works fine as long as myfolder contains some files. I get a python list as a return value. But when there is no file in myfolder the program stops running and no error message is displayed. My goal is to get an empty python list, so that I can see that this folder doesn't contain any files.
Does anybody has some ideas? I could not find information in the p4 files documentation and on StackOverflow.
I'm going to guess you've got an exception handler around that command execution that's eating the exception and exiting. I wrote a very simple test script and got this:
C:\Perforce\test>C:\users\samwise\AppData\local\programs\python\Python36-32\python files.py
Traceback (most recent call last):
File "files.py", line 6, in <module>
print(p4.run("files", "//depot/no such path/*"))
File "C:\users\samwise\AppData\local\programs\python\Python36-32\lib\site-packages\P4.py", line 611, in run
raise e
File "C:\users\samwise\AppData\local\programs\python\Python36-32\lib\site-packages\P4.py", line 605, in run
result = P4API.P4Adapter.run(self, *flatArgs)
P4.P4Exception: [P4#run] Errors during command execution( "p4 files //depot/no such path/*" )
[Error]: "//depot/no such path/* - must refer to client 'Samwise-dvcs-1509687817'."
Try something like this ?
import os
if len(os.listdir('//mypath/myfolder/') ) == 0: # Do not execute p4.run if directory is empty
list = []
else:
list = p4.run("files", "//mypath/myfolder/*")
This is the code snippet causing the problem:
if str(sys.argv[2]) + '.pickle' in os.listdir(os.curdir): #os.path.isfile(str(sys.argv[2]) + '.pickle'):
path = sys.argv[2] + '.pickle'
#print path
instance = cPickle.load(open(str(path)))
This is the traceback:
Traceback (most recent call last):
File "parent_cls.py", line 92, in <module>
instance = cPickle.load(open(str(path)))
EOFError
If this keeps happening because of file.close() is not performed or some other ridiculous mistake, please let me know if there is a way to access the pickle file using subprocess. Thanks.
UPDATE: Another thing I notice. The filename.pickle to check if its there or not using the if condition actually is creating a filename.pickle although it wasn't there first.
I dont want to create it but to check its existence. is this some other problem?
Open it in binary mode :
open(str(path), 'rb')
I have been stuck with this error for a couple of hours now. Not sure what is wrong. Below is the piece of code
NameError: global name 'GetText' is not defined
class BaseScreen(object):
def GetTextFromScreen(self, x, y, a, b, noofrows = 0):
count = 0
message = ""
while (count < noofrows):
line = Region(self.screen.x + x, self.screen.y + y + (count * 20), a, b)
message = message + "\n" + line.text()
count += 1
return message
class HomeScreen(BaseScreen):
def GetSearchResults(self):
if self.screen.exists("Noitemsfound.png"):
return 'No Items Found'
else:
return self.GetTextFromScreen(36, 274, 680, 20, 16)
class HomeTests(unittest.TestCase):
def test_001S(self):
Home = HomeScreen()
Home.ResetSearchCriteria()
Home.Search("0009", "Key")
self.assertTrue("0009" in Home.GetSearchResults(), "Key was not returned")
Basescreen class has all the reusable methods applicable across different screens.
Homescreen inherits Basescreen.
In HomeTests test case class, the last step is to Home.GetSearchResults() which in turn calls a base class method and the error.
Note:
I have another screenclass and testcaseclass doing the same which works without issues.
I have checked all the importing statements and is ok
'GetText' in the error message is the name of method initially after which i changed it to GetTextFromScreen
Error message is still pointing to a line 88 in code which is not there any more. Module import/reloading issue?
Try clearing out your *.pyc files (or __pycache__ if using 3+).
You asked:
Error message is still pointing to a line 88 in code which is not there any more. Module import/reloading issue?
Yes. The traceback (error messages) will show the current (newest saved) file, even if you haven't run it yet. You must reload/reimport to get the new file.
The discrepancy comes from the fact that traceback printouts read from the script file (scriptname.py) saved on your drive. However, the program is run either from the module saved in memory, or sometimes from the .pyc file. If you fix an error by changing your script, and save it to your drive, then the same error will still occur if you don't reload it.
If you're running interactively for testing, you can use the reload function:
>>> import mymodule
>>> mymodule.somefunction()
Traceback (most recent call last):
File "mymodule.py", line 3, in somefunction
Here is a broken line
OhNoError: Problem with your file
Now, you fix the error and save mymodule.py, return to your interactive session, but you still get the error, but the traceback shows the fixed line
>>> mymodule.somefunction()
Traceback (most recent call last):
File "mymodule.py", line 3, in somefunction
Here is the fixed line
OhNoError: Problem with your file
So you have to reload the module:
>>> reload(mymodule)
<module 'mymodule' from '/path/to/mymodule.py'>
>>> mymodule.somefunction()
Success!