Beginner Python: Saving an excel file while it is open - python

I have a simple problem that I hope will have a simple solution.
I am writing python(2.7) code using the xlwt package to write excel files. The program takes data and writes it out to a file that is being saved constantly. The problem is that whenever I have the file open to check the data and python tries to save the file the program crashes.
Is there any way to make python save the file when I have it open for reading?

My experience is that sashkello is correct, Excel locks the file. Even OpenOffice/LibreOffice do this. They lock the file on disk and create a temp version as a working copy. ANY program trying to access the open file will be denied by the OS. The reason for this is because many corporations treat Excel files as databases but the users have no understanding of the issues involved in concurrency and synchronisation.
I am on linux and I get this behaviour (at least when the file is on a SAMBA share). Look in the same directory as your file, if a file called .~lock.[filename]# exists then you will be unable to read your file from another program. I'm not sure what enforces this lock but I suspect it's an NTFS attribute. Note that even a simple cp or cat fails: cp: error reading ‘CATALOGUE.ods’: Input/output error
UPDATE: The actual locking mechanism appears to be 'oplocks`, a concept connected to Windows shares: http://oreilly.com/openbook/samba/book/ch05_05.html . If the share is managed by Samba the workaround is to disable locks on certain file types, eg:
veto oplock files = /*.xlsx/
If you aren't using a share or NTFS on linux then I guess you should be able to RW the file as long as your script has write permissions. By default only the user who created the file has write access.
WORKAROUND 2: The restriction only seems to apply if you have the file open in Excel/LO as writable, however LO at least allows you to open a file as read-only (Go to File -> Properties -> Security, set Read-Only, Save and re-open the file). I don't know if this will also make it RO for xlwt though.

Hah, funny I ran across your post. I actually just implemented this tonight.
The issue is that Excel files write, and that's it, not both. You cannot read/write off the same object. So if you have another method to save data please do. I'm in a position where I don't have an option.. and so might you.
You're going to need xlutils it's the bread and butter to this.
Here's some example code:
from xlutils.copy import copy
wb_filename = 'example.xls'
wb_object = xlrd.open_workbook(wb_filename)
# And then you can read this file to your hearts galore.
# Now when it comes to writing to this, you need to copy the object and work off that.
write_object = copy(wb_object)
# Write to it all you want and then save that object.
And that's it, now if you read the object, write to it, and read the original one again it won't be updated. You either need to recreate wb_object or you need to create some sort of table in memory that you can keep track of while working through it.

Related

Python - can't find file specified & permission denied on file operation

Hi I am checking to see if a excel file is modified, and if it is basically save it as something else and open it. So it works the first time around, but on the second time I modify the file, I am getting error: The system cannot find the file specified: 'C:\example.xlsx'
Sometimes it would also throw: Permission denied: 'C:\Todolist2.xlsx'
Please help. Newbie here. Thank You
import time, os.path, os, openpyxl
from openpyxl import Workbook
currentFD = os.stat("C:\\example.xlsx")
while True:
modDate = os.stat("C:\\example.xlsx")
if (modDate > currentFD):
print('yes it does')
wb=openpyxl.load_workbook("C:\\example.xlsx")
wb.save("C:\\Todolist2.xlsx")
os.startfile("C:\\Todolist2.xlsx")
currentFD = modDate
You seem to have two different problems here but they may be related.
Since you gave a traceback for the Permission Denied error on C:\Todolist2.xlsx, let's look at that one.
On Windows, many programs, when they open a file, put a lock on it. This is especially true for "applications" programs, like Excel and Notepad. If one program has a file locked, any another program that tries to open that file to overwrite it will get a permissions error.
And that's exactly what you're seeing: The first time through, your code overwrites Todolist2.xslx, then uses startfile to tell Excel (or some application that's registered for Excel files) to open it, which works. Then it tries to overwrite the same file, which Excel presumably still has locked and open, which fails.
Depending on what you're actually trying to do here, there are a few possible workarounds:
Copy Todolist2.xlsx to a temporary file, then start that temporary.
Create new files Todolist2.xlsx, Todolist2-1.xlsx, etc., and keep opening them.
Use either COM automation or a GUI scripting framework like autogui to make Excel open a copy of the file rather than opening the file itself.
Use either of the above to make Excel close the file before overwriting the file and launching it.
Launch a new Excel instance to open the file using subprocess.Popen, so you can kill it and launch a new one.
Rewrite your whole code to build the spreadsheet using Excel COM automation, rather than by building a file to pass to it.

Attribute system similar to HTTP Headers for local files

I am in the process of writing a program and need some guidance. Essentially, I am trying to determine if a file has some marker or flag attached to it. Sort of like the attributes for a HTTP Header.
If such a marker exists, that file will be manipulated in some way (moved to another directory).
My question is:
Where exactly should I be storing this flag/marker? Do files have a system similar to HTTP Headers? I don't want to access or manipulate the contents of the file, just some kind of property of the file that can be edited without corrupting the actual file--and it must be rather universal among file types as my potential domain of file types is unbound. I have some experience with Web APIs so I am familiar with HTTP Headers and json. Does any similar system exist for local files in windows? I am especially interested in anyone who has professional/industry knowledge of common techniques that programmers use when trying to store 'meta data' in files in order to access them later. Or if anyone knows of where to point me, as I am unsure to what I should be researching.
For the record, I am going to write a program for Windows probably using Golang or Python. And the files I am going to manipulate will be potentially all common ones (.docx, .txt, .pdf, etc.)
Metadata you wish to add is best kept in a separate file or database for all files.
Or in another file with same name and different extension or prefix, that you can make hidden.
Relying on a file system is very tricky and your data will be bound by the restrictions and capabilities of the file system your file is stored on.
And, you cannot count on your data remaining intact as any application may wish to change these flags.
And some of those have very specific, clearly defined use, such as creation time, modification time, access time...
See, if you need only flagging the document, you may wish to use creation time, which will stay unchanged through out the live of this document (until is copied) to store your flags. :D
Very dirty business, unprofessional, unreliable and all that.
But it's a solution. Poor one, but exists.
I do not know that FAT32 or NTFS file systems support any extra bits for flagging except those already used by the OS.
Unixes EXT family FS's do support some extra bits. And even than you should be careful in case some other important application makes use of them for something.
Mac OS may support some metadata by itself, but I am not 100% sure.
On Windows, you have one more option to associate more data with a file, but I wouldn't use that as well.
Well, NTFS file system (FAT doesn't support that) has a feature called streams.
In essential, same file can have multiple data streams under itself. I.e. You have more than one file contents under same file node.
To be more clear. Same file contains two different files.
When you open the file normally only main stream is visible to the application. Applications must check whether the other streams are present and choose the one they want to follow.
So, you may choose to store metadata under the second stream of the file.
But, what if all streams are taken?
Even more, anti-virus programs may prevent you access to the metadata out of paranoya, or at least ask for a permission.
I don't know why MS included that option, probably for file duplication or something, but bad hackers made use of the fact that you can store some data, under existing regular file, that nobody is aware of.
Imagine a virus writing it's copy into another stream of one of programs already there.
All that is needed for it to start, instead of your old program next time you run it is a batch script added to task scheduler that flips two streams making the virus data the main one.
Nasty trick! So when this feature started to be abused, anti-virus software started restricting files with multiple streams, so it's like this feature doesn't exist.
If you want to add some metadata using OS's technology, use Windows registry,
but even that is unwise.
What to tell you?
Don't add metadata to files, organize a separate file, or index your data in special files with same name as the file you are refering to and in same folder.
If you are dealing with binary files like docx and pdf, you're best off storing the metadata in seperate files or in a sqlite file.
Metadata is usually stored seperate from files, in data structures called inodes (at least in Unix systems, Windows probably has something similar). But you probably don't want to get that deep into the rabbit hole.
If your goal is to query the system based on metadata, then it would be easier and more efficient to use something SQLite. Having the meta data in the file would mean that you would need to open the file, read it into memory from disk, and then check the meta data - i.e slower queries.
If you don't need to query based on metadata, then storing metadata in the file might make sense. It would reduce the dependencies in your application, but in order to access the contents of the file through Word or Adobe Reader, you'd need to strip the metadata before handing it off to the application. Not worth the hassle, usually

Multiple Editing of the same file

What would I have to do to allow multiple programs/users to read/write to the same file ?
Use Case
I have a CSV file and I want to enable multiple users to edit it in more or less in real time. I want to be able to write and read the small changes in the file but I also want to be able to refresh the data, loaded in my program, in the event that the entire file is replaced by some careless soul.
Background
I have seen that certain programs will refresh a file if the time stamp is changed or the file is overwritten by another program/user. (I've used this myself when editing a file in two different editors leveraging their different features).
Home Work
I would imagine this requires my application to duplicate the original file when it is initially opened. In this way any updates to the original can be diff'd against the copy to get the modifications to the current data. Then when the temporary file is updated the primary file can be re-written. Each user/program could then reload the updated files them selves. Is this a sensible way/Best practice or are there better means to an ends here.
Alternatively one could Cache the file from what I understand.
Is it better to block/lock the file ? Must I be wary of race conditions ?
Environment
I plan to do this in Python. I would also like this to be platform independent e.g. linux, windows and mac (expensive linux).
Related
It seems these are related here, here and here.
If the intensity of the edits is low, you can pull it of with csv file, but by locking the entire file to avoid users overwriting each other's edits. If the file cannot be locked until the edit is applied, you will be better by using DB, where specific records will be locked instead of the entire file.
When a user opens the file you actually serve a copy of it file_userid-1.csv and let him edit that one to avoid users overwriting their work. When the user saves you overwrite the original one. In between you keep a hook to see if the original one was modified while current user also modified his. If the original file was modified you to a diff or something I don't know.
I think what you need is a tiny replica of how svn or git works.

How to open Excel instance in python on MAC?

I think this question has been asked before but it's not clear, in the original question the user has provided excel.exe which is a windows executable extension and not for mac.
I need to open new Excel instance in Python on MAC.
which module should I import?
I'm a newbie I have completed learning python language, but have trouble understanding documentation.
If all you need to do is launch Excel, the best way to do it is to use LaunchServices to do it.
If you have PyObjC (which you do if you're using the Python that Apple pre-installs on 10.6 and later; otherwise, you may have to install it):
import Foundation
ws = Foundation.NSWorkspace.sharedWorkspace()
ws.launchApplication_('Microsoft Excel')
If not, you can always use the open tool:
import subprocess
subprocess.check_call(['open', '-a', 'Microsoft Excel'])
Either way, you're effectively launching Excel the same way as if the user double-clicked the app icon in Finder.
If you want to make Excel do something simple like open a specific document, that's not much harder. Look at the NSWorkspace or open documentation to see how to do whatever you want.
If you actually want to control Excel—e.g., open a document, make some changes, and save it—you'll want to use its AppleScript interface.
Apple's recommended way of doing that is via ScriptingBridge, or using a dual-language approach (write AppleScripts and execute them via NSAppleScript—which, in Python, you do through PyObjC). However, I'd probably use appscript (get the code from here). Despite the fact that it's been abandoned by its original creator, and is only being sparsely maintained, and will probably eventually stop working with some future OS X version, it's still much better than the official solutions.
Here's a sample (untested, because I don't have Excel here):
import appscript
excel = appscript.app('Microsoft Excel')
excel.workbooks[1].column[2].row[2].formula.set('=A2+1')
From the comments it is not completely clear if you need to 'update' an Excel file with data, and just assume that you need Excel to do so, or that you need to change some excel files to include new data.
It is usually much easier, and certainly faster (wrt excution speed) to go with 'updating' an Excel file without starting Excel. However updating is not the right word: you have to read in the file and write it out new. You can of course overwrite the orginal file, so it looks like an update.
For 'updating' you can use the trio xlrd, xlwt, xlutils if the files you work with are .xls files (Excel 2003). IIRC xlwt does not support .xlsx for writing (but xlrd can read those files).
For .xlsx files I use openpyxl,
Both are good enough for writing things like data, formula and basic formatting.
If you have existing Excel files which you use as 'templates' with information that would get lost if you read/write using one of the above packages, then you have to go with updating the file in Excel. I had to do so because I had no easy way to include Visual Basic macros and very specific formatting specified by a client. And sometimes it is just easier to visually setup a spreadsheet and then just fill the cells programmatically. But this was all done on Windows.
If you really have to drive Excel on Mac, because you need to use existing files as templates, I suggest you look at Applescript. Or, if it is an option, look at OpenOffice/LibreOffice PyUno interface.

How can I load byte strings into a file and put them in another location (python)

Sorry if that doesn't describes what I need to do but i need to make a program or script that goes to the 'Pictures' folder on a windows system grab a picture (I assume using a byte string) store it in a file (pickle or...) and load the file into another folder...
long story short, I have a program that would be complete if I could add a function that can be run on a computer (mine or anyone with my program installed) go to there 'pictures' folder on a windows os and take a picture image file and store them in a transportable file (pickle) then take that file and unload(pickle) it on my/another computer using a function preferably the same one
As I mentioned in my earlier comment you could write something that open()ed both the source image file and a destination file in binary mode, and then use the file read() and write() methods to copy the bytes from one file to the other.
However that's a somewhat low-level approach would be reinventing the wheel.
A better alternative would be to just use one of the existing copyfile...() or even higher-level copy...() file copying functions in the shutil module, which you can read about here.

Categories

Resources