Python 2.7.2 shelve fails on OSX - python

Using the shelve standard lib on Python 2.7.2, I wrote an extremely simple test to create a persistent data file and then immediately open it for printing:
import os
import shelve
shelf_filename = str(__file__.split('.')[0] + '.dat')
#delete the shelf file if it exists already.
try:
os.remove(shelf_filename)
print "DELETED LEFTOVER SHELF FILE", shelf_filename
except OSError:
pass
#create a new shelf, write some data, and flush it to disk
shelf_handle = shelve.open(shelf_filename)
print "OPENED", shelf_filename, "WITH SHELF"
shelf_handle['foo'] = 'bar'
shelf_handle.close()
print "FLUSHED AND CLOSED THE SHELF"
#re-open the shelf we just wrote, read/print the data, and close it
shelf_handle = shelve.open(shelf_filename)
print "RE-OPENED", shelf_filename, "WITH SHELF"
print 'foo:', shelf_handle.get('foo')
shelf_handle.close()
#delete the shelf file
os.remove(shelf_filename)
However this script fails unexpectedly when it tries to re-open the shelf it just created:
DELETED LEFTOVER SHELF FILE shelve_test.dat
OPENED shelve_test.dat WITH SHELF
FLUSHED AND CLOSED THE SHELF
Traceback (most recent call last):
File "shelve_test.py", line 21, in <module>
shelf_handle = shelve.open(shelf_filename)
File ".../shelve.py", line 239, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File ".../shelve.py", line 223, in __init__
Shelf.__init__(self, anydbm.open(filename, flag), protocol, writeback)
File ".../anydbm.py", line 82, in open
raise error, "db type could not be determined"
anydbm.error: db type could not be determined
This is about the most basic possible usage of shelve so I really don't understand what I'm missing, and AFAIK I'm following the docs to the letter. Can somebody with a little shelve experience please tell me what's going on?
UPDATE: This script apparently works in certain platforms and not others, which is really kind of surprising for a Python standard lib. Definitely confirmed this does NOT work on:
Python 2.7.2 (default, May 15 2013, 13:46:05)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin

This version of Python shelve is using a deprecated dbm package, and the issue can be solved by specifying a different dbm as follows:
import anydbm
anydbm._defaultmod = __import__('dumbdbm')
If those two lines are added to the script above then everything works as expected. This really needs to be fixed in shelve.

shelve doesn't expect you to specify the extension on the filename. Instead it gives the file an extension based on which db was used to create the file.
So when you call shelve.open("shelve_test.dat"), you are creating a file that shelve expects to be created in using dumbdbm, but is instead created with a different (better) database manager.
Instead, you should be calling shelve.open("shelve_test"), that way shelve can choose to save the file using the correct extension. This also applies when opening an already created file.

Related

sqlanydb windows could not load dbcapi

I am trying to connect to a SQL Anywhere database via python. I have created the DSN and I can use command prompt to connect to the database using dbisql - c "DNS=myDSN". When I try to connect through python using con = sqlanydb.connect(DSN= "myDSN")
I get
`Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
con = sqlanydb.connect(DSN= "RPS Integration")
File "C:\Python27\lib\site-packages\sqlanydb.py", line 522, in connect
return Connection(args, kwargs)
File "C:\Python27\lib\site-packages\sqlanydb.py", line 538, in __init__
parent = Connection.cls_parent = Root("PYTHON")
File "C:\Python27\lib\site-packages\sqlanydb.py", line 464, in __init__
'libdbcapi_r.dylib')
File "C:\Python27\lib\site-packages\sqlanydb.py", line 456, in load_library
raise InterfaceError("Could not load dbcapi. Tried: " + ','.join(map(str, names)))
InterfaceError: (u'Could not load dbcapi. Tried: None,dbcapi.dll,libdbcapi_r.so,libdbcapi_r.dylib', 0)`
I was able to resolve the problem. I was never able to use the sqlanydb.connect. I ended up using pyodbc. So my final connection string was con = pydobc.connect(dsn="myDSN"). I think that sqlanydb is only fully functional with sqlanydb 17 and I was using a previous version.
Depending on how Sybase is locally installed, it could also be that the Python module really cannot find the (correct) library when looking up dbcapi.dll in your PATH. A quick fix for that is to create a new SQLANY_API_DLL environment variable (that's the initial None in the names the error message says it tried using; it takes priority over all the others) containing the correct path, usually something like %SQLANY16%\Bin64\dbcapi.dll depending on what version you have installed (Sybase usually creates an environment variable pointing to its installation folder on a per version basis).
One way to encounter the backtrace shown by the OP is running 64-bit Python interpreter, which is unable to load the 32-bit dbcapi.dll. Try launching with "py -X-32" to force a 32-bit engine, or reinstall using a 32-bit python engine.
Unfortunately sqlanydb's code that tries to be smart about finding dbcapi also ends up swallowing the exceptions thrown during loading. The sqlanydb author probably assumed that failure to load implies file not found, which isn't always the case.
I was stoked in this same problem, both windows system32 and 64.
I'm using Python 3.9 and I found that since Python 3.8 the cdll.LoadLibrary function does no longer search PATH to find library files.
To fix this, I created the enviroment variable SQLANY_API_DLL pointing to the route of sqlanywhere installation, in my case version 17, bin32 or bin64 depending on your system.

Use python 2 shelf in python 3

I have data stored in a shelf file created with python 2.7
When I try to access the file from python 3.4, I get an error:
>>> import shelve
>>> population=shelve.open('shelved.shelf')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\shelve.py", line 239, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "C:\Python34\lib\shelve.py", line 223, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
File "C:\Python34\lib\dbm\__init__.py", line 88, in open
raise error[0]("db type could not be determined")
dbm.error: db type could not be determined
I'm still able to access the shelf with no problem in python 2.7, so there seems to be a backward-compatibility issue. Is there any way to directly access the old format with the new python version?
As I understand now, here is the path that lead to my problem:
The original shelf was created with Python 2 in Windows
Python 2 Windows defaults to bsddb as the underlying database for shelving, since dbm is not available on the Windows platform
Python 3 does not ship with bsddb. The underlying database is dumbdbm in Python 3 for Windows.
I at first looked into installing a third party bsddb module for Python 3, but it quickly started to turn into a hassle. It then seemed that it would be a recurring hassle any time I need to use the same shelf file on a new machine. So I decided to convert the file from bsddb to dumbdbm, which both my python 2 and python 3 installations can read.
I ran the following in Python 2, which is the version that contains both bsddb and dumbdbm:
import shelve
import dumbdbm
def dumbdbm_shelve(filename,flag="c"):
return shelve.Shelf(dumbdbm.open(filename,flag))
out_shelf=dumbdbm_shelve("shelved.dumbdbm.shelf")
in_shelf=shelve.open("shelved.shelf")
key_list=in_shelf.keys()
for key in key_list:
out_shelf[key]=in_shelf[key]
out_shelf.close()
in_shelf.close()
So far it looks like the dumbdbm.shelf files came out ok, pending a double-check of the contents.
The shelve module uses Python's pickle, which may require a protocol version when being accessed between different versions of Python.
Try supplying protocol version 2:
population = shelve.open('shelved.shelf', protocol=2)
According to the documentation:
Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
This is most likely the protocol used in the original serialization (or pickling).
Edited: You may need to rename your database. Read on...
Seems like pickle is not the culprit here. shelve relies also in anydbm (Python 2.x) or dbm (Python 3) to create/open a database and store the pickled information.
I created (manually) a database file using the following:
# Python 2.7
import anydbm
anydbm.open('database2', flag='c')
and
# Python 3.4
import dbm
dbm.open('database3', flag='c')
In both cases, it creates the same kind of database (may be distribution dependent, this is on Debian 7):
$ file *
database2: Berkeley DB (Hash, version 9, native byte-order)
database3.db: Berkeley DB (Hash, version 9, native byte-order)
anydbm can open database3.db without problems, as expected:
>>> anydbm.open('database3')
<dbm.dbm object at 0x7fb1089900f0>
Notice the lack of .db when specifying the database name, though. But dbm chokes on database2, which is weird:
>>> dbm.open('database2')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.4/dbm/__init__.py", line 88, in open
raise error[0]("db type could not be determined")
dbm.error: db type could not be determined
unless I change the name of the name of the database to database2.db:
$ mv database2 database2.db
$ python3
>>> import dbm
>>> dbm.open('database2')
<_dbm.dbm object at 0x7fa7eaefcf50>
So, I suspect a regression on the dbm module, but I haven't checked the documentation. It may be intended :-?
NB: Notice that in my case, the extension is .db, but that depends on the database being used by dbm by default! Create an empty shelf using Python 3 to figure out which one are you using and what is it expecting.
I don't think it's possible to use a Python 2 shelf with Python 3's shelve module. The underlying files are completely different, at least in my tests.
In Python 2*, a shelf is represented as a single file with the filename you originally gave it.
In Python 3*, a shelf consists of three files: filename.bak, filename.dat, and filename.dir. Without any of these files present, the shelf cannot be opened by the Python 3 library (though it appears that just the .dat file is sufficient for opening, if not actual reading).
#Ricardo Cárdenes has given an overview of why this may be--it's likely an issue with the underlying database modules used in storing the shelved data. It's possible that the databases are backwards compatible, but I don't know and a quick search hasn't turned up any obvious answers.
I think it's likely that some of the possible databases implemented by dbm are backwards-compatible, whereas others are not: this could be the cause of the discrepancy between answers here, where some people, but not all, are able to open older databases directly by specifying a protocol.
*On every machine I've tested, using Python 2.7.6 vs Pythons 3.2.5, 3.3.4, and 3.4.1
The berkeleydb module includes a retro-compatible
implementation of the shelve object. (You will also need to install Oracle Berkeley DB)
You just need to:
import berkeleydb
from berkeleydb import dbshelve as shelve
population = shelve.open('shelved.shelf')

Can't access temporary files created with tempfile

I am using tempfile.NamedTemporaryFile() to store some text until the program ends. On Unix is working without any issues but on Windows the file returned isn't accessible for reading or writing: python gives Errno 13. The only way is to set delete=False and manually delete the file with os.remove(). Why?
This causes the IOError because the file can be opened only once after it is created.
The reason is because NamedTemporaryFile creates the file with FILE_SHARE_DELETE flag on Windows. On Windows when a file has been created/opened with specific share flag all subsequent open operations have to pass this share flag. It's not the case with Python's open function which does not pass FILE_SHARE_DELETE flag. See my answer on How to create a temporary file that can be read by a subprocess? question for more details and a workaround.
Take a look: http://docs.python.org/2/library/tempfile.html
tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])
This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the file object. Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later). If delete is true (the default), the file is deleted as soon as it is closed.
Thanks to #Rnhmjoj here is a working solution:
file = NamedTemporaryFile(delete=False)
file.close()
You have to keep the file with the delete-flag and then close it after creation. This way, Windows will unlock the file and you can do stuff with it!

Connecting to SQLServer 2005 with adodbapi

I'm very new to Python and I have Python 3.2 installed on a Win 7-32 workstation. Trying to connect to MSSQLServer 2005 Server using adodbapi-2.4.2.2, the latest update to that package.
The code/connection string looks like this:
conn = adodbapi.connect('Provider=SQLNCLI.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=XXX;Data Source=123.456.789');
From adodbapi I continually get the error (this is entire error message from Wing IDE shell):
Traceback (most recent call last):
File "D:\Program Files\Wing IDE 4.0\src\debug\tserver_sandbox.py", line 2, in
if name == 'main':
File "D:\Python32\Lib\site-packages\adodbapi\adodbapi.py", line 298, in connect
raise InterfaceError #Probably COM Error
adodbapi.adodbapi.InterfaceError:
I can trace through the code and see the exception as it happens.
I also tried using conn strings with OLEDB provider and integrated Windows security, with same results.
All of these connection strings work fine from a UDL file on my workstation, and from SSMS, but fail with the same error in adodbapi.
How do I fix this?
Try this connection string:
Initial Catalog=XXX; Data Source=<servername>\\<SQL Instance name>; Provider=SQLOLEDB.1; Integrated Security=SSPI
Update
Umm ok. Looking at the source for adodbapi I would have to say that you are suffering a COM error. (yeah I know the traceback says that). But specifically with initialising the relevant COM objects.
This means that your connection string has nothing to do with the traceback. I think a good place to start would be to make sure that your copy of pythoncom is up-to-date.
It could be that win32com/pythoncom does not support Python 3K (3.0 onwards) yet, but after a minute of googleing I have not found anything useful on that, I'll leave it to you.
This code should run successfully when you have fixed your problem (and should fail at the moment).
import win32com.client
import pythoncom
pythoncom.CoInitialize()
win32com.client.Dispatch('ADODB.Connection')
Also any exception that code throws would be useful to help debug your problem.
In case anyone else finds this thread looking for the resolution to a similar error that I saw with Python 2.7:
Traceback (most recent call last):
File "get_data.py", line 10, in <module>
connection = get_connection(r"XXX\YYY", "DB")
File "get_data.py", line 7, in get_connection
conn = adodbapi.connect(connstring)
File "C:\Python27\lib\site-packages\adodbapi\adodbapi.py", line 116, in connect
raise api.OperationalError(e, message)
adodbapi.apibase.OperationalError: (InterfaceError("Windows COM Error: Dispatch('ADODB.Connection') failed.",), 'Error opening connection to "Data Source=XXX\\YYY; Initial Catalog=DB; Provider=SQLOLEDB.1; Integrated Security=SSPI"')
In my case the simple solution was to install Python for Windows Extensions (pywin32) from here:
http://sourceforge.net/projects/pywin32/files/pywin32/Build%20219/
I had the same problem, and I tracked it down to a failure to load win32com.pyd, because of some system DLLs that was not in the "dll load path", such as msvcp100.dll
I solved the problem by copying a lot of these dll's (probably too many) into C:\WinPython-64bit-3.3.3.2\python-3.3.3.amd64\Lib\site-packages\win32

Detect file handle leaks in python?

My program appears to be leaking file handles. How can I find out where?
My program uses file handles in a few different places—output from child processes, call ctypes API (ImageMagick) opens files, and they are copied.
It crashes in shutil.copyfile, but I'm pretty sure this is not the place it is leaking.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 874, in main
magpy.run_all()
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 656, in run_all
[operation.operate() for operation in operations]
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 417, in operate
output_file = self.place_image(output_file)
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 336, in place_image
shutil.copyfile(str(input_file), str(self.full_filename))
File "C:\Python25\Lib\shutil.py", line 47, in copyfile
fdst = open(dst, 'wb')
IOError: [Errno 24] Too many open files: 'C:\\Documents and Settings\\stuart.axon\\Desktop\\calzone\\output\\wwtbam4\\Nokia_NCD\\nl\\icon_42x42_V000.png'
Press any key to continue . . .
I had similar problems, running out of file descriptors during subprocess.Popen() calls. I used the following script to debug on what is happening:
import os
import stat
_fd_types = (
('REG', stat.S_ISREG),
('FIFO', stat.S_ISFIFO),
('DIR', stat.S_ISDIR),
('CHR', stat.S_ISCHR),
('BLK', stat.S_ISBLK),
('LNK', stat.S_ISLNK),
('SOCK', stat.S_ISSOCK)
)
def fd_table_status():
result = []
for fd in range(100):
try:
s = os.fstat(fd)
except:
continue
for fd_type, func in _fd_types:
if func(s.st_mode):
break
else:
fd_type = str(s.st_mode)
result.append((fd, fd_type))
return result
def fd_table_status_logify(fd_table_result):
return ('Open file handles: ' +
', '.join(['{0}: {1}'.format(*i) for i in fd_table_result]))
def fd_table_status_str():
return fd_table_status_logify(fd_table_status())
if __name__=='__main__':
print fd_table_status_str()
You can import this module and call fd_table_status_str() to log the file descriptor table status at different points in your code.
Also, make sure that subprocess.Popen instances are destroyed. Keeping references of Popen instances in Windows prevent the GC from running. And if the instances are kept, the associated pipes are not closed. More info here.
Use Process Explorer, select your process, View->Lower Pane View->Handles - then look for what seems out of place - usually lots of the same or similar files open points to the problem.
lsof -p <process_id> works well on several UNIX-like systems including FreeBSD.
Look at the output from ls -l /proc/$pid/fd/ (substituting the PID of your process, of course) to see which files are open [or, on win32, use Process Explorer to list open files]; then figure out where in your code you're opening them, and make sure that close() is being called. (Yes, the garbage collector will eventually close things, but it's not always fast enough to avoid running out of fds).
Checking for any circular references which might be preventing garbage collection is also a good practice. (The cycle collector will eventually dispose of these -- but it may not run frequently enough to avoid file descriptor exhaustion; I've been bitten by this personally).
While the OP has a Windows system, I'm sure plenty of people here (such as myself) are looking for others too (it's not even tagged Windows).
Google has a psutil package with a get_open_files() method. It looks like an excellent interface, but it hasn't been maintained in a couple years it seems. I actually wrote an implementation for my own Python 2 project on Linux. I'm using it with unittest to make sure my functions clean up their resources.
import os
# calling this **synchronously** will accurately relay open files on Linux
def get_open_files(pid):
# directory spawned by Python process, containing its file descriptors
path = "/proc/%d/fd" % pid
# list the abspaths belonging to that directory
links = ["%s/%s" % (path, f) for f in os.listdir(path)]
# filter out the bad ones returned by os.listdir()
valid_links = filter(lambda f: os.path.exists(f), links)
# these links are fd integers, so map them to their actual file devices
devices = map(lambda f: os.readlink(f), valid_links)
# remove any ones that are stdin, stdout, stderr, etc.
return filter(lambda f: "/dev/pts" not in f, devices)
Python's own test suite has a refleak module that utilizes fd_count. Works across operating systems and is available on full installs:
>>> from test.support.os_helper import fd_count
>>> fd_count()
27
On Python 3.9 and earlier, the os_helper doesn't exist, so from test.support import fd_count.

Categories

Resources