zc.lockfile.LockError in ZODB

zc.lockfile.LockError in ZODB - python

I am trying to use ZODB 3.10.2 on my web server which is running Debian and Python 2.7.1. It seems like every time I try to access the same database from 2 different processes, I get a mysterious exception. I tried accessing a database from an interactive Python session and everything seemed to work fine:
>>> import ZODB
>>> from ZODB.FileStorage import FileStorage
>>> storage = FileStorage("test.db")
>>>
But then I tried the same series of commands from another session running at the same time and it didn't seem to work:
>>> import ZODB
>>> from ZODB.FileStorage import FileStorage
>>> storage = FileStorage("test.db")
No handlers could be found for logger "zc.lockfile"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/ZODB3-3.10.2-py2.7-linux-x86_64.egg/ZODB/FileStorage/FileStorage.py", line 125, in __init__
self._lock_file = LockFile(file_name + '.lock')
File "/usr/local/lib/python2.7/site-packages/zc.lockfile-1.0.0-py2.7.egg/zc/lockfile/__init__.py", line 76, in __init__
_lock_file(fp)
File "/usr/local/lib/python2.7/site-packages/zc.lockfile-1.0.0-py2.7.egg/zc/lockfile/__init__.py", line 59, in _lock_file
raise LockError("Couldn't lock %r" % file.name)
zc.lockfile.LockError: Couldn't lock 'test.db.lock'
>>>
Why is this happening? What can be done about it?

The ZODB does not support multi-process access. This is why you get the lock error; the ZODB file storage has been locked by one process to prevent other processes altering it.
There are several ways around this. The easiest option is to use ZEO. ZEO extends the ZODB machinery to provide access to objects over a network, and you can easily configure your ZODB to access a ZEO server instead of a local FileStorage file:
<zodb>
<zeoclient>
server localhost:9100
</zeoclient>
</zodb>
Another option is to use RelStorage, which stores the ZODB data in a relational database. RelStorage supports PostgreSQL, Oracle and MySQL backends. RelStorage takes care of concurrent access from different ZODB clients. Here is an example configuration:
<zodb>
<relstorage>
<postgresql>
# The dsn is optional, as are each of the parameters in the dsn.
dsn dbname='zodb' user='username' host='localhost' password='pass'
</postgresql>
</relstorage>
</zodb>
RelStorage requires more up-front setup work but can outperform ZEO in many scenarios.

You can not access the same database files from two processes at the same time (which is obvious). That's why you get this error. If you need to perform actions on the same data.fs file from two or more processes: use ZEO.

## Let the program run once (if it's already running, don't run it again)
## Run the program, open the form
import sys
import zc.lockfile
try:
lock = zc.lockfile.LockFile('lock', content_template='{pid};{hostname}')
if __name__ == '__main__':
mainForm()
except zc.lockfile.LockError:
sys.exit()
## zc.lockfile thanks
## https://pypi.org/project/zc.lockfile/#detailed-documentation

Related

How to backup Peewee database (SqliteQueueDatabase) programatically?

I'm using Peewee in one of my projecs. Specifically, I'm using SqliteQueueDatabase and I need to create a backup (i.e. another *.db file) without stopping my application. I saw that there are two methods that could work for me (backup and backup_to_file) but they're methods from CSqliteExtDatabase, and SqliteQueueDatabase is subclass of SqliteExtDatabase. I've found solutions to manually create a dump of the file, but I need a *.db file (not a *.csv file, for example). Couldn't find any similar question or relevant answer.
Thanks!

You can just import the backup_to_file() helper from playhouse._sqlite_ext and pass it your connection and a filename:
db = SqliteQueueDatabase('...')
from playhouse._sqlite_ext import backup_to_file
conn = db.connection() # get the underlying pysqlite conn
backup_to_file(conn, 'dest.db')
Also, if you're using pysqlite3, then there are also backup methods available on the connection itself.

Use python 2 shelf in python 3

I have data stored in a shelf file created with python 2.7
When I try to access the file from python 3.4, I get an error:
>>> import shelve
>>> population=shelve.open('shelved.shelf')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\shelve.py", line 239, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "C:\Python34\lib\shelve.py", line 223, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
File "C:\Python34\lib\dbm\__init__.py", line 88, in open
raise error[0]("db type could not be determined")
dbm.error: db type could not be determined
I'm still able to access the shelf with no problem in python 2.7, so there seems to be a backward-compatibility issue. Is there any way to directly access the old format with the new python version?

As I understand now, here is the path that lead to my problem:
The original shelf was created with Python 2 in Windows
Python 2 Windows defaults to bsddb as the underlying database for shelving, since dbm is not available on the Windows platform
Python 3 does not ship with bsddb. The underlying database is dumbdbm in Python 3 for Windows.
I at first looked into installing a third party bsddb module for Python 3, but it quickly started to turn into a hassle. It then seemed that it would be a recurring hassle any time I need to use the same shelf file on a new machine. So I decided to convert the file from bsddb to dumbdbm, which both my python 2 and python 3 installations can read.
I ran the following in Python 2, which is the version that contains both bsddb and dumbdbm:
import shelve
import dumbdbm
def dumbdbm_shelve(filename,flag="c"):
return shelve.Shelf(dumbdbm.open(filename,flag))
out_shelf=dumbdbm_shelve("shelved.dumbdbm.shelf")
in_shelf=shelve.open("shelved.shelf")
key_list=in_shelf.keys()
for key in key_list:
out_shelf[key]=in_shelf[key]
out_shelf.close()
in_shelf.close()
So far it looks like the dumbdbm.shelf files came out ok, pending a double-check of the contents.

The shelve module uses Python's pickle, which may require a protocol version when being accessed between different versions of Python.
Try supplying protocol version 2:
population = shelve.open('shelved.shelf', protocol=2)
According to the documentation:
Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
This is most likely the protocol used in the original serialization (or pickling).

Edited: You may need to rename your database. Read on...
Seems like pickle is not the culprit here. shelve relies also in anydbm (Python 2.x) or dbm (Python 3) to create/open a database and store the pickled information.
I created (manually) a database file using the following:
# Python 2.7
import anydbm
anydbm.open('database2', flag='c')
and
# Python 3.4
import dbm
dbm.open('database3', flag='c')
In both cases, it creates the same kind of database (may be distribution dependent, this is on Debian 7):
$ file *
database2: Berkeley DB (Hash, version 9, native byte-order)
database3.db: Berkeley DB (Hash, version 9, native byte-order)
anydbm can open database3.db without problems, as expected:
>>> anydbm.open('database3')
<dbm.dbm object at 0x7fb1089900f0>
Notice the lack of .db when specifying the database name, though. But dbm chokes on database2, which is weird:
>>> dbm.open('database2')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.4/dbm/__init__.py", line 88, in open
raise error[0]("db type could not be determined")
dbm.error: db type could not be determined
unless I change the name of the name of the database to database2.db:
$ mv database2 database2.db
$ python3
>>> import dbm
>>> dbm.open('database2')
<_dbm.dbm object at 0x7fa7eaefcf50>
So, I suspect a regression on the dbm module, but I haven't checked the documentation. It may be intended :-?
NB: Notice that in my case, the extension is .db, but that depends on the database being used by dbm by default! Create an empty shelf using Python 3 to figure out which one are you using and what is it expecting.

I don't think it's possible to use a Python 2 shelf with Python 3's shelve module. The underlying files are completely different, at least in my tests.
In Python 2*, a shelf is represented as a single file with the filename you originally gave it.
In Python 3*, a shelf consists of three files: filename.bak, filename.dat, and filename.dir. Without any of these files present, the shelf cannot be opened by the Python 3 library (though it appears that just the .dat file is sufficient for opening, if not actual reading).
#Ricardo Cárdenes has given an overview of why this may be--it's likely an issue with the underlying database modules used in storing the shelved data. It's possible that the databases are backwards compatible, but I don't know and a quick search hasn't turned up any obvious answers.
I think it's likely that some of the possible databases implemented by dbm are backwards-compatible, whereas others are not: this could be the cause of the discrepancy between answers here, where some people, but not all, are able to open older databases directly by specifying a protocol.
*On every machine I've tested, using Python 2.7.6 vs Pythons 3.2.5, 3.3.4, and 3.4.1

The berkeleydb module includes a retro-compatible
implementation of the shelve object. (You will also need to install Oracle Berkeley DB)
You just need to:
import berkeleydb
from berkeleydb import dbshelve as shelve
population = shelve.open('shelved.shelf')

Using Python Boto with AWS Support API

I've used boto to interact with S3 with with no problems, but now I'm attempting to connect to the AWS Support API to pull back info on open tickets, trusted advisor results, etc. It seems that the boto library has different connect methods for each AWS service? For example, with S3 it is:
conn = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
According to the boto docs, the following should work to connect to AWS Support API:
>>> from boto.support.connection import SupportConnection
>>> conn = SupportConnection('<aws access key>', '<aws secret key>')
However, there are a few problems I see after digging through the source code. First, boto.support.connection doesn't actually exist. boto.connection does, but it doesn't contain a class SupportConnection. boto.support.layer1 exists, and DOES have the class SupportConnection, but it doesn't accept key arguments as the docs suggest. Instead it takes 1 argument - an AWSQueryConnection object. That class is defined in boto.connection. AWSQueryConnection takes 1 argument - an AWSAuthConnection object, class also defined in boto.connection. Lastly, AWSAuthConnection takes a generic object, with requirements defined in init as:
class AWSAuthConnection(object):
def __init__(self, host, aws_access_key_id=None,
aws_secret_access_key=None,
is_secure=True, port=None, proxy=None, proxy_port=None,
proxy_user=None, proxy_pass=None, debug=0,
https_connection_factory=None, path='/',
provider='aws', security_token=None,
suppress_consec_slashes=True,
validate_certs=True, profile_name=None):
So, for kicks, I tried creating an AWSAuthConnection by passing keys, followed by AWSQueryConnection(awsauth), followed by SupportConnection(awsquery), with no luck. This was inside a script.
Last item of interest is that, with my keys defined in a .boto file in my home directory, and running python interpreter from the command line, I can make a direct import and call to SupportConnection() (no arguments) and it works. It clearly is picking up my keys from the .boto file and consuming them but I haven't analyzed every line of source code to understand how, and frankly, I'm hoping to avoid doing that.
Long story short, I'm hoping someone has some familiarity with boto and connecting to AWS API's other than S3 (the bulk of material that exists via google) to help me troubleshoot further.

This should work:
import boto.support
conn = boto.support.connect_to_region('us-east-1')
This assumes you have credentials in your boto config file or in an IAM Role. If you want to pass explicit credentials, do this:
import boto.support
conn = boto.support.connect_to_region('us-east-1', aws_access_key_id="<access key>", aws_secret_access_key="<secret key>")
This basic incantation should work for all services in all regions. Just import the correct module (e.g. boto.support or boto.ec2 or boto.s3 or whatever) and then call it's connect_to_region method, supplying the name of the region you want as a parameter.

Closing an SQLObject Connection

Is it possible to manually close an SQLObject Connection once it has been opened? I am trying to delete a database file once it has been used, but it seems that the open connection to the database file is stopping me from doing so.
For example:
from sqlobject import *
import os
# Create and open connection to a database file.
sqlhub.processConnection = connectionForURI('sqlite:path_to_db')
SomeObject.createTable()
# ...
# Delete database when finished.
os.remove('path_to_db')
Gives the following error:
WindowsError: [Error 32] The process cannot access the file because
it is being used by another process: 'path_to_db'

It seems like just calling .close() on the database connection seems to do the trick:
from sqlobject import *
import os
# Create and open connection to a database file.
sqlhub.processConnection = connectionForURI('sqlite:path_to_db')
#do something with connection
pass
#close connection
sqlhub.processConnection.close()
#delete database
os.remove(path_to_db)
I could only find a little bit on the close method here, but it's fair to say you can treat it like any other file object. I don't have much experience with sqlobject though, and in the interpreter, you can still remove the db right after the processConnection assignment, without closing it, so who knows.

Detect file handle leaks in python?

My program appears to be leaking file handles. How can I find out where?
My program uses file handles in a few different places—output from child processes, call ctypes API (ImageMagick) opens files, and they are copied.
It crashes in shutil.copyfile, but I'm pretty sure this is not the place it is leaking.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 874, in main
magpy.run_all()
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 656, in run_all
[operation.operate() for operation in operations]
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 417, in operate
output_file = self.place_image(output_file)
File "C:\Python25\Lib\site-packages\magpy\magpy.py", line 336, in place_image
shutil.copyfile(str(input_file), str(self.full_filename))
File "C:\Python25\Lib\shutil.py", line 47, in copyfile
fdst = open(dst, 'wb')
IOError: [Errno 24] Too many open files: 'C:\\Documents and Settings\\stuart.axon\\Desktop\\calzone\\output\\wwtbam4\\Nokia_NCD\\nl\\icon_42x42_V000.png'
Press any key to continue . . .

I had similar problems, running out of file descriptors during subprocess.Popen() calls. I used the following script to debug on what is happening:
import os
import stat
_fd_types = (
('REG', stat.S_ISREG),
('FIFO', stat.S_ISFIFO),
('DIR', stat.S_ISDIR),
('CHR', stat.S_ISCHR),
('BLK', stat.S_ISBLK),
('LNK', stat.S_ISLNK),
('SOCK', stat.S_ISSOCK)
)
def fd_table_status():
result = []
for fd in range(100):
try:
s = os.fstat(fd)
except:
continue
for fd_type, func in _fd_types:
if func(s.st_mode):
break
else:
fd_type = str(s.st_mode)
result.append((fd, fd_type))
return result
def fd_table_status_logify(fd_table_result):
return ('Open file handles: ' +
', '.join(['{0}: {1}'.format(*i) for i in fd_table_result]))
def fd_table_status_str():
return fd_table_status_logify(fd_table_status())
if __name__=='__main__':
print fd_table_status_str()
You can import this module and call fd_table_status_str() to log the file descriptor table status at different points in your code.
Also, make sure that subprocess.Popen instances are destroyed. Keeping references of Popen instances in Windows prevent the GC from running. And if the instances are kept, the associated pipes are not closed. More info here.

Use Process Explorer, select your process, View->Lower Pane View->Handles - then look for what seems out of place - usually lots of the same or similar files open points to the problem.

lsof -p <process_id> works well on several UNIX-like systems including FreeBSD.

Look at the output from ls -l /proc/$pid/fd/ (substituting the PID of your process, of course) to see which files are open [or, on win32, use Process Explorer to list open files]; then figure out where in your code you're opening them, and make sure that close() is being called. (Yes, the garbage collector will eventually close things, but it's not always fast enough to avoid running out of fds).
Checking for any circular references which might be preventing garbage collection is also a good practice. (The cycle collector will eventually dispose of these -- but it may not run frequently enough to avoid file descriptor exhaustion; I've been bitten by this personally).

While the OP has a Windows system, I'm sure plenty of people here (such as myself) are looking for others too (it's not even tagged Windows).
Google has a psutil package with a get_open_files() method. It looks like an excellent interface, but it hasn't been maintained in a couple years it seems. I actually wrote an implementation for my own Python 2 project on Linux. I'm using it with unittest to make sure my functions clean up their resources.
import os
# calling this **synchronously** will accurately relay open files on Linux
def get_open_files(pid):
# directory spawned by Python process, containing its file descriptors
path = "/proc/%d/fd" % pid
# list the abspaths belonging to that directory
links = ["%s/%s" % (path, f) for f in os.listdir(path)]
# filter out the bad ones returned by os.listdir()
valid_links = filter(lambda f: os.path.exists(f), links)
# these links are fd integers, so map them to their actual file devices
devices = map(lambda f: os.readlink(f), valid_links)
# remove any ones that are stdin, stdout, stderr, etc.
return filter(lambda f: "/dev/pts" not in f, devices)

Python's own test suite has a refleak module that utilizes fd_count. Works across operating systems and is available on full installs:
>>> from test.support.os_helper import fd_count
>>> fd_count()
27
On Python 3.9 and earlier, the os_helper doesn't exist, so from test.support import fd_count.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

zc.lockfile.LockError in ZODB - python

You can not access the same database files from two processes at the same time (which is obvious). That's why you get this error. If you need to perform actions on the same data.fs file from two or more processes: use ZEO.

Related

How to backup Peewee database (SqliteQueueDatabase) programatically?

Use python 2 shelf in python 3

Using Python Boto with AWS Support API

Closing an SQLObject Connection

Detect file handle leaks in python?

Categories

Resources