dreaded "not the same object error" pickling a queryset.query object - python

I have a queryset that I need to pickle lazily and I am having some serious troubles. cPickle.dumps(queryset.query) throws the following error:
Can't pickle <class 'myproject.myapp.models.myfile.QuerySet'>: it's not the same object as myproject.myapp.models.myfile.QuerySet
Strangely (or perhaps not so strangely), I only get that error when I call cPcikle from another method or a view, but not when I call it from the command line.
I made the method below after reading PicklingError: Can't pickle <class 'decimal.Decimal'>: it's not the same object as decimal.Decimal and Django mod_wsgi PicklingError while saving object:
def dump_queryset(queryset, model):
from segment.segmentengine.models.segment import QuerySet
memo = {}
new_queryset = deepcopy(queryset, memo)
memo = {}
new_query = deepcopy(new_queryset.query, memo)
queryset = QuerySet(model=model, query=new_query)
return cPickle.dumps(queryset.query)
As you can see, I am getting extremely desperate -- that method still yields the same error. Is there a known, non-hacky solution to this problem?
EDIT: Tried using --noreload running on the django development server, but to no avail.
EDIT2: I had a typo in the error I displayed above -- it was models.QuerySet, not models.mymodel.QuerySet that it was complaining about. There is another nuance here, which is that my models file is broken out into multiple modules, so the error is ACTUALLY:
Can't pickle <class 'myproject.myapp.models.myfile.QuerySet'>: it's not the same object as myproject.myapp.models.myfile.QuerySet
Where myfile is one of the modules under models. I have an __ini__.py in models with the following line:
from myfile import *
I wonder if this is contributing to my issue. Is there some way to change my init to protect myself against this? Are there any other tests to try?
EDIT3: Here is a little more background on my use case: I have a model called Context that I use to populate a UI element with instances of mymodel. The user can add/remove/manipulate the objects on the UI side, changing their context, and when they return, they can keep their changes, because the context serialized everything. A context has a generic foreign key to different types of filters/ways the user can manipulate the object, all of which must implement a few methods that the context uses to figure out what it should display. One such filter takes a queryset that can be passed in and displays all of the objects in that queryset. This provides a way to pass in arbitrary querysets that are produced elsewhere and have them displayed in the UI element. The model that uses the Context is hierarchical (using mptt for this), and the UI element makes a request to get children each time the user clicks around, we can then take the children and determine if they should be displayed based on whether or not they are included in the Context. Hope that helps!
EDIT4: I am able to dump an empty queryset, but as soon as I add anything of value, it fails.
EDIT4: I am on Django 1.2.3

This may not be the case for everyone, but I was using Ipython notebook and having a similar issue pickling my own class.
The problem turned out to be from a reload call
from dir.my_module import my_class
reload(dir.my_module)
Removing the reload call and then re-running the import and the cell where the instance of that object was created then allowed it to be pickled.

not so elegant but perhaps it works:
add the directory of the myfile -module to os.sys.path and use only import myfile in each module where you use myfile. (remove any from segment.segmentengine.models.segment import, anywhere in your project)

According to this doc, pickling a QuerySet should be not a problem. Thus, the problem should come from other place.
Since you mentined:
EDIT2: I had a typo in the error I displayed above -- it was models.QuerySet, not models.mymodel.QuerySet that it was complaining about. There is another nuance here, which is that my models file is broken out into multiple modules, so the error is ACTUALLY:
The second error message you provided look like the same as previous one, is that what you mean?
The error message you provided looks weird. Since you are pickling "queryset.query", the error should related to the django.db.models.sql.Query class instead of the QuerySet class.
Some modules or classes may have the same name. They will override each other then cause this kind of issue. To make thing easier, I will recommend you to use "import ooo.xxx" instead of "from ooo import *".

Your could also try
import ooo.xxx as othername

Related

objc.error: NSInternalInconsistencyException - readFromData:ofType:error: is a subclass responsibility but has not been overridden

I am using PyObj-C and am making some methods in a python file to read and write files using NSDocument, which uses the abstract NSFileCoordinater class. Accessing files this way instead of just using python's open let's these classes handle things for me such as preventing files from being edited from more than one program at a time or giving enough time for read/write operations to finish before it could get deadlocked.
These features are very important, and the app I ma building I want to be up to standard as much as I can here.
I have this code that instantiates a NSDocument object that contains the content of whatever file path you put into it, as a function:
#classmethod
def write(cls, file: str):
path = NSURL.fileURLWithPath_(file)
ext = file.split('.')[-1]
doc = NSDocument.alloc().initWithContentsOfURL_ofType_error_(path, ext, None)
When I call this function with a valid file path I get this error:
File "/Users/user123/PycharmProjects/shoutout/src/sutils/cfiles.py", line 27, in write
doc = NSDocument.alloc().initWithContentsOfURL_ofType_error_(path, ext, None)
objc.error: NSInternalInconsistencyException - readFromData:ofType:error: is a subclass responsibility but has not been overridden.
I have tried to find forums both objective-c, swift, or pyobj-c based as it were asking any keywords such as objective-c is a subclass responsibility but has not been overridden on google, and checked stackoverflow, and github for existing posts on this error but I could find none.
As I understand it Objective-C being polymorphic, has my method initWithContentsOfURL:ofType:error: call readFromData:ofType:error, among other ones at the same time. I don't understand exactly however what it means when it's saying that "is a subclass responsibility but has not been overridden." I am not sure also about what it means to override a class or a one being a responsibility so that doesn't help on my part.
A NSInternalInconsistencyException means a "when an internal assertion fails and implies an unexpected condition within the called code." Not sure what a internal "assertion" is either or what this could mean.
Any idea of what I could do to fix this?
NSDocument is an abstract class that requires you to subclass and implement a number of methods to make it usable. This is document in Apple's documentation for the class.
the older Document-Baed App Programming Guide for Mac gives more information on this.

Django dumpdata: "Unable to serialize database" error due to a BitFlagField var

I've been trying to create a fixture of a table, but it's always been failing with the following message: CommandError: Unable to serialize database: __str__ returned non-string (type method). The stacktrace was equally unhelpful, pointing to one of the Django files as the culprit.
After some fiddling about, I've managed to pinpoint the culprit in the models.py:
class UserExtra(model.Models):
(...)
blocked = BitFlagField(
flags=(
'manual', 'system', 'tries', 'expired', 'inactivity',
'nosys_nobypass'
),
db_column='ind_block'
)
The class is only a list of vars and lacks any sort of function. If I remove that var and run the dumpdata command, it works. How do I serialize this field?
As Ian Shelvington helped me figure out in the comments above, the BitFlagField is a custom made type and the problem was in its __str__ function's return, as it was calling the __repr__ method in a wrong manner (return self.__repr__).
Closing this because the problem of the original question was solved, though this led to another problem.

Overwriting methods via mixin pattern does not work as intended

I am trying to introduce a mod/mixin for a problem. In particular I am focusing here on a SpeechRecognitionProblem. I intend to modify this problem and therefore I seek to do the following:
class SpeechRecognitionProblemMod(speech_recognition.SpeechRecognitionProblem):
def hparams(self, defaults, model_hparams):
SpeechRecognitionProblem.hparams(self, defaults, model_hparams)
vocab_size = self.feature_encoders(model_hparams.data_dir)['targets'].vocab_size
p = defaults
p.vocab_size['targets'] = vocab_size
def feature_encoders(self, data_dir):
# ...
So this one does not do much. It calls the hparams() function from the base class and then changes some values.
Now, there are already some ready-to-go problems e.g. Libri Speech:
#registry.register_problem()
class Librispeech(speech_recognition.SpeechRecognitionProblem):
# ..
However, in order to apply my modifications I am doing this:
#registry.register_problem()
class LibrispeechMod(SpeechRecognitionProblemMod, Librispeech):
# ..
This should, if I am not mistaken, overwrite everything (with identical signatures) in Librispeech and instead call functions of SpeechRecognitionProblemMod.
Since I was able to train a model with this code I am assuming that it's working as intended so far.
Now here comes the my problem:
After training I want to serialize the model. This usually works. However, it does not with my mod and I actually know why:
At a certain point hparams() gets called. Debugging to that point will show me the following:
self # {LibrispeechMod}
self.hparams # <bound method SpeechRecognitionProblem.hparams of ..>
self.feature_encoders # <bound method SpeechRecognitionProblemMod.feature_encoders of ..>
self.hparams should be <bound method SpeechRecognitionProblemMod.hparams of ..>! It would seem that for some reason hparams() of SpeechRecognitionProblem gets called directly instead of SpeechRecognitionProblemMod. But please note that it's the correct type for feature_encoders()!
The thing is that I know this is working during training. I can see that the hyper-paramaters (hparams) are applied accordingly simply because the model's graph node names change through my modifications.
There is one specialty I need to point out. tensor2tensor allows to dynamically load a t2t_usr_dir, which are additional python modules which get loaded by import_usr_dir. I make use of that function in my serialization script as well:
if usr_dir:
logging.info('Loading user dir %s' % usr_dir)
import_usr_dir(usr_dir)
This could be the only culprit I can see at the moment although I would not be able to tell why this may cause the problem.
If anybody sees something I do not I'd be glad to get a hint what I'm doing wrong here.
So what is the error you're getting?
For the sake of completeness, this is the result of the wrong hparams() method being called:
NotFoundError (see above for traceback): Restoring from checkpoint failed.
Key transformer/symbol_modality_256_256/softmax/weights_0 not found in checkpoint
symbol_modality_256_256 is wrong. It should be symbol_modality_<vocab-size>_256 where <vocab-size> is a vocabulary size which gets set in SpeechRecognitionProblemMod.hparams.
So, this weird behavior came from the fact that I was remote debugging and that the source files of the usr_dir were not correctly synchronized. Everything works as intended but the source files where not matching.
Case closed.

AssertionError, altough the expected call looks same as actual call

I made a command in django which calls a function.
That function does a django orm call:
def get_notes():
notes = Note.objects.filter(number=2, new=1)
return [x.note for x in notes]
I want to patch the actual lookup:
#mock.patch('Note.objects.filter', autospec=True)
def test_get_all_notes(self, notes_mock):
get_notes()
notes_mock.assert_called_once_with(number=2, new=1)
I get the following assertion error:
AssertionError: Expected call: filter(number=2, new=1)
Actual call: filter(number=2, new=1)
I search on google and stackoverflow for hours, but I still haven't a clue.
Can anyone point me in the right direction, I think it might be an obvious mistake I'm making...
AFAIK you can't use patch() like this. Patch target should be a string in the form package.module.ClassName. I don't know much about django but I suppose Note is a class so Note.objects.filter is not something you can import and hence use in patch(). Also I don't think patch() can handle attributes. Actually I don't quite understand why the patch works at all.
Try using patch.object() which is specifically designed to patch class attributes. It implies Note is already imported in your test module.
#mock.patch.object(Note, 'objects')
def test_get_all_notes(self, objects_mock):
get_notes()
objects_mock.filter.assert_called_once_with(number=2, new=1)
I've removed autospec because I'm not sure it will work correctly in this case. You can try putting it back if it works.
Another option might be to use patch() on whatever you get with type(Note.objects) (probably some django class).
As I've said I don't know much about django so I'm not sure if these things work.

shelve gives strange error

I'm trying to put some sites i crawled into a shelve, but the shelve won't accept any Site-objects. It will accept lists, strings, tuples, what have you, but as soon as i put in a Site-object, it crashes when i try to get the contents of the shelve
So when i fill up my shelve like this:
def add_to_shelve(self, site):
db = shelve.open("database")
print site, site.url
for word in site.content:
db[word] = site.url #site.url is a string, word has to be one too
shelve.open("database")['whatever'] works perfectly.
But if I do this:
def add_to_shelve(self, site):
db = shelve.open("database")
print site, site.url
for word in site.content:
db[word] = site #site is now an object of Site
shelve.open("database")['whatever'] errors out with this error message:
AttributeError: 'module' object has no attribute 'Site'
I'm completely stumped, and the pythondocs, strangely, don't have much info either. All they say is that the key in a shelve has to be a string, but the value or data can be "an arbitrary object"
It looks like you refactored your code after saving objects in the shelve. When retrieving objects from the shelve, Python rebuilds the object, and it needs to find the original class that, presumably, you have moved. This problem is typical when working with pickle (as the shelve module does).
The solution, as pduel suggests, is to provide a backwards-compatibility reference to the class in the same location that it used to be, so that pickle can find it. If you re-save all the objects, thereby rebuilding the pickles, you can remove that backwards-comatibility referece.
It seems that Python is looking for a constructor for a 'Site' object, and not finding it. I have not used shelve, but I recall the rules for what can be pickled are byzantine, and suspect the shelve rules are similar.
Try adding the line:
Site = sitemodule.Site
(with the name of the module providing 'Site') before you try unshelving. This ensures that a Site class can be found.

Categories

Resources