I am using h5py with mpi4py. I am reading an h5 file with as h5py.File(fname, 'w', driver='mpio', comm=MPI.COMM_WORLD) but I got an NameError.
I checked the source code from where the error comes and it needs h5py.h5.get_config().mpi to be True in order to import mpi4py. But it's set to False.
I have mpi4py installed and it works well.
The problems began when I updated numpy, I tried to go back to the previous version but it did'nt solve the problem. Before this update I had no problem with h5py
the full message error is :
File "main.py", line 87, in <module>
memory = H5_memory(MEM_SIZE, STATE_SHAPE , fname)
File "/My/work/dir/memory.py", line 185, in __init__
self.f = h5py.File(fname, 'w', driver='mpio', comm=MPI.COMM_WORLD)
File "/home/miniconda/envs/lib/python3.5/site-packages/h5py/_hl/files.py", line 270, in __init__
fapl = make_fapl(driver, libver, **kwds)
File "/hom/miniconda/envs/lib/python3.5/site-packages/h5py/_hl/files.py", line 73, in make_fapl
kwds.setdefault('info', mpi4py.MPI.Info())
NameError: name 'mpi4py' is not defined
Do you have any idea on how to solve this problem? I didn't find any answer that could help me online.
Thank you
Looking at the installation documentation for h5py, it looks like installing a parallelized version of the HDF5 library with MPI support is an option so you might have installed it without that option or misconfigured an environment variable such as HDF5_MPI=ON.
Related
I am getting the following error when I deserialize a causalml (0.10.0) model in linux (x86-64), that has been serialized on os-x (darwin):
ValueError: numpy.ndarray size changed, may indicate binary incompatibility.
Unexpectedly, trying to deserializing it again in the same python session does succeed!
The environment
On the serializing machine:
Python 3.8, in a poetry .venv
numpy 1.18.5 (the latest version compatible with causalml 0.10.0)
os-x
On the deserializing machine:
docker based on AWS lambda python 3.8
Python 3.8
linux x86_64
Both have cython version 0.28, causalml version 0.10.0.
With cython version 0.29.26 (compatible according to pip), rerunning does not succeed.
The error gets raised in the causaltree.cpython-38-x86_64-linux-gnu.so .
Joblib or Pickle
I tried both python's pickle, and joblib, both raise the error.
In the case of using joblib, the following stacktrace occurs:
File "/var/task/joblib/numpy_pickle.py", line 577, in load
obj = _unpickle(fobj)
File "/var/task/joblib/numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "/var/lang/lib/python3.8/pickle.py", line 1212, in load
dispatch[key[0]](self)
File "/var/lang/lib/python3.8/pickle.py", line 1537, in load_stack_global
self.append(self.find_class(module, name))
File "/var/lang/lib/python3.8/pickle.py", line 1579, in find_class
__import__(module, level=0)
File "/var/task/causalml/inference/tree/__init__.py", line 3, in <module>
from .causaltree import CausalMSE, CausalTreeRegressor
File "__init__.pxd", line 238, in init causalml.inference.tree.causaltree
Using a more recent python version
Other answers mention that upgrading (on the deserializing environment) to a more recent numpy, which should be backwards compatible, could help. In my case it did not help.
After installing causalml, I separately pip3 install --upgrade numpy==XXX to replace the numpy version on the deserializing machine.
With both numpy 1.18.5 and 1.19.5, the error mentions: Expected 96 from C header, got 80 from PyObject
With numpy 1.20.3, the error mentions: Expected 96 from C header, got 88 from PyObject
Can other numpy arrays be serialized & deserialized?: Yes
To verify if numpy serialization & deserialization is actually possible, I've tested serializing a random array (both with pickle and joblib:
with open(str(path / "numpy.pkl"), "wb") as f:
pickle.dump(object, f, protocol=5)
with open(str(path / "numpy.joblib"), "wb") as f:
joblib.dump(object, f, compress=True)
These actually deserialize without errors:
with open(str(path / "numpy.pkl"), "rb") as f:
read_object = pickle.load(f)
with open(str(path / "numpy.pkl"), "rb") as f:
read_object = joblib.load(f)
Numpy source
If I look at the source code of numpy at this line it seems the error only gets raised when the retrieved size is bigger than the expected size.
Some other (older) stackoverflow answers mention that the warnings can be silenced as follows. But didn't help neither:
import warnings;
warnings.filterwarnings("ignore", message="numpy.dtype size changed");
warnings.filterwarnings("ignore", message="numpy.ufunc size changed");
Trying twice solves it
I found one way to solve this: in the same python session, load the serialized model twice. The first time raises the error, the second time it does not.
The loaded model then does behave as expected.
What is happening? And is there a way to make it succeed the first time?
I am trying to run a code that was used with older versions of torch and torchtext. I have adjusted a lot in the code to make it work. I was able to pre-process and train my data. Lastly I tried to run the test script, after solving multiple errors, I am getting this error:
Batch size > 1 not implemented! Falling back to batch_size = 1 ...
Traceback (most recent call last):
File "translate_mm.py", line 166, in <module>
main()
File "translate_mm.py", line 84, in main
onmt.ModelConstructor.load_test_model(opt, dummy_opt.__dict__)
File "/onmt/ModelConstructor.py", line 145, in load_test_model
checkpoint['vocab'], data_type=opt.data_type)
File "/onmt/io/IO.py", line 57, in load_fields_from_vocab
fields = get_fields(data_type, n_src_features, n_tgt_features)
File "/onmt/io/IO.py", line 43, in get_fields
return TextDataset.get_fields(n_src_features, n_tgt_features)
File "/onmt/io/TextDataset.py", line 218, in get_fields
postprocessing=make_src, sequential=False)
TypeError: __init__() got an unexpected keyword argument 'tensor_type'
I have tried downgrading to older versions of PyTorch, however when doing this I get a ModuleError namely:
ModuleNotFoundError: No module named 'torchtext.legacy'
I have also tried running it on Anaconda, with proper pytorch and torchtext versions according to requirements, but there I get an entirely different error:
import torch._dl as _dl_flags ImportError: No module named _dl
I just need to test the data at this point, everything else seems to have worked out. Any help would be greatly appreciated.
-U
The older versions of torchtext do not have a legacy module, so if you remove that part of the call that should fix the error.
i.e. torchtext.legacy.___ -> torchtext.___
After training my model for almost 2 days 3 files were generated:
best_model.ckpt.data-00000-of-00001
best_model.ckpt.index
best_model.ckpt.meta
where best_model is my model name.
When I try to import my model using the following command
with tf.Session() as sess:
saver = tf.train.import_meta_graph('best_model.ckpt.meta')
saver.restore(sess, "best_model.ckpt")
I get the following error
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/shreyash/.local/lib/python2.7/site-
packages/tensorflow/python/training/saver.py", line 1577, in
import_meta_graph
**kwargs)
File "/home/shreyash/.local/lib/python2.7/site-
packages/tensorflow/python/framework/meta_graph.py", line 498, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "/home/shreyash/.local/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 259, in import_graph_def
raise ValueError('No op named %s in defined operations.' % node.op)
ValueError: No op named attn_add_fun_f32f32f32 in defined operations.
How to fix this?
I have referred this post: TensorFlow, why there are 3 files after saving the model?
Tensorflow version 1.0.0 installed using pip
Linux version 16.04
python 2.7
The importer can't find a very specific function in your graph, namely attn_add_fun_f32f32f32, which is likely to be one of attention functions.
Probably you've stepped into this issue. However, they say it's bundled in tensorflow 1.0. Double check that installed tensorflow version contains attention_decoder_fn.py (or, if you are using another library, check that it's there).
If it's there, here are your options:
Rename this operation, if possible. You might want to read this discussion for workarounds.
Duplicate your graph definition, so that you won't have to call import_meta_graph, but restore the model into the current graph.
Overall goal is to use NumbaPro to run some functions on the GPU (on OSX 10.8.3).
Before starting, I just wanted to get everything set up. According to this page I installed CUDA, registered as a CUDA developer, downloaded the Compiler SDK and set up the NUMBAPRO_NVVM=/path/to/libnvvm.dylib environment variable.
However, running this basic test function:
from numbapro import autojit
#autojit(target='gpu')
def my_function(x):
if x == 0.0:
return 1.0
else:
return x*x*x
print my_function(4.4)
exit()
Brings up this error:
File ".../anaconda/lib/python2.7/site-packages/numba/decorators.py", line 207, in compile_function
compiled_function = dec(f)
File "...lib/python2.7/site-packages/numbapro/cudapipeline/decorators.py", line 35, in _jit_decorator
File "...lib/python2.7/site-packages/numbapro/cudapipeline/decorators.py", line 128, in __init__
File "...lib/python2.7/site-packages/numbapro/cudapipeline/environment.py", line 31, in generate_ptx
File "...lib/python2.7/site-packages/numbapro/cudapipeline/environment.py", line 186, in _link_llvm_math_intrinsics
KeyError: 1
I've tried #vectorize'ing instead of autojit, same error.
#autojit by itself with no target works fine.
Any ideas?
For posterity's sake, I asked Continuum Support. They responded:
It seems that you are running a CUDA GPU with compute capability 1.x. NVVM only supports CC2.0 and above. We definitely should have a better error reporting and make it clear in the NumbaPro documentation for the supported compute capability.
I am using rss2email for converting a number of RSS feeds into mail for easier consumption. That is, I was using it because it broke in a horrible way today: On every run, it only gives me this backtrace:
Traceback (most recent call last):
File "/usr/share/rss2email/rss2email.py", line 740, in <module>
elif action == "list": list()
File "/usr/share/rss2email/rss2email.py", line 681, in list
feeds, feedfileObject = load(lock=0)
File "/usr/share/rss2email/rss2email.py", line 422, in load
feeds = pickle.load(feedfileObject)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
The only helpful fact that I have been able to construct from this backtrace is that the file ~/.rss2email/feeds.dat in which rss2email keeps all its configuration and runtime state is somehow broken. Apparently, rss2email reads its state and dumps it back using cPickle on every run.
I have even found the line containing that 'sxOYAAuyzSx0WqN3BVPjE+6pgPU'string mentioned above in the giant (>12MB) feeds.dat file. To my untrained eye, the dump does not appear to be truncated or otherwise damaged.
What approaches could I try in order to reconstruct the file?
The Python version is 2.5.4 on a Debian/unstable system.
EDIT
Peter Gibson and J.F. Sebastian have suggested directly loading from the
pickle file and I had tried that before. Apparently, a Feed class
that is defined in rss2email.py is needed, so here's my script:
#!/usr/bin/python
import sys
# import pickle
import cPickle as pickle
sys.path.insert(0,"/usr/share/rss2email")
from rss2email import Feed
feedfile = open("feeds.dat", 'rb')
feeds = pickle.load(feedfile)
The "plain" pickle variant produces the following traceback:
Traceback (most recent call last):
File "./r2e-rescue.py", line 8, in <module>
feeds = pickle.load(feedfile)
File "/usr/lib/python2.5/pickle.py", line 1370, in load
return Unpickler(file).load()
File "/usr/lib/python2.5/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.5/pickle.py", line 1133, in load_reduce
value = func(*args)
TypeError: 'str' object is not callable
The cPickle variant produces essentially the same thing as calling
r2e itself:
Traceback (most recent call last):
File "./r2e-rescue.py", line 10, in <module>
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
EDIT 2
Following J.F. Sebastian's suggestion around putting "printf
debugging" into Feed.__setstate__ into my test script, these are the
last few lines before Python bails out.
u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html': u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html'},
'to': None,
'url': 'http://arstechnica.com/'}
Traceback (most recent call last):
File "./r2e-rescue.py", line 23, in ?
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
The same thing happens on a Debian/etch box using python 2.4.4-2.
How I solved my problem
A Perl port of pickle.py
Following J.F. Sebastian's comment about how simple the pickle
format is, I went out to port parts of pickle.py to Perl. A couple
of quick regular expressions would have been a faster way to access my
data, but I felt that the hack value and an opportunity to learn more
about Python would be be worth it. Plus, I still feel much more
comfortable using (and debugging code in) Perl than Python.
Most of the porting effort (simple types, tuples, lists, dictionaries)
went very straightforward. Perl's and Python's different notions of
classes and objects has been the only issue so far where a bit more
than simple translation of idioms was needed. The result is a module
called Pickle::Parse which after a bit of polishing will be
published on CPAN.
A module called Python::Serialise::Pickle existed on CPAN, but I
found its parsing capabilities lacking: It spews debugging output all
over the place and doesn't seem to support classes/objects.
Parsing, transforming data, detecting actual errors in the stream
Based upon Pickle::Parse, I tried to parse the feeds.dat file.
After a few iteration of fixing trivial bugs in my parsing code, I got
an error message that was strikingly similar to pickle.py's original
object not callable error message:
Can't use string ("sxOYAAuyzSx0WqN3BVPjE+6pgPU") as a subroutine
ref while "strict refs" in use at lib/Pickle/Parse.pm line 489,
<STDIN> line 187102.
Ha! Now we're at a point where it's quite likely that the actual data
stream is broken. Plus, we get an idea where it is broken.
It turned out that the first line of the following sequence was wrong:
g7724
((I2009
I3
I19
I1
I19
I31
I3
I78
I0
t(dtRp62457
Position 7724 in the "memo" pointed to that string
"sxOYAAuyzSx0WqN3BVPjE+6pgPU". From similar records earlier in the
stream, it was clear that a time.struct_time object was needed
instead. All later records shared this wrong pointer. With a simple
search/replace operation, it was trivial to fix this.
I find it ironic that I found the source of the error by accident
through Perl's feature that tells the user its position in the input
data stream when it dies.
Conclusion
I will move away from rss2email as soon as I find time to
automatically transform its pickled configuration/state mess to
another tool's format.
pickle.py needs more meaningful error messages that tell the user
about the position of the data stream (not the poision in its own
code) where things go wrong.
Porting parts pickle.py to Perl was fun and, in the end, rewarding.
Have you tried manually loading the feeds.dat file using both cPickle and pickle? If the output differs it might hint at the error.
Something like (from your home directory):
import cPickle, pickle
f = open('.rss2email/feeds.dat', 'r')
obj1 = cPickle.load(f)
obj2 = pickle.load(f)
(you might need to open in binary mode 'rb' if rss2email doesn't pickle in ascii).
Pete
Edit: The fact that cPickle and pickle give the same error suggests that the feeds.dat file is the problem. Probably a change in the Feed class between versions of rss2email as suggested in the Ubuntu bug J.F. Sebastian links to.
Sounds like the internals of cPickle are getting tangled up. This thread (http://bytes.com/groups/python/565085-cpickle-problems) looks like it might have a clue..
'sxOYAAuyzSx0WqN3BVPjE+6pgPU' is most probably unrelated to the pickle's problem
Post an error traceback for (to determine what class defines the attribute that can't be called (the one that leads to the TypeError):
python -c "import pickle; pickle.load(open('feeds.dat'))"
EDIT:
Add the following to your code and run (redirect stderr to file then use 'tail -2' on it to print last 2 lines):
from pprint import pprint
def setstate(self, dict_):
pprint(dict_, stream=sys.stderr, depth=None)
self.__dict__.update(dict_)
Feed.__setstate__ = setstate
If the above doesn't yield an interesting output then use general troubleshooting tactics:
Confirm that 'feeds.dat' is the problem:
backup ~/.rss2email directory
install rss2email into virtualenv/pip sandbox (or use zc.buildout) to isolate the environment (make sure you are using feedparser.py from the trunk).
add couple of feeds, add feeds until 'feeds.dat' size is greater than the current. Run some tests.
try old 'feeds.dat'
try new 'feeds.dat' on existing rss2email installation
See r2e bails out with TypeError bug on Ubuntu.