I have the error listed above, but have been unable to find what it means. I am new to numpy and its {.frombuffer()} command. The code where this error is triggering is:
ARRAY_1=400000004
fid=open(fn,'rb')
fid.seek(w+x+y+z) #w+x+y+z=
if(condition==0):
b=fid.read(struct.calcsize(fmt+str(ARRAY_1)+'b'))
myClass.y = numpy.frombuffer(b,'b',struct.calcsize(fmt+str(ARRAY_1)+'b'))
else:
b=fid.read(struct.calcsize(fmt+str(ARRAY_1)+'h'))
myClass.y = numpy.frombuffer(b,'h',struct.calcsize(fmt+str(ARRAY_1)+'h')) #error this line
where fmt is '>' where condition==0 and '<' where condition !=0. This is changing the way the binaryfile is read, big endian or little endian. fid is a binary file that has already been opened.
Debugging up to this point, condition=1, so I have a feeling that there is also an error in the last statement of the if condition as well, I just don't see it right now.
As I said before, I tried to find what the error meant, but haven't had any luck. If anyone knows why it's erroring out on me, I'd really like the help.
calcsize gives the number of bytes that the buffer will have given the format.
In [421]: struct.calcsize('>100h')
Out[421]: 200
In [422]: struct.calcsize('>100b')
Out[422]: 100
h takes 2 bytes per item, so for 100 items, it gives 200 bytes.
For frombuffer, the 3rd argument is
count : int, optional
Number of items to read. ``-1`` means all data in the buffer.
So I should give it 100, not 200.
Reading a simple bytestring (in Py3):
In [429]: np.frombuffer(b'one two three ','b',14)
Out[429]: array([111, 110, 101, 32, 116, 119, 111, 32, 116, 104, 114, 101, 101, 32], dtype=int8)
In [430]: np.frombuffer(b'one two three ','h',14)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-430-30077e924a4c> in <module>()
----> 1 np.frombuffer(b'one two three ','h',14)
ValueError: buffer is smaller than requested size
In [431]: np.frombuffer(b'one two three ','h',7)
Out[431]: array([28271, 8293, 30580, 8303, 26740, 25970, 8293], dtype=int16)
To read it with h I need to give it half the count of the b read.
Related
Maybe the question is dumb, but so far I have not been able to find a solution.
I have been handed a code from other person who was working probably with a different set than mine (e.g. Python 2 instead of 3, etc).
So I have done some small changes to make things work, but I am stuck in a probably simple problem related to h5py.
The part of the code where it crushes looks like:
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
The problem seems to be in base.create_dataset:
Traceback (most recent call last):
File "C:\Users\DaniJ\Documents\PostDoc_Jena\Trips, Conf, etc\Sinfonia Workshop\Exercise_1\exercise_1_SINFONIA_for_One\NR_chem_SINGLE_NoEu.py", line 252, in <module>
base.create_dataset('Labels', data=labels_ALL)
File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\group.py", line 136, in create_dataset
dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 118, in make_new_dset
tid = h5t.py_create(dtype, logical=1)
File "h5py\h5t.pyx", line 1634, in h5py.h5t.py_create
File "h5py\h5t.pyx", line 1656, in h5py.h5t.py_create
File "h5py\h5t.pyx", line 1717, in h5py.h5t.py_create
TypeError: No conversion path for dtype: dtype('<U10')
the variable base seems to be a h5py._hl.files.File variable.
Does somebody how can I solve this problem?
Thanks
Best regards,
Dani
Did you solve your problem? I'm 99.9% sure it's related to your Labels data -- likely it's in a NumPy array instead of a List. I wrote 3 short examples to demonstrate the difference.
The first code segment uses a List and successfully creates the
datasets in file SO_69900543_1.h5.
The second code segment reproduces your error. It converts the List
to a NumPy Array then fails when attempting to create the datasets
in file SO_69900543_2.h5. Notice that it gives the same error
message you encountered: TypeError: No conversion path for dtype: dtype('<U10').
The third code segment shows how to modify numpy.str_ elements to str (solves problem in segment #2). Note that the each Labels value is converted with str() before it is added to Labels_All.
Maybe this will help you find (and fix) your problem with Unicode data.
Code segment 1 (works):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
with h5py.File('SO_69900543_1.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
Code segment 2 (returns TypeError):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array
# This will trigger the error when creating the dataset
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
for i in range(len(labels_ALL)):
print(i, type(labels_ALL[i]), type(units_ALL[i]))
with h5py.File('SO_69900543_2.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
Code segment 3 (works):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array
# This will trigger the error when creating the dataset if not modified
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
# use str() to convert from 'numpy.str_' to 'str'
labels_ALL.append(str(Labels[i]))
units_ALL.append('(mol/L)')
for i in range(len(labels_ALL)):
print(i, type(labels_ALL[i]), type(units_ALL[i]))
with h5py.File('SO_69900543_2.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
I am trying to implement my own version of the MatLab function imhmin() in Python using OpenCV and (naturally) NumPy. If you are not familiar with this MatLab function, it's extremely useful for segmentation. MatLab's documentation can explain it much better than I can:
https://it.mathworks.com/help/images/ref/imhmin.html
Here is what I have so far:
(For the sake of keeping this short, I did not include the local_min function. It takes one image parameter and returns an image of the same size where local minima are 1s and everything else is 0.)
from volume import show
import cv2
import numpy
def main():
arr = numpy.array( [[5,5,5,5,5,5,5],
[5,0,3,1,4,2,5],
[5,5,5,5,5,5,5]] ) + 1
res = imhmin(arr, 3)
print(res)
def imhmin(src, h):
# TODO: speed up function by cropping image
edm = src.copy()
# d is the domain / all values contained in the array
d = numpy.unique(edm)
# for the index of each local minima (sorted gtl)
indices = numpy.nonzero(local_min(edm)) # get indices
indices = numpy.dstack((indices[0], indices[1]))[0].tolist() # zip
# sort based on the value of edm[] at that index
indices.sort(key = lambda _: edm[_[0],_[1]], reverse = True)
for (x,y) in indices:
start = edm[x,y] # remember original value of minima
# for each in a list of heights greater than the starting height
for i in range(*numpy.where(d==edm[x,y])[0], d.shape[0]-1):
# prevent exceeding target height
step = start + h if (d[i+1] - start > h) else d[i+1]
#-------------- WORKS UNTIL HERE --------------#
# complete floodFill syntax:
# cv2.floodFill(image, mask, seed, newVal[, loDiff[, upDiff[, flags]]]) → retval, rect
# fill UPWARD onto image (and onto mask?)
cv2.floodFill(edm, None, (y,x), step, 0, step-d[i], 4)
# fill DOWNWARD NOT onto image
# have you overflowed?
if __name__ == "__main__":
main()
Which works fine until it gets to the floodfill line. It barks this error back:
Traceback (most recent call last):
File "edm.py", line 94, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 66, in imhmin
cv2.floodFill(edm, None, (y,x), step, 0, step-d[i], 4)
TypeError: Layout of the output array image is incompatible with cv::Mat (step[ndims-1] != elemsize or step[1] != elemsize*nchannels)
At first I thought maybe the way I laid out the parameters was wrong because of the stuff about step in the traceback, but I tried changing that variable's name and have come to the conclusion that step is some variable name in OpenCV's code. It's talking about the output array, and I'm not using a mask, so something must be wrong with the array edm.
I can suppress this error by replacing the floodfill line with this one:
cv2.floodFill(edm.astype(numpy.double), None, (y,x), step, 0, step-d[i], 4)
The difference being that I am typecasting the numpy array to a float array. Then I am left with this error:
Traceback (most recent call last):
File "edm.py", line 92, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 64, in imhmin
cv2.floodFill(edm.astype(numpy.double), None, (y,x), step, 0, step-d[i], 4)
TypeError: Scalar value for argument 'newVal' is not numeric
This is where I started suspecting something was seriously wrong, because step is "obviously" going to be an integer here (maybe it isn't obvious, but I did try printing it and it looks like it's just an integer, not an array of one integer or anything weird like that).
To entertain the error message, I typecast the newVal parameter to a float. I got pretty much the exact same error message about the upDiff parameter, so I just typecast that too, resulting in this line of code:
cv2.floodFill(edm.astype(numpy.double), None, (y,x), float(step), 0, float(step-d[i]), 4)
I know this isn't how I want to be doing things, but I just wanted to see what would happen. What happened was I got this scary looking error:
Traceback (most recent call last):
File "edm.py", line 92, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 64, in imhmin
cv2.floodFill(edm.astype(numpy.double), None, (y,x), float(step), 0, float(step-d[i]), 4)
cv2.error: OpenCV(3.4.2) /opt/concourse/worker/volumes/live/9523d527-1b9e-48e0-7ed0-a36adde286f0/volume/opencv-suite_1535558719691/work/modules/imgproc/src/floodfill.cpp:587: error: (-210:Unsupported format or combination of formats) in function 'floodFill'
I don't even know where to start with this. I've used OpenCV's floodfill function many times before and have never run into problems like this. Can anyone provide any insight?
Thanks in advance
Antonio
Python struct.pack_into with format char 'x' requires more bytes.
I am trying to learn about python byte arrays to be able to write my own IP,TPC,UDP headers. I use the struct in python to pack and unpack binary data so the specified types given the format string.
ba2 = bytearray(2)
print(ba2, "The size: ", ba2.__len__())
struct.pack_into(">bx", ba2, 1, 1)
print(struct.unpack(">bx", ba2))
Now when I try to pack into a buffer of length 2 with ">bx" as format, according to above code, I get the error:
bytearray(b'\x00\x00') The size: 2
Traceback (most recent call last):
File "D:/User/Documents/Python/Network/Main.py", line 58, in <module>
bitoperations_bytes_bytearrays_test()
File "D:/User/Documents/Python/Network/Main.py", line 49, in bitoperations_bytes_bytearrays_test
struct.pack_into(">bx", ba2, 1, 1)
struct.error: pack_into requires a buffer of at least 2 bytes
but I have a byte array of 2 bytes.
What am I doing wrong?
And please reference to some documentation, if I have missed it (I have read the python doc, but may have missed it).
Edit:
Sorry if I was unclear. but i want to just change the second byte in the byte array. Thus the 'x' padd in the format.
And as stupid as i was it is just to exclude the 'x' in the format like thiss:
struct.pack_into(">b", ba2, 1, 1)
and the right packing will have ben made. With this output:
bytearray(b'\x00\x00') The size: 2
A pack with one byte shift: 0001
(0, 1)
You need one additional parameter for pack_into() function call. The third parameter is mandatory and it is offset in the target buffer (refer to https://docs.python.org/2/library/struct.html). Your format is also incorrect, because it just expects one byte. Following code fixes your problems:
import struct
ba2 = bytearray(2)
print(ba2, "The size: ", ba2.__len__())
struct.pack_into("bb", ba2, 0, 1, 1)
print(struct.unpack("bb", ba2))
And as stupid as i was it is just to exclude the 'x' in the format like thiss:
struct.pack_into(">b", ba2, 1, 1)
and the right packing will have ben made. With this output:
bytearray(b'\x00\x00') The size: 2
A pack with one byte shift: 0001
(0, 1)
Have some trouble with writing a small function for feature scaling.
data = [115, 140, 175]
def featureScaling(arr):
for a in arr:
return (a-min(data))/float((max(data)-min(data)))
print featureScaling(data)
I know there is a potential corner case if all values in data are the same ( division by zero problem)
I just don't know why my idea does not work, since min(data) on its on works.
I get the error:
Traceback (most recent call last):
File "vm_main.py", line 33, in
import main
File "/tmp/vmuser_czbvlzqfxl/main.py", line 2, in
import studentMain
File "/tmp/vmuser_czbvlzqfxl/studentMain.py", line 29, in
elif not compare_numbers(student_output[0], solution_output[0]):
TypeError: 'float' object has no attribute 'getitem'
As Tichodroma noted this code does not throw an error, however I do not believe it performs as you intended at any rate. It returns a single value but I believe you actually want each data point scaled, hence the following modification:
data = [115, 140, 175]
def featureScaling(arr):
scaled=[]
for a in arr:
scaled.append((a-min(data))/(max(data)-min(data)))
return scaled
print(featureScaling(data))
This code provides the result [0.0, 0.4166666666666667, 1.0]
I have the following dict which I want to write to a file in binary:
data = {(7, 190, 0): {0: 0, 1: 101, 2: 7, 3: 0, 4: 0},
(7, 189, 0): {0: 10, 1: 132, 2: 17, 3: 20, 4: 40}}
I went ahead to use the struct module in this way:
packed=[]
for ssd, add_val in data.iteritems():
# am trying to using 0xcafe as a marker to tell me where to grab the keys
pack_ssd = struct.pack('HBHB', 0xcafe, *ssd)
packed.append(pack_ssd)
for add, val in data[ssd].iteritems():
pack_add_val = struct.pack('HH', add, val)
packed.append(pack_add_val)
The output of this is packed = ['\xfe\xca\x07\x00\xbe\x00\x00', '\x00\x00\x00\x00', '\x01\x00e\x00', '\x02\x00\x07\x00', '\x03\x00\x00\x00', '\x04\x00\x00\x00', '\xfe\xca\x07\x00\xbd\x00\x00', '\x00\x00\n\x00', '\x01\x00\x84\x00', '\x02\x00\x11\x00', '\x03\x00\x14\x00', '\x04\x00(\x00']
After which I write this as a binary file :
ifile = open('test.bin', 'wb')
for pack in packed:
ifile.write(pack)
Here is what the binary file looks like:
'\xfe\xca\x07\x00\xbe\x00\x00\x00\x00\x00\x00\x01\x00e\x00\x02\x00\x07\x00\x03\x00\x00\x00\x04\x00\x00\x00\xfe\xca\x07\x00\xbd\x00\x00\x00\x00\n\x00\x01\x00\x84\x00\x02\x00\x11\x00\x03\x00\x14\x00\x04\x00(\x00'
It's all OK until I tried to unpack the data. Now I want to read the contents of the binary file and arrange it back to how my dict looked liked in the first place. This is how I tried to unpack it but I was always getting an error:
unpack=[]
while True:
chunk = ifile.read(log_size)
if len(chunk) == log_size:
str = struct.unpack('HBHB', chunk)
unpack.append(str)
chunk = ifile.read(log1_size)
str= struct.unpack('HH', chunk)
unpack.append(str)
Traceback (most recent call last):
File "<interactive input>", line 7, in ?
error: unpack str size does not match format
I realize the method I tried to unpack will always run into problems, but I can't seem to find a good way in unpacking the contents of the binary file. Any help is much appreciated..
If you need to write something custom, I would suggest doing the following:
1) 64 bit integer: Number of keys
2) 64 bit integer * 3 * number of keys: Key tuple data
for i in number of keys:
3i) 64 bit integer: Number of keys for dictionary i
4i): 64 bit integer * 2 * number of keys for i: key data, value data, key data, value data...
After that, just make sure you read and write with the same endianness and that specifying an invalid length at any point (too high, too low) doesn't crash your program and you are good.
The idea is that at any state in the unpacker it is either expecting a length or to read data as something, and so it is 100% unambiguous where everything starts and ends as long as you follow the format.