Reading BSD500 groundTruth with scipy - python

I am trying to load using scipy loadmat a ground truth file, it return numpy ndarray of type object (dtype='O').
From that object I arrive to access to each element that are also ndarrays but I am struggling from that point to access to either the segmentation or the boundaries image.
I would like a to transform this a list of list of ndarray of numerical types how can I do that ?
Thanks in advance for any help

I found a way to fix my issue.
I do not think it is optimal but it works.
def load_bsd_gt(filename):
gt = loadmat(filename)
gt = gt['groundTruth']
cols = gt.shape[1]
what = ['Segmentation','Boundaries']
ret = list()
for i in range(cols):
j=0
tmp = list()
for w in what:
tmp.append(gt[0][j][w][0][0][:])
j+=1
ret.append(tmp)
return ret
If someone have a better way to do it please feel free to add a comment or an answer.

Related

How do I read a text file of numbers into an array of arrays

In python, using the OpenCV library, I need to create some polylines. The example code for the polylines method shows:
cv2.polylines(img,[pts],True,(0,255,255))
I have all the 'pts' laid out in a text file in the format:
x1,y1,x2,y2,x3,y3,x4,y4
x1,y1,x2,y2,x3,y3,x4,y4
x1,y1,x2,y2,x3,y3,x4,y4
How can I read this file and provide the data to the [pts] variable in the method call?
I've tried the np.array(csv.reader(...)) method as well as a few others I've found examples of. I can successfully read the file, but it's not in the format the polylines method wants. (I am a newbie when it comes to python, if this was C++ or Java, it wouldn't be a problem).
I would try to use numpy to read the csv as an array.
from numpy import genfromtxt
p = genfromtxt('myfile.csv', delimiter=',')
cv2.polylines(img,p,True,(0,255,255))
You may have to pass a dtype argument to the genfromtext if you need to coerce the data to a specific format.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html
In case you know it is a fixed number of items in each row:
import csv
with open('myfile.csv') as csvfile:
rows = csv.reader(csvfile)
res = list(zip(*rows))
print(res)
I know it's not pretty and there is probably a MUCH BETTER way to do this, but it works. That being said, if someone could show me a better way, it would be much appreciated.
pointlist = []
f = open(args["slots"])
data = f.read().split()
for row in data:
tmp = []
col = row.split(";")
for points in col:
xy = points.split(",")
tmp += [[int(pt) for pt in xy]]
pointlist += [tmp]
slots = np.asarray(pointlist)
You might need to draw each polyline individually (to expand on #Chris's answer):
from numpy import genfromtxt
lines = genfromtxt('myfile.csv', delimiter=',')
for line in lines:
cv2.polylines(img, line.reshape((-1, 2)), True, (0,255,255))

Numpy (n, 1, m) to (n,m)

I am working on a problem which involves a batch of 19 tokens each with 400 features. I get the shape (19,1,400) when concatenating two vectors of size (1, 200) into the final feature vector. If I squeeze the 1 out I am left with (19,) but I am trying to get (19,400). I have tried converting to list, squeezing and raveling but nothing has worked.
Is there a way to convert this array to the correct shape?
def attn_output_concat(sample):
out_h, state_h = get_output_and_state_history(agent.model, sample)
attns = get_attentions(state_h)
inner_outputs = get_inner_outputs(state_h)
if len(attns) != len(inner_outputs):
print 'Length err'
else:
tokens = [np.zeros((400))] * largest
print(tokens.shape)
for j, (attns_token, inner_token) in enumerate(zip(attns, inner_outputs)):
tokens[j] = np.concatenate([attns_token, inner_token], axis=1)
print(np.array(tokens).shape)
return tokens
The easiest way would be to declare tokens to be a numpy.shape=(19,400) array to start with. That's also more memory/time efficient. Here's the relevant portion of your code revised...
import numpy as np
attns_token = np.zeros(shape=(1,200))
inner_token = np.zeros(shape=(1,200))
largest = 19
tokens = np.zeros(shape=(largest,400))
for j in range(largest):
tokens[j] = np.concatenate([attns_token, inner_token], axis=1)
print(tokens.shape)
BTW... It makes it difficult for people to help you if you don't include a self-contained and runnable segment of code (which is probably why you haven't gotten a response on this yet). Something like the above snippet is preferred and will help you get better answers because there's less guessing at what your trying to accomplish.

applying healpy mask to array of maps

I have a series of maps with two different indices, i and j. Let this be indexed like map_series[i][j].
EDIT 1/21: A minimal working example would be something like
map_series=np.array([np.array([np.arange(12) + 0.1*(i+1) + 0.01*(j+1) for j in range(3)]) for i in range(5)])
I'd like to apply the same mask to each; if map_series is one-dimensional, these each work.
I can imagine a few different ways of applying these maps:
(A) Applying the mask to the whole array:
map_series_ma = hp.ma(map_series)
map_series_ma.mask = predefined_mask
(B1) Applying the mask to each element of the array:
map_series_ma = np.zeros_like(map_series)
for i in range(len(map_series)):
for j in range(len(map_series[0])):
temp = hp.ma(map_series[i][j])
temp.mask = predefined_mask
map_series_ma[i][j] = temp
(B2) Applying the mask to each element of the array:
map_series_ma = np.zeros_like(map_series)
for i in range(len(map_series)):
for j in range(len(map_series[0])):
map_series_ma[i][j] = hp.ma(map_series[i][j])
map_series_ma[i][j].mask = predefined_mask
(C) Pythonically enumerating the list:
map_series_ma = np.array([hp.ma(map_series[i][j]) for j in range(j_max) for i in range(i_max)])
map_series_ma.mask = predetermined_mask
All of these fail to give my desired output, however.
Upon trying (A) or (C) I get an error after the first step, telling me TypeError: bad number of pixels.
Upon trying (B1) I don't get an error, but I also none of the elements of the maps_series_ma have masks; in fact, they do not even appear to be hp.ma objects. Oddly enough, though: when I return temp it does have the appropriate mask.
Upon trying (B2) I get the error
AttributeError: 'numpy.ndarray' object has no attribute 'mask' (which, after looking at my syntax, I totally understand!)
I'm a little confused how to go about this. Both (A) and (B1) seem acceptable to me...
Any help is much appreciated,
Thanks,
Sam
this works for me:
import numpy as np
import healpy as hp
map_series=np.array([np.array([np.arange(12) + 0.1*(i+1) + 0.01*(j+1) for j in range(3)]) for i in range(5)])
map_series_ma = map(lambda x: hp.ma(x), map_series)
pm=[True, True,True,True,True,True,False,False,False,False,False,False]
for m in map_series_ma:
for mm in m:
mm.mask=pm

MemoryError with large sparse matrices

For a project I have built a program that constructs large matrices.
def ExpandSparse(LNew):
SpId = ssp.csr_matrix(np.identity(MS))
Sz = MS**LNew
HNew = ssp.csr_matrix((Sz,Sz))
Bulk = dict()
for i in range(LNew-1):
for j in range(LNew-1):
if i == j:
Bulk[(i,j)]=H2
else:
Bulk[(i,j)]=SpId
Ha = ssp.csr_matrix((8,8))
try:
for i in range(LNew-1):
for j in range(LNew-2):
if j < 1:
Ha = ssp.csr_matrix(ssp.kron(Bulk[(i,j)],Bulk[(i,j+1)]))
else:
Ha = ssp.csr_matrix(ssp.kron(Ha,Bulk[(i,j+1)]))
HNew = HNew + Ha
except MemoryError:
print('The matrix you tried to build requires too much memory space.')
return
return HNew
This does the job, however it does not work as well as I would have expected. The problem is that it won't allow for really large matrices. When LNewis larger than 13 I will get a MemoryError. My experiences with numpy suggest that, memorywise, I should be able to get LNew up to 18 or 19 before I get this error. Does this have to do with my code, or with the way scipy.sparse.kron() works with these matrices?
Another note that might be important is that I use Windows not Linux.
After some more reading on the working of the scipy.sparse.kron() function I have noticed that there is a third term named format you can enter. The default setting is None, but when it is put on 'csr' or another supported format it will only use the sparse format making it a lot more efficient, now for me it can build a 2097152 x 2097152 matrix. Here LNew is 21.

python program to export numpy/lists in svmlight format

Any way to export a python array into SVM light format?
There is one in scikit-learn:
http://scikit-learn.org/stable/modules/generated/sklearn.datasets.dump_svmlight_file.html
It's basic but it works both for numpy arrays and scipy.sparse matrices.
I wrote this totally un-optimized script a while ago, maybe it can help! Data and labels must be in two separate numpy arrays.
def save_svmlight_data(data, labels, data_filename, data_folder = ''):
file = open(data_folder+data_filename,'w')
for i,x in enumerate(data):
indexes = x.nonzero()[0]
values = x[indexes]
label = '%i'%(labels[i])
pairs = ['%i:%f'%(indexes[i]+1,values[i]) for i in xrange(len(indexes))]
sep_line = [label]
sep_line.extend(pairs)
sep_line.append('\n')
line = ' '.join(sep_line)
file.write(line)
The svmlight-loader module can load an svmlight file into a numpy array. I don't think anything exists for the other direction, but the module is probably a good starting point for extending its functionality.

Categories

Resources