Converting Python list of lists to a MATLAB sparse matrix

Converting Python list of lists to a MATLAB sparse matrix - python

I would like to convert a Python list of lists to a MATLAB sparse matrix. Each element of the Python list is a list with three elements: a row, a column, and a value. My code and the error message are below:
from __future__ import division
import matlab.engine
A = [[1,2,1],[1,3,1],[2,2,1]]
A = matlab.double(A)
eng = matlab.engine.start_matlab()
eng.spconvert(A)
And the error message:
File "testMatlab.py", line 14, in <module>
eng.spconvert(A)
File "/Library/Python/2.7/site-packages/matlab/engine/matlabengine.py", line 71, in __call__
_stderr, feval=True).result()
File "/Library/Python/2.7/site-packages/matlab/engine/futureresult.py", line 67, in result
return self.__future.result(timeout)
File "/Library/Python/2.7/site-packages/matlab/engine/fevalfuture.py", line 82, in result
self._result = pythonengine.getFEvalResult(self._future,self._nargout, None, out=self._out, err=self._err)
TypeError: unsupported data type returned from MATLAB
The error goes away if we amend the code to, the problem is with Python not being able to handle the returned sparse matrix:
eng.spconvert(A,nargout=0)

Related

spatial regression in Python - read matrix from list

I have a following problem. I am following this example about spatial regression in Python:
import numpy
import libpysal
import spreg
import pickle
# Read spatial data
ww = libpysal.io.open(libpysal.examples.get_path("baltim_q.gal"))
w = ww.read()
ww.close()
w_name = "baltim_q.gal"
w.transform = "r"
Example above works. But I would like to read my own spatial matrix which I have now as a list of lists. See my approach:
ww = libpysal.io.open(matrix)
But I got this error message:
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 90, in __new__
cls.__registry[cls.getType(dataPath, mode, dataFormat)][mode][0]
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 105, in getType
ext = os.path.splitext(dataPath)[1]
File "/usr/lib/python3.8/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list
this is how matrix looks like:
[[0, 2, 1], [2, 0, 4], [1, 4, 0]]
EDIT:
If I try to insert my matrix into the GM_Lag like this:
model = spreg.GM_Lag(
y,
X,
w=matrix,
)
I got following error:
warn("w must be API-compatible pysal weights object")
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 2, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/twosls_sp.py", line 469, in __init__
USER.check_weights(w, y, w_required=True)
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/user_output.py", line 444, in check_weights
if w.n != y.shape[0] and time == False:
AttributeError: 'list' object has no attribute 'n'
EDIT 2:
This is how I read the list of lists:
import pickle
with open("weighted_matrix.pkl", "rb") as f:
matrix = pickle.load(f)
How can I insert list of lists into spreg.GM_Lag ? Thanks

Why do you want to pass it to the libpysal.io.open method? If I understand correctly this code, you first open a file, then read it (and the read method seems to be returning a List). So in your case, where you already have the matrix, you don't need to neither open nor read any file.
What will be needed though is what w is supposed to look like here: w = ww.read(). If it is a simple matrix, then you can initialize w = matrix. If the read method also format the data a certain way, you'll need to do it another way. If you could describe the expected behavior of the read method (e.g. what does the input file contain, and what is returned), it would be useful.
As mentioned, as the data is formatted into a libpysal.weights object, you must build one yourself. This can supposedly be done with this method libpysal.weights.W. (Read the doc too fast).

TypeError: '(slice(0, 15, None), 15)' is an invalid key

I have a code in Python that looks something like the code pasted below. For context, the all csv files print [15 rows x 16 columns], I just changed the name for privacy purposes.
import numpy as np
import pandas as pd
C = pd.read_csv('/Users/name/Desktop/filename1.csv')
Chome = pd.read_csv('/Users/name/Desktop/filename2.csv')
Cwork = pd.read_csv('/Users/name/Desktop/filename3.csv')
Cschool = pd.read_csv('/Users/name/Desktop/filename4.csv')
Cother = pd.read_csv('/Users/name/Desktop/filename5.csv')
Cf = np.zeros([17,17])
Cf = C
Cf[0:15,16] = C[0:15,15]
Cf[16,0:15] = C[15,0:15]
Cf[16,16] = C[15,15]
print(Cf)
When I run the code I get the following error:
runfile('/Users/name/.spyder-py3/untitled12.py', wdir='/Users/name/.spyder-py3')
Traceback (most recent call last):
File "/Users/name/.spyder-py3/untitled12.py", line 23, in <module>
Cf[0:15,16] = C[0:15,15]
File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 2800, in __getitem__
indexer = self.columns.get_loc(key)
File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 116, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(slice(0, 15, None), 15)' is an invalid key
I am not exactly sure what this error means. I am pretty new to python, so debugging is a skill I am trying to better understand. So any advice on what I can do to fix this error, or what it means would be helpful. Thank you.

Note the following sequence in your code sample:
C = pd.read_csv(...)
... # Other cases of pd.read_csv
Cf = np.zeros([17,17])
So, at least till now, C is a DataFrame and Cf is a Numpy array.
Then Cf = C is probably a logical error, since it overwrites
the Numpy array (full of zeroes) with another reference to C.
And now as the offending instruction (Cf[0:15,16] = C[0:15,15]) is concerned:
Note that C[0:15,15] is wrong (run this code on your own to see it).
In case of pandasonic DataFrames you can use "positional addressing",
including slices, using iloc.
On the other hand, this notation is allowed for Numpy arrays.
So, assuming that Cf = C is not needed and Cf should remain a
Numpy array, you probably should correct this instruction to:
Cf[0:15,16] = C.iloc[0:15,15]
And make analogous corrections in remaining instructions in your code.
Edit
Another option is to refer to the underlying Numpy array in C DataFrame,
using values attribute.
In this case you can use Numpythonic addressing style, e.g.:
C.values[0:15,15]
causes no error.

DistributedSampler — Expected a ‘cuda’ device type for generator when generating indices

Performing distributed training, I have the following code like this:
training_sampler = DistributedSampler(training_set, num_replicas=2, rank=0)
training_generator = data.DataLoader(training_set, **params, sampler=training_sampler)
for x, y, z in training_generator: # Error occurs here.
...
Overall, I get the following message:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/ubuntu/VC/ppg_training_extraction/ppg_training_scripts/train_ASR_trim_scp.py", line 336, in train
for local_batch_src, local_batch_tgt, lengths in dataloaders[phase]:
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 352, in __iter__
return self._get_iterator()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 294, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 827, in __init__
self._reset(loader, first_iter=True)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 857, in _reset
self._try_put_index()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1091, in _try_put_index
index = self._next_index()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 427, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 227, in __iter__
for idx in self.sampler:
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/distributed.py", line 97, in __iter__
indices = torch.randperm(len(self.dataset), generator=g).tolist() # type: ignore
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'
Now at that line, I ran the following instructions in pdb:
(Pdb) g = torch.Generator()
(Pdb) g.manual_seed(0)
<torch._C.Generator object at 0x7ff7f8143110>
(Pdb) indices = torch.randperm(4556, generator=g).tolist()
(Pdb) indices = torch.randperm(455604, generator=g).tolist()
*** RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'
Why am I getting the runtime error when the upperbound integer is high, but not when it’s low enough?
Note, I ran on a clean Python session and found
>>> import torch
>>> g = torch.Generator()
>>> g.manual_seed(0)
<torch._C.Generator object at 0x7f9d2dfb39f0>
>>> indices = torch.randperm(455604, generator=g).tolist()
that this worked fine. Is it some configuration in how I’m handling distributed training among multiple GPUs? Any sort of insights would be appreciated!

I just met the same problem using dataloader and I found the following helps without removing torch.set_default_tensor_type('torch.cuda.FloatTensor')
data.DataLoader(..., generator=torch.Generator(device='cuda'))
since I don't want to manually add .to('cuda') for tons of tensors in my code

I had the same issue. Looks like somebody found the source of the problem here:
torch.set_default_tensor_type('torch.cuda.FloatTensor')
I fixed my problem by getting rid of this line in my code and manually using .to(device).

python feature scaling, short function

Have some trouble with writing a small function for feature scaling.
data = [115, 140, 175]
def featureScaling(arr):
for a in arr:
return (a-min(data))/float((max(data)-min(data)))
print featureScaling(data)
I know there is a potential corner case if all values in data are the same ( division by zero problem)
I just don't know why my idea does not work, since min(data) on its on works.
I get the error:
Traceback (most recent call last):
File "vm_main.py", line 33, in
import main
File "/tmp/vmuser_czbvlzqfxl/main.py", line 2, in
import studentMain
File "/tmp/vmuser_czbvlzqfxl/studentMain.py", line 29, in
elif not compare_numbers(student_output[0], solution_output[0]):
TypeError: 'float' object has no attribute 'getitem'

As Tichodroma noted this code does not throw an error, however I do not believe it performs as you intended at any rate. It returns a single value but I believe you actually want each data point scaled, hence the following modification:
data = [115, 140, 175]
def featureScaling(arr):
scaled=[]
for a in arr:
scaled.append((a-min(data))/(max(data)-min(data)))
return scaled
print(featureScaling(data))
This code provides the result [0.0, 0.4166666666666667, 1.0]

ValueError string to float when retrieving float32 from Netcdf file using Netcdf4 in python

I am using netcdf4 in python 2.7 on a windows7 machine. I have loaded numpy recarrays into a netcdf file I created and have subsequently retrieved the data several times. Then, for some unknown reason when I try to retrieve the data I get a ValueError could not convert string to float:
The code that is being used to retrieve the data is:
def getNetCDFGroupVarData(NCfilename, GroupPath, Variable):
""" ==============================================================
TITLE: getNetCDFGroupVarData
DESCR: for a valid variable on the specified path in a NetCDF file
returns a data vector
ARGS: NCfilename : netcdf4 file path and name
GroupPath : group path
Variable : variable name
RETURN: VarData: vector of variable data
DEPEND: netCDF4.Dataset
=======================================================================
"""
# get rootgroup and group from which to return attributes
if os.path.isfile(NCfilename):
RG = Dataset(NCfilename, 'a')
G = giveListEndGroup(RG,GroupPath)
# retrieve variable data from group
keyVar = G.variables.keys()
print(keyVar)
kvlen = len(keyVar)
var = unicode(Variable)
if kvlen > 0 :
print('variable name: ',var)
V = G.variables[var]
print V.dtype
print V.shape
print V.dimensions
VarData = V[:] #====== Error raised here ==============
else:
print('no keys found')
VarData = None
RG.close()
return VarData
The print outputs and error stack I get when calling this function are:
[u'time', u'SECONDS', u'NANOSECONDS', u'Rg', u'Ts1', u'Ts2', u'Ts3', u'V_log', u'T_log']
('variable name: ', u'time')
float64
(88872,)
(u'time',)
variable: time does not exist
Unexpected error: <type 'exceptions.ValueError'>
Traceback (most recent call last):
File "C:\Users\rclement\Documents\My Dropbox\Code\python\NCTSutil\Panel_NCTS_structure.py", line 69, in tree_path_changed
pub.sendMessage('NetcdfTS.group.specified', arg1=pathlist )
File "C:\Python27\lib\site-packages\pubsub\core\kwargs\publisher.py", line 27, in sendMessage
topicObj.publish(**kwargs)
File "C:\Python27\lib\site-packages\pubsub\core\kwargs\publishermixin.py", line 24, in publish
self._publish(msgKwargs)
File "C:\Python27\lib\site-packages\pubsub\core\topicobj.py", line 376, in _publish
self.__sendMessage(data, self, iterState)
File "C:\Python27\lib\site-packages\pubsub\core\topicobj.py", line 397, in __sendMessage
self._mix_callListener(listener, data, iterState)
File "C:\Python27\lib\site-packages\pubsub\core\kwargs\publishermixin.py", line 64, in _mix_callListener
listener(iterState.filteredArgs, self, msgKwargs)
File "C:\Python27\lib\site-packages\pubsub\core\kwargs\listenerimpl.py", line 43, in __call__
cb(**kwargs)
File "C:\Users\rclement\Documents\My Dropbox\Code\python\NCTSutil\NetcdfTimeSeries.py", line 70, in listner_group
atime = self.GetSelectedVariableData(pathlist, u'time')
File "C:\Users\rclement\Documents\My Dropbox\Code\python\NCTSutil\NetcdfTimeSeries.py", line 307, in GetSelectedVariableData
VarData = MNU.getNetCDFGroupVarData(self.filename, GroupPathList, variable )
File "C:\Users\rclement\Documents\My Dropbox\Code\python\NCTSutil\MyNetcdfUtil.py", line 304, in getNetCDFGroupVarData
VarData = V[:]
File "netCDF4.pyx", line 2949, in netCDF4.Variable.__getitem__ (netCDF4.c:36472)
File "netCDF4.pyx", line 2969, in netCDF4.Variable._toma (netCDF4.c:36814)
ValueError: could not convert string to float:
When I use other netcdf utilities (i.e. panolpy) I can access the data.
Does anyone have a clue why netcdf4 would be throwing this excpetion - or worse - how it could have inserted a string in my float32 field in the netcdf file?

From the traceback the problem was occurring in the "_toma" Netcdf4 function which converts the data to a masked array. When reading the file with other utilities (eg. NCDUMP) I had no problem accessing the data.
At the moment I believe the problem occurred because I had an unassigned 'missing_value' attribute for the variable. Apparently, if there is no 'missing_value' attribute Netcdf4 defaults to a missing value appropriate for the dtype. In my implementation the 'missing_value' attribute was being exposed for editing via a wxpyhton grid control. When the edited attributes in the grid were written back to the netcdf file the empty grid cell was returning either a None object or wx.emptyString, which Netcdf4 attempted to insert into the float type in the netcdf file.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting Python list of lists to a MATLAB sparse matrix - python

Related

spatial regression in Python - read matrix from list

TypeError: '(slice(0, 15, None), 15)' is an invalid key

DistributedSampler — Expected a ‘cuda’ device type for generator when generating indices

python feature scaling, short function

ValueError string to float when retrieving float32 from Netcdf file using Netcdf4 in python

Categories

Resources