python feature scaling, short function - python

Have some trouble with writing a small function for feature scaling.
data = [115, 140, 175]
def featureScaling(arr):
for a in arr:
return (a-min(data))/float((max(data)-min(data)))
print featureScaling(data)
I know there is a potential corner case if all values in data are the same ( division by zero problem)
I just don't know why my idea does not work, since min(data) on its on works.
I get the error:
Traceback (most recent call last):
File "vm_main.py", line 33, in
import main
File "/tmp/vmuser_czbvlzqfxl/main.py", line 2, in
import studentMain
File "/tmp/vmuser_czbvlzqfxl/studentMain.py", line 29, in
elif not compare_numbers(student_output[0], solution_output[0]):
TypeError: 'float' object has no attribute 'getitem'

As Tichodroma noted this code does not throw an error, however I do not believe it performs as you intended at any rate. It returns a single value but I believe you actually want each data point scaled, hence the following modification:
data = [115, 140, 175]
def featureScaling(arr):
scaled=[]
for a in arr:
scaled.append((a-min(data))/(max(data)-min(data)))
return scaled
print(featureScaling(data))
This code provides the result [0.0, 0.4166666666666667, 1.0]

Related

spatial regression in Python - read matrix from list

I have a following problem. I am following this example about spatial regression in Python:
import numpy
import libpysal
import spreg
import pickle
# Read spatial data
ww = libpysal.io.open(libpysal.examples.get_path("baltim_q.gal"))
w = ww.read()
ww.close()
w_name = "baltim_q.gal"
w.transform = "r"
Example above works. But I would like to read my own spatial matrix which I have now as a list of lists. See my approach:
ww = libpysal.io.open(matrix)
But I got this error message:
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 90, in __new__
cls.__registry[cls.getType(dataPath, mode, dataFormat)][mode][0]
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 105, in getType
ext = os.path.splitext(dataPath)[1]
File "/usr/lib/python3.8/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list
this is how matrix looks like:
[[0, 2, 1], [2, 0, 4], [1, 4, 0]]
EDIT:
If I try to insert my matrix into the GM_Lag like this:
model = spreg.GM_Lag(
y,
X,
w=matrix,
)
I got following error:
warn("w must be API-compatible pysal weights object")
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 2, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/twosls_sp.py", line 469, in __init__
USER.check_weights(w, y, w_required=True)
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/user_output.py", line 444, in check_weights
if w.n != y.shape[0] and time == False:
AttributeError: 'list' object has no attribute 'n'
EDIT 2:
This is how I read the list of lists:
import pickle
with open("weighted_matrix.pkl", "rb") as f:
matrix = pickle.load(f)
How can I insert list of lists into spreg.GM_Lag ? Thanks
Why do you want to pass it to the libpysal.io.open method? If I understand correctly this code, you first open a file, then read it (and the read method seems to be returning a List). So in your case, where you already have the matrix, you don't need to neither open nor read any file.
What will be needed though is what w is supposed to look like here: w = ww.read(). If it is a simple matrix, then you can initialize w = matrix. If the read method also format the data a certain way, you'll need to do it another way. If you could describe the expected behavior of the read method (e.g. what does the input file contain, and what is returned), it would be useful.
As mentioned, as the data is formatted into a libpysal.weights object, you must build one yourself. This can supposedly be done with this method libpysal.weights.W. (Read the doc too fast).

problem related to h5py and create_dataset

Maybe the question is dumb, but so far I have not been able to find a solution.
I have been handed a code from other person who was working probably with a different set than mine (e.g. Python 2 instead of 3, etc).
So I have done some small changes to make things work, but I am stuck in a probably simple problem related to h5py.
The part of the code where it crushes looks like:
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
The problem seems to be in base.create_dataset:
Traceback (most recent call last):
File "C:\Users\DaniJ\Documents\PostDoc_Jena\Trips, Conf, etc\Sinfonia Workshop\Exercise_1\exercise_1_SINFONIA_for_One\NR_chem_SINGLE_NoEu.py", line 252, in <module>
base.create_dataset('Labels', data=labels_ALL)
File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\group.py", line 136, in create_dataset
dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 118, in make_new_dset
tid = h5t.py_create(dtype, logical=1)
File "h5py\h5t.pyx", line 1634, in h5py.h5t.py_create
File "h5py\h5t.pyx", line 1656, in h5py.h5t.py_create
File "h5py\h5t.pyx", line 1717, in h5py.h5t.py_create
TypeError: No conversion path for dtype: dtype('<U10')
the variable base seems to be a h5py._hl.files.File variable.
Does somebody how can I solve this problem?
Thanks
Best regards,
Dani
Did you solve your problem? I'm 99.9% sure it's related to your Labels data -- likely it's in a NumPy array instead of a List. I wrote 3 short examples to demonstrate the difference.
The first code segment uses a List and successfully creates the
datasets in file SO_69900543_1.h5.
The second code segment reproduces your error. It converts the List
to a NumPy Array then fails when attempting to create the datasets
in file SO_69900543_2.h5. Notice that it gives the same error
message you encountered: TypeError: No conversion path for dtype: dtype('<U10').
The third code segment shows how to modify numpy.str_ elements to str (solves problem in segment #2). Note that the each Labels value is converted with str() before it is added to Labels_All.
Maybe this will help you find (and fix) your problem with Unicode data.
Code segment 1 (works):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
with h5py.File('SO_69900543_1.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
Code segment 2 (returns TypeError):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array
# This will trigger the error when creating the dataset
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
labels_ALL.append(Labels[i])
units_ALL.append('(mol/L)')
for i in range(len(labels_ALL)):
print(i, type(labels_ALL[i]), type(units_ALL[i]))
with h5py.File('SO_69900543_2.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)
Code segment 3 (works):
Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array
# This will trigger the error when creating the dataset if not modified
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
# use str() to convert from 'numpy.str_' to 'str'
labels_ALL.append(str(Labels[i]))
units_ALL.append('(mol/L)')
for i in range(len(labels_ALL)):
print(i, type(labels_ALL[i]), type(units_ALL[i]))
with h5py.File('SO_69900543_2.h5','w') as base:
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)

Passing arguments to this function which uses pvmismatch: Why is this not working?

I am using a library called pvmismatch which measures the impact of imperfect shading on solar cells, which I think will soon be compatible with pvlib. I am not sure if this is a question related to python in general or just the library, but probably the former.
I would like to create a function which takes in a list of "shadings" which uses "setSuns" and also the indices of which cells are to be shaded. My code is below:
def shade_into_powers(shades_list = [], temperatures_list = [], cells_list = []):
length_of_lists = len(shades_list)
list_of_powers = []
for i in range(0, length_of_lists):
my_module_shaded.setSuns(Ee = shades_list[i], cells = cells_list[i])
my_module_shaded.setTemps(Tc=temperatures_list[i], cells= cells_list[i])
list_of_powers[i] = my_module_shaded.pvcells[i].Igen*max(my_module_shaded.pvcells[i].Vcell)
return list_of_powers
I later tried to try out this function as below:
shadez = [0.43, 0.43, 0.43]
tempez = [88, 81, 77]
cellz = [30, 31, 32]
powers_listed = shade_into_powers(shadez, tempez, cellz)
The error I get is "object of type 'int' is not iterable". What am I doing wrong here?
All help appreciated.
The below is the TraceBack:
Traceback (most recent call last):
File "/home/abed/.config/JetBrains/PyCharmCE2020.2/scratches/scratch_2.py", line 176, in <module>
powers_listed = shade_into_powers(shadez, tempez, cellz)
File "/home/abed/.config/JetBrains/PyCharmCE2020.2/scratches/scratch_2.py", line 168, in shade_into_powers
my_module_shaded.setSuns(Ee = shades_list[i], cells = cells_list[i])
File "/home/abed/.local/lib/python3.7/site-packages/pvmismatch/pvmismatch_lib/pvmodule.py", line 323, in setSuns
cells_to_update = [self.pvcells[i] for i in cells]
TypeError: 'int' object is not iterable
Thank you for using PVmismatch. As #carcigenicate, says in their comment, the reason you are getting TypeError: 'int' object is not iterable is because the expected argument for cells in setSuns() is a list as documented in the API:
I think you are trying to set the irradiance & temperature for 3 cells in a module. If correct, you can do this in a single call to setSuns followed by a single call to setTemps. Also note that cell temperatures are Kelvin, not Celsius. Also note you can get the max cell temperatures by calling the NumPy max() function on the array of IV-curve powers, Pcell[cell_idx].
>>> from pvmismatch import *
>>> shadez = [0.43, 0.43, 0.43]
>>> tempez = [88, 81, 77]
>>> cellz = [30, 31, 32]
>>> my_module_shaded = pvmodule.PVmodule()
>>> my_module_shaded.Pmod.max() # module max power
321.2733629193704
# power of cells 30, 31, & 32, same for all cells in module
>>> [cellpower.max() for cellpower in my_module_shaded.Pcell[cellz]]
[3.3466338806725577, 3.3466338806725577, 3.3466338806725577]
>>> my_module_shaded.setSuns(Ee=shadez, cells=cellz)
>>> my_module_shaded.Pmod.max() # module max power, after irradiance change
217.32753929640674
# NOTE: cell temperature is in Kelvin, not Celsius!
>>> tempez = [tc + 273.15 for tc in tempez] # convert to Kelvin
>>> my_module_shaded.setTemps(Tc=tempez, cells=cellz)
>>> my_module_shaded.Pmod.max() # module max power, after temperature change
215.93464636002747
# power of cells 30, 31, & 32, same for all cells in module
>>> [cellpower.max() for cellpower in my_module_shaded.Pcell[cellz]]
[1.0892289330819398, 1.1230533440517434, 1.1424662134689452]
List of powers is an empty array. Try replacing list_of_powers[i] = my_module_shaded.pvcells[i].Igen*max(my_module_shaded.pvcells[i].Vcell) with list_of_powers.append(my_module_shaded.pvcells[i].Igen*max(my_module_shaded.pvcells[i].Vcell))

Python-OpenCV floodfill function; strange type errors

I am trying to implement my own version of the MatLab function imhmin() in Python using OpenCV and (naturally) NumPy. If you are not familiar with this MatLab function, it's extremely useful for segmentation. MatLab's documentation can explain it much better than I can:
https://it.mathworks.com/help/images/ref/imhmin.html
Here is what I have so far:
(For the sake of keeping this short, I did not include the local_min function. It takes one image parameter and returns an image of the same size where local minima are 1s and everything else is 0.)
from volume import show
import cv2
import numpy
def main():
arr = numpy.array( [[5,5,5,5,5,5,5],
[5,0,3,1,4,2,5],
[5,5,5,5,5,5,5]] ) + 1
res = imhmin(arr, 3)
print(res)
def imhmin(src, h):
# TODO: speed up function by cropping image
edm = src.copy()
# d is the domain / all values contained in the array
d = numpy.unique(edm)
# for the index of each local minima (sorted gtl)
indices = numpy.nonzero(local_min(edm)) # get indices
indices = numpy.dstack((indices[0], indices[1]))[0].tolist() # zip
# sort based on the value of edm[] at that index
indices.sort(key = lambda _: edm[_[0],_[1]], reverse = True)
for (x,y) in indices:
start = edm[x,y] # remember original value of minima
# for each in a list of heights greater than the starting height
for i in range(*numpy.where(d==edm[x,y])[0], d.shape[0]-1):
# prevent exceeding target height
step = start + h if (d[i+1] - start > h) else d[i+1]
#-------------- WORKS UNTIL HERE --------------#
# complete floodFill syntax:
# cv2.floodFill(image, mask, seed, newVal[, loDiff[, upDiff[, flags]]]) → retval, rect
# fill UPWARD onto image (and onto mask?)
cv2.floodFill(edm, None, (y,x), step, 0, step-d[i], 4)
# fill DOWNWARD NOT onto image
# have you overflowed?
if __name__ == "__main__":
main()
Which works fine until it gets to the floodfill line. It barks this error back:
Traceback (most recent call last):
File "edm.py", line 94, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 66, in imhmin
cv2.floodFill(edm, None, (y,x), step, 0, step-d[i], 4)
TypeError: Layout of the output array image is incompatible with cv::Mat (step[ndims-1] != elemsize or step[1] != elemsize*nchannels)
At first I thought maybe the way I laid out the parameters was wrong because of the stuff about step in the traceback, but I tried changing that variable's name and have come to the conclusion that step is some variable name in OpenCV's code. It's talking about the output array, and I'm not using a mask, so something must be wrong with the array edm.
I can suppress this error by replacing the floodfill line with this one:
cv2.floodFill(edm.astype(numpy.double), None, (y,x), step, 0, step-d[i], 4)
The difference being that I am typecasting the numpy array to a float array. Then I am left with this error:
Traceback (most recent call last):
File "edm.py", line 92, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 64, in imhmin
cv2.floodFill(edm.astype(numpy.double), None, (y,x), step, 0, step-d[i], 4)
TypeError: Scalar value for argument 'newVal' is not numeric
This is where I started suspecting something was seriously wrong, because step is "obviously" going to be an integer here (maybe it isn't obvious, but I did try printing it and it looks like it's just an integer, not an array of one integer or anything weird like that).
To entertain the error message, I typecast the newVal parameter to a float. I got pretty much the exact same error message about the upDiff parameter, so I just typecast that too, resulting in this line of code:
cv2.floodFill(edm.astype(numpy.double), None, (y,x), float(step), 0, float(step-d[i]), 4)
I know this isn't how I want to be doing things, but I just wanted to see what would happen. What happened was I got this scary looking error:
Traceback (most recent call last):
File "edm.py", line 92, in <module>
main()
File "edm.py", line 14, in main
res = imhmin(arr, 3)
File "edm.py", line 64, in imhmin
cv2.floodFill(edm.astype(numpy.double), None, (y,x), float(step), 0, float(step-d[i]), 4)
cv2.error: OpenCV(3.4.2) /opt/concourse/worker/volumes/live/9523d527-1b9e-48e0-7ed0-a36adde286f0/volume/opencv-suite_1535558719691/work/modules/imgproc/src/floodfill.cpp:587: error: (-210:Unsupported format or combination of formats) in function 'floodFill'
I don't even know where to start with this. I've used OpenCV's floodfill function many times before and have never run into problems like this. Can anyone provide any insight?
Thanks in advance
Antonio

ifft function gives "'str' object is not callable" error

I am trying to take the inverse Fourier transform of a list, and for some reason I keep getting the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "simulating_coherent_data.py", line 238, in <module>
exec('ift%s = np.fft.ifft(nd.array(FTxSQRT_PS%s))'(x,x))
TypeError: 'str' object is not callable
And I can't figure out where I have a string. The part of my code it relates to is as follows
def FTxSQRT_PS(FT,PS):
# Import: The Fourier Transform and the Power Spectrum, both as lists
# Export: The result of FTxsqrt(PS), as a list
# Function:
# Takes each element in the FT and PS and finds FTxsqrt(PS) for each
# appends each results to a list called signal
signal = []
print type(PS)
for x in range(len(FT)):
indiv_signal = np.abs(FT[x])*math.sqrt(PS[x])
signal.append(indiv_signal)
return signal
for x in range(1,number_timesteps+1):
exec('FTxSQRT_PS%s = FTxSQRT_PS(fshift%s,power_spectrum%s)'%(x,x,x))
exec('ift%s = np.fft.ifft(FTxSQRT_PS%s)'(x,x))
Where FTxSQRT_PS%s are all lists. fshift%s is a np.array and power_spectrum%s is a list. I've also tried setting the type for FTxSQRT_PS%s as a np.array but that did not help.
I have very similar code a few lines up that works fine;
for x in range(1,number_timesteps+1):
exec('fft%s = np.fft.fft(source%s)'%(x,x))
where source%s are all type np.array
The only thing I can think of is that maybe np.fft.ifft is not how I should be taking the inverse Fourier transform for Python 2.7.6 but I also cannot find an alternative.
Let me know if you'd like to see the whole code, there is about 240 lines up to where I'm having trouble, though a lot of that is commenting.
Thanks for any help,
Teresa
You are missing a %
exec('ift%s = np.fft.ifft(FTxSQRT_PS%s)'(x,x))
Should be:
exec('ift%s = np.fft.ifft(FTxSQRT_PS%s)'%(x,x))

Categories

Resources