I've been recently learning python through a course. Everything works smoothly except when I use view method. Anybody having this problem as well?
I even used a sample code in https://pythonhosted.org/scikit-fuzzy/auto_examples/plot_tipping_problem_newapi.html#example-plot-tipping-problem-newapi-py. (link updated)
import numpy as np
import skfuzzy as fuzz
from skfuzzy import control as ctrl
quality = ctrl.Antecedent(np.arange(0, 11, 1), 'quality')
service = ctrl.Antecedent(np.arange(0, 11, 1), 'service')
tip = ctrl.Consequent(np.arange(0, 26, 1), 'tip')
quality.automf(3)
service.automf(3)
tip['low'] = fuzz.trimf(tip.universe, [0, 0, 13])
tip['medium'] = fuzz.trimf(tip.universe, [0, 13, 25])
tip['high'] = fuzz.trimf(tip.universe, [13, 25, 25])
# HERE COMES MY PROBLEM
quality['average'].view()
Whenever I get to view query part, all I get is a little square box that should show me the graph but it just keeps on loading. Any advice is greatly appreciated. Thank you!
This is the complete example running:
import matplotlib.pyplot as plt
import numpy as np
import skfuzzy as fuzz
from skfuzzy import control as ctrl
quality = ctrl.Antecedent(np.arange(0, 11, 1), 'quality')
service = ctrl.Antecedent(np.arange(0, 11, 1), 'service')
tip = ctrl.Consequent(np.arange(0, 26, 1), 'tip')
quality.automf(3)
service.automf(3)
tip['low'] = fuzz.trimf(tip.universe, [0, 0, 13])
tip['medium'] = fuzz.trimf(tip.universe, [0, 13, 25])
tip['high'] = fuzz.trimf(tip.universe, [13, 25, 25])
Here is my problem
quality['average'].view()
service.view()
tip.view()
rule1 = ctrl.Rule(quality['poor'] | service['poor'], tip['low'])
rule2 = ctrl.Rule(service['average'], tip['medium'])
rule3 = ctrl.Rule(service['good'] | quality['good'], tip['high'])
rule1.view()
tipping_ctrl = ctrl.ControlSystem([rule1, rule2, rule3])
tipping = ctrl.ControlSystemSimulation(tipping_ctrl)
Pass inputs to the ControlSystem using Antecedent labels with Pythonic API
# Note: if you like passing many inputs all at once, use .inputs(dict_of_data)
tipping.input['quality'] = 6.5
tipping.input['service'] = 9.8
# Crunch the numbers
tipping.compute()
print (tip)
tip.view(sim=tipping)
plt.show()
Since skfuzzy uses matplotlib and NetworkX underhood, you can try this code to show your figure:
import matplotlib.pyplot as plt
import numpy as np
import skfuzzy as fuzz
from skfuzzy import control as ctrl
quality = ctrl.Antecedent(np.arange(0, 11, 1), 'quality')
service = ctrl.Antecedent(np.arange(0, 11, 1), 'service')
tip = ctrl.Consequent(np.arange(0, 26, 1), 'tip')
quality.automf(3)
service.automf(3)
tip['low'] = fuzz.trimf(tip.universe, [0, 0, 13])
tip['medium'] = fuzz.trimf(tip.universe, [0, 13, 25])
tip['high'] = fuzz.trimf(tip.universe, [13, 25, 25])
# HERE COMES MY PROBLEM
quality['average'].view()
plt.show()
Related
I have a list of tuples, where each tuple is a datetime and float. I wish to clip the float values so that they are all above a threshold value. For example if I have:
a = [
(datetime.datetime(2021, 11, 1, 0, 0, tzinfo=tzutc()), 100),
(datetime.datetime(2021, 11, 1, 1, 0, tzinfo=tzutc()), 9.0),
(datetime.datetime(2021, 11, 1, 2, 0, tzinfo=tzutc()), 100.0)
]
and if I want to clip at 10.0, this would give me:
b = [
(datetime.datetime(2021, 11, 1, 0, 0, tzinfo=tzutc()), 100),
(datetime.datetime(2021, 11, 1, 0, ?, tzinfo=tzutc()), 10.0),
(datetime.datetime(2021, 11, 1, 1, ?, tzinfo=tzutc()), 10.0),
(datetime.datetime(2021, 11, 1, 2, 0, tzinfo=tzutc()), 100.0)
]
So if I were to plot the a data (before clipping), I would get a V shaped graph. However, if I clip the data at 10.0 to give me the b data, and plot, I will have a \_/ shaped graph instead. There is a bit of math involved in calculating the new times so I'm hoping there is already functionality available to do this kind of thing. The datetimes are sorted in order and are unique. I can fix the data so the difference between consecutive times is equal, should that be necessary.
Apologies for not putting a full answer yesterday, my SO account is still rate-limited.
I have made a bit more complex custom dataset to showcase several values in a row being below threshold.
import pandas as pd
from datetime import datetime
from matplotlib import pyplot as plt
from scipy.interpolate import InterpolatedUnivariateSpline
df = pd.DataFrame([
(datetime(2021, 10, 31, 23, 0), 0),
(datetime(2021, 11, 1, 0, 0), 80),
(datetime(2021, 11, 1, 1, 0), 100),
(datetime(2021, 11, 1, 2, 0), 6),
(datetime(2021, 11, 1, 3, 0), 105),
(datetime(2021, 11, 1, 4, 0), 70),
(datetime(2021, 11, 1, 5, 0), 200),
(datetime(2021, 11, 1, 6, 0), 0),
(datetime(2021, 11, 1, 7, 0), 7),
(datetime(2021, 11, 1, 8, 0), 0),
(datetime(2021, 11, 1, 9, 0), 20),
(datetime(2021, 11, 1, 10, 0), 100),
(datetime(2021, 11, 1, 11, 0), 0)
], columns=['time', 'whatever'])
THRESHOLD = 10
The first thing to do here is to express index in terms of timedelta so that it behaves as any usual number we can then do all kinds of calculations with. For convenience, I am also expressing it as Series - an even better approach would be to create it as such from the get go, save the initial timestamp and reindex.
start_time = df['time'][0]
df.set_index((df['time'] - start_time).dt.total_seconds(), inplace=True)
series = df['whatever']
Then, I've tried InterpolatedUnivariateSpline from scipy:
roots = InterpolatedUnivariateSpline(df.index, series.values - THRESHOLD).roots()
threshold_crossings = pd.Series([THRESHOLD] * len(roots), index=roots)
new_series = pd.concat([series[series > THRESHOLD], threshold_crossings]).sort_index()
Let's test it out:
fig, ax = plt.subplots(figsize=(12, 8))
ax.plot(series)
ax.plot(df.index, [THRESHOLD] * len(df.index), 'k-.', label='threshold')
ax.plot(new_series)
ax.set_xlabel('$t-t_0$, s')
axins = ax.inset_axes([0.6, 0.6, 0.35, 0.3])
axins.plot(series)
axins.plot(df.index, [THRESHOLD] * len(df.index), 'k-.')
axins.plot(new_series)
axins.set_ylim(0, 20)
ax.indicate_inset_zoom(axins, edgecolor="black")
ax.set_ylabel('whatever, a.u.')
ax.legend(loc='upper left')
ax.set_title('Roots from InterpolatedUnivariateSpline')
Not so great. Spline roots interpolation is quite a bit off (after all, it uses a cubic B-spline under the hood and can't find roots if setting order to 1). Ah well. For monotonic functions, we could just inverse the interpolation, but this is not the case here. I hope someone finds a better way to do it, but my next step was rolling out a custom function:
def my_interp(series: pd.Series, thr: float) -> pd.Series:
needs_interp = series > thr
# XOR means we are only considering transition points
needs_interp = (needs_interp ^ needs_interp.shift(-1)).fillna(False)
# The last point will never be interpolated
x = series.index.to_series()
k = series.diff(periods=-1) / x.diff(periods=-1)
b = series - k * x
x_fill = ((thr - b) / k)[needs_interp]
fill_series = pd.Series(data=[thr] * x_fill.size, index=x_fill.values)
# NB! needs_interp is a wrong mask to use for series here
return pd.concat([series[series > thr], fill_series]).sort_index()
new= my_interp(series, THRESHOLD)
It achieves what you want to do with good precision:
To get back to timestamp representation, one would simply do
new_series.index = (start_time + pd.to_timedelta(new_series.index, unit='s'))
With that said, there are a couple caveats:
The function above assumes the timestamps are sorted (can be achieved
by sort_index), and no duplicates are present in the series
Edge conditions are nasty as usual. I have tested the function a little bit, the logic seems sound and it does not break if either side of the series is above/below the threshold, and it handles irregular data just fine, but still - watch out for NaNs in your data and consider how you should handle all the edge conditions, sorting etc.
There is no logic dedicated to handling data points exactly at threshold or ensuring there is any regularity in new timestamps. This could lead to bugs, too: e.g. if some portion of your code relies on having at least 2 data points every day, it might not hold after the transformation.
I have the speed data in that I need to detect the values where threshold is greater than 20 and valley greater than 0. I used this code for peak detection but I am getting index error
import numpy as np
from scipy.signal import find_peaks, find_peaks_cwt
import matplotlib.pyplot as plt
import pandas as pd
import sys
np.set_printoptions(threshold=sys.maxsize)
zero_locs = np.where(x==0)
search_lims = np.append(zero_locs, len(x)) # limits for search area
diff_x = np.diff(x)
diff_x_mapped = diff_x > 0
peak_locs = []
x = np.array([1, 9, 18, 24, 26, 5, 26, 25, 26, 16, 20, 16, 23, 5, 1, 27,
22, 26, 27, 26, 25, 24, 25, 26, 3, 25, 26, 24, 23, 12, 22, 11, 15, 24, 11,
26, 26, 26, 24, 25, 24, 24, 22, 22, 22, 23, 24])
for i in range(len(search_lims)-1):
peak_loc = search_lims[i] + np.where(diff_x_mapped[search_lims[i]:search_lims[i+1]]==0)[0][0]
if x[peak_loc] > 20:
peak_locs.append(peak_loc)
fig= plt.figure(figsize=(10,4))
plt.plot(x)
plt.plot(np.array(peak_locs), x[np.array(peak_locs)], "x", color = 'r')
I tried using peak detection algorithm where it is not detecting peaks where the peak value is above 20 i need to detect the peaks where x values is 0 and peak values is 20
expected output: the marked peaks has to be detected
by running the above script i am getting this error
IndexError: arrays used as indices must be of integer (or boolean) type
how to get ride of this error any suggestions thanks in regards
You found no peaks.
That is, len(peak_locs) is zero.
So you wind up with this array, whose type defaulted to float:
>>> np.array(peak_locs)
array([], dtype=float64)
To fix it?
Find more peaks!
This is my code:
image_array.append(image)
label_array.append(i)
image_array = np.array(image_array)
label_array = np.array(label_array, dtype="float")
This is the error:
AttributeError: 'numpy.ndarray' object has no attribute 'append'
numpy.append expects two input atleast. see this example
import numpy as np
#define NumPy array
x = np.array([1, 4, 4, 6, 7, 12, 13, 16, 19, 22, 23])
#append the value '25' to end of NumPy array
x = np.append(x, 25)
#view updated array
x
array([ 1, 4, 4, 6, 7, 12, 13, 16, 19, 22, 23, 25])
From what I can recall, you are writing the append in the wrong way (check here the example in the doc https://numpy.org/doc/stable/reference/generated/numpy.append.html)
image_array = np.append(image_array, [image])
label_array = np.append(label_array, [i])
Arrays must have the same dimensions
I created a Sankey diagram using plotly (python) and it looks like this:
As you can see, some links overlap, but this plot can be easily changed (manually) to this:
I think the overlapping result comes from the 3rd column of nodes being centered on Y. Is there a way for me to align the 3rd column to the top (or bottom) to fix this problem? (or any other fix is also welcome of course)
The only thing I've found is setting x and y for nodes manually, but I seem to not be able to only set the y, and this also would involve calculating all those coordinates.
Thank you for the help!
Edit: My code
import plotly.graph_objects as go
sources = [23, 23, 23, 23, 23, 23, 23, 24, 8, 23, 23, 23, 30, 17, 5, 12, 20, 20, 23, 18, 18, 18, 18, 23, 33, 33, 33, 33, 33, 23, 16, 16, 23]
targets = [7, 13, 6, 21, 1, 2, 15, 23, 23, 32, 25, 19, 23, 23, 23, 23, 27, 22, 20, 31, 4, 0, 3, 18, 11, 26, 9, 14, 28, 33, 29, 10, 16]
values = [50.0, 1542.78, 287.44, 2619.76, 1583.26, 722.1, 5133.69, 6544.0, 2563.35, 6476.59, 4314.0, 82.87, 650.0, 1773.68, 16723.0, 32297.7, 81.64, 266.92, 348.56, 388.57, 743.2, 5403.24, 5821.52, 12356.53, 12905.68, 316.12, 497.68, 354.42, 3830.44, 17904.34, 175.95, 1224.46, 1400.41]
fig = go.Figure(data=[go.Sankey(
node = dict(
pad = 5,
thickness = 10,
line = dict(color = "black", width = 0.5),
label = list(range(len(values))),
color = "blue"
),
link = dict(
source = sources,
target = targets,
value = values
))])
fig.update_layout(title_text=
"Basic Sankey Diagram", font_size=8)
fig.write_html("test.html")
There's an open issue on github that both x and y positions have to be set in order for manual positioning to work. Does manually adding y coordinates along with x coordinates address your problem?
In general there other issues with sankey sorting as well.
I have been working with problems in this area only in plotly.R so I'm afraid I can't offer specific python suggestions to modify your code.
If you're also looking for suggestions about calculating the coordinates manually, you can calculate this as
1 - (cumulative_sum_of_higher_nodes + current_node_size/2)
or
1 - (cumulative_sum_of_all_nodes_including_current_node - current_node_size/2)
assuming y = 0 is at the bottom of the plot area.
Lately I've been doing a lot of processing on 8x8 blocks of image-data.
Standard approach has been to use nested for-loops to extract the blocks, e.g.
for y in xrange(0,height,8):
for x in xrange(0,width,8):
d = image_data[y:y+8,x:x+8]
# further processing on the 8x8-block
I can't help to wonder if there is a way to vectorize this operation or another approach using numpy/scipy that I can use instead? An iterator of some kind?
A MWE1:
#!/usr/bin/env python
import sys
import numpy as np
from scipy.fftpack import dct, idct
import scipy.misc
import matplotlib.pyplot as plt
def dctdemo(coeffs=1):
unzig = np.array([
0, 1, 8, 16, 9, 2, 3, 10,
17, 24, 32, 25, 18, 11, 4, 5,
12, 19, 26, 33, 40, 48, 41, 34,
27, 20, 13, 6, 7, 14, 21, 28,
35, 42, 49, 56, 57, 50, 43, 36,
29, 22, 15, 23, 30, 37, 44, 51,
58, 59, 52, 45, 38, 31, 39, 46,
53, 60, 61, 54, 47, 55, 62, 63])
lena = scipy.misc.lena()
width, height = lena.shape
# reconstructed
rec = np.zeros(lena.shape, dtype=np.int64)
# Can this part be vectorized?
for y in xrange(0,height,8):
for x in xrange(0,width,8):
d = lena[y:y+8,x:x+8].astype(np.float)
D = dct(dct(d.T, norm='ortho').T, norm='ortho').reshape(64)
Q = np.zeros(64, dtype=np.float)
Q[unzig[:coeffs]] = D[unzig[:coeffs]]
Q = Q.reshape([8,8])
q = np.round(idct(idct(Q.T, norm='ortho').T, norm='ortho'))
rec[y:y+8,x:x+8] = q.astype(np.int64)
plt.imshow(rec, cmap='gray')
plt.show()
if __name__ == '__main__':
try:
c = int(sys.argv[1])
except ValueError:
sys.exit()
else:
if 1 <= int(sys.argv[1]) <= 64:
dctdemo(int(sys.argv[1]))
Footnotes:
Actual application: https://github.com/figgis/dctdemo
There's a function view_as_windows for this in Scikit Image
http://scikit-image.org/docs/dev/api/skimage.util.html#view-as-windows
Unfortunately I will have to finish this answer another time, but you can grab the windows in a form that you can pass to dct with:
from skimage.util import view_as_windows
# your code...
d = view_as_windows(lena.astype(np.float), (8, 8)).reshape(-1, 8, 8)
dct(d, axis=0)
There is a function called extract_patches in the scikit-learn feature extraction routines. You need to specify a patch_size and an extraction_step. The result will be a view on your image as patches, which may overlap. The resulting array is 4D, the first 2 index the patch, and the last two index the pixels of the patch. Try this
from sklearn.feature_extraction.image import extract_patches
patches = extract_patches(image_data, patch_size=(8, 8), extraction_step=(4, 4))
This gives (8, 8) size patches that overlap by half.
Note that up until now this uses no extra memory, because it is implemented using stride tricks. You can force a copy by reshaping
patches = patches.reshape(-1, 8, 8)
which will basically yield a list of patches.