Matplotilb- Need to find source data from a class attributes - python

I have a lines object which was created with the following:
junk = plt.plot([xxxx], [yyyy])
for x in junk:
print type(x)
<class 'matplotlib.lines.Line2D'>
I need to find the names of the two lists 'xxxx' and 'yyyy'. How can I get them from the class attributes?

You can use dir to see the content of an object in python, or check the docs for the class. I guess the objects you are looking for are xdata and ydata (although I'm a bit confused, in your post you ask for the names of the lists?)
In [27]:
import numpy as np
import matplotlib.pyplot as plt
​
x = np.arange(0, 5, 0.1);
y = np.sin(x)
junk = plt.plot(x, y)
for x in junk:
#print(dir(x))
print(x.get_xdata())
print(x.get_ydata())
[ 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 1.1 1.2 1.3 1.4
1.5 1.6 1.7 1.8 1.9 2. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
3. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4. 4.1 4.2 4.3 4.4
4.5 4.6 4.7 4.8 4.9]
[ 0. 0.09983342 0.19866933 0.29552021 0.38941834 0.47942554
0.56464247 0.64421769 0.71735609 0.78332691 0.84147098 0.89120736
0.93203909 0.96355819 0.98544973 0.99749499 0.9995736 0.99166481
0.97384763 0.94630009 0.90929743 0.86320937 0.8084964 0.74570521
0.67546318 0.59847214 0.51550137 0.42737988 0.33498815 0.23924933
0.14112001 0.04158066 -0.05837414 -0.15774569 -0.2555411 -0.35078323
-0.44252044 -0.52983614 -0.61185789 -0.68776616 -0.7568025 -0.81827711
-0.87157577 -0.91616594 -0.95160207 -0.97753012 -0.993691 -0.99992326
-0.99616461 -0.98245261]
Hope it helps.

Related

Subtracting a List from a Dictionary

I am trying to subtract a list of values from each key in a dictionary. Each key in the dictionary contains 20 y-values for a predicted line. I want to find the difference between these y-values and a different set of given values.
ydata contains 20 points. ycalc has a length of 100 to which keys are assigned for, from L1-L99. Each Key contains 20 points as well. I want to subtract each key from ydata. This is what I have tried, the main issue is that my method return a list of 20 values, when I expect a list of 100 values where each value is a list of 20 points.
ydata = [ 1.2 1.8 1.7 3.0 3.5 3.2 4.5 4.8 5.3 6.2 5.7 6.8 7.0 7.8 8.5 8.6 9.1 11.5 10.3 10.8]
ycalc = 'L0': array([-0.8, -0.6, -0.4, -0.2, 0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2,
1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8, 3. ]), 'L1': array([-0.57777778, -0.37777778, -0.17777778, 0.02222222, 0.22222222,
0.42222222, 0.62222222, 0.82222222, 1.02222222, 1.22222222,
1.42222222, 1.62222222, 1.82222222, 2.02222222, 2.22222222,
2.42222222, 2.62222222, 2.82222222, 3.02222222, 3.22222222]), 'L2': array([-0.35555556, -0.15555556, 0.04444444, 0.24444444, 0.44444444,
0.64444444, 0.84444444, 1.04444444, 1.24444444, 1.44444444,
1.64444444, 1.84444444, 2.04444444, 2.24444444, 2.44444444,
2.64444444, 2.84444444, 3.04444444, 3.24444444, 3.44444444]), 'L3': array([-0.13333333, 0.06666667, 0.26666667, 0.46666667, 0.66666667,
0.86666667, 1.06666667, 1.26666667, 1.46666667, 1.66666667,
1.86666667, 2.06666667, 2.26666667, 2.46666667, 2.66666667,
2.86666667, 3.06666667, 3.26666667, 3.46666667, 3.66666667]), 'L4': array([0.08888889, 0.28888889, 0.48888889, 0.68888889, 0.88888889,
1.08888889, 1.28888889, 1.48888889, 1.68888889, 1.88888889,
2.08888889, 2.28888889, 2.48888889, 2.68888889, 2.88888889,
3.08888889, 3.28888889, 3.48888889, 3.68888889, 3.88888889]), etc.
for i in ycalc:
ydiff = - i + array(ydata)
print(ydiff)
returns [-0.2 0. -0.5 0.4 0.5 -0.2 0.7 0.6 0.7 1.2 0.3 1. 0.8 1.2
1.5 1.2 1.3 3.3 1.7 1.8]
but I want something like this:
([-0.2 0. -0.5 0.4 0.5 -0.2 0.7 0.6 0.7 1.2 0.3 1. 0.8 1.2 1.5 1.2 1.3 3.3 1.7 1.8]), ([-0.3 0.1 -0.6 0.4 0.5 -0.2 0.2 0.6 0.8 1.2 0.5 1. 0.8 1.2 1.5 1.2 1.3 3.3 1.7 1.8]), etc.

Regex expression to remove specific characters

I'm currently trying to get into regex expressions - at the moment I want to write one which acts as follows:
import regex
a = '[[0.1 0.1 0.1 0.1]\n [1.2 1.2 1.2 1.2]\n [2.3 2.3 2.3 2.3]\n [3.4 3.4 3.4 3.4]]'
a_transformed = re.sub(regex_expression, a)
# a_transformed = '0.1 0.1 0.1 0.1 1.2 1.2 1.2 1.2 2.3 2.3 2.3 2.3 3.4 3.4 3.4 3.4'
Basically I only need to sub all occurences of (,n,[,]), but currently I'm struggling to get the expression right.
Thanks for the help in advance!
You can try the following:
>>> re.sub(r'[^\d. ]', '', a)
'0.1 0.1 0.1 0.1 1.2 1.2 1.2 1.2 2.3 2.3 2.3 2.3 3.4 3.4 3.4 3.4'
Here '[^\d. ]' means anything except a digit, '.' and space like characters. ^ inside [] means negate this character group.

how to generate a list within a list delimited by a space

how do i replicate the structure of result of itertools.product?
so as you know itertools.product gives us an object and we need to put them in a list so we can print it
.. something like this.. right?
import itertools
import numpy as np
CN=np.asarray((itertools.product([0,1], repeat=5)))
print(CN)
i want to be able to make something like that but i want the data to be from a csv file.. so i want to make something like this
#PSEUDOCODE
import pandas as pd
df = pd.read_csv('csv here')
#a b c d are the columns that i want to get
x = list(df['a'] df['c'] df['c'] df['d'])
print(x)
so the result will be something like this
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]]
how can i do that?
EDIT:
i am trying to learn how to do recursive feature elimination and i saw in some codes in google that they use the iris data set..
from sklearn import datasets
dataset = datasets.load_iris()
x = dataset.data
print(x)
and when printed it looked something like this
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]]
how could i make my dataset something like that so i can use this RFE template ?
# Recursive Feature Elimination
from sklearn import datasets
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
# load the iris datasets
dataset = datasets.load_iris()
# create a base classifier used to evaluate a subset of attributes
model = LogisticRegression()
# create the RFE model and select 3 attributes
rfe = RFE(model, 3)
print(rfe)
rfe = rfe.fit(dataset.data, dataset.target)
print("features:",dataset.data)
print("target:",dataset.target)
print(rfe)
# summarize the selection of the attributes
print(rfe.support_)
print(rfe.ranking_)
You don't have to. If you want to use rfe.fit function, you need to feed features and target seperately.
So if your df is like:
a b c d target
0 5.1 3.5 1.4 0.2 1
1 4.9 3.0 1.4 0.2 1
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 1
5 5.4 3.9 1.7 0.4 1
6 4.6 3.4 1.4 0.3 0
7 5.0 3.4 1.5 0.2 0
8 4.4 2.9 1.4 0.2 1
9 4.9 3.1 1.5 0.1 1
you can use:
...
rfe = rfe.fit(df[['a', 'b', 'c', 'd']], df['target'])
...

Python numpy.linspace behaves strangely with floats

I have an issue with numpy linspace
import numpy as np
temp = np.linspace(1,2,11)
for t in temp:
print(t)
This return :
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7000000000000002
1.8
1.9
2.0
The 1.7 value looks definitely wrong.
It seems related to this issue https://github.com/numpy/numpy/issues/8909
Does anybody ever had such a problem with numpy.linspace ? is it a known issue ?
François
This is nothing to do with numpy, consider:
>>> temp = np.linspace(1,2,11)
>>> temp
array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. ])
>>> # ^ look, numpy displays it fine
>>> for t in temp:
... print(t)
...
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7000000000000002
1.8
1.9
2.0
The "issue" is with how computers represent floats in general. See: https://docs.python.org/3/tutorial/floatingpoint.html.

Reshape subsections of Pandas DataFrame into a wide format

I am importing data from a PDF which has not been optimised for analysis.
The data has been imported into the following dataframe
NaN NaN Plant_A NaN Plant_B NaN
Pre 1,2 1.1 1.2 6.1 6.2
Pre 3,4 1.3 1.4 6.3 6.4
Post 1,2 2.1 2.2 7.1 7.2
Post 3,4 2.3 2.4 7.3 7.4
and I would like to reorganise it into the following form:
Pre_1 Pre_2 Pre_3 Pre_4 Post_1 Post_2 Post_3 Post_4
Plant_A 1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4
Plant_B 6.1 6.2 6.3 6.4 7.1 7.2 7.3 7.4
I started by splitting the 2nd column by commas, and then combining that with the first column to give me Pre_1 and Pre_2 for instance. However I have struggled to match that with the data in the rest of the columns. For instance, Pre_1 with 1.1 and Pre_2 with 1.2
Any help would be greatly appreciated.
I had to make some assumptions in regards to consistency of your data
from itertools import cycle
import pandas as pd
tracker = {}
for temporal, spec, *data in df.itertuples(index=False):
data = data[::-1]
cycle_plant = cycle(['Plant_A', 'Plant_B'])
spec_i = spec.split(',')
while data:
plant = next(cycle_plant)
for i in spec_i:
tracker[(plant, f"{temporal}_{i}")] = data.pop()
pd.Series(tracker).unstack()
Post_1 Post_2 Post_3 Post_4 Pre_1 Pre_2 Pre_3 Pre_4
Plant_A 2.1 2.2 2.3 2.4 1.1 1.2 1.3 1.4
Plant_B 7.1 7.2 7.3 7.4 6.1 6.2 6.3 6.4

Categories

Resources