I am plotting a dataframe:
ax = df.plot()
fig = ax.get_figure()
fig.savefig("{}/{}ts.png".format(IMGPATH, series[pfxlen:]))
It works fine. But, on the console, I get:
/usr/lib64/python2.7/site-packages/matplotlib/axes.py:2542: UserWarning: Attempting to set identical left==right results in singular transformations; automatically expanding. left=736249.924955, right=736249.924955 + 'left=%s, right=%s') % (left, right))
Basic searching hasn't showed me how to solve this error. So, I want to suppress these errors, since they garbage up the console. How can I do this?
Those aren't errors, but warnings. If you aren't concerned by those and just want to silence them, it's as simple as:
import warnings
warnings.filterwarnings('ignore')
Additionally, pandas and other libraries may trigger NumPY floating-point errors. If you encounter those, you have to silence them as well:
import numpy as np
np.seterr('ignore')
Related
I have an array A of shape (1000, 2000). I use matplotlib.pyplot to plot the array, which means 1000 curves, using
import matplotlib.pyplot as plt
plt(A)
The figure is fine but there are a thousand lines of:
<matplotlib.lines.Line2D at 0xXXXXXXXX>
Can I disable this output?
This output is what the plt function is returning (I presume here you meant to write plt.plot(A)). To suppress this output assign the return object a name:
_ = plt.plot(A)
_ is often used to indicate a temporary object which is not going to be used later on. Note that this output you are seeing will only appear in the interpreter, and not when you run the script from outside the interpreter.
You can also suppress the output by use of ; at the end (assuming you are doing this in some sort of interactive environment)
plot(A);
plt.show()
This way there is no need to create unnecessary variables.
E.g.:
import matplotlib.pyplot as plt
plt.plot(A)
plt.show()
use a semi-colon after the plot command
eg:
plt.imshow(image,cmap);
will display the graph and stop the verbose
To ignore warnings
import warnings
warnings.filterwarnings("ignore")
This will resolve your issue.
I am new to Python, so I am not sure if this problem is due to my inexperience or whether this is a glitch.
I am running this code multiple times on the same data (no random number generation) and getting different results. This has occurred with more than one variable so far, and obviously I cannot proceed with the analysis until I figure out which results are trustworthy. Here is a short sample of the results I have obtained after running the code four times. Why is there such a discrepancy between these outputs? I am puzzled and greatly appreciate your advice.
Linear Regression
from scipy.stats import linregress
import scipy.stats
from scipy.signal import welch
import matplotlib
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal as signal
part_022_o = pd.read_excel(r'C:\Users\Me\Desktop\Behavioral Data Processed\part_022_combined_other.xlsx')
distance_o = part_022_o["distance"]
fs = 200
f, Pwelch_spec = signal.welch(distance_o, fs=fs, window='hanning',nperseg=400, noverlap=200, scaling='density', average='mean')
log_f = np.log(f, where=f>0)
log_pwelch = np.log(Pwelch_spec, where=Pwelch_spec>0)
idx = np.isfinite(log_f) & np.isfinite(log_pwelch)
polynomial_coefficients = np.polyfit(log_f[idx],log_pwelch[idx],1)
print(polynomial_coefficients)
scipy.stats.linregress(log_f[idx], log_pwelch[idx])
Results First Attempt
[ 0.00324568 -2.82962602]
Results Second Attempt
[-2.70137164 6.97117509]
Results Third Attempt
[-2.70137164 6.97117509]
Results Fourth Attempt
[-2.28028005 5.53839502]
The same thing happens when I use scipy.stats.linregress().
Thank you,
Confused
Edit: full code added.
Also, the issue appears to be related to np.log(), since only the values of "log_f" array seem to be changing with the different outputs. It is hard to be certain that nothing else is changing (e.g. log_pwelch), but differences in output clearly correspond to differences in the first value of the "log_f" array.
Edit: I have narrowed the issue down to np.log(f, where=f>0). The first value in the f array is zero. According to the documentation of numpy log, "...Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized." Apparently this means that the value or variable is unpredictable and can vary from trial to trial, which is exactly what I am observing. Given my inexperience with Python, I am not sure what the best solution is (e.g. specifying the out-array in the log function, use a random seed, just note the regression coefficients whenever the value of zero is unchanged after log, etc.)
Try to use a random seed to reproduce results. Do this with the following code at the top of your program:
import numpy as np
np.random.seed(123) or any number you want
see here for more info: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html
A random seed ensures you get repeatable results when some part of your program is generating numbers at random.
Try finding out what the functions (np.polyfit(), np.log()) are actually doing using documentation.
This is standard practice for scikit-learn and ML to use a seed value.
In Python DataFrame, Im trying to generate histogram, it gets generated the first time when the function is called. However, when the create_histogram function is called second time it gets stuck at h = df.hist(bins=3, column="amount"). When I say "stuck", I mean to say that it does not finish executing the statement and the execution does not continue to the next line but at the same time it does not give any error or break out from the execution. What is exactly the problem here and how can I fix this?
import matplotlib.pyplot as plt
...
...
def create_histogram(self, field):
df = self.main_df # This is DataFrame
h = df.hist(bins=20, column="amount")
fileContent = StringIO()
plt.savefig(fileContent, dpi=None, facecolor='w', edgecolor='w',
orientation='portrait', papertype=None, format="png",
transparent=False, bbox_inches=None, pad_inches=0.5,
frameon=None)
content = fileContent.getvalue()
return content
Finally I figured this out myself.
Whenever I executed the function I was always getting the following log message but I was ignoring it due to my lack of awareness.
Backend TkAgg is interactive backend. Turning interactive mode on.
But then I realised that may be its running in interactive mode (which was not my purpose). So, I found out that there is a way to turn it off, which is given below.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
And this fixed my issue.
NOTE: the use should be called immediately after importing matplotlib in the sequence given here.
How can I see a warning again without restarting python. Now I see them only once.
Consider this code for example:
import pandas as pd
pd.Series([1]) / 0
I get
RuntimeWarning: divide by zero encountered in true_divide
But when I run it again it executes silently.
How can I see the warning again without restarting python?
I have tried to do
del __warningregistry__
but that doesn't help.
Seems like only some types of warnings are stored there.
For example if I do:
def f():
X = pd.DataFrame(dict(a=[1,2,3],b=[4,5,6]))
Y = X.iloc[:2]
Y['c'] = 8
then this will raise warning only first time when f() is called.
However, now when if do del __warningregistry__ I can see the warning again.
What is the difference between first and second warning? Why only the second one is stored in this __warningregistry__? Where is the first one stored?
How can I see the warning again without restarting python?
As long as you do the following at the beginning of your script, you will not need to restart.
import pandas as pd
import numpy as np
import warnings
np.seterr(all='warn')
warnings.simplefilter("always")
At this point every time you attempt to divide by zero, it will display
RuntimeWarning: divide by zero encountered in true_divide
Explanation:
We are setting up a couple warning filters. The first (np.seterr) is telling NumPy how it should handle warnings. I have set it to show warnings on all, but if you are only interested in seeing the Divide by zero warnings, change the parameter from all to divide.
Next we change how we want the warnings module to always display warnings. We do this by setting up a warning filter.
What is the difference between first and second warning? Why only the second one is stored in this __warningregistry__? Where is the first one stored?
This is described in the bug report reporting this issue:
If you didn't raise the warning before using the simple filter, this
would have worked. The undesired behavior is because of
__warningsregistry__. It is set the first time the warning is emitted.
When the second warning comes through, the filter isn't even looked at.
I think the best way to fix this is to invalidate __warningsregistry__
when a filter is used. It would probably be best to store warnings data
in a global then instead of on the module, so it is easy to invalidate.
Incidentally, the bug has been closed as fixed for versions 3.4 and 3.5.
warnings is a pretty awesome standard library module. You're going to enjoy getting to know it :)
A little background
The default behavior of warnings is to only show a particular warning, coming from a particular line, on its first occurrence. For instance, the following code will result in two warnings shown to the user:
import numpy as np
# 10 warnings, but only the first copy will be shown
for i in range(10):
np.true_divide(1, 0)
# This is on a separate line from the other "copies", so its warning will show
np.true_divide(1, 0)
You have a few options to change this behavior.
Option 1: Reset the warnings registry
when you want python to "forget" what warnings you've seen before, you can use resetwarnings:
# warns every time, because the warnings registry has been reset
for i in range(10):
warnings.resetwarnings()
np.true_divide(1, 0)
Note that this also resets any warning configuration changes you've made. Which brings me to...
Option 2: Change the warnings configuration
The warnings module documentation covers this in greater detail, but one straightforward option is just to use a simplefilter to change that default behavior.
import warnings
import numpy as np
# Show all warnings
warnings.simplefilter('always')
for i in range(10):
# Now this will warn every loop
np.true_divide(1, 0)
Since this is a global configuration change, it has global effects which you'll likely want to avoid (all warnings anywhere in your application will show every time). A less drastic option is to use the context manager:
with warnings.catch_warnings():
warnings.simplefilter('always')
for i in range(10):
# This will warn every loop
np.true_divide(1, 0)
# Back to normal behavior: only warn once
for i in range(10):
np.true_divide(1, 0)
There are also more granular options for changing the configuration on specific types of warnings. For that, check out the docs.
I have an array A of shape (1000, 2000). I use matplotlib.pyplot to plot the array, which means 1000 curves, using
import matplotlib.pyplot as plt
plt(A)
The figure is fine but there are a thousand lines of:
<matplotlib.lines.Line2D at 0xXXXXXXXX>
Can I disable this output?
This output is what the plt function is returning (I presume here you meant to write plt.plot(A)). To suppress this output assign the return object a name:
_ = plt.plot(A)
_ is often used to indicate a temporary object which is not going to be used later on. Note that this output you are seeing will only appear in the interpreter, and not when you run the script from outside the interpreter.
You can also suppress the output by use of ; at the end (assuming you are doing this in some sort of interactive environment)
plot(A);
plt.show()
This way there is no need to create unnecessary variables.
E.g.:
import matplotlib.pyplot as plt
plt.plot(A)
plt.show()
use a semi-colon after the plot command
eg:
plt.imshow(image,cmap);
will display the graph and stop the verbose
To ignore warnings
import warnings
warnings.filterwarnings("ignore")
This will resolve your issue.