I have a text file that consists of points of 4 dimensions each point.
The file is like this:
4.8 3.4 1.6 0.2
4.8 3.0 1.4 0.1
4.3 3.0 1.1 0.1
5.8 4.0 1.2 0.2
5.7 4.4 1.5 0.4
5.4 3.9 1.3 0.4
5.1 3.5 1.4 0.3
I want to read the file and store each line of the file as a seperate list.For instance point1=[4,8 3,4 1,6 0,2].
What I have done so far is :
f= open('points.txt', 'r')
data = f.readlines()
for line in data:
pList= line.rstrip()
print (pList)
I get a list of all the points.
You might find Python's CSV module useful for this:
import csv
with open('points.txt', 'r') as f_input:
points = list(csv.reader(f_input, delimiter='\t'))
# To convert to floats
points = [map(float, x) for x in points]
print points
This would display the following:
[[4.8, 3.4, 1.6, 0.2], [4.8, 3.0, 1.4, 0.1], [4.3, 3.0, 1.1, 0.1], [5.8, 4.0, 1.2, 0.2], [5.7, 4.4, 1.5, 0.4], [5.4, 3.9, 1.3, 0.4], [5.1, 3.5, 1.4, 0.3]]
Try with :
f= open('points.txt', 'r')
data = f.readlines()
for line in data:
points = line.split()
print points
Related
I am trying to subtract a list of values from each key in a dictionary. Each key in the dictionary contains 20 y-values for a predicted line. I want to find the difference between these y-values and a different set of given values.
ydata contains 20 points. ycalc has a length of 100 to which keys are assigned for, from L1-L99. Each Key contains 20 points as well. I want to subtract each key from ydata. This is what I have tried, the main issue is that my method return a list of 20 values, when I expect a list of 100 values where each value is a list of 20 points.
ydata = [ 1.2 1.8 1.7 3.0 3.5 3.2 4.5 4.8 5.3 6.2 5.7 6.8 7.0 7.8 8.5 8.6 9.1 11.5 10.3 10.8]
ycalc = 'L0': array([-0.8, -0.6, -0.4, -0.2, 0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2,
1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8, 3. ]), 'L1': array([-0.57777778, -0.37777778, -0.17777778, 0.02222222, 0.22222222,
0.42222222, 0.62222222, 0.82222222, 1.02222222, 1.22222222,
1.42222222, 1.62222222, 1.82222222, 2.02222222, 2.22222222,
2.42222222, 2.62222222, 2.82222222, 3.02222222, 3.22222222]), 'L2': array([-0.35555556, -0.15555556, 0.04444444, 0.24444444, 0.44444444,
0.64444444, 0.84444444, 1.04444444, 1.24444444, 1.44444444,
1.64444444, 1.84444444, 2.04444444, 2.24444444, 2.44444444,
2.64444444, 2.84444444, 3.04444444, 3.24444444, 3.44444444]), 'L3': array([-0.13333333, 0.06666667, 0.26666667, 0.46666667, 0.66666667,
0.86666667, 1.06666667, 1.26666667, 1.46666667, 1.66666667,
1.86666667, 2.06666667, 2.26666667, 2.46666667, 2.66666667,
2.86666667, 3.06666667, 3.26666667, 3.46666667, 3.66666667]), 'L4': array([0.08888889, 0.28888889, 0.48888889, 0.68888889, 0.88888889,
1.08888889, 1.28888889, 1.48888889, 1.68888889, 1.88888889,
2.08888889, 2.28888889, 2.48888889, 2.68888889, 2.88888889,
3.08888889, 3.28888889, 3.48888889, 3.68888889, 3.88888889]), etc.
for i in ycalc:
ydiff = - i + array(ydata)
print(ydiff)
returns [-0.2 0. -0.5 0.4 0.5 -0.2 0.7 0.6 0.7 1.2 0.3 1. 0.8 1.2
1.5 1.2 1.3 3.3 1.7 1.8]
but I want something like this:
([-0.2 0. -0.5 0.4 0.5 -0.2 0.7 0.6 0.7 1.2 0.3 1. 0.8 1.2 1.5 1.2 1.3 3.3 1.7 1.8]), ([-0.3 0.1 -0.6 0.4 0.5 -0.2 0.2 0.6 0.8 1.2 0.5 1. 0.8 1.2 1.5 1.2 1.3 3.3 1.7 1.8]), etc.
I have a pandas dataframe with index 3 to 15 with 0.5 steps and want to reindex it to 0.1 steps.
I tried this code and it doesn't work
# create data and set index and print for verification
df = pd.DataFrame({'A':np.arange(3,5,0.5),'B':np.arange(3,5,0.5)})
df.set_index('A', inplace = True)
df.reindex(np.arange(3,5,0.1)).head(15)
The above code outputs this:
A
B
3.0
3.0
3.1
NaN
3.2
NaN
3.3
NaN
3.4
NaN
3.5
NaN * expected output in this position to be 3.5 since it exists in the original df
3.6
NaN
3.7
NaN
3.8
NaN
Strangely the problem is fixed when reindexing from 0 instead of 3 as it's shown in the code below:
df = pd.DataFrame({'A':np.arange(3,5,0.5),'B':np.arange(3,5,0.5)})
df.set_index('A', inplace = True)
print(df.head())
df.reindex(np.arange(0,5,0.1)).head(60)
The output now correctly shows
A
B
0.0
NaN
...
...
3.0
3.0
3.1
NaN
3.2
NaN
3.3
NaN
3.4
NaN
3.5
3.5
3.6
NaN
3.7
NaN
3.8
NaN
I'm running python 3.8.5 on Windows 10.
Pandas version is 1.4.07
Numpy version is 1.22.1
Does anyone know why this happens? If it's a known or new bug? If the bug has been fixed in a newer version of python, pandas or numpy?
Thanks
Good question.
The answer is because np.arange(3,5,0.1) creates a value of 3.5 that is not exactly 3.5. It is 3.5000000000000004. But np.arange(0,5,0.1) does create a 3.5 that is exactly 3.5. Plus, np.arange(3,5,0.5) also generates a 3.5 that is exactly 3.5.
pd.Index(np.arange(3,5,0.1))
Float64Index([ 3.0, 3.1, 3.2,
3.3000000000000003, 3.4000000000000004, 3.5000000000000004,
3.6000000000000005, 3.7000000000000006, 3.8000000000000007,
3.900000000000001, 4.000000000000001, 4.100000000000001,
4.200000000000001, 4.300000000000001, 4.400000000000001,
4.500000000000002, 4.600000000000001, 4.700000000000001,
4.800000000000002, 4.900000000000002],
dtype='float64')
and
pd.Index(np.arange(0,5,0.1))
Float64Index([ 0.0, 0.1, 0.2,
0.30000000000000004, 0.4, 0.5,
0.6000000000000001, 0.7000000000000001, 0.8,
0.9, 1.0, 1.1,
1.2000000000000002, 1.3, 1.4000000000000001,
1.5, 1.6, 1.7000000000000002,
1.8, 1.9000000000000001, 2.0,
2.1, 2.2, 2.3000000000000003,
2.4000000000000004, 2.5, 2.6,
2.7, 2.8000000000000003, 2.9000000000000004,
3.0, 3.1, 3.2,
3.3000000000000003, 3.4000000000000004, 3.5,
3.6, 3.7, 3.8000000000000003,
3.9000000000000004, 4.0, 4.1000000000000005,
4.2, 4.3, 4.4,
4.5, 4.6000000000000005, 4.7,
4.800000000000001, 4.9],
dtype='float64')
and
pd.Index(np.arange(3,5,0.5))
Float64Index([3.0, 3.5, 4.0, 4.5], dtype='float64')
This is definitely related to Numpy:
np.arange(3,5,0.1)[5]
3.5000000000000004
and
np.arange(3,5,0.1)[5] == 3.5
False
This situation is documented in the Numpy arange doc:
https://numpy.org/doc/stable/reference/generated/numpy.arange.html
The length of the output might not be numerically stable.
Another stability issue is due to the internal implementation of
numpy.arange. The actual step value used to populate the array is
dtype(start + step) - dtype(start) and not step. Precision loss can
occur here, due to casting or due to using floating points when start
is much larger than step. This can lead to unexpected behaviour.
It looks like np.linspace might be able to help you out here:
pd.Index(np.linspace(3,5,num=21))
Float64Index([3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2,
4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0],
dtype='float64')
I have an issue with numpy linspace
import numpy as np
temp = np.linspace(1,2,11)
for t in temp:
print(t)
This return :
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7000000000000002
1.8
1.9
2.0
The 1.7 value looks definitely wrong.
It seems related to this issue https://github.com/numpy/numpy/issues/8909
Does anybody ever had such a problem with numpy.linspace ? is it a known issue ?
François
This is nothing to do with numpy, consider:
>>> temp = np.linspace(1,2,11)
>>> temp
array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. ])
>>> # ^ look, numpy displays it fine
>>> for t in temp:
... print(t)
...
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7000000000000002
1.8
1.9
2.0
The "issue" is with how computers represent floats in general. See: https://docs.python.org/3/tutorial/floatingpoint.html.
My Python 3.5.2 output in the terminal (on a mac) is limited to a width of ca. 80px, even if I increase the size of the terminal window.
This narrow width causes a bunch of line breaks when outputting long arrays which is really a hassle. How do I tell python to use the full command line window width?
For the record, i am not seeing this problem in any other program, for instance my c++ output looks just fine.
For numpy, it turns out you can enable the full output by setting
np.set_printoptions(suppress=True,linewidth=np.nan,threshold=np.nan).
In Python 3.7 and above, you can use
from shutil import get_terminal_size
pd.set_option('display.width', get_terminal_size()[0])
I have the same problem while using pandas. So if this is what you are trying to solve, I fixed mine by doing
pd.set_option('display.width', pd.util.terminal.get_terminal_size()[0])
Default output of a 2x15 matrix is broken:
a.T
array([[ 0.2, -1.4, -0.8, 1.3, -1.5, -1.4, 0.6, -1.5, 0.4, -0.9, 0.3,
1.1, 0.5, -0.3, 1.1],
[ 1.3, -1.2, 1.6, -1.4, 0.9, -1.2, -1.9, 0.9, 1.8, -1.8, 1.7,
-1.3, 1.4, -1.7, -1.3]])
Output is fixed using numpy set_printoptions() command
import sys
np.set_printoptions(suppress=True,linewidth=sys.maxsize,threshold=sys.maxsize)
a.T
[[ 0.2 -1.4 -0.8 1.3 -1.5 -1.4 0.6 -1.5 0.4 -0.9 0.3 1.1 0.5 -0.3 1.1]
[ 1.3 -1.2 1.6 -1.4 0.9 -1.2 -1.9 0.9 1.8 -1.8 1.7 -1.3 1.4 -1.7 -1.3]]
System and numpy versions:
sys.version = 3.8.3 (default, Jul 2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)]
numpy.__version__ = 1.18.5
Im having issues removing elements from a range a through b from an array list. The solutions ive searched online seem to only work for individual elements, adjacent elements and or elements that are whole numbers. Im dealing with float numbers.
self.genx = np.arange(0, 5, 0.1)
temp_select = self.genx[1:3] #I want to remove numbers from 1 - 3 from genx
print(temp_select)
self.genx = list(set(self.genx)-set(temp_select))
print(self.genx)
plt.plot(self.genx,self.geny)
However I get the following in the console and this is because im subtracting floats rather than whole numbers so it literally subtracts rather than removing which is what it would do if dealing with whole numbers:
genx: [ 0.0 , 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , 0.9 , 1.0, 1.1 , 1.2 , 1.3 , 1.4 , 1.5 , 1.6 , 1.7 , 1.8 , 1.9 , 2.0, , 2.1 , 2.2 , 2.3 , 2.4 , 2.5 , 2.6 , 2.7 , 2.8 , 2.9
, 3.0 , 3.1 , 3.2 , 3.3 , 3.4 , 3.5 , 3.6 , 3.7 , 3.8 , 3.9 , 4.0 , 4.1 , 4.2 , 4.3 , 4.4
, 4.5 , 4.6 , 4.7 , 4.8 , 4.9]
temp_select: [ 0.1 0.2]
genx(after subtracted): [0.0, 0.5, 2.0, 3.0, 4.0, 1.5, 1.0, 1.1000000000000001, 0.70000000000000007, 0.90000000000000002, 2.7000000000000002, 0.30000000000000004, 2.9000000000000004, 1.9000000000000001, 3.3000000000000003, 0.40000000000000002, 4.7000000000000002, 3.4000000000000004, 2.2000000000000002, 2.8000000000000003, 1.4000000000000001, 0.60000000000000009, 3.6000000000000001, 1.3, 1.2000000000000002, 4.2999999999999998, 4.2000000000000002, 4.9000000000000004, 3.9000000000000004, 3.8000000000000003, 2.3000000000000003, 4.8000000000000007, 3.2000000000000002, 1.7000000000000002, 2.5, 3.5, 1.8, 4.1000000000000005, 2.4000000000000004, 4.4000000000000004, 1.6000000000000001, 0.80000000000000004, 2.6000000000000001, 4.6000000000000005, 2.1000000000000001, 3.1000000000000001, 3.7000000000000002, 4.5]
I didn't test this but you should be able to do something like the following:
self.genx = [ item for item in self.genx if not range_min < item < range_max ]
self.genx = [ item for item in self.genx if not range_min <= item <= range_max ]
Is this what you want??