I have a .txt file and would like to plot the data.
Here my code
import matplotlib.pyplot as plt
with open('/home/tont_fe/Desktop/Extra_for_paper/Lifbase OH emission2.txt') as f:
lines = f.readlines()
x = [(line.split()[0]) for line in lines]
item = [float(item) for item in x]
y = [(line.split()[1]) for line in lines]
I receive the following error:
ValueError: invalid literal for float(): 281,2
example of data:
281,228 0,01097
281,2289 0,0096
281,2297 0,00888
281,2306 0,00883
281,2315 0,00932
281,2324 0,01008
281,2333 0,01062
281,2341 0,01058
281,235 0,01013
281,2359 0,00981
281,2367 0,01013
281,2376 0,01141
281,2385 0,01377
ValueError: invalid literal for float(): 281,2
Float is excepting . but got , which caused ValueError. You might either replace , using . before feeding it into float or harness locale built-in module following way:
import locale
locale.setlocale(locale.LC_NUMERIC, 'de_DE') # I use Deutschland but can be any country using , in floats
value_str = "281,2"
value_float = locale.atof(value_str)
print(value_float)
output
281.2
You are trying to convert a string containing , to float. That's why the error!
Remove the , and then convert to float.
with open('/home/tont_fe/Desktop/Extra_for_paper/Lifbase OH emission2.txt') as f:
lines = f.readlines()
x = [(line.split()[0]) for line in lines]
# This line will remove ',' from each string of x
x = [i.replace(',', '') for i in x]
item = [float(item) for item in x]
y = [(line.split()[1]) for line in lines]
Related
I am attempting to import a CSV file into Python. After importing the CSV, I want to take an every of every ['Spent Past 6 Months'] value, however the "$" symbol that the CSV includes in front of that value is causing me problems. I've tried a number of things to get rid of that symbol and I'm honestly lost at this point!
I'm really new to Python, so I apologize if there is something very simple here that I am missing.
What I have coded is listed below. My output is listed first:
File "customer_regex2.py", line 24, in <module>
top20Cust = top20P(data)
File "customer_regex2.py", line 15, in top20P
data1 += data1 + int(a[i]['Spent Past 6 Months'])
ValueError: invalid literal for int() with base 10: '$2099.83'
error screenshot
import csv
import re
data = []
with open('customerData.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
data.append(row)
def top20P(a):
outputList=[]
data1=0
for i in range(0,len(a)):
data1 += data1 + int(a[i]['Spent Past 6 Months'])
top20val= int(data1*0.8)
for j in range(0,len(a)):
if data[j]['Spent Past 6 Months'] >= top20val:
outputList.append('a[j]')
return outputList
top20Cust = top20P(data)
print(outputList)
It looks like a datatype issue.
You could strip the $ characters like so:
someString = '$2099.83'
someString = someString.strip('$')
print(someString)
2099.83
Now the last step is to wrap in float() since you have decimal values.
print(type(someString))
<class 'str'>
someFloat = float(someString)
print(type(someFloat))
<class 'float'>
Hope that helps.
i need to split a string into three values (x,y,z) the string is something like this (48,25,19)
i used "re.findall" and it works fine but sometimes it produces this error
(plane_X, plane_Y, plane_Z = re.findall("\d+.\d+", planepos)
ValueError: not enough values to unpack (expected 3, got 0))
this is the code:
def read_data():
# reading from file
file = open("D:/Cs/Grad/Tests/airplane test/Reading/Positions/PlanePos.txt", "r")
planepos = file.readline()
file.close()
file = open("D:/Cs/Grad/Tests/airplane test/Reading/Positions/AirportPosition.txt", "r")
airportpos = file.readline()
file.close()
# ==================================================================
# spliting and getting numbers
plane_X, plane_Y, plane_Z = re.findall("\d+\.\d+", planepos)
airport_X, airport_Y, airport_Z = re.findall("\d+\.\d+", airportpos)
return plane_X,plane_Y,plane_Z,airport_X,airport_Y,airport_Z
what i need is to split the string (48,25,19) to x=48,y=25,z=19
so if someone know a better way to do this or how to solve this error will be appreciated.
Your regex only works for numbers with a decimal point and not for integers, hence the error. You can instead strip the string of parentheses and white spaces, then split the string by commas, and map the resulting sequence of strings to the float constructor:
x, y, z = map(float, planepos.strip('() \n').split(','))
You can use ast.literal_eval which safely evaluates your string:
import ast
s = '(48,25,19)'
x, y, z = ast.literal_eval(s)
# x => 48
# y => 25
# z => 19
If your numbers are integers, you can use the regex:
re.findall(r"\d+","(48,25,19)")
['48', '25', '19']
If there are mixed numbers:
re.findall(r"\d+(?:\.\d+)?","(48.2,25,19.1)")
['48.2', '25', '19.1']
How to read a text file where each line has three floating point numbers, each with three digits after the decimal point. The numbers are separated by commas followed by one or more white spaces.
The text file (first four observations) looks like this:
-0.340, 1.572, 0.616
-0.948, 1.701, 0.377
0.105, 2.426, 1.265
-0.509, 2.668, 1.079
Desired output:
array = [[-0.340 1.572 0.616],
[-0.948 1.701 0.377],
[0.105 2.426 1.265],
[-0.509 2.668 1.079]]
fh = open("YourFileName")
raw = fh.read()
fh.close()
data = [[float(i) for i in k.split(",")] for k in raw.split("\n")]
Use the csv module & convert to float, it's simple:
import csv
with open("test.csv") as f:
array = [[float(x) for x in row] for row in csv.reader(f)]
on this simple case you can get the same result without csv:
array = [[float(x) for x in row.split(",")] for row in f]
in both cases result is:
[[-0.34, 1.572, 0.616], [-0.948, 1.701, 0.377], [0.105, 2.426, 1.265], [-0.509, 2.668, 1.079]]
You should read the whole file, split into lines, and then split each line into values separated by comma. Last thing - use numpy.array to turn it into an array:
import numpy as np
filename = r'****.txt'
with open(filename) as f:
txt = f.read()
ls = []
for line in txt.split('\n'):
sub_ls = line.split(',')
ls.append(sub_ls)
print np.array(ls, dtype=np.float)
# omitting the ", dtype=np.float" would result in a list-of-strings array
OUTPUT:
[[-0.34 1.572 0.616]
[-0.948 1.701 0.377]
[ 0.105 2.426 1.265]
[-0.509 2.668 1.079]]
Use numpy load text as:
import numpy as np
x= np.loadtxt('YOUR_TXT_FILE.txt',comments='#',usecols=(0),unpack=True)
Just use numpy.
import numpy as np
arr= np.loadtxt('data.txt',delimiter=',')
print("arr = {}".format(arr))
'''
arr = [[-0.34 1.572 0.616]
[-0.948 1.701 0.377]
[ 0.105 2.426 1.265]
[-0.509 2.668 1.079]]
'''
Given the following script to read in latitude, longitude, and magnitude data:
#!/usr/bin/env python
# Read in latitudes and longitudes
eq_data = open('lat_long')
lats, lons = [], []
for index, line in enumerate(eq_data.readlines()):
if index > 0:
lats.append(float(line.split(',')[0]))
lons.append(float(line.split(',')[1]))
#Build the basemap
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
antmap = Basemap(projection='spstere', boundinglat=-20, lon_0=-60, resolution='f')
antmap.drawcoastlines(color='0.50', linewidth=0.25)
antmap.fillcontinents(color='0.95')
x,y = antmap(lons, lats)
antmap.plot(x,y, 'r^', markersize=4)
plt.show()
I receive the following error when attempting to read in the latitudes, longitudes, and magnitudes:
Traceback (most recent call last):
File "./basic_eqplot.py", line 10, in <module>
lats.append(float(line.split(',')[0]))
ValueError: invalid literal for float(): -18.381 -172.320 5.9
The input file looks something like:
-14.990,167.460,5.6
-18.381,-172.320,5.9
-33.939,-71.868,5.9
-22.742,-63.571,5.9
-2.952,129.219,5.7
Any ideas for why this would cause a hiccup?
It appears you have one or more lines of corrupt data in your input file. Your traceback says as much:
ValueError: invalid literal for float(): -18.381 -172.320 5.9
Specifically what is happening:
The line -18.381 -172.320 5.9 is read in from eq_data.
split(',') is called on the string "-18.381 -172.320 5.9". Since there is no comma in the string, the split method returns a list with a single element, the original string.
You attempt to parse the first element of the returned array as a float. The string "-18.381 -172.320 5.9" cannot be parsed as a float and a ValueError is raised.
To fix this issue, double check the format of your input data. You might also try surrounding this code snippet in a try/except block to give you a bit more useful information as to the specific source of the problem:
for index, line in enumerate(eq_data.readlines()):
if index > 0:
try:
lats.append(float(line.split(',')[0]))
lons.append(float(line.split(',')[1]))
except ValueError:
raise ValueError("Unable to parse input file line #%d: '%s'" % (index + 1, line))
What is probably going on is that your input file has a malformed line where a space is used to separate fields instead of a comma.
As a consequence, the result of line.split(',')[0] is the whole input line (in your case "-18.381 -172.320 5.9").
More in general: for these types of problems I really like to use the Python cvs module to parse the input file:
import csv
with open('lat_long', 'r') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
assert len(row) == 3
lst, lon, mag = row
...
An alternative would be to use tools like pandas; but that might be overkill in some cases.
I've been reading an ASCII data file using python. Then I covert the data into a numpy array.
However, I've noticed that the numbers are being rounded.
E.g. My original value from the file is: 2368999.932089
which python has rounded to: 2368999.93209
here is an example of my code:
import numpy as np
datafil = open("test.txt",'r')
tempvar = []
header = datafil.readline()
for line in datafil:
word = line.split()
char = word[0] # take the first element word[0] of the list
word.pop() # remove the last element from the list "word"
if char[0:3] >= '224' and char[0:3] < '225':
tempvar.append(word)
strvar = np.array(tempvar,dtype = np.longdouble) # Here I want to read all data as double
print(strvar.shape)
var = strvar[:,0:23]
print(var[0,22]) # here it prints 2368999.93209 but the actual value is 2368999.932089
Any ideas guys?
Abedin
I think this is not a problem of your code. It's the usual floating point representation in Python. See
https://docs.python.org/2/tutorial/floatingpoint.html
I think when you print it, print already formatted your number to str
In [1]: a=2368999.932089
In [2]: print a
2368999.93209
In [3]: str(a)
Out[3]: '2368999.93209'
In [4]: repr(a)
Out[4]: '2368999.932089'
In [5]: a-2368999.93209
Out[5]: -9.997747838497162e-07
I'm not totally sure what you're trying to do, but simplified with test.txt containing only
asdf
2368999.932089
and then the code:
import numpy as np
datafil = open("test.txt",'r')
tempvar = []
header = datafil.readline()
for line in datafil:
tempvar.append(line)
print(tempvar)
strvar = np.array(tempvar, dtype=np.float)
print(strvar.shape)
print(strvar)
I get the following output:
$ python3 so.py
['2368999.932089']
(1,)
[ 2368999.932089]
which seems to be working fine.
Edit: Updated with your provided line, so test.txt is
asdf
t JD a e incl lasc aper truean rdnnode RA Dec RArate Decrate metdr1 metddr1 metra1 metdec1 metbeta1 metdv1 metsl1 metarrJD1 beta JDej name 223.187263 2450520.619348 3.12966 0.61835 70.7196 282.97 171.324 -96.2738 1.19968 325.317 35.8075 0.662368 0.364967 0.215336 3.21729 -133.586 46.4884 59.7421 37.7195 282.821 2450681.900221 0 2368999.932089 EH2003
and the code
import numpy as np
datafil = open("test.txt",'r')
tempvar = []
header = datafil.readline()
for line in datafil:
tempvar.append(line.split(' '))
print(tempvar)
strvar = np.array(tempvar[0][-2], dtype=np.float)
print(strvar)
the last print still outputs 2368999.932089 for me. So I'm guessing this is a platform issue? What happens if you force dtype=np.float64 or dtype=np.float128? Some other sanity checks: have you tried spitting out the text before it is converted to a float? And what do you get from doing something like:
>>> np.array('2368999.932089')
array('2368999.932089',
dtype='<U14')
>>> float('2368999.932089')
2368999.932089