Tuples | Converting Strings to Floats - python

I'm following along with a Great Course tutorial for learning Python, and this code doesn't seem to work.
#Open File
filename = input("Enter the name of the data file: ")
infile = open(filename, 'r')
#Read in file
datalist = []
for line in infile:
#Get data from line
date, l, h, rf = (line.split(','))
rainfall = float(rf)
max_temp = float(h)
min_temp = float(l)
m, d, y = (date.split('/'))
month = int(m)
day= int(d)
year=int(y)
#Put data into list
datalist.append([day,month,year,min_temp,max_temp,rainfall])
I'm trying to import a csv file, then create a tuple. The problem occurs when I'm converting the values in the tuple to floats. It works fine until it runs through the file. Then it presents me with this error:
Traceback (most recent call last): File
"C:/Users/Devlin/PycharmProjects/untitled3/James'
Programs/Weather.py", line 16, in rainfall = float(rf)
ValueError: could not convert string to float:`
Any ideas on what am doing wrong?

It's hard to tell what exactly are you doing wrong without seeing the input file itself, but what seems to be wrong here (besides the fact that your values in a file seem to be comma-separated and that you might have been better off using Python stdlib's csv module) is that you're encountering a string somewhere when iterating over the lines, and are trying to convert that to float which is a no go:
>>> float('spam')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: 'spam'
One of the solutions would be to simply skip the strings you're encountering, and you can choose between two approaches there (LBYL (Look Before You Leap) vs EAFP (Easier to Ask for Forgiveness than Permission)), and, in a nutshell, I suggest you go with the latter since it's generally preferred due to the gap that exists between checking and using, which means that things can change out from under you, and oftentimes you'll have to handle the error in any case, even if you check first. Apart from that, instead of manually closing file-like objects later on (which you seem to have forgotten to do), I suggest using with statement as a context manager which automatically manages that for you. So, taking all that into account, something along the following lines should do the trick (disclaimer: not thoroughly tested):
import csv
data = []
filename = input('Enter the name of the data file: ')
with open(filename) as f:
reader = csv.reader(f, delimiter=',', skipinitialspace=True)
for line in reader:
try:
date, (l, h, rf) = line[0], map(float, line[1:])
m, d, y = map(int, date.split('/'))
except ValueError as e:
print('Skipping line: %s [because of: %s]' % (line, e))
continue
data.append([d, m, y, l, h, rf])
Hopefully this is enough to get you back on track ;-)

Review your csv file.
It is hard to say without seeing what's in the file, but the most likely explanation (according to the error message) is that you have a line for which the forth value is empty, e.g:
2018-01-30,5,12,
So the rf variable would be empty when parsing that line, and the you would get that ValueError when trying to cast the value as a float.
Also, some advice on how to do it better:
You may want to split your line first, then count how many data fields it has, and then discarding it before assigning the whole line to date, l, h, rf. Something like this:
`
for line in infile:
# Get data from line. Use strip() to avoid whitespace
items = line.strip().split(',')
if len(items) != 4:
# skip the iteration - do nothing with that malformed line
continue
date, l, h, rf = items
`
You may want to have a look at the csv module for reading/writing csv files easily.

The error means you that the string you are trying to cast a float is actually not a number. In your case, it looks like it's an empty string. It's probably because the last line of your file is empty, so you can check it at the beginning of your loop and break or continue if it is. An other strategy would be to catch the error, but it would then ignore a malformed line when you could want to be alerted of it, so it's up to you to pick the one that suites you.
Using square brackets also puts your values in a list, not in a tuple. You need parenthesis for that.
And you should also close your files when you are done.
Python also has a CSV module you may find useful.
#Open File
filename = input("Enter the name of the data file: ")
infile = open(filename, 'r')
#Read in file
datalist = []
for line in infile:
if line.strip() == '': # If the line only contains spaces
continue # Then, just skip it
# Some stuff ...
# Put everything in a tuple that we add to our list
datalist.append((day,month,year,min_temp,max_temp,rainfall))
infile.close() # Close the file

Related

How to convert a list of strings and numbers to float in python?

I need to get data from a csv file which I have done and have appended the data to a list but don't know how to make the entire list into a float.
I tried the following code and it did not work:
import csv
with open('csv1.csv', 'r') as f:
lines = f.readlines()[6]
new_list = []
for i in lines:
print(lines)
new_list.append(float(i))
print(new_list)
I got a ValueError message. ValueError: could not convert string to float: '"' which is weird since I don't understand where it is getting the " from.
The CSV file I am using is from Bureau Of Economic, here's the link to the exact file I am using: https://apps.bea.gov/iTable/iTable.cfm?reqid=19&step=2#reqid=19&step=2&isuri=1&1921=survey
When you execute lines = f.readlines()[6] you are grabbing only the seventh line of the file. That is only one line and it is a str (string) that contains that whole line from the file. As you go through and try to iterate for i in lines you are actually going through the seventh line of the file, character-by-character and attempting to do the float() conversion. The first character of that line is the " which is causing the error.
In order to go through all of the lines of the file starting with the seventh line you need to change the array indexing to lines = f.readlines()[6:]. That gets you to processing the lines, one-by-one.
But, that's all a sort of explanation of what was going on. As #Razzle Shazl points out, you're not using the CSV reader.
That said, I think this is what you're actually trying to accomplish:
import csv
with open('csv1.csv', 'r') as f:
csv_data = csv.reader(f, delimiter=',', quotechar='"')
for i in range(6): # this is to skip over header rows in the data (I saw 6 in the file I downloaded)
next(csv_data)
for row in csv_data: # going through the rest of the rows of data
new_list = []
for i in range(2, len(row)): # go through each column after the line number and description
new_list.append(float(row[i])) # convert element to float and append to array
print(f"CSV Input: {row}")
print(f"Converted: {new_list}")
You cannot convert empty string to float. Probably somewhere the data is empty.
In [1]: float("")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-603e80e58999> in <module>
----> 1 float("")
ValueError: could not convert string to float:
If you want to ignore the exception and continue processing, then use try/except block

ValueError: not enough values to unpack (expected 2, got 1) is keeping me from finishing my code

Hello i am trying to make a dictionary from a file in python and i keep getting this error. I have no idea how to fix it. Help would be much appreciated.
Here is my code:
dict = {}
f = open('example.txt', 'r')
row = f.read()
lines = row.split("\n")
for line in lines:
name, number = line.split(")")
dict[name] = number
f.close()
return dict
And here is how the file looks like:
a)b
b)c
abc)d
d)dac
Thanks in advance
read will add an extra blank line when it hits the end of the file.
To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string or bytes object. size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ('').
change your program to check if libe exists.
dict = {}
f = open('example.txt', 'r')
row = f.read()
lines = row.split("\n")
for line in lines:
if line:
name, number = line.split(")")
dict[name] = number
f.close()
return dict
reference: https://docs.python.org/3.3/tutorial/inputoutput.html#methods-of-file-objects
IMP: please do not use builtin name dict for a variable
you are getting an empty line by the last row.split. You can remove it as follows by checking for if line:
lines = [line for line in row.split("\n") if line]
In addition you shouldn't reuse the builtin name dict for a variable. And you are using return outside a function, which is invalid.

Use of readline()?

I have a question about this program:
%%file data.csv
x1,x2,y
0.4946,5.7661,0
4.7206,5.7661,1
1.2888,5.3433,0
4.2898,5.3433,1
1.4293,4.5592,0
4.2286,4.5592,1
1.1921,5.8563,0
3.1454,5.8563,1
f = open('data.csv')
data = []
f.readline()
for line in f:
(x1,x2,y) = line.split(',')
x1 = float(x1)
x2 = float(x2)
y = int(y)
data.append((x1,x2,y))
What is the purpose of readline here? I have seen different examples but here seems that it delete the first line.
Python is reading the data serially, so if a line gets read once, python jumps to the next one. The r.readline() reads the first line, so in the loop it doesn't get read.
That's precisely the point: to delete the first line. If you notice, the file has the names of the columns as its first line (x1,x2,y), and the program wants to ignore that line.
Using readline() method before reading lines of file in loop
is equals to:
for line in f.readlines()[1:]:
...
for example that may be used to skip table header.
In your file, when you will convert x1 variable to float type it raise ValueError because in first iteration x1 contain not digit sting type value "x1". And to avoid that error you use readline() to swich iterator to second line wich contain pure digits.

map function in Python

Content of file scores.txt that lists the performance of players at a certain game:
80,55,16,26,37,62,49,13,28,56
43,45,47,63,43,65,10,52,30,18
63,71,69,24,54,29,79,83,38,56
46,42,39,14,47,40,72,43,57,47
61,49,65,31,79,62,9,90,65,44
10,28,16,6,61,72,78,55,54,48
The following program reads the file and stores the scores into a list
f = open('scores.txt','r')
L = []
for line in f:
L = L + map(float,str.split(line[:-1],','))
print(L)
But it leads to error messages. I was given code in class so quite confused as very new to Pyton.
Can I fix code?
It appears you've adapted python2.x code to use in python3.x. Note that map does not return a list in python3.x, it returns a generator map object (not a list, basically) that you've to convert to a list appropriately.
Furthermore, I'd recommend using list.extend instead of adding the two together. Why? The former creates a new list object every time you perform addition, and is wasteful in terms of time and space.
numbers = []
for line in f:
numbers.extend(list(map(float, line.rstrip().split(','))))
print(numbers)
An alternative way of doing this would be:
for line in f:
numbers.extend([float(x) for x in line.rstrip().split(',')])
Which happens to be slightly more readable. You could also choose to get rid of the outer for loop using a nested list comprehension.
numbers = [float(x) for line in f for x in line.rstrip().split(',')]
Also, forgot to mention this (thanks to chris in the comments), but you really should be using a context manager to handle file I/O.
with open('scores.txt', 'r') as f:
...
It's cleaner, because it closes your files automatically when you're done with them.
After seeing your ValueError message, it's clear there's issues with your data (invalid characters, etc). Let's try something a little more aggressive.
numbers = []
with open('scores.txt', 'r') as f:
for line in f:
for x in line.strip().split(','):
try:
numbers.append(float(x.strip()))
except ValueError:
pass
If even that doesn't work, perhaps, something even more aggressive with regex might do it:
import re
numbers = []
with open('scores.txt', 'r') as f:
for line in f:
line = re.sub('[^\d\s,.+-]', '', line)
... # the rest remains the same

reading a file and parse them into section

okay so I have a file that contains ID number follows by name just like this:
10 alex de souza
11 robin van persie
9 serhat akin
I need to read this file and break each record up into 2 fields the id, and the name. I need to store the entries in a dictionary where ID is the key and the name is the satellite data. Then I need to output, in 2 columns, one entry per line, all the entries in the dictionary, sorted (numerically) by ID. dict.keys and list.sort might be helpful (I guess). Finally the input filename needs to be the first command-line argument.
Thanks for your help!
I have this so far however can't go any further.
fin = open("ids","r") #Read the file
for line in fin: #Split lines
string = str.split()
if len(string) > 1: #Seperate names and grades
id = map(int, string[0]
name = string[1:]
print(id, name) #Print results
We need sys.argv to get the command line argument (careful, the name of the script is always the 0th element of the returned list).
Now we open the file (no error handling, you should add that) and read in the lines individually. Now we have 'number firstname secondname'-strings for each line in the list "lines".
Then open an empty dictionary out and loop over the individual strings in lines, splitting them every space and storing them in the temporary variable tmp (which is now a list of strings: ('number', 'firstname','secondname')).
Following that we just fill the dictionary, using the number as key and the space-joined rest of the names as value.
To print the dictionary sorted just loop over the list of numbers returned by sorted(out), using the key=int option for numerical sorting. Then print the id (the number) and then the corresponding value by calling the dictionary with a string representation of the id.
import sys
try:
infile = sys.argv[1]
except IndexError:
infile = input('Enter file name: ')
with open(infile, 'r') as file:
lines = file.readlines()
out = {}
for fullstr in lines:
tmp = fullstr.split()
out[tmp[0]] = ' '.join(tmp[1:])
for id in sorted(out, key=int):
print id, out[str(id)]
This works for python 2.7 with ASCII-strings. I'm pretty sure that it should be able to handle other encodings as well (German Umlaute work at least), but I can't test that any further. You may also want to add a lot of error handling in case the input file is somehow formatted differently.
Just a suggestion, this code is probably simpler than the other code posted:
import sys
with open(sys.argv[1], "r") as handle:
lines = handle.readlines()
data = dict([i.strip().split(' ', 1) for i in lines])
for idx in sorted(data, key=int):
print idx, data[idx]

Categories

Resources