This is my file text:
Covid-19 Data
Country / Number of infections / Number of Death
USA 124.356 2.236
Netherlands 10.866 771
Georgia 90 NA
Germany 58.247 455
I created a function to calculate the ratio of deaths compared to the infections, however it does not work, because some of the values aren't floats.
f=open("myfile.txt","w+")
x="USA" + " " + " " + "124.356" + " " + " " + "2.236"
y="Netherlands" + " " + " " + "10.866" + " " + " " + "771"
z="Georgia" + " " + " " + "90" + " " + " " + "NA"
w="Germany" + " " + " " + "58.247" + " " + " " + "455"
f.write("Covid-19 Data" + "\n" + "Country" + " " + "/" + " " + "Number of infections" + " " + "/" + " " + "Number of Death" + "\n")
f.write(x + "\n")
f.write(y + "\n")
f.write(z + "\n")
f.write(w)
f.close()
with open("myfile.txt", "r") as file:
try:
for i in file:
t = i.split()
result=float(t[-1])/float(t[-2])
print(results)
except:
print("fail")
file.close()
Does someone have an idea how to solve this problem ?
You can do the following:
with open("myfile.txt", "r") as file:
for i in file:
t = i.split()
try:
result = float(t[-1]) / float(t[-2])
print(result)
except ValueError:
pass
At the time you don't know if the values you are trying to divide are numeric values or not, therefore surrounding the operation with a try-catch should solve your problem.
If you want to become a bit more "clean" you can do the following:
def is_float(value):
try:
float(value)
except ValueError:
return False
return True
with open("myfile.txt", "r") as file:
for i in file:
t = i.split()
if is_float(t[-1]) and is_float(t[-2]):
result = float(t[-1]) / float(t[-2])
print(result)
The idea is the same, however.
I used the same file that you attached in your example. I created this function hopefully it helps:
with open("test.txt","r") as reader:
lines = reader.readlines()
for line in lines[2:]:
line = line.replace(".","") # Remove points to have the full value
country, number_infections, number_deaths = line.strip().split()
try:
number_infections = float(number_infections)
number_deaths = float(number_deaths)
except Exception as e:
print(f"[WARNING] Could not convert Number of Infections {number_infections} or Number of Deaths {number_deaths} to float for Country: {country}\n")
continue
ratio = number_deaths/number_infections
print(f"Country: {country} D/I ratio: {ratio}")
As you can see I avoided the headers of your file using lines[2:] that means that I will start from row 3 of your file. Also, added try/exception logic to avoid non-float converts. Hope this helps!
Edit
Just noticed that the format for thousands is used with "." instead "," in that case the period was removed in line 7.
The results for this execution is:
Country: USA D/I ratio: 0.017980636237897647
Country: Netherlands D/I ratio: 0.07095527332965212
[WARNING] Could not convert Number of Infections 90.0 or Number of Deaths NA to float for Country: Georgia
Country: Germany D/I ratio: 0.007811561110443456
Fixed the following:
The first two lines in your text-file are headers. These need to be skipped
'NA' Can't be converted to zero
If there is a 0 in your data, your program would crash. Now it wouldn't.
f=open("myfile.txt","w+")
x="USA" + " " + " " + "124.356" + " " + " " + "2.236"
y="Netherlands" + " " + " " + "10.866" + " " + " " + "771"
z="Georgia" + " " + " " + "90" + " " + " " + "NA"
w="Germany" + " " + " " + "58.247" + " " + " " + "455"
f.write("Covid-19 Data" + "\n" + "Country" + " " + "/" + " " + "Number of infections" + " " + "/" + " " + "Number of Death" + "\n")
f.write(x + "\n")
f.write(y + "\n")
f.write(z + "\n")
f.write(w)
f.close()
with open("myfile.txt", "r") as file:
#Skipping headers
next(file)
next(file)
try:
for i in file:
t = i.split()
#Make sure your code keeps working when one of the numbers is zero
x = 0
y = 0
#There are some NA's in your file. Strings not representing
#a number can't be converted to float
if t[1] != "NA":
x = t[1]
if t[2] != "NA":
y = t[2]
if x == 0 or y == 0:
result = 0
else:
result=float(x)/float(y)
print(t[0] + ": " + str(result))
except:
print("fail")
file.close()
Output:
USA: 55.615384615384606
Netherlands: 0.014093385214007782
Georgia: 0
Germany: 0.12801538461538461
Your header line in the file is Covid-19 Data. this is the first line and when you call t=i.split() you then have a list t which has data ['Covid-19', 'Data']
you cannot convert these to floats since they have letters in them. Instead you should read the first 2 header line before the loop and do nothing with them. However you are then going to have issues with Georgia as "NA" also cannot be converted to a float.
A few other points, its not good practice to have a catch all exception. Also you dont need to close the file explicitly if you open the file using a with statement.
Related
I'm currently using openpyxl to use an excel file that has mul-indices (2 Levels of headers) and i'm trying to do the operations depending on the subheaders a header has.
I have some exp. doing this in pandas but for this Project i have to use openpyxl which i barly made any use of before.
Only thing i could think off is the manual way:
iterating over the rows
saving the first row as header and 2nd row as subheader
do some cleaning.
manually save the headers with their subheaders in dics. then filleing in the values by iterating over all the cols
my code is as follows:
#reading the excel file
path = r'path to file'
wb = load_workbook(path) #loading the excel table
ws = wb.active #grab the active worksheet
#Setting the doc Header
for h in ws.iter_rows(max_row = 1, values_only = True): #getting the first row (Headers) in the table
header = list(h)
for sh in ws.iter_rows(min_row = 1 ,max_row = 2, values_only = True):
sub_header = list(sh)
#removing all of the none Values
header = list(filter(None, header))
sub_header = list(filter(None, sub_header))
print(header)
print(sub_header)
#creating a list of all the Columns in the excel file
col_list = []
for col in ws.iter_cols(min_row=3,min_col = 1): #Iteration over every single row starting from the third row since first two are the headers
col = [cell.value for cell in col] #Creating a list from each row
col = list(filter(None, col)) #removing the none values from each row
col_list.append(col) #creating a list of all rows (starting from the 3d one)
#print (col_list)
But i'm sure there must be a better way that i wasnt able to find in the docs or by checking this website.
Thanks in advance!
My goal in the end is to automate this part of my code by iterating over the header and use the subheaders of that head and their values each time
code:
#bulding the templates using yattag "yattag.org"
doc , tag , text = Doc().tagtext()
#building the tags of the xml file
with tag("Data"): #root tag
for row in row_list :
with tag("Row"):
with tag("Input"):
with tag(header[0].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[0] + " the Precentage of Students with "+ sub_header[0] + " is "+ str(row[1]) + " whereas the " + sub_header[1] + " are " + str(row[2]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[0] + " | " + sub_header[0]+ " | " + str(row[1]) + " | " + sub_header[1] + " | " + str(row[2]))
with tag(header[1].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[1] + " the Precentage of Students with "+ sub_header[2] + " is "+ str(row[3]) + " whereas the " + sub_header[3] + " are " + str(row[4]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[1] + " | " + sub_header[2]+ " | " + str(row[3]) + " | " + sub_header[3] + " | " + str(row[4]))
with tag(header[2].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[2] + " the Precentage of Students with "+ sub_header[4] + " is "+ str(row[5]) + " whereas the " + sub_header[5] + " are " + str(row[6]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[2] + " | " + sub_header[4]+ " | " + str(row[5]) + " | " + sub_header[5] + " | " + str(row[6]))
with tag(header[3].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[3] + " the Precentage of Students with "+ sub_header[6] + " is "+ str(row[7]) + " whereas the " + sub_header[7] + " are " + str(row[8]) +" and for " + sub_header[8] + str(row[9]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[3] + " | " + sub_header[6]+ " | " + str(row[7]) + " | " + sub_header[7] + " | " + str(row[8]) + " | " + sub_header[8] + " | " + str(row[9]))
with tag(header[4].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[4] + " the Precentage of Students with "+ sub_header[9] + " is "+ str(row[10]) + " whereas the " + sub_header[10] + " are " + str(row[11]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[4] + " | " + sub_header[9]+ " | " + str(row[10]) + " | " + sub_header[10] + " | " + str(row[11]))
with tag(header[5].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[5] + " the Precentage of Students with "+ sub_header[11] + " is "+ str(row[12]) + " whereas the " + sub_header[12] + " are " + str(row[13]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[5] + " | " + sub_header[11]+ " | " + str(row[12]) + " | " + sub_header[12] + " | " + str(row[13]))
with tag(header[6].replace(' ','_').replace('\n','_')):
text("In " + dic[row[0]]+" the precentage of Students " + " regarding the " + header[6] + " the Precentage of Students with "+ sub_header[13] + " is "+ str(row[14]) + " whereas the " + sub_header[14] + " are " + str(row[15]) )
with tag("Row_Data"):
text(dic[row[0]] + " | " + header[6] + " | " + sub_header[13]+ " | " + str(row[14]) + " | " + sub_header[14] + " | " + str(row[15]))
#print(doc.getvalue())
result = indent(
doc.getvalue(),
indentation=' ',
indent_text=True
)
#saving the xml file
with open("output.xml", "w") as f:
f.write(result)
Unless Pandas is completely off the table I think you might be able to do something pandas and openpyxl. The documentation mentions reading data from openpyxl into a pandas dataframe: Working with Pandas and Numpy.
Could you use:
data = ws.values
df = DataFrame(data[2:,:], index=data[0], columns=data[1])
There may be some filtering necessary with regards to the None values.
I have code writing data to 3 different files. I need to add sales tax to the price output of f3 that will change the price value. what would be the cleanest way to do this?
Edit: price structure is (00.00)
# write compiled product to buylist.txt
def write_output(self):
with open(self.outputfile, 'w') as f, open(self.outputfile2, 'w') as f2, open(self.outputfile3, 'w') as f3:
for item in self.buylist:
f.write(str(item['status']) + " " + str(item['item']) + " " + str(item['quantity'])+ " " + str(item['price']) + "\n")
f2.write(str(item['item']) + " " + str(item['quantity']) + "\n")
f3.write(str(item['item']) + " " + str(item['quantity'])+ " " + str(item['price']) + "\n")
f.close(), f2.close(), f3.close()
You can use this code, Although this is not a very good way to ask a question since you have not mentioned data type of price or given an example.
You will only need to edit the f3.write line
salesTax = 18
f3.write(str(item['item']) + " " + str(item['quantity'])+ " " + str(int(item['price']) * (1 + (salesTax/100))) + "\n")
I'm new to Python and so am confused by the following "list index out of range" error I get with the following code as my text file only contains 4 items in it which are first name, last name, hourly salary, total hours worked. Should this be changed to something that's not a while loop? If need be I can give the entire code. Any help would be greatly appreciated!
while line2 != "":
line2 = " "
line2 = line2.split( " " )
if (line2[ 0 ]+ " " + line2[ 1 ]) != name1.rstrip( " \n " ):
empFile3.write(line2[ 0 ] + " " + line2[ 1 ] + " " + line2[ 2 ] + " " + line2[ 3 ] + " \n " )
The problem is in line no. 2. Remove that. You are getting that error because you are trying to split an empty string.
while line2 != "":
line2 = line2.split( " " )
if (line2[ 0 ]+ " " + line2[ 1 ]) != name1.rstrip( " \n " ):
empFile3.write(line2[ 0 ] + " " + line2[ 1 ] + " " + line2[ 2 ] + " " + line2[ 3 ] + " \n " )
I'm trying to analyze a sqlite3 file and printing the results to a text file. If i test the code with print it all works fine. When i write it to a file it cuts out at the same point every time.
import sqlite3
import datetime
import time
conn = sqlite3.connect("History.sqlite")
curs = conn.cursor()
results = curs.execute("SELECT visits.id, visits.visit_time, urls.url, urls.visit_count \
FROM visits INNER JOIN urls ON urls.id = visits.url \
ORDER BY visits.id;")
exportfile = open('chrome_report.txt', 'w')
for row in results:
timestamp = row[1]
epoch_start = datetime.datetime(1601,1,1)
delta = datetime.timedelta(microseconds=int(timestamp))
fulltime = epoch_start + delta
string = str(fulltime)
timeprint = string[:19]
exportfile.write("ID: " + str(row[0]) + "\t")
exportfile.write("visit time: " + str(timeprint) + "\t")
exportfile.write("Url: " + str(row[2]) + "\t")
exportfile.write("Visit count: " + str(row[3]))
exportfile.write("\n")
print "ID: " + str(row[0]) + "\t"
print "visit time: " + str(timeprint) + "\t"
print "Url: " + str(row[2]) + "\t"
print "Visit count: " + str(row[3])
print "\n"
conn.close()
So the print results give the proper result but the export to the file stops in the middle of a url.
OK, I would start by replacing the for loop with the one below
with open('chrome_report.txt', 'w') as exportfile:
for row in results:
try:
timestamp = row[1]
epoch_start = datetime.datetime(1601,1,1)
delta = datetime.timedelta(microseconds=int(timestamp))
fulltime = epoch_start + delta
string = str(fulltime)
timeprint = string[:19]
exportfile.write("ID: " + str(row[0]) + "\t")
exportfile.write("visit time: " + str(timeprint) + "\t")
exportfile.write("Url: " + str(row[2]) + "\t")
exportfile.write("Visit count: " + str(row[3]))
exportfile.write("\n")
print "ID: " + str(row[0]) + "\t"
print "visit time: " + str(timeprint) + "\t"
print "Url: " + str(row[2]) + "\t"
print "Visit count: " + str(row[3])
print "\n"
except Exception as err:
print(err)
By using the "with" statement (context manager) we eliminate the need to close the file. By using the try/except we capture the error and print it. This will show you where your code is failing and why.
How can I print a new line on the output file? When I try to add the new line with "/n" it just prints /n
This is what I have so far.
``
inputFile = open("demofile1.txt", "r")
outFile = open("Ji
string = line.split(',')
go =(string)[3::]
bo = [float(i) for i in go]
total = sum(bo)
pine = ("%8.2f"%total)
name = string[2] + "," + " " + string[1]
kale = (string[0] + " " + name + " " + "/n")
se)
Current Result
8
53 Baul
A999999
You need to use \n, not /n. So this line:
kale = (string[0] + " " + name + " " + "/n")
Should be:
kale = (string[0] + " " + name + " " + "\n")
Also, please do consider using a str formatter, so all these lines:
go =(string)[3::]
bo = [float(i) for i in go]
total = sum(bo)
pine = ("%8.2f"%total)
name = string[2] + "," + " " + string[1]
kale = (string[0] + " " + name + " " + "/n")
str1 = ''.join(kale)
str2 = ''.join(pine)
outFile.write(str1 + " " + str2 + " ")
Will become:
outFile.write("{} {} {:8.2f}\n".format(string[0], string[2] + ", " + string[1], sum(bo))