Here is my code:
with open('life-expectancy.csv') as file:
for row in file:
row = row.strip() #trim
parts = row.split(',')
value = float(parts[3])
max_value = float(-1.0)
if value > max_value:
max_value = value
min_value = float(100.0)
if value < min_value:
min_value = value
# print(sum(value))
print(max_value)
print(min_value)
The life expectancy file contains rows that are all like this:
Afghanistan,AFG,1981,43.923
With different countries, years, etcetera. My goal is to find the highest and lowest life expectancy with the corresponding country and year, but my code is just giving me the life expectancy of the last item in the list (I haven't attempted to add the country and year yet obviously).
What am I missing?
Move the max_value = float(-1.0) and min_value = float(100.0) above the with statement. This way you reset it on each line.
Related
im trying to iterate this and cant figureout how. Theres .csv file.
QUESTION: so i finds LOW_num's data[0], and got to get TOP_num for TOP_num's data[0] < LOW_num's data[0] What the formula could be? The example:
for line in file:
data = line.split(sep)
A line looks like this:
2022-04-05T08:34:39+02:00, 1.2024, 1.2024, 1.2024, 1.2024, 1.2185, 1.2059028833000065, 1.2024784243912705, 1.2004400559932131, 1.198116316019428
So data[0] means Column A, data[1] is Column B, data[2] is Column C, (...)
memory["high"] = {}
memory["low"] = {}
for line in file:
data = line.split(sep)
if data[5] < data[9]:
memory["high"][float(data[2])] = str(data[0])
memory["low"][float(data[3])] = str(data[0])
# those are collecing data[2] and data[3] only between events when
# values changes from column F > J, to F < J, in that .csv file
then in the same "for line in file:", but different if:
LOW_num = min(memory["low"]) # it gets lowest number of all collected data[3] (Column D)
TOP_num = max(memory["high"]) # it gets biggest number of all collected data[2] (Column C)
#so TOP_num is for example: "1.555"
#but that TOP_num got day, month, and year attached to it as well:
#ex: memory["high"]["1.555"]["2022-04-05T08:34:39+02:00"]
TOP_data0 = str(memory["high"][TOP_num])
LOW_data0 = str(memory["low"][LOW_num])
i tried some things but, cant get it righ, example:
for i in memory["high"][i][j]:
if memory["high"][i][memory["high"][TOP_num][TOP_data0] < memory["low"][LOW_num]LOW_data0]:
print(memory["high"][i][TOP_num])
The .csv file represents some coin's price data ex: ADAUSDT frome exchange,
(time, open, high, low, close, somthing, somthing, somthing, somthing, somthing)
I finds Lowest price of given time period (Low_num), starting from some start price earlier.
And must find the biggest price between those start point and Low_num point.
That biggets price is the Stop loss numer had to be set, in order to achive the Lowest point for this example, it was a short.
figured out!
memory["SL"] = {}
for number in memory["high"]:
if number > LOW_num: # so its only numbers higher than Lowest obviously
x = number
if memory["high"][x] < LOW_data0: # and among them, with date only earlier than LOW_date0
memory["SL"][x] = str(memory["high"][x]) # and saving it to new memory set for later max() or min() upon it
Wow python can compare dates!
I'm working on trying to calculate the greatest increase/decrease in a change to profits/losses over time from a CSV.
The data set in csv is as follows (extract only):
Date,Profit/Losses
Jan-2010,867884
Feb-2010,984655
Mar-2010,322013
Apr-2010,-69417
So far, i've imported the csv file and added the items to a dictionary. Calculated total months, total profit/loss, calculated the change in profit/loss from month to month but now need to find the greatest and smallest change in the month and have the code return both the month and the change figure.
The output when trying to print the greatest increase/decrease returns only the final month on the list and all change values (instead of just the biggest change value and it's corresponding month)
Here is the code. Would appreciate any perspective:
budget = {}
total_months = 0
total_pnl = 0
date = 0
pnl = 0
monthly_change = []
previous_pnl = 0
greatest_increase = ["Date",[0]]
greatest_decrease = ["Date",[100000000000000]]
with open(csvpath, 'r') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',')
header = next(csvreader)
for row in csvreader:
date = 0
pnl = 1
budget[row[date]] = int(row[pnl])
for date, pnl in budget.items():
total_months = total_months + 1
total_pnl = total_pnl + pnl
pnlchange = pnl - previous_pnl
if total_months > 1:
monthly_change.append(pnlchange)
previous_pnl = pnl
if (monthly_change > greatest_increase[1]):
greatest_increase[1] = monthly_change
greatest_increase[0] = row[0]
if (monthly_change < greatest_decrease[1]):
greatest_decrease[1] = monthly_change
greatest_decrease[0] = row[0]
print(greatest_increase)
The primary problem is the final part of the code (the if statement). When I print 'greatest_increase' this currently returns the final value in the list rather than the highest value of change.
current output is:
[['Feb-2017', '671099'], [116771, -662642, -391430, 379920, 212354, 510239, -428211, -821271, 693918, 416278, -974163, 860159, -1115009, 1033048, 95318, -308093, 99052, -521393, 605450, 231727, -65187, -702716, 177975, -1065544, 1926159, -917805, 898730, -334262, -246499, -64055, -1529236, 1497596, 304914, -635801, 398319, -183161, -37864, -253689, 403655, 94168, 306877, -83000, 210462, -2196167, 1465222, -956983, 1838447, -468003, -64602, 206242, -242155, -449079, 315198, 241099, 111540, 365942, -219310, -368665, 409837, 151210, -110244, -341938, -1212159, 683246, -70825, 335594, 417334, -272194, -236462, 657432, -211262, -128237, -1750387, 925441, 932089, -311434, 267252, -1876758, 1733696, 198551, -665765, 693229, -734926, 77242, 532869]]
What i am trying to get is the bold value being the highest value (along with the relevant month)
Apologies if this isn't clear, I'm still fairly new (3rd week learning!)
I'm still new to python, so forgive me if my code seems rather messy or out of place. However, I need help with an assignment for university. I was wondering how I am able to find specific items in a CSV file? Here is what the assignment says:
Allow the user to type in a year, then, find the average life expectancy for that year. Then find the country with the minimum and the one with the maximum life expectancies for that year.
import csv
country = []
digit_code = []
year = []
life_expectancy = []
count = 0
lifefile = open("life-expectancy.csv")
with open("life-expectancy.csv") as lifefile:
for line in lifefile:
count += 1
if count != 1:
line.strip()
parts = line.split(",")
country.append(parts[0])
digit_code.append(parts[1])
year.append(parts[2])
life_expectancy.append(float(parts[3]))
highest_expectancy = max(life_expectancy)
country_with_highest = country[life_expectancy.index(max(life_expectancy))]
print(f"The country that has the highest life expectancy is {country_with_highest} at {highest_expectancy}!")
lowest_expectancy = min(life_expectancy)
country_with_lowest = country[life_expectancy.index(min(life_expectancy))]
print(f"The country that has the lowest life expectancy is {country_with_lowest} at {lowest_expectancy}!")
It looks like you only want the first and fourth tokens from each row in your CSV. Therefore, let's simplify it like this:
Hong Kong,,,85.29
Japan,,,85.03
Macao,,,84.68
Switzerland,,,84.25
Singapore,,,84.07
You can then process it like this:
FILE = 'life-expectancy.csv'
data = []
with open(FILE) as csv:
for line in csv:
tokens = line.split(',')
data.append((float(tokens[3]), tokens[0]))
hi = max(data)
lo = min(data)
print(f'The country with the highest life expectancy {hi[0]:.2f} is {hi[1]}')
print(f'The country with the lowest life expectancy {lo[0]:.2f} is {lo[1]}')
I have a text file consisting of some stocks and their prices and what not, I am trying to print out the stock which has the lowest value along with the name of the company here is my code.
stocks = open("P:\COM661\stocks.txt")
name_lowest = ""
price_lowest = 0
for line in stocks:
rows = line.split("\t")
price = float(rows[2])
if price>price_lowest:
price_lowest = price
name_lowest = rows[1]
print(name_lowest + "\t" + str(price_lowest))
I'm trying to go through the file and compare each numeric value to the one before it to see if it is higher or lower and then at the end it should have saved the lowest price and print it along with the name of the company.
Instead it prints the value of the last company in the file along with its name.
How can I fix this?
You made 2 mistakes.
First is initialised the initial value to 0
You should initialise the initial value to the max available number in python float.
import sys
price_lowest = sys.float_info.max
Or else you could initialise it to the first element
Second your should if statement should be
if price<price_lowest:
Initialize:
price_lowest = 999999 # start with absurdly high value, or take first one
Plus your if check is the opposite.
Should be:
if price < price_lowest
Others already suggested a solution that fixes your current code. However, using Python you can have a shorter solution:
with open('file') as f:
print min(
[(i.split('\t')[0], float(i.split('\t')[1])) for i in f.readlines()],
key=lambda t: t[1]
)
Your "if" logic is backwards, it should be price<lowest_pre.
Just make a little adjustment start your price_lowest at None then set it to your first encounter and compare from there on
stocks = open("P:\COM661\stocks.txt")
name_lowest = ""
price_lowest = None
for line in stocks:
rows = line.split("\t")
price = float(rows[2])
if price_lowest = None:
price = price_lowest
name_lowest = rows[1]
elif price < price_lowest:
price_lowest = price
name_lowest = rows[1]
print(name_lowest + "\t" + str(price_lowest))
So I'm currently using a loop to search through my csv data to find the "high" and "low" values of a group of days and then calculate the averages of each day. With those averages, I want to find the highest one amongst them but I've been having trouble doing so. This is currently what I have.
for row in reversed(list(reader1)):
openNAS, closeNAS = row['Open'], row['Close']
highNAS, lowNAS = row['High'], row['Low']
dateNAS = row['Date']
averageNAS = (float(highNAS) + float(lowNAS)) / 2
bestNAS = max(averageNAS)
I have indeed realized that the max(averageNAS) doesn't work because averageNAS is not a list and since the average isn't found in the csv file, I can't do max(row['Average']) either.
When the highest average is found, I'd also like to be able to include the date of it as well so my program can print out the date of which the highest average occurred. Thanks in advance.
One possible solution is to create a dictionary of average values where the date is the key and the average is the value:
averageNAS = {}
Then calculate the average and insert it into this dict:
for row in reversed(list(reader1)):
highNAS, lowNAS = row['High'], row['Low']
dateNAS = row['Date']
averageNAS[dateNAS] = (float(highNAS) + float(lowNAS)) / 2 # Insertion
Now you can get the maximum by finding the highest value:
import operator
bestNAS = max(averageNAS.items(), key=operator.itemgetter(1))
The result will be a tuple like:
# (1, 8.0)
which means that day 1 had the highest average. And the average was 8.
If you don't need the day then you could create a list instead of a dictionary and append to it. That makes finding the maximum a bit easier:
averageNAS = []
for ...
averageNAS.append((float(highNAS) + float(lowNAS)) / 2)
bestNAS = max(averageNAS)
There are a few solutions that come to mind.
Solution 1
The method most similar to your existing solution would be to create a list of the averages as you calculate them, and then take the maximum from that list. The code, based on your example, looks something like this:
averageNAS = []
for row in reversed(list(reader1)):
openNAS, closeNAS = row['Open'], row['Close']
highNAS, lowNAS = row['High'], row['Low']
dateNAS = row['Date']
averageNAS.append((float(highNAS) + float(lowNAS)) / 2)
# the maximum of the list only needs to be done once (at the end)
bestNAS = max(averageNAS)
Solution 2
Instead of creating an entire list, you could just maintain a variable of the maximum average NAS that you've "seen" so far, and the dateNAS associated with it. That would look something like:
bestNAS = float('-inf')
bestNASdate = None
for row in reversed(list(reader1)):
openNAS, closeNAS = row['Open'], row['Close']
highNAS, lowNAS = row['High'], row['Low']
dateNAS = row['Date']
averageNAS = (float(highNAS) + float(lowNAS)) / 2
if averageNAS > bestNAS:
bestNAS = averageNAS
bestNASdate = dateNAS
Solution 3
If you want to use a package as a solution, I'm fairly certain that the pandas package can do this easily and efficiently. I'm not 100% certain that the pandas syntax is exact, but the library has everything that you'd need to get this done. It's based on numpy, so the operations are faster/more efficient than a vanilla python loop.
from pandas import DataFrame, read_csv
import pandas as pd
df = pd.read_csv(r'file location')
df['averageNAS'] = df[["High", "Low"]].mean(axis=1)
bestNASindex = df['averageNAS'].argmax() # 90% sure this is the right syntax
bestNAS = df['averageNAS'][bestNASindex]
bestNASdate = df['date'][bestNASindex]