Arguments when Creating Functions - python

I am trying to create a function from weather data on a .csv file: When given temperature and a location, the function returns the number of entries that exceed that temperature in the specific location. After the condition statement I am unsure of what I need to write.
I have read the dictionary in a previous cell.
import csv
given_location = input ('Enter given location:')
given_temp = input('Enter given temp:')
count = 0
def daysOver (smalldict, location, temp):
reader = csv.Dictreader(dataFile)
for row in reader:
if row ['Location'] == given_location and line['MaxTemp'] > given_temp:
count = row
return count
print('Number of days over',given_temp, 'in', given_location,':', count)

You probably want to replace count = row with count count += 1

Related

How to write CSV files with Python using print statements with variables in them

I'm a noob here and I have pretty straight forward question with writing CSV output files with Python. I've been googling this for a while and I can't find an answer to my q. I have a bunch of tasks which have to output answers to the terminal as well as write the answers to a CSV output file. I've got all the correct answers in the terminal but I can't figure out how to write them to a CSV file. My print statements contain variables, and I need the value of the variable printed. I.E. "The total profit/loss for this period is: $22564198" should be printed to the CSV not the print statement format which is: 'The total profit/loss for this period is: ${total}'
I'm copying my code below.
import os
import csv
date = []
profloss = []
changes = []
total = 0
totalChange = 0
mo2mo = {}
budget_csv = os.path.join(xxxxxx)
with open(budget_csv) as csvfile:
csvreader = csv.reader(csvfile, delimiter=",")
#splitting the two columns into seperate lists
for row in csvreader:
date.append(row[0])
profloss.append(row[1])
#removing header rows
date.pop(0)
profloss.pop(0)
#printing how many months are in the data set
dataLen = len(date)
countMonths = "This data set has " + str(len(date)) + " months."
#calculating total profit/loss
for i in range(0, len(profloss)):
profloss[i] = int(profloss[i])
total = total + (profloss[i])
print(f'The total profit/loss for this period is: ${total}')
#calculating the difference between months and adding it to a list
for i in range(0, len(profloss)-1):
difference = (profloss[i+1]) - (profloss[i])
changes.append(difference)
#removing the first element in date to make a dictionary for dates: changes
date.pop(0)
#creating a dictionary of months as keys and change as values, starting with the second month
mo2mo = {date[i]: changes[i] for i in range(len(date))}
#calculating the average change from one month to the next
for i in range(0, len(changes)):
totalChange = totalChange + changes[i]
avChange = totalChange/len(changes)
print(f'The average change from month to month for this dataset is: {round((avChange),2)}')
#getting the month with the maximum increase
keyMax = max(mo2mo, key= lambda x: mo2mo[x])
for key,value in mo2mo.items():
if key == keyMax:
print(f'The month with the greatest increase was: {key} ${value}')
#getting the month with the maximum decrease
keyMin = min(mo2mo, key= lambda x: mo2mo[x])
for key, value in mo2mo.items():
if key == keyMin:
print(f'The maximum decrease in profits was: {key} ${value}')
outputCSV = ("countMonths",)
output_path = os.path.join("..", "output", "PyBankAnswers.csv")
#writing outcomes to csv file
with open(output_file,"w") as datafile:
writer = csv.writer(datafile)
for row in writer.writerow()
I've only got experience printing whole lists to csv files and not actual statements of text. I've tried to find how to do this but I'm not having luck. There's got to be a way without me just writing out the sentence I want printed by hand and having the CSV writer print that statement? OR do I just have to copy the sentence from the terminal and then print those statements row by row?
The print() function accepts a file option to specify that the output should be written to an open file stream. So you can make all your print() statements twice, once to the terminal and then to the file. To avoid all this duplication, you can put that into a function. Then call that instead of print() everywhere.
import os
import csv
date = []
profloss = []
changes = []
total = 0
totalChange = 0
mo2mo = {}
def print_to_terminal_and_file(f, *args, **kwargs):
print(*args, **kwargs)
print(*args, file=f, **kwargs)
budget_csv = os.path.join(xxxxxx)
with open(budget_csv) as csvfile:
csvreader = csv.reader(csvfile, delimiter=",")
#splitting the two columns into seperate lists
for row in csvreader:
date.append(row[0])
profloss.append(int(row[1]))
#removing header rows
date.pop(0)
profloss.pop(0)
output_path = os.path.join("..", "output", "PyBankAnswers.txt")
#writing outcomes to text file
with open(output_file,"w") as datafile:
#printing how many months are in the data set
print_to_terminal_and_file(datafile, f"This data set has {len(date)} months.")
#calculating total profit/loss
total = sum(profloss)
print_to_terminal_and_file(datafile, f'The total profit/loss for this period is: ${total}')
#calculating the difference between months and adding it to a list
for i in range(0, len(profloss)-1):
difference = (profloss[i+1]) - (profloss[i])
changes.append(difference)
#removing the first element in date to make a dictionary for dates: changes
date.pop(0)
#creating a dictionary of months as keys and change as values, starting with the second month
mo2mo = {date[i]: changes[i] for i in range(len(date))}
#calculating the average change from one month to the next
for i in range(0, len(changes)):
totalChange = totalChange + changes[i]
avChange = totalChange/len(changes)
print_to_terminal_and_file(datafile, f'The average change from month to month for this dataset is: {round((avChange),2)}')
#getting the month with the maximum increase
keyMax = max(mo2mo, key= lambda x: mo2mo[x])
for key,value in mo2mo.items():
if key == keyMax:
print_to_terminal_and_file(datafile, f'The month with the greatest increase was: {key} ${value}')
#getting the month with the maximum decrease
keyMin = min(mo2mo, key= lambda x: mo2mo[x])
for key, value in mo2mo.items():
if key == keyMin:
print_to_terminal_and_file(datafile, f'The maximum decrease in profits was: {key} ${value}')

user input reads through dictionary find number of occurrences of a word

I want to have the user input a name that I have in a dictionary that is in row 3. The program will look through the dictionary in row 3 to find the word and count how many times the word appears on a separate line.
I also would like to program to find all the zip codes this type of tree is found in and the city that has the greatest number of this type of tree.
Enter a command: treeinfo white oak
Entry: white oak
Total number of such trees: 642
Zip codes in which this tree is found: 10011,11103,11375,10002,10463
Borough containing the largest number of such trees: Bronx, with 432
Here is what I have so far, it ask me to enter tree name and nothing happens then it loops as ask me to enter tree name again.
import csv
from collections import Counter
# variables
# Load data into string variable
# file is variable called data
with open('nyc_street_trees.csv','r') as my_file:
data = my_file.read()
# Create a list of each row
rows = data.split("\n")
# Create a dictionary
tree_type = {}
# Iterate through each row, this prints out each row on a separate line
for row in rows:
# Split the row into info
info = row.split(",")
# Check if the row is the proper length
if len(info) == 9:
# separate out the data
tree_id = info[0]
tree_dbh = info[1]
health = info[2]
spc_common = info[3]
zipcode = info[4]
boroname = info[5]
nta_name = info[6]
latitude = info[7]
longitude = info[8]
# Populate the dictionary
tree_type[spc_common] = info[3]
userinput = input('Enter tree species:')
count = 0
for key in tree_type:
if tree_type[spc_common] == 'userinput':
count += 1
print(count)
#print(tree_type)
# prints out the row with tree names
# print(f'Spc: {spc_common} ')
#print(data)
#print(rows)
# print(row)
#print(type(rows))
# print(type(info), info)
What i see when I run the program:
Enter tree species:mulberry
Enter tree species:
I'm still a beginner so maybe someone can catch what i'm doing wrong here.
I tried to swap
with open('nyc_street_trees.csv','r') as my_file:
data = my_file.read()
with
inputFile = open("data.txt", "r")
I tried
count = 0
for key in tree_type:
if tree_type[spc_common] == 'data':
count += 1
The easiest way to import and manipulate data from a CSV is from the pandas library.
Import the CSV file into a dataframe
Extract relevant information from the dataframe
As you have not provided the headers for the CSV, I have used row and column index notation to retrieve the data. I have made the assumption that the columns you used are correct. It would be safer to use the column header names, but the following code should work nonetheless.
df.iloc[:,3] and df.columns[3] = refers to the spc_common column
df.iloc[:,4] = refers to the zipcode column
df.iloc[:,5] and df.columns[5] = refers to the borough column
Code:
import pandas as pd
# Import the data from CSV file
df = pd.read_csv('nyc_street_trees.csv')
# Get user to enter the tree species
userinput = input("Enter tree species:")
# Find the number of trees of that species
number_of_trees = df.iloc[:,3][df.iloc[:,3] == userinput].count()
# Find the zip codes containing that species
zip_codes = ",".join(map(str, list(set(df.iloc[:,4][df.iloc[:,3] == userinput]))))
# Find the borough containing the most number of trees of that species
borough_with_most_trees = df.iloc[:,[3,5]][df.iloc[:,3] == userinput].groupby(df.columns[5]).count().nlargest(1, columns=(df.columns[3]))
# Display output
print(f"Total number of {userinput} trees: {number_of_trees}")
print(f"Zip codes in which {userinput} trees are found: {zip_codes}")
print(f"Borough containing the largest number of {userinput} trees: {borough_with_most_trees.index[0]}, with {borough_with_most_trees.iloc[0,0]}")

How to get a running total for a column in a csv file while depending on a unique variable in a different column?

import csv
def getDataFromFile(filename, dataList):
file = open(filename, "r")
csvReader = csv.reader(file)
for aList in csvReader:
dataList.append(aList)
file.close()
def getTotalByYear(expendDataList):
total = 0
for row in expendDataList:
expenCount = float(row[2])
total += expenCount**
Rtotal = input(print("Enter 'every' or a particular year. "))
if Rtotal == 'every' or == 'Every':
print(expenCount)
As you can see I got the running total for column 2 if you type every or Every but I don't understand how to do a running total for column 2 while dependent on a certain variable in column one.
In this case my CSV file has three columns of data. A year field, an item field, and an expenditure field. How do I get a running total of the expenditure field based on a certain year?
expendDataList = []
fname = "expenditures.csv"
getDataFromFile(fname, expendDataList)
getTotalByYear(expendDataList)
Producing a running total is good task for a generator function. This example uses the filter built-in function to filter out unwanted years (a generator expression/ list comprehension could be used instead). Then it iterates over the selected rows to produce the results.
import csv
def running_totals(year):
with open('year-item-expenditure.csv') as f:
reader = csv.DictReader(f)
predicate = None if year.lower() == 'every' else lambda row: row['Year'] == year
total = 0
for row in filter(predicate, reader):
total += float(row['Expenditure'])
yield total
totals = running_totals('2019')
for total in totals:
print(total)
Another approach would be to use itertools.accumulate, though you still have to perform all of the filtering, so there's not much benefit to doing this unless you need performance.
import csv
import itertools
def running_totals(year):
with open('year-item-expenditure.csv') as f:
reader = csv.DictReader(f)
predicate = None if year.lower() == 'every' else lambda row: row['Year'] == year
# Create a generator expression that yields expenditures as floats
expenditures = (float(row['Expenditure']) for row in filter(predicate, reader))
for total in itertools.accumulate(expenditures):
yield total

Counting a ncaa basketball teams wins

I am trying to count the wins of certain college basketball teams, I have a csv file containing that data. When I run this code no matter what I have tried it always returns 0.
import csv
f = open("data.csv", 'r')
data = list(csv.reader(f))
def ncaa(team):
count = 0
for row in data:
if row[2] == team:
count += 1
return count
airforce_wins = ncaa("Air force")
akron_wins = ncaa("Akron")
print(akron_wins)
This will give you "1".
import csv
f = open("C:\\users/alex/desktop/data.csv", 'r')
data = list(csv.reader(f))
def ncaa(team):
count = 0
for row in data:
if row[1] == team: #corrected index here
count += 1
return count
airforce_wins = ncaa("Air force")
akron_wins = ncaa("Akron")
print(akron_wins)
However, I don't think you are counting the wins correctly. You are counting occurrences of a row in the file but, since each team only has one row, you will always get "1" for any team. Perhaps, your wins are in another column and that's the value you need to look up when you find your team.
Try this instead before the function definition:
import csv
with open("data1.csv", 'r') as f:
data = csv.reader(f,delimiter=',')
I don't think using list(reader_object) is correct.

check whether string is within a csv and run a absolute MAX formula using python

I have run into a problem with trying to read several csv's and finding a specific string within these csv's and be able to run a formula on it.
The csv all have the following main fields (waterlevel, flow ID and a value):
Water Level, Flow, Water Level,Flow
NEU_NEU_065,NEU_NEU_065,NEU_NEU_065,NEU_NEU_065
(274.4925,0,261.3318,-3.2)
With the example above there are duplicates for (NEU_NEU_065) for flow which are one value is 0 and another value being -3.2. I manually find and search within this csv this ID and do a Absolute MAX formula on that column range. So for this case I manually take out NEU_NEU_065 and make it NEU_NEU_065a = 0 and the second one NEU_NEU_065 make it NEU_NEU_065b = 3.2.
I don't need to run absolute max on all IDS within the csv just the particular list of IDs I have in another sheet. By running Absoltute max on everything within the csv will not give the right result because it will consider NEU_NEU_065 as one ID and the value will just be = 0. Which is not correct for what I am trying to achieve. I need to extract it out as NEU_NEU_065a = 0 and NEU_NEU_065b = 3.2 (absolute max of -3.2).
import csv
ifile = open('BCC_R_002c_E_00005Y_0090m_5m_01_PO.csv', "rb")
reader = csv.reader(ifile)
rownum = 0
for row in reader:
# Save header row.
if rownum == 0:
header = row
else:
colnum = 0
for col in row:
print '%-8s: %s' % (header[colnum], col)
colnum += 1
rownum += 1
ifile.close()

Categories

Resources