Python the total sales price from a text file - python

I would like to subtract two numbers that are written in my text file.I have to calculate the total sales in each line by doing sales - cost price.
the text file contains:
200 123
300 189
111 77
I would like to subtract these values to get the output. Each line gives a different sales profit.

Here's a way to do it:
input_file = open('input.txt', 'r')
sales_profits = []
for line in input_file:
sales, cost = map(int, line.split(" "))
sales_profits.append(sales - cost)
print(sales_profits) # Your result is there - do what you want with :)

Related

How could I create a tuple with for-loops to create a list variable using a file with headers and its corresponding data

So I have a text file containing 22 lines and three headers which is:
economy name
unique economy code given the World Bank standard (3 uppercase letters)
Trade-to-GDP from year 1990 to year 2019 (30 years, 30 data points); 0.3216 means that trade-to-gdp ratio for Australia in 1990
is 32.16%
The code I have used to import this file and open/read it is:
def Input(filename):
f = open(filename, 'r')
lines = f.readlines()
lines = [l.strip() for l in lines]
f.close()
return lines
However once I have done that I have to create a code with for-loops to create a list variable named result. It should contain 22 tuples, and each tuple contains four elements:
economy name,
World Bank economy code,
average trade-to-gdp ratio for this economy from 1990 to 2004,
average trade-to-gdp ratio for this economy from 2005 to 2019.
Coming out like
('Australia', 'AUS', '0.378', '0.423')
So far the code I have written looks like this:
def result:
name, age, height, weight = zip(*[l.split() for l in text_file.readlines()])
I am having trouble starting this and knowing how to grapple with the multiple years required and output all the countries with corresponding ratios.Here is the table of all the data I have on the text file.
I would suggest to use Pandas for this.
You can simply do:
import pandas as pd
df = read_csv('filename.csv')
for index, row in df.iterrows():
***Do something***
In for loop you can use row['columnName'] and get the data, For example: row['code'] or row['1999'].
This approach will be lot easier for you to carry operations and process the data.
Also to answer your approach:
You can iter over the lines and extract the data using index.
Try the below code:
def Input(filename):
f = open(filename, 'r')
lines = f.readlines()
lines = [l.strip().split() for l in lines]
f.close()
return lines
for line in lines[1:]:
total = sum([float(x) for x in line[2:17])# this will give you sum of values from 1990 to 2004
total2 = sum([float(x) for x in line[17:])# this will give you sum of values from 2005 to 2019
val= (line[0], line[1], total, total1) #This will give you tuple
You can continue the approach and create a tuple in each for loop.

Trying to get sums in lists to print out with strings in output

I have been working on a Python project analyzing a CSV file and cannot get the output to show me sums with my strings, just lists of the numbers that should be summed.
Code I'm working with:
import pandas as pd
data = pd.read_csv('XML_projectB.csv')
#inserted column headers since the raw data doesn't have any
data.columns = ['name','email','category','amount','date']
data['date'] = pd.to_datetime(data['date'])
#Calculate the total budget by cateogry
category_wise = data.groupby('category').agg({'amount':['sum']})
category_wise.reset_index(inplace=True)
category_wise.columns = ['category','total_budget']
#Determine which budget category people spent the most money in
max_budget = category_wise[category_wise['total_budget']==max(category_wise['total_budget'])]['category'].to_list()
#Tally the total amounts for each year-month (e.g., 2017-05)
months_wise = data.groupby([data.date.dt.year, data.date.dt.month])['amount'].sum()
months_wise = pd.DataFrame(months_wise)
months_wise.index.names = ['year','month']
months_wise.reset_index(inplace=True)
#Determine which person(s) spent the most money on a single item.
person = data[data['amount'] == max(data['amount'])]['name'].to_list()
#Tells user in Shell that text file is ready
print("Check your folder!")
#Get all this info into a text file
tfile = open('output.txt','a')
tfile.write(category_wise.to_string())
tfile.write("\n\n")
tfile.write("The type with most budget is " + str(max_budget) + " and the value for the same is " + str(max(category_wise['total_budget'])))
tfile.write("\n\n")
tfile.write(months_wise.to_string())
tfile.write("\n\n")
tfile.write("The person who spent most on a single item is " + str(person) + " and he/she spent " + str(max(data['amount'])))
tfile.close()
The CSV raw data looks like this (there are almost 1000 lines of it):
Walker Gore,wgore8i#irs.gov,Music,$77.98,2017-08-25
Catriona Driussi,cdriussi8j#github.com,Garden,$50.35,2016-12-23
Barbara-anne Cawsey,bcawsey8k#tripod.com,Health,$75.38,2016-10-16
Henryetta Hillett,hhillett8l#pagesperso-orange.fr,Electronics,$59.52,2017-03-20
Boyce Andreou,bandreou8m#walmart.com,Jewelery,$60.77,2016-10-19
My output in the txt file looks like this:
category total_budget
0 Automotive $53.04$91.99$42.66$1.32$35.07$97.91$92.40$21.28$36.41
1 Baby $93.14$46.59$31.50$34.86$30.99$70.55$86.74$56.63$84.65
2 Beauty $28.67$97.95$4.64$5.25$96.53$50.25$85.42$24.77$64.74
3 Books $4.03$17.68$14.21$43.43$98.17$23.96$6.81$58.33$30.80
4 Clothing $64.07$19.29$27.23$19.78$70.50$8.81$39.36$52.80$80.90
year month amount
0 2016 9 $97.95$67.81$80.64
1 2016 10 $93.14$6.08$77.51$58.15$28.31$2.24$12.83$52.22$48.72
2 2016 11 $55.22$95.00$34.86$40.14$70.13$24.82$63.81$56.83
3 2016 12 $13.32$10.93$5.95$12.41$45.65$86.69$31.26$81.53
I want the total_budget column to be the sum of the list for each category, not the individual values you see here. It's the same problem for months_wise, it gives me the individual values, not the sums.
I tried the {} .format in the write lines, .apply(str), .format on its own, and just about every other Python permutation of the conversion to string from a list I could think of, but I'm stumped.
What am I missing here?
As #Barmar said, the source has $XX so it is not treated as numbers. You could try following this approach to parse the values as integers/floats instead of strings with $ in them.

How to count occurances and calculate a rating with the csv module?

You have a CSV file of individual song ratings and you'd like to know the average rating for a particular song. The file will contain a single 1-5 rating for a song per line.
Write a function named average_rating that takes two strings as parameters where the first string represents the name of a CSV file containing song ratings in the format: "YouTubeID, artist, title, rating" and the second parameter is the YouTubeID of a song. The YouTubeID, artist, and title are all strings while the rating is an integer in the range 1-5. This function should return the average rating for the song with the inputted YouTubeID.
Note that each line of the CSV file is individual rating from a user and that each song may be rated multiple times. As you read through the file you'll need to track a sum of all the ratings as well as how many times the song has been rated to compute the average rating. (My code below)
import csv
def average_rating(csvfile, ID):
with open(csvfile) as f:
file = csv.reader(f)
total = 0
total1 = 0
total2 = 0
for rows in file:
for items in ID:
if rows[0] == items[0]:
total = total + int(rows[3])
for ratings in total:
total1 = total1 + int(ratings)
total2 = total2 + 1
return total1 / total2
I am getting error on input ['ratings.csv', 'RH5Ta6iHhCQ']: division by zero. How would I go on to resolve the problem?
You can do this by using pandas DataFrame.
import pandas as pd
df = pd.read_csv('filename.csv')
total_sum = df[df['YouTubeID'] == 'RH5Ta6iHhCQ'].rating.sum()
n_rating = len(df[df['YouTubeID'] == 'RH5Ta6iHhCQ'].rating)
average = total_sum/n_rating
There are a few confusing things, I think renaming variables and refactoring would be a smart decision. It might even make things more obvious if one function was tasked with getting all the rows for a specific youtube id and another function for calculating the average.
def average_rating(csvfile, id):
'''
Calculate the average rating of a youtube video
params: - csvfile: the location of the source rating file
- id: the id of the video we want the average rating of
'''
total_ratings = 0
count = 0
with open(csvfile) as f:
file = csv.reader(f)
for rating in file:
if rating[0] == id:
count += 1
total_ratings += rating[3]
if count == 0:
return 0
return total_ratings / count
import csv
def average_rating(csvfile, ID) :
with open(csvfile) as f:
file = csv.reader(f)
cont = 0
total = 0
for rows in file:
if rows[0] == ID:
cont = cont + 1
total = total + int(rows[3])
return total/cont
this works guyx

Python: Find keywords in a text file from another text file

Take this invoice.txt for example
Invoice Number
INV-3337
Order Number
12345
Invoice Date
January 25, 2016
Due Date
January 31, 2016
And this is what dict.txt looks like:
Invoice Date
Invoice Number
Due Date
Order Number
I am trying to find keywords from 'dict.txt' in 'invoice.txt' and then add it and the text which comes after it (but before the next keyword) in a 2 column datatable.
So it would look like :
col1 ----- col2
Invoice number ------ INV-3337
order number ---- 12345
Here is what I have done till now
with open('C:\invoice.txt') as f:
invoices = list(f)
with open('C:\dict.txt') as f:
for line in f:
dict = line.strip()
for invoice in invoices:
if dict in invoice:
print invoice
This is working but the ordering is all wrong (it is as in dict.txt and not as in invoice.txt)
i.e.
The output is
Invoice Date
Invoice Number
Due Date
Order Number
instead of the order in the invoice.txt , which is
invoice number
order number
invoice date
due date
Can you help me with how I should proceed further ?
Thank You.
This should work. You can load your invoice data into a list, and your dict data into a set for easy lookup.
with open('C:\invoice.txt') as f:
invoice_data = [line.strip() for line in f if line.strip()]
with open('C:\dict.txt') as f:
dict_data = set([line.strip() for line in f if line.strip()])
Now iterate over invoices, 2 at a time and print out the line sets that match.
for i in range(0, len(invoice_data), 2):
if invoice_data[i] in dict_data:
print(invoive_data[i: i + 2])

How to parse csv file and compute stats based on that data

I have a task which requires me to make a program in python that reads a text file which has information about people (name, weight and height).
Then I need the program to ask for the user to enter a name then look for that name in the text file and print out the line which includes that name and the person's height and weight.
Then the program has to work out the average weight of the people and average height.
The text file is:
James,73,1.82,M
Peter,78,1.80,M
Jay,90,1.90,M
Beth,65,1.53.F
Mags,66,1.50,F
Joy,62,1.34,F
So far I have this code which prints out the line using the name that has been typed by the user but I don't know to assign the heights and the weights:
search = input("Who's information would you like to find?")
with open("HeightAndWeight.txt", "r") as f:
for line in f:
if search in line:
print(line)
Using the pandas library as suggested, you can do as follows:
import pandas as pd
df = pd.read_csv('people.txt', header=None, index_col=0)
df.columns = ['weight', 'height', 'sex']
print(df)
weight height sex
0
James 73 1.82 M
Peter 78 1.80 M
Jay 90 1.90 M
Beth 65 1.53 F
Mags 66 1.50 F
Joy 62 1.34 F
print(df.mean())
weight 72.333333
height 1.648333
You could use Python's built in csv module to split each line in the file into a list of columns as follows:
import csv
with open('HeightAndWeight.txt', 'rb') as f_input:
csv_input = csv.reader(f_input)
total_weight = 0
total_height = 0
for index, row in enumerate(csv_input, start=1):
total_weight += float(row[1])
total_height += float(row[2])
print "Average weight: {:.2f}".format(total_weight / index)
print "Average height: {:.2f}".format(total_height / index)
This would display the following output:
Average weight: 72.33
Average height: 1.65
The answer is actually in your question's title : use the standard lib's csv module to parse your file
Use:
splitted_line = line.split(',', 4)
to split the line you just found into four parts, using the comma , as a delimiter. You then can get the first part (the name) with splitted_line[0], the second part (age) with splitted_line[1] and so on. So, to print out the persons name, height and weight:
print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))
To get the average on height and age, you need to know how many entries are in your file, and then just add up age and height and divide it by the number of entries/persons. The whole thing would look like:
search = input("Who's information would you like to find?")
total = 0
age = 0
height = 0
with open("HeightAndWeight.txt", "r") as f:
for line in f:
total += 1
splitted_line = line.split(',', 4)
age += int(splitted_line[1])
height += int(splitted_line[2])
if search in line:
print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))
average_age = age / total
average_height = height / total
That's one straightforward way to do it, and hopefully easy to understand as well.
search = input("Who's information would you like to find?")
heights = []
weights = []
with open("HeightAndWeight.txt", "r") as f:
for line in f:
if search in line:
print(line)
heights.append(int(line.split(',')[2]))
weights.append(int(line.split(',')[1]))
# your calculation stuff

Categories

Resources