Adding keys to defaultdict of int while iterating

Adding keys to defaultdict of int while iterating - python

The script needs to read input from a text/csv file but as soon as I try and implement the functionality, everything breaks.
Here is my code:
from collections import defaultdict
#from csv import reader
data = """Lions 3, Snakes 3
Tarantulas 1, FC Awesome 0
Lions 1, FC Awesome 1
Tarantulas 3, Snakes 1
Lions 4, Grouches 0"""
# with open('sample_input.csv') as data:
# csv = reader(data)
# list_csv = [line.rstrip('\n') for line in data]
data_list = data.splitlines()
def splitter(row):
left_team, right_team = row.split(',')
return {
'left': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
data_dicts = [splitter(row) for row in data_list]
team_scores = defaultdict(int)
for game in data_dicts:
if game['left_score'] == game['right_score']:
team_scores[game['left']] += 1
team_scores[game['right']] += 1
elif game ['left_score'] > game['right_score']:
team_scores[game['left']] += 3
else:
team_scores[game['right']] += 3
print(team_scores)
teams_sorted = sorted(team_scores.items(), key=lambda team: team[1], reverse=True)
# for line in teams_sorted:
# print(line)
Also, the expected output that I need to have is:
1. Tarantulas, 6 pts
2. Lions, 5 pts
3. FC Awesome, 1 pt
3. Snakes, 1 pt
4. Grouches, 0 pts
I just can't seem to figure out how to get to this step. I checked most parts of my code with print statements and it seems the dictionary is working correctly but it is not printing the last team and it's score (Grouches, 0 pts).
I am currently getting this output:
('Tarantulas', 6)
('Lions', 5)
('Snakes', 1)
('FC Awesome', 1)
Any help would be greatly appreciated!

Well done for getting this far. You have managed to implement the logic, but have got stuck with a specific behaviour of defaultdict. There are 2 main points to note:
If a key is not initialized with defaultdict, it won't be added to the dictionary. You can do this simply by adding 0 to a non-initialized key.
For the specific formatting you require, you can use enumerate in a loop after sorting.
Putting these together, amend your loop as below:
for game in data_dicts:
if game['left_score'] == game['right_score']:
team_scores[game['left']] += 1
team_scores[game['right']] += 1
elif game ['left_score'] > game['right_score']:
team_scores[game['left']] += 3
team_scores[game['right']] += 0
else:
team_scores[game['left']] += 0
team_scores[game['right']] += 3
Then use enumerate in a loop. You can use operator.itemgetter and f-strings (the latter in Python 3.6+) to make your logic cleaner:
from operator import itemgetter
teams_sorted = sorted(team_scores.items(), key=itemgetter(1), reverse=True)
for idx, (team, score) in enumerate(teams_sorted, 1):
print(f'{idx}. {team} {score} pts')
1. Tarantulas 6 pts
2. Lions 5 pts
3. Snakes 1 pts
4. FC Awesome 1 pts
5. Grouches 0 pts

have you tried the CSV python lib? Extracted from the doc (https://docs.python.org/3/library/csv.html):
import csv
with open('data.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
print(', '.join(row))

To the code breaking on adding CSV. CSV reader does the split(',') for you. So your left_team = row[0] and right_team = row[1]
So your code changes to something like
def spliter(row):
left_team, right_team = row
return {
'left': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
with open('data.csv') as data_obj:
reader = csv.reader(data_obj)
data_dicts = [splitter(row) for row in reader]
You can go for plaintext reading if you want to manually split(',').
with open('data.csv') as data_obj:
data_list = [line.rstrip('\n') for line in data_obj.readlines()]

Related

how to create a list and then print it in ascending order

def list():
list_name = []
list_name_second = []
with open('CoinCount.txt', 'r', encoding='utf-8') as csvfile:
num_lines = 0
for line in csvfile:
num_lines = num_lines + 1
i = 0
while i < num_lines:
for x in volunteers[i].name:
if x not in list_name: # l
f = 0
while f < num_lines:
addition = []
if volunteers[f].true_count == "Y":
addition.append(1)
else:
addition.append(0)
f = f + 1
if f == num_lines:
decimal = sum(addition) / len(addition)
d = decimal * 100
percentage = float("{0:.2f}".format(d))
list_name_second.append({'Name': x , 'percentage': str(percentage)})
list_name.append(x)
i = i + 1
if i == num_lines:
def sort_percentages(list_name_second):
return list_name_second.get('percentage')
print(list_name_second, end='\n\n')
above is a segment of my code, it essentially means:
If the string in nth line of names hasn't been listed already, find the percentage of accurate coins counted and then add that all to a list, then print that list.
the issue is that when I output this, the program is stuck on a while loop continuously on addition.append(1), I'm not sure why so please can you (using the code displayed) let me know how to update the code to make it run as intended, also if it helps, the first two lines of code within the txt file read:
Abena,5p,325.00,Y
Malcolm,1p,3356.00,N
this doesn't matter much but just incase you need it, I suspect that the reason it is stuck looping addition.append(1) is because the first line has a "Y" as its true_count

Writing a file to a dictionary

I'm taking a CSC 110 project. I am trying use dictionaries for our assignment even though we haven't learned them yet.
I have a file of countries and how many medals they won separated by new line characters. EX:
Afghanistan
0
0
0
Albania
0
0
0
Algeria
0
2
0
Each line after the country is the medals they earned starting with gold and working its way down to bronze.
I want to take these and store them in a dictionary with the structure looking something like this.
dict={Afghanistan: [0,0,0], Albania: [0,0,0]}
What I have :
olympic_stats = {}
fileIn = open('test.txt', 'r')
line = fileIn.readline()#Initialize Loop
counter = 0
while line != '':
if counter == 4:
counter = 0
if counter%4 == 0:#First Pass, COUNTRY
country_name = line.rstrip()
elif counter%4 == 1:#Second Pass, GOLD
gold_medals = int(line)
elif counter%4 == 2:#Third Pass, SILVER
silver_medals = int(line)
else: #Fourth Pass, BRONZE
bronze_medals = int(line)
#update Counter
counter += 1
if counter == 4:
olympic_stats[country_name] = [gold_medals, silver_medals, bronze_medals]
line = fileIn.readline()#Update Loop
While this works it is nasty and over complicated. I'm trying to come up with a new way to do this.

While your answer isn't super concise its not 'bad' per say. I might do something like this:
olympic_stats = {}
while line:
line_str = line.rstrip()
if line_str[0].isalpha():
country = line_str
olympic_stats[country] = []
else:
olympic_stats[country].append(line_str)

Your loop here is pretty clumsy - you can do better. You could, for example,
read the entire file at once into a list (using file.readlines())
count through the list four items at a time
which I have done here:
olympic_stats = {}
fileIn = open('test.txt', 'r')
fileLines = fileIn.readlines()
counter = 0
while counter < len(fileLines):
country_name = fileLines[counter]
gold_metals = fileLines[counter + 1]
silver_metals = fileLines[counter + 2]
bronze_metals = fileLines[counter + 3]
olympic_stats[country_name] = [gold_medals, silver_medals, bronze_medals]
counter += 4
There are more concise but much more complicated methods of doing this, by involving list comprehension and numpy or itertools, but those are more advanced topics and this should suffice for the time being.
While implementing this you might come up against errors when the number of lines in the file isn't easily divisible by four - I'll leave you to figure out how to fix that issue on your own, as it's a valuable learning experience and not too hard.

Need help understanding why this value is staying as 1? Python CSV

So this block of code is supposed to open the csv file, get the values from column 1-3 (not 0). Once it has got the values for each row and their 3 columns, it is supposed to add these values up and divide by 3. I thought this code would work however the addition of the 3 columns in each row doesn't seem to be working. If anyone could tell me why and how i can fix this, that would be great, thank you. I'm pretty certain the problem lies at the for index, summedValue in enumerate (sums): Specifically, the "summedValue" value.
if order ==("average score"):
askclass = str(input("what class?"))
if askclass == ('1'):
with open("Class1.csv") as f:
columns = f.readline().strip().split(" ")
sums = [1] * len(columns)
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(" ")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)

from statistics import mean
import csv
with open("Class1.csv") as f:
# create reader object
r = csv.reader(f)
# skip headers
headers = next(r)
# exract name from row and use statistics.mean to average from row[1..
# mapping scores to ints
avgs = ((row[0], mean(map(int, row[1:]))) for row in r)
# unpack name and average and print
for name, avg in avgs:
print(name,avg)
Unless you have written empty lines to your csv file there won't be any, not sure how the header fits into it but you can use it if necessary.
You can also unpack with the * syntax in python 3 which I think is a bit nicer:
avgs = ((name, mean(map(int, row))) for name, *row in r)
for name, avg in avgs:
print(name,avg)
To order just sort by the average using reverse=True to sort from highest to lowest:
from statistics import mean
import csv
from operator import itemgetter
with open("Class1.csv") as f:
r = csv.reader(f)
avgs = sorted(((name, mean(map(int, row))) for name, *row in r),key=itemgetter(1),reverse=True)
for name, avg in avgs:
print(name,avg)
Passing key=itemgetter(1) means we sort by the second subelement which is the average in each tuple.

using
1, 2, 3
4, 2, 3
4, 5, 3
1, 6, 3
1, 6, 6
6, 2, 3
as Class1.csv
and
askclass = str(input("what class?"))
if askclass == ('1'):
with open("Class1.csv") as f:
columns = f.readline().strip().split(",")
sums = [1] * len(columns)
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(",")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)
I obtain the expected result:
what class?1
('1', 0.3333333333333333)
(' 2', 7.333333333333333)
(' 3', 6.333333333333333)
[update] Observations:
sums defined ad sums = [1] * len(columns) has length columns, but you ignore first column in you operations so value for sum[0] will always be 1, do not seems necessary.
for float division it is sufficient summedValues / 3.0 instead of 1.0 * (summedValues) / 3
Maybe this is what you want
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(" ")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)

Python: count values within defined intervals

I import data from a CSV which looks like this:
3.13
3.51
3.51
4.01
2.13
1.13
1.13
1.13
1.63
1.88
What I would like to do now is to COUNT the values within those intervals:
0-1, 1-2, 2-3, >3
So the result would be
0-1: 0
1-2: 5
2-3: 1
>3: 4
Apart from this main task I would like to calculate the outcome into percent of total numbers (e.g. 0-1: 0%, 1-2: 50%,...)
I am quite new to Python so I got stuck in my attemps solving this thing. Maybe there is a predefined function for solving this I don't know of?
Thanks a lot for your help!!!
+++ UPDATE: +++
Thanks for all the replies.
I have testes a bunch of them but I kind of doing something wrong with reading the CSV-File I guess. Refering to the code snippets using a,b,c,d for the differnt intervalls these variables always stay '0' for me.
Here is my actual code:
import csv
a=b=c=0
with open('winter.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
for row in spamreader:
if row in range(0,1):
a += 1
elif row in range (1,2):
b += 1
print a,b
I also converted all values in the CSV to Integers without success. In the CSV there is just one single column.
Any ideas what I am doing wrong???

Here's how to do it in a very concise way with numpy:
import sys
import csv
import numpy as np
with open('winter.csv') as csvfile:
field = 0 # (zero-based) field/column number containing the required values
float_list = [float(row[field]) for row in csv.reader(csvfile)]
#float_list = [3.13, 3.51, 3.51, 4.01, 2.13, 1.13, 1.13, 1.13, 1.63, 1.88]
hist, bins = np.histogram(float_list, bins=[0,1,2,3,sys.maxint])
bin_counts = zip(bins, bins[1:], hist) # [(bin_start, bin_end, count), ... ]
for bin_start, bin_end, count in bin_counts[:-1]:
print '{}-{}: {}'.format(bin_start, bin_end, count)
# different output required for last bin
bin_start, bin_end, count = bin_counts[-1]
print '>{}: {}'.format(bin_start, count)
Which outputs:
0-1: 0
1-2: 5
2-3: 1
>3: 4
Most of the effort is in massaging the data for output.
It's also quite flexible as it is easy to use different intervals by changing the bins argument to np.histogram(), e.g. add another interval by changing bins:
hist, bins = np.histogram(float_list, bins=[0,1,2,3,4,sys.maxint])
outputs:
0-1: 0
1-2: 5
2-3: 1
3-4: 3
>4: 1

This should do, provided the data from the CSV is in values:
from collections import defaultdict
# compute a histogram
histogram = defaultdict(lambda: 0)
interval = 1.
max = 3
for v in values:
bin = int(v / interval)
bin = max if bin >= max else bin
histogram[bin] += 1
# output
sum = sum(histogram.values())
for k, v in sorted(histogram.items()):
share = 100. * v / sum
if k >= max:
print "{}+ : {}, {}%".format(k, v, share)
else:
print "{}-{}: {}, {}%".format(k, k+interval, v, share)

import csv
a=b=c=d=0
with open('cf.csv', 'r') as csvfile:
spamreader = csv.reader(csvfile)
for row in spamreader:
if 0<float(row[0])<1:
a+=1
elif 1<float(row[0])<2:
b+=1
elif 2<float(row[0])<3:
c+=1
if 3<float(row[0]):
d+=1
print "0-1:{} \n 1-2:{} \n 2-3:{} \n <3:{}".format(a,b,c,d)
out put:
0-1:0
1-2:5
2-3:1
<3:4
Because of your rows are list type we use [0] index to access our data and convert the string to float by float() function .

After you get the entries into a list:
0_to_1 = 0
1_to_2 = 0
2_to_3 = 0
ovr_3 = 0
for i in list:
if i in range(0,1):
0_to_1 += 1
elif i in range (1,2):
1_to_2 += 1
So on and so forth...
And to find the breakdown:
total_values = 0_to_1 + 1_to_2 + 2_to_3 + Ovr_3
perc_0_to_1 = (total_values/0_to_1)*100
perc_1_to_2 = (total_values/1_to_2)*100
perc_2_to_3 = (total_values/2_to_3)*100
perc_ovr_3 = (total_values/ovr_3)*100
+++++ Response to Update +++++++
import csv
a=b=c=0
with open('winter.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
for row in spamreader:
for i in row:
i = float(i.strip()) # .strip() removes blank spaces before converting it to float
if row in range(0,1):
a += 1
elif row in range(1,2):
b += 1
# add more elif statements here as desired.
Hope that works.
Side note, I like that a=b=c=o thing. Didn't realize you could do that after all this time haha.

How to add specific lines from a file into List in Python?

I have an input file:
3
PPP
TTT
QPQ
TQT
QTT
PQP
QQQ
TXT
PRP
I want to read this file and group these cases into proper boards.
To read the Count (no. of boards) i have code:
board = []
count =''
def readcount():
fp = open("input.txt")
for i, line in enumerate(fp):
if i == 0:
count = int(line)
break
fp.close()
But i don't have any idea of how to parse these blocks into List:
TQT
QTT
PQP
I tried using
def readboard():
fp = open('input.txt')
for c in (1, count): # To Run loop to total no. of boards available
for k in (c+1, c+3): #To group the boards into board[]
board[c].append(fp.readlines)
But its wrong way. I know basics of List but here i am not able to parse the file.
These boards are in line 2 to 4, 6 to 8 and so on. How to get them into Lists?
I want to parse these into Count and Boards so that i can process them further?
Please suggest

I don't know if I understand your desired outcome. I think you want a list of lists.
Assuming that you want boards to be:
[[data,data,data],[data,data,data],[data,data,data]], then you would need to define how to parse your input file... specifically:
line 1 is the count number
data is entered per line
boards are separated by white space.
If that is the case, this should parse your files correctly:
board = []
count = 0
currentBoard = 0
fp = open('input.txt')
for i,line in enumerate(fp.readlines()):
if i == 0:
count = int(i)
board.append([])
else:
if len(line[:-1]) == 0:
currentBoard += 1
board.append([])
else: #this has board data
board[currentBoard].append(line[:-1])
fp.close()
import pprint
pprint.pprint(board)
If my assumptions are wrong, then this can be modified to accomodate.
Personally, I would use a dictionary (or ordered dict) and get the count from len(boards):
from collections import OrderedDict
currentBoard = 0
board = {}
board[currentBoard] = []
fp = open('input.txt')
lines = fp.readlines()
fp.close()
for line in lines[1:]:
if len(line[:-1]) == 0:
currentBoard += 1
board[currentBoard] = []
else:
board[currentBoard].append(line[:-1])
count = len(board)
print(count)
import pprint
pprint.pprint(board)

If you just want to take specific line numbers and put them into a list:
line_nums = [3, 4, 5, 1]
fp = open('input.txt')
[line if i in line_nums for i, line in enumerate(fp)]
fp.close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Adding keys to defaultdict of int while iterating - python

have you tried the CSV python lib? Extracted from the doc (https://docs.python.org/3/library/csv.html): import csv with open('data.csv', newline='') as csvfile: spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|') for row in spamreader: print(', '.join(row))

Related

how to create a list and then print it in ascending order

Writing a file to a dictionary

Need help understanding why this value is staying as 1? Python CSV

Python: count values within defined intervals

How to add specific lines from a file into List in Python?

Categories

Resources