Get averages of lists in a dictionary

Get averages of lists in a dictionary - python

I have a dictionary that has lists inside it eg:
dict = {'Monday,': [10, 20], 'Tuesday,': [20, 20], 'Wednesday,': [30, 40], 'Thursday,': [5, 25], 'Friday,': [100, 20], 'Saturday,': [40, 10], 'Sunday,': [35, 25]}
And I need to get an average of each day eg:
{'Monday,': [15.00], 'Tuesday,': [20.00], 'Wednesday,': [35.00], 'Thursday,': [15.00], 'Friday,': [60.00], 'Saturday,': [25.00], 'Sunday,': [30.00]}
I have this at the moment but it is not working
average = sum(dict) / len(dict)

first, please don't name your dict as dict because that's a name of a python built-in
This will work for you:
average = {key: sum(val)/len(val) for key, val in dict.items()}
but beware that you assumed all lists are of numbers and no list is empty.

Related

Finding the lowest value or fewest items in a python dictionary that has lists as values

I have a dictionary with lists as values ; i need to find the key which has the fewer number of items in its list
Also I need to find what key has the item in the list with the lowest individual score.
I've really got no idea how to approach the first problem but my attempts to find the 2nd I feel like my coding attempts are not bringing back the results I would expect.
Thanks for any help or any pointers
results = {
'a': [21, 23, 24, 19],
'b': [16, 15, 12, 19],
'c': [23, 22, 23],
'd': [18, 20, 26, 24],
'e': [17, 22],
'f': [22, 24],
'g': [21, 21, 28, 25]
}
#Attempt1
print(min(results.items()))
#attempt2
print(results[min(results,key=results.get)])

You can use:
Fewest items:
min(results, key=lambda x: len(results[x]))
output: 'e'
Lowest min score:
min(results, key=lambda x: min(results[x]))
output: 'b'
why your attempts failed:
min(results.items())
would get the lowest key first, then in case of a tie compare the lists (Note that there cannot be a tie as keys are unique). So, this would be the same as min(results.keys())
min(results, key=results.get)
would compare the values, and pick the one with the smallest first item in the list, or second in case of a tie, or third…

Try below for key that has lower len:
>>> min(results, key=lambda x: len(results[x]))
'e'
try below for key that have lower value:
>>> min(results, key=lambda x: min(results[x]))
'b'
How does this work?
min(results, key=lambda x: ...)
# ---------------------^^^ x is a key of your dict : 'a', 'b', ...
# Then when write min(results[x]) or len(results[x]), we want to find
min(min(results['a']) , min(results['b']), ...))
min(results, key = 19, 12, ...)
----------------^'a'^,^'b'^
# So when 12 is min, we get key 'b'

Python - 2D list - find duplicates in one column and sum values in another column

I have a 2D list that contains soccer player names, the number of times they scored a goal, and the number of times they attempted a shot on goal, respectively.
player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]
From this list, I'm trying to return another list that shows only one instance of each player with their respective total goals and total attempts on goal, like so:
player_stats_totals = [['Adam', 30, 45], ['Kyle', 12, 18], ['Jo', 26, 49], ['Charlie', 31, 58]]
After searching on Stack Overflow I was able to learn (from this thread) how to return the indexes of the duplicate players
x = [player_stats[i][0] for i in range (len(player_stats))]
for i in range (len(x)):
if (x[i] in x[:i]) or (x[i] in x[i+1:]): print (x[i], i)
but got stuck on how to proceed thereafter and if indeed this method is strictly relevant for what I need(?)
What's the most efficient way to return the desired list of totals?

Use a dictionary to accumulate the values for a given player:
player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]
lookup = {}
for player, first, second in player_stats:
# if the player has not been seen add a new list with 0, 0
if player not in lookup:
lookup[player] = [0, 0]
# get the accumulated total so far
first_total, second_total = lookup[player]
# add the current values to the accumulated total, and update the values
lookup[player] = [first_total + first, second_total + second]
# create the output in the expected format
res = [[player, first, second] for player, (first, second) in lookup.items()]
print(res)
Output
[['Adam', 30, 45], ['Kyle', 12, 18], ['Jo', 26, 49], ['Charlie', 31, 58]]
A more advanced, and pythonic, version is to use a collections.defaultdict:
from collections import defaultdict
player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35],
['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]
lookup = defaultdict(lambda: [0, 0])
for player, first, second in player_stats:
# get the accumulated total so far
first_total, second_total = lookup[player]
# add the current values to the accumulated total, and update the values
lookup[player] = [first_total + first, second_total + second]
# create the output in the expected format
res = [[player, first, second] for player, (first, second) in lookup.items()]
print(res)
This approach has the advantage of skipping the initialisation. Both has approaches are O(n).
Notes
The expression:
res = [[player, first, second] for player, (first, second) in lookup.items()]
is a list comprehension, equivalent to the following for loop:
res = []
for player, (first, second) in lookup.items():
res.append([player, first, second])
Additionally, read this for understanding unpacking.

What you want to do is use a dictionary where the key is the player name and the value is a list containing [goals, shots]. Constructing it would look like this:
all_games_stats = {}
for stat in player_stats:
player, goals, shots = stat
if player not in all_games_stats:
all_games_stats[player] = [goals, shots]
else:
stat_list = all_games_stats[player]
stat_list[0] += goals
stat_list[1] += shots
Then, if you want to represent the players and their stats as a list, you would do:
list(all_games_stats.items())

You can convert the list to a dictionary. (It can always be changed back once done) This works:
player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo',
20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6,
14], ['Adam', 10, 15]]
new_stats = {}
for item in player_stats:
if not item[0] in new_stats:
new_stats[item[0]] = [item[1],item[2]]
else:
new_stats[item[0]][0] += item[1]
new_stats[item[0]][1] += item[2]
print(new_stats)

I might as well submit something, too. Here's yet another method with some list comprehension worked in:
# Unique values to new dictionary with goal and shots on goal default entries
agg_stats = dict.fromkeys(set([p[0] for p in player_stats]), [0, 0])
# Iterate over the player stats list
for player in player_stats:
# Set entry to sum of current and next stats values for the corresponding player.
agg_stats[player[0]] = [sum([agg_stats.get(player[0])[i], stat]) for i, stat in enumerate(player[1:])]

Yet another way, storing the whole triples (including the name) in the dict and updating them:
stats = {}
for name, goals, attempts in player_stats:
entry = stats.setdefault(name, [name, 0, 0])
entry[1] += goals
entry[2] += attempts
player_stats_totals = list(stats.values())
And for fun a solution with complex numbers, which makes adding nice but requires annoying conversion back:
from collections import defaultdict
tmp = defaultdict(complex)
for name, *stats in player_stats:
tmp[name] += complex(*stats)
player_stats_totals = [[name, int(stats.real), int(stats.imag)]
for name, stats in tmp.items()]

2d arrays and how to populate them with one dimensional arrays

This is my code:
def SetUpScores():
scoreBoard= []
names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
for i in range(0,5):
scoreBoard.append([])
for j in range(0,2):
scoreBoard[i].append(names[i])
scoreBoard[i][1]= userScores[i]
I'm basically trying to create a two dimensional array that holds the name and the userScore, I have looked this up alot and so far I keep getting the error of list assignment index out of range or 'list' cannot be called.
If i remove the last line from my code i.e:
def SetUpScores():
scoreBoard= []
names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
for i in range(0,5):
scoreBoard.append([])
for j in range(0,2):
scoreBoard[i].append(names[i])
I get
[['George', 'George'], ['Paul', 'Paul'], ['John', 'John'], ['Ringo', 'Ringo'], ['Bryan', 'Bryan']] without any errors (this is just to test if the array was made).
I would like to make something like:
[['George', 17], ['Paul', 19], ['John', 23], ['Ringo', 25], ['Bryan', 35]]
Any help would be appreciated, thank you!

With the line scoreBoard[i].append(names[i]), you add a single element, not a list. So, the next line scoreBoard[i][1]= userScores[i] causes an error, because it refers to the second element of names[i], which is just a string.
The most compact way to do what you want would be
for name, score in zip(names, userScores):
scoreBoard.append([name, score])

names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
L3 =[]
for i in range(0, len(L1)):
L3.append([L1[i], L2[i]])
print(L3)
Output:
[[17, 'George'], [19, 'Paul'], [23, 'John'], [25, 'Ringo'], [35, 'Bryan']]

Removing duplicates in list of lists

I have a list consisting of lists, and each sublist has 4 items(integers and floats) in it. My problem is that I want to remove those sublists whose index=1 and index=3 match with other sublists.
[[1, 2, 0, 50], [2, 19, 0, 25], [3, 12, 25, 0], [4, 18, 50, 50], [6, 19, 50, 67.45618854993529], [7, 4, 50, 49.49657024231138], [8, 12, 50, 41.65340802385248], [9, 12, 50, 47.80600357035001], [10, 18, 50, 47.80600357035001], [11, 18, 50, 53.222014760339356], [12, 18, 50, 55.667812693447615], [13, 12, 50, 41.65340802385248], [14, 12, 50, 47.80600357035001], [15, 13, 50, 47.80600357035001], [16, 3, 50, 49.49657024231138], [17, 3, 50, 49.49657024231138], [18, 4, 50, 49.49657024231138], [19, 5, 50, 49.49657024231138]]
For example,[7, 4, 50, 49.49657024231138] and [18, 4, 50, 49.49657024231138] have the same integers at index 1 and 3. So I want to remove one, which one doesn't matter.
I have looked at codes which allow me to do this on the basis of single index.
def unique_items(L):
found = set()
for item in L:
if item[1] not in found:
yield item
found.add(item[1])
I have been using this code which allows me to remove lists but only on the basis of a single index.(I haven't really understood the code completely.But it is working.)
Hence, the problem is removing sublists only on the basis of duplicate values of index=1 and index=3 in the list of lists.

If you need to compare (item[1], item[3]), use a tuple. Tuple is hashable type, so it can be used as a set member or dict key.
def unique_items(L):
found = set()
for item in L:
key = (item[1], item[3]) # use tuple as key
if key not in found:
yield item
found.add(key)

This is how you could make it work:
def unique_items(L):
# Build a set to keep track of all the indices we've found so far
found = set()
for item in L:
# Now check if the 2nd and 4th index of the current item already are in the set
if (item[1], item[3]) not in found:
# if it's new, then add its 2nd and 4th index as a tuple to our set
found.add((item[1], item[3])
# and give back the current item
# (I find this order more logical, but it doesn't matter much)
yield item

This should work:
from pprint import pprint
d = {}
for sublist in lists:
k = str(sublist[1]) + ',' + str(sublist[3])
if k not in d:
d[k] = sublist
pprint(d.values())

Sorting a list of lists in Python

c2=[]
row1=[1,22,53]
row2=[14,25,46]
row3=[7,8,9]
c2.append(row2)
c2.append(row1)
c2.append(row3)
c2 is now:
[[14, 25, 46], [1, 22, 53], [7, 8, 9]]
how do i sort c2 in such a way that for example:
for row in c2:
sort on row[2]
the result would be:
[[7,8,9],[14,25,46],[1,22,53]]
the other question is how do i first sort by row[2] and within that set by row[1]

The key argument to sort specifies a function of one argument that is used to extract a comparison key from each list element. So we can create a simple lambda that returns the last element from each row to be used in the sort:
c2.sort(key = lambda row: row[2])
A lambda is a simple anonymous function. It's handy when you want to create a simple single use function like this. The equivalent code not using a lambda would be:
def sort_key(row):
return row[2]
c2.sort(key = sort_key)
If you want to sort on more entries, just make the key function return a tuple containing the values you wish to sort on in order of importance. For example:
c2.sort(key = lambda row: (row[2],row[1]))
or:
c2.sort(key = lambda row: (row[2],row[1],row[0]))

>>> import operator
>>> c2 = [[14, 25, 46], [1, 22, 53], [7, 8, 9]]
>>> c2.sort(key=itemgetter(2))
>>> c2
[[7, 8, 9], [14, 25, 46], [1, 22, 53]]

Well, your desired example seems to indicate that you want to sort by the last index in the list, which could be done with this:
sorted_c2 = sorted(c2, lambda l1, l2: l1[-1] - l2[-1])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get averages of lists in a dictionary - python

first, please don't name your dict as dict because that's a name of a python built-in This will work for you: average = {key: sum(val)/len(val) for key, val in dict.items()} but beware that you assumed all lists are of numbers and no list is empty.

Related

Finding the lowest value or fewest items in a python dictionary that has lists as values

Python - 2D list - find duplicates in one column and sum values in another column

2d arrays and how to populate them with one dimensional arrays

Removing duplicates in list of lists

Sorting a list of lists in Python

Categories

Resources