Sum of first elements in nested lists - python

I am trying to get the first element in a nested list and sum up the values.
eg.
nested_list = [[1, 'a'], [2, 'b'], [3, 'c'], [4, 'd']]
print sum(i[0] for i in nested_list)
However, there are times in which the first element in the lists None instead
nested_list = [[1, 'a'], [None, 'b'], [3, 'c'], [4, 'd']]
new = []
for nest in nested_list:
if not nest[0]:
pass
else:
new.append(nest[0])
print sum(nest)
Wondering what is the better way that I can code this?

Just filter then, in this case testing for values that are not None:
sum(i[0] for i in nested_list if i[0] is not None)
A generator expression (and list, dict and set comprehensions) takes any number of nested for loops and if statements. The above is the equivalent of:
for i in nested_list:
if i[0] is not None:
i[0] # used to sum()
Note how this mirrors your own code; rather than use if not ...: pass and else, I inverted the test to only allow for values you actually can sum. Just add more for loops or if statements in the same left-to-right to nest order if you need more loops with filters, or use and or or to string together multiple tests in a single if filter.
In your specific case, just testing for if i[0] would also suffice; this would filter out None and 0, but the latter value would not make a difference to the sum anyway:
sum(i[0] for i in nested_list if i[0])
You already approached this in your own if test in the loop code.

First of all, Python has no null, the equivalent of null in languages like Java, C#, JavaScript, etc. is None.
Secondly, we can use a filter in the generator expression. The most generic is probably to check with numbers:
from numbers import Number
print sum(i[0] for i in nested_list if isinstance(i[0], Number))
Number will usually make sure that we accept ints, longs, floats, complexes, etc. So we do not have to keep track of all objects in the Python world that are numerical ourselves.
Since it is also possible that the list contains empty sublists, we can also check for that:
from numbers import Number
print sum(i[0] for i in nested_list if i and isinstance(i[0], Number))

Related

set item at multiple indexes in a list

I am trying to find a way to use a list of indexes to set values at multiple places in a list (as is possible with numpy arrays).
I found that I can map __getitem__ and a list of indexes to return the values at those indexes:
# something like
a_list = ['a', 'b', 'c']
idxs = [0, 1]
get_map = map(a_list.__getitem__, idxs)
print(list(get_map)) # yields ['a', 'b']
However, when applying this same line of thought to __setitem__, the setting fails. This probably has something to do with pass-by-reference vs pass-by-value, which I have never fully understood no matter how many times I've read about it.
Is there a way to do this?
b_list = ['a', 'b', 'c']
idxs = [0, 1]
put_map = map(b_list.__setitem__, idx, ['YAY', 'YAY'])
print(b_list) # yields ['YAY', 'YAY', 'c']
For my use case, I only want to set one value at multiple locations. Not multiple values at multiple locations.
EDIT: I know how to use list comprehension. I am trying to mimic numpy's capability to accept a list of indexes for both getting and setting items in an array, except for lists.
The difference between the get and set case is that in the get case you are interested in the result of map itself, but in the set case you want a side effect. Thus, you never consume the map generator and the instructions are never actually executed. Once you do, b_list gets changed as expected.
>>> put_map = map(b_list.__setitem__, idxs, ['YAY', 'YAY'])
>>> b_list
['a', 'b', 'c']
>>> list(put_map)
[None, None]
>>> b_list
['YAY', 'YAY', 'c']
Having said that, the proper way for get would be a list comprehension and for set a simple for loop. That also has the advantage that you do not have to repeat the value to put in place n times.
>>> for i in idxs: b_list[i] = "YAY"
>>> [b_list[i] for i in idxs]
['YAY', 'YAY']

How to solve this task with lists?

Making a list from mainlist and from subslists
As chepner said, try write your own code without list comprehension and then modify your own code.
def pyramid(base, char):
return [ ['']*num_empty + [char]*num_char + ['']*num_empty for num_empty, num_char in enumerate(list(range(base, 0, -2)))][::-1]
I'd recommend breaking up the problems into two steps:
Generate the list of pyramid bricks (without the empty spaces)
Add in the empty spaces (as needed) to each element of that list
As for Step 1, you can easily accomplish it with this function:
def pyramid(base, char):
result = [[char] * i for i in range(1, base+1, 2)]
return result
Do you see what it's doing? It's looping through a range of odd numbers, and for each number, it is constructing a list of chars. Each constructed list will be an element of the returned result list.
So if you call pyramid(5, 'A'), you'll get:
[['A'], ['A', 'A', 'A'], ['A', 'A', 'A', 'A', 'A']]
This solution does not account for the empty spaces, however. To handle those empty spaces, you could either:
Run the result through a second list comprehension, or:
Edit the first (and only) list comprehension to include the proper number of spaces at the beginning and end of each sub-list.
I'll let you decide how to implement this for yourself. I hope this helps!

How to arrange the output of set based on predefined list

list1=['f','l','a','m','e','s'] #This is the predefined list
list2=['e','e','f','a','s','a'] #This is the list with repitition
x=list(set(list2)) # I want to remove duplicates
print(x)
Here I want the variable x to retain the order which list1 has. For example, if at one instance set(list2) produces the output as ['e','f','a','s'], I want it to produce ['f','a','e','s'] (Just by following the order of list1).
Can anyone help me with this?
Construct a dictionary that maps characters to their position in list1. Use its get method as the sort-key.
>>> dict1 = dict(zip(list1, range(len(list1))))
>>> sorted(set(list2), key=dict1.get)
['f', 'a', 'e', 's']
This is one way using dictionary:
list1=['f','l','a','m','e','s'] #This is the predefined list
list2=['e','e','f','a','s','a'] #This is the list with repitition
x=list(set(list2)) # I want to remove duplicates
d = {key:value for value, key in enumerate(list1)}
x.sort(key=d.get)
print(x)
# ['f', 'a', 'e', 's']
Method index from the list class can do the job:
sorted(set(list2), key=list1.index)
What is best usually depends on actual use. With this problem it is important to know the expected sizes of the lists to choose the most efficient approach. If we are keeping much of the dictionary the following query works well and has the additional benefit that it is easy to read.
set2 = set(list2)
x = [i for i in list1 if i in set2]
It would also work without turning list2 into a set first. However, this would run much slower with a large list2.

Multi Dimensional List - Sum Integer Element X by Common String Element Y

I have a multi dimensional list:
multiDimList = [['a',1],['a',1],['a',1],['b',2],['c',3],['c',3]]
I'm trying to sum the instances of element [1] where element [0] is common.
To put it more clearly, my desired output is another multi dimensional list:
multiDimListSum = [['a',3],['b',2],['c',6]]
I see I can access, say the value '2' in multiDimList by
x = multiDimList [3][1]
so I can grab the individual elements, and could probably build some sort of function to do this job, but it'd would be disgusting.
Does anyone have a suggestion of how to do this pythonically?
Assuming your actual sequence has similar elements grouped together as in your example (all instances of 'a', 'b' etc. together), you can use itertools.groupby() and operator.itemgetter():
from itertools import groupby
from operator import itemgetter
[[k, sum(v[1] for v in g)] for k, g in groupby(multiDimList, itemgetter(0))]
# result: [['a', 3], ['b', 2], ['c', 6]]
Zero Piraeus's answer covers the case when field entries are grouped in order. If they're not, then the following is short and reasonably efficient.
from collections import Counter
reduce(lambda c,x: c.update({x[0]: x[1]}) or c, multiDimList, Counter())
This returns a collection, accessible by element name. If you prefer it as a list you can call the .items() method on it, but note that the order of the labels in the output may be different from the order in the input even in the cases where the input was consistently ordered.
You could use a dict to accumulate the total associated to each string
d = {}
multiDimList = [['a',1],['a',1],['a',1],['b',2],['c',3],['c',3]]
for string, value in multiDimList:
# Retrieves the current value in the dict if it exists or 0
current_value = d.get(string, 0)
d[string] += value
print d # {'a': 3, 'b': 2, 'c': 6}
You can then access the value for b by using d["b"].

How to count number of unique lists within list?

I've tried using Counter and itertools, but since a list is unhasable, they don't work.
My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]
I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:
>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})
If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))
Also, if you wanted to count the occurences of a unique* list:
print(ll.count([1,2,3]))
*value unique, not reference unique)
I think, using the Counter class on tuples like
Counter(tuple(item) for item in li)
Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).
The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.
In order to improve on performance, you would have to
Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)
At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.
As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.
So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.
list = [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
count = 0;
if i not in repeats:
for i2 in list:
if i == i2:
count += 1
if count > 1:
repeats.append(i)
elif count == 1:
unique += 1
print "Repeated Items"
for r in repeats:
print r,
print "\nUnique items:", unique
loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

Categories

Resources