How can I sum a column of a list? - python

I have a Python array, like so:
[[1,2,3],
[1,2,3]]
I can add the row by doing sum(array[i]), how can I sum a column, using a double for loop?
I.E. for the first column, I could get 2, then 4, then 6.

Using a for loop (in a generator expression):
data = [[1,2,3],
[1,2,3]]
column = 1
print(sum(row[column] for row in data)) # -> 4

Try this:
a = [[1,2,3],
[1,2,3]]
print [sum(x) for x in zip(*a)]
zip function description

You don't need a loop, use zip() to transpose the list, then take the desired column:
sum(list(zip(*data)[i]))
(Note in 2.x, zip() returns a list, so you don't need the list() call).
Edit: The simplest solution to this problem, without using zip(), would probably be:
column_sum = 0
for row in data:
column_sum += row[i]
We just loop through the rows, taking the element and adding it to our total.
This is, however, less efficient and rather pointless given we have built-in functions to do this for us. In general, use zip().

[sum(row[i] for row in array) for i in range(len(array[0]))]
That should do it. len(array[0]) is the number of columns, so i iterates through those. The generator expression row[i] for row in array goes through all of the rows and selects a single column, for each column number.

I think the easiest way is this:
sumcolumn=data.sum(axis=0)
print (sumcolumn)

you can use zip():
In [16]: lis=[[1,2,3],
....: [1,2,3]]
In [17]: map(sum,zip(*lis))
Out[17]: [2, 4, 6]
or with a simple for loops:
In [25]: for i in xrange(len(lis[0])):
summ=0
for x in lis:
summ+=x[i]
print summ
....:
2
4
6

You may be interested in numpy, which has more advanced array features.
One of which is to easily sum a column:
from numpy import array
a = array([[1,2,3],
[1,2,3]])
column_idx = 1
a[:, column_idx].sum() # ":" here refers to the whole array, no filtering.

You can use numpy:
import numpy as np
a = np.array([[1,2,3],[1,2,3]])
a.sum(0)

Related

Randomly remove 'x' elements from a list

I'd like to randomly remove a fraction of elements from a list without changing the order of the list.
Say I had some data and I wanted to remove 1/4 of them:
data = [1,2,3,4,5,6,7,8,9,10]
n = len(data) / 4
I'm thinking I need a loop to run through the data and delete a random element 'n' times? So something like:
for i in xrange(n):
random = np.randint(1,len(data))
del data[random]
My question is, is this the most 'pythonic' way of doing this? My list will be ~5000 elements long and I want to do this multiple times with different values of 'n'.
Thanks!
Sequential deleting is a bad idea since deletion in a list is O(n). Instead do something like this:
def delete_rand_items(items,n):
to_delete = set(random.sample(range(len(items)),n))
return [x for i,x in enumerate(items) if not i in to_delete]
You can use random.sample like this:
import random
a = [1,2,3,4,5,6,7,8,9,10]
no_elements_to_delete = len(a) // 4
no_elements_to_keep = len(a) - no_elements_to_delete
b = set(random.sample(a, no_elements_to_keep)) # the `if i in b` on the next line would benefit from b being a set for large lists
b = [i for i in a if i in b] # you need this to restore the order
print(len(a)) # 10
print(b) # [1, 2, 3, 4, 5, 8, 9, 10]
print(len(b)) # 8
Two notes on the above.
You are not modifying the original list in place but you could.
You are not actually deleting elements but rather keeping elements but it is the same thing (you just have to adjust the ratios)
The drawback is the list-comprehension that restores the order of the elements
As #koalo says in the comments the above will not work properly if the elements in the original list are not unique. I could easily fix that but then my answer would be identical to the one posted by#JohnColeman. So if that might be the case just use his instead.
Is the order meaningful?
if not you can do something like:
shuffle(data)
data=data[:len(data)-n]
I suggest using numpy indexing as in
import numpy as np
data = np.array([1,2,3,4,5,6,7,8,9,10])
n = len(data)/4
indices = sorted(np.random.choice(len(data),len(data)-n,replace=False))
result = data[indices]
I think it will be more convenient this way:
import random
n = round(len(data) *0.3)
for i in range(n):
data.pop(random.randrange(len(data)))

Enumerate list to make a new list of indices?

I'm trying to make a new list of indices by enumerated a previous list. Basically, what I want is:
To enumerate a list of elements to obtain indices for each element. I coded this:
board = ["O","O","O","O","O"]
for index,y in enumerate(board):
print(index,end=" ")
which gives:
0 1 2 3 4
I now want to make those numbers into a new list, but have no clue how to do that.
Thanks! Sorry for the question, I'm still a beginner and am just trying to get the hang of things.
You should probably just make a range of the right length:
board = ["O","O","O","O","O"]
indices = list(range(len(board)))
print(indices)
> [0, 1, 2, 3, 4]
Use list comprehension:
indices = [index for index, y in enumerate(board)]
If board is always a object, which implements the __len__-method, you can also use range:
indices = list(range(len(board)))
If you just want all the numbers you can use this:
indices = list(range(len(board)))
If you pass one number to range it will return an iterator with the numbers 0 up to the passed number (excluding). After this we turn it into a list with the list function.
You can use list comprehension to do that:
result = [index for index,y in enumerate(board)]
Alternatively you can use the range function:
result = range(len(board))
I would just use numpy arange, which creates an array that looks like the one you are looking for:
Numpy Arange
import numpy as np
enumerated = np.arange(len(board))
The straightforward way is:
board = ["O","O","O","O","O"]
newlist = []
for index,y in enumerate(board):
newlist.append(index)
A more advanced way using list comprehensions would be:
newlist = [index for index, value in enumerate(board)]

Can I use python slicing to access one "column" of a nested tuple?

I have a nested tuple that is basically a 2D table (returned from a MySQL query). Can I use slicing to get a list or tuple of one "column" of the table?
For example:
t = ((1,2,3),(3,4,5),(1,4,5),(9,8,7))
x = 6
How do I efficiently check whether x appears in the 3rd position of any of the tuples?
All the examples of slicing I can find only operate within a single tuple. I don't want to slice a "row" out of t. I want to slice it the other way -- vertically.
Your best bet here is to use a generator expression with the any() function:
if any(row[2] == x for row in t):
# x appears in the third row of at least one tuple, do something
As far as using slicing to just get a column, here are a couple of options:
Using zip():
>>> zip(*t)[2]
(3, 5, 5, 7)
Using a list comprehension:
>>> [row[2] for row in t]
[3, 5, 5, 7]
I'll chime in with the numpy solution
import numpy
t = ((1,2,3),(3,4,5),(1,4,5),(9,8,7))
x = 6
col_id = 2
a = numpy.array(t)
print a[a[:,col_id] == x]

Two dimensional array in python

I want to know how to declare a two dimensional array in Python.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
The first two assignments work fine. But when I try to do, arr[1].append("bb1"), I get the following error:
IndexError: list index out of range.
Am I doing anything silly in trying to declare the 2-D array?
Edit:
but I do not know the number of elements in the array (both rows and columns).
You do not "declare" arrays or anything else in python. You simply assign to a (new) variable. If you want a multidimensional array, simply add a new array as an array element.
arr = []
arr.append([])
arr[0].append('aa1')
arr[0].append('aa2')
or
arr = []
arr.append(['aa1', 'aa2'])
There aren't multidimensional arrays as such in Python, what you have is a list containing other lists.
>>> arr = [[]]
>>> len(arr)
1
What you have done is declare a list containing a single list. So arr[0] contains a list but arr[1] is not defined.
You can define a list containing two lists as follows:
arr = [[],[]]
Or to define a longer list you could use:
>>> arr = [[] for _ in range(5)]
>>> arr
[[], [], [], [], []]
What you shouldn't do is this:
arr = [[]] * 3
As this puts the same list in all three places in the container list:
>>> arr[0].append('test')
>>> arr
[['test'], ['test'], ['test']]
What you're using here are not arrays, but lists (of lists).
If you want multidimensional arrays in Python, you can use Numpy arrays. You'd need to know the shape in advance.
For example:
import numpy as np
arr = np.empty((3, 2), dtype=object)
arr[0, 1] = 'abc'
You try to append to second element in array, but it does not exist.
Create it.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr.append([])
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
We can create multidimensional array dynamically as follows,
Create 2 variables to read x and y from standard input:
print("Enter the value of x: ")
x=int(input())
print("Enter the value of y: ")
y=int(input())
Create an array of list with initial values filled with 0 or anything using the following code
z=[[0 for row in range(0,x)] for col in range(0,y)]
creates number of rows and columns for your array data.
Read data from standard input:
for i in range(x):
for j in range(y):
z[i][j]=input()
Display the Result:
for i in range(x):
for j in range(y):
print(z[i][j],end=' ')
print("\n")
or use another way to display above dynamically created array is,
for row in z:
print(row)
When constructing multi-dimensional lists in Python I usually use something similar to ThiefMaster's solution, but rather than appending items to index 0, then appending items to index 1, etc., I always use index -1 which is automatically the index of the last item in the array.
i.e.
arr = []
arr.append([])
arr[-1].append("aa1")
arr[-1].append("aa2")
arr.append([])
arr[-1].append("bb1")
arr[-1].append("bb2")
arr[-1].append("bb3")
will produce the 2D-array (actually a list of lists) you're after.
You can first append elements to the initialized array and then for convenience, you can convert it into a numpy array.
import numpy as np
a = [] # declare null array
a.append(['aa1']) # append elements
a.append(['aa2'])
a.append(['aa3'])
print(a)
a_np = np.asarray(a) # convert to numpy array
print(a_np)
a = [[] for index in range(1, n)]
For compititve programming
1) For input the value in an 2D-Array
row=input()
main_list=[]
for i in range(0,row):
temp_list=map(int,raw_input().split(" "))
main_list.append(temp_list)
2) For displaying 2D Array
for i in range(0,row):
for j in range(0,len(main_list[0]):
print main_list[i][j],
print
the above method did not work for me for a for loop, where I wanted to transfer data from a 2D array to a new array under an if the condition. This method would work
a_2d_list = [[1, 2], [3, 4]]
a_2d_list.append([5, 6])
print(a_2d_list)
OUTPUT - [[1, 2], [3, 4], [5, 6]]
x=3#rows
y=3#columns
a=[]#create an empty list first
for i in range(x):
a.append([0]*y)#And again append empty lists to original list
for j in range(y):
a[i][j]=input("Enter the value")
In my case I had to do this:
for index, user in enumerate(users):
table_body.append([])
table_body[index].append(user.user.id)
table_body[index].append(user.user.username)
Output:
[[1, 'john'], [2, 'bill']]

Picking out items from a python list which have specific indexes

I'm sure there's a nice way to do this in Python, but I'm pretty new to the language, so forgive me if this is an easy one!
I have a list, and I'd like to pick out certain values from that list. The values I want to pick out are the ones whose indexes in the list are specified in another list.
For example:
indexes = [2, 4, 5]
main_list = [0, 1, 9, 3, 2, 6, 1, 9, 8]
the output would be:
[9, 2, 6]
(i.e., the elements with indexes 2, 4 and 5 from main_list).
I have a feeling this should be doable using something like list comprehensions, but I can't figure it out (in particular, I can't figure out how to access the index of an item when using a list comprehension).
[main_list[x] for x in indexes]
This will return a list of the objects, using a list comprehension.
t = []
for i in indexes:
t.append(main_list[i])
return t
map(lambda x:main_list[x],indexes)
If you're good with numpy:
import numpy as np
main_array = np.array(main_list) # converting to numpy array
out_array = main_array.take([2, 4, 5])
out_list = out_array.tolist() # if you want a list specifically
I think Yuval A's solution is a pretty clear and simple. But if you actually want a one line list comprehension:
[e for i, e in enumerate(main_list) if i in indexes]
As an alternative to a list comprehension, you can use map with list.__getitem__. For large lists you should see better performance:
import random
n = 10**7
L = list(range(n))
idx = random.sample(range(n), int(n/10))
x = [L[x] for x in idx]
y = list(map(L.__getitem__, idx))
assert all(i==j for i, j in zip(x, y))
%timeit [L[x] for x in idx] # 474 ms per loop
%timeit list(map(L.__getitem__, idx)) # 417 ms per loop
For a lazy iterator, you can just use map(L.__getitem__, idx). Note in Python 2.7, map returns a list, so there is no need to pass to list.
I have noticed that there are two optional ways to do this job, either by loop or by turning to np.array. Then I test the time needed by these two methods, the result shows that when dataset is large
【[main_list[x] for x in indexes]】is about 3~5 times faster than
【np.array.take()】
if your code is sensitive to the computation time, the highest voted answer is a good choice.

Categories

Resources