Related
I want to make a function in python which receives as arguments a matrix, a coordinate for the line and another coordinate for the column. For example: A matrix m=[[1,2,3], [4,5,6]] and the function will receive the arguments (m,0,0)
It should return 1 (Which is the number located in position 0,0 in the matrix).
Think of it as a list of lists rather than a "matrix", and the logic becomes more obvious. Matrix m has two elements: m[0] = [1, 2, 3] and m[1] = [4, 5, 6]. So accessing a single value from within those lists requires another index. For example, m[0][1] = 2.
def matrix(m, a, b):
return m[a][b] # element b from list a in list m
If you really want to use a matrix, consider numpy.
a=[1,2,3]
b=[3,4,5,2]
c=[60,70,80]
sum(zip(a,b,c),())
what's the logic for the sum function here? why does it return a single tuple? especially why the following won't work
sum(zip(a,b,c))
The sum() function, simply concatenates items together with "+" and an initial value. Likewise, the zip() function produces tupled items together. Explicitly:
list(zip(a,b,c)) # [(1, 3, 60), (2, 4, 70), (3, 5, 80)]
sum([1,2,3],0) # 0 + 1 + 2 + 3
sum(zip(a,b,c),()) # () + (1,3,60) + (2,4,70) + (3,5,80)
Hope this helps explain the sum() and zip() functions. zip() can be tricky to see what it is doing since it produces an iterator instead of an answer. If you want to see what zip() does, wrap it in a list().
The sum(zip(a,b,c)) fails because the default initial value is 0. Hence, python tried to do 0 + (1,3,60) + ..., which fails because a 0 cannot be added to a tuple.
The other answers are useful in resolving any confusion, but perhaps the result you might be looking for is achieved by doing this:
sum(a+b+c)
because the + operator when applied to lists, concatenates them into a single list whereas zip does not
zip() does not do what you think it does. sum() will add the items of its input and return the result. In your case, you want to sum numbers from 3 lists. zip() returns tuples containing elements of the same index from the inputs, and when the result of this is passed to sum, it concatenates the tuples, leaving you with your undesired result. The fix is to use itertools.chain to combine the lists, then use sum to sum the numbers in those lists.
To show exactly how zip() works, an example should be useful:
a = ["a", "b", "c"]
b = [1, 2, 3]
list(zip(a, b)) -> [('a', 1), ('b', 2), ('c', 3)]
zip returned a generator of tuples (converted to a list here), each containing the element from each input that corresponds to the index of the tuple in the result, i.e, list(zip(a, b))[index] == (a[index], b[index])
What you want is this:
sum(itertools.chain(a, b, c))
EDIT: Make sure to import itertools first.
I have a function that returns many output arrays of varying size.
arr1,arr2,arr3,arr4,arr5, ... = func(data)
I want to run this function many times over a time series of data, and combine each output variable into one array that covers the whole time series.
To elaborate: If the output arr1 has dimensions (x,y) when the function is called, I want to run the function 't' times and end up with an array that has dimensions (x,y,t). A list of 't' arrays with size (x,y) would also be acceptable, but not preferred.
Again, the output arrays do not all have the same dimensions, or even the same number of dimensions. Arr2 might have size (x2,y2), arr3 might be only a vector of length (x3). I do not know the size of all of these arrays before hand.
My current solution is something like this:
arr1 = []
arr2 = []
arr3 = []
...
for t in range(t_max):
arr1_t, arr2_t, arr3_t, ... = func(data[t])
arr1.append(arr1_t)
arr2.append(arr2_t)
arr3.append(arr3_t)
...
and so on. However this is inelegant looking when repeated 27 times for each output array.
Is there a better way to do this?
You can just make arr1, arr2, etc. a list of lists (of vectors or matrices or whatever). Then use a loop to iterate the results obtained from func and add them to the individual lists.
arrN = [[] for _ in range(N)] # N being number of results from func
for t in range(t_max):
results = func(data[t])
for i, res in enumerate(results):
arrN[i].append(res)
The elements in the different sub-lists do not have to have the same dimensions.
Not sure if it counts as "elegant", but you can build a list of the result tuples then use zip to group them into tuples by return position instead of by call number, then optionally map to convert those tuples to the final data type. For example, with numpy array:
from future_builtins import map, zip # Only on Python 2, to minimize temporaries
import numpy as np
def func(x):
'Dumb function to return tuple of powers of x from 1 to 27'
return tuple(x ** i for i in range(1, 28))
# Example inputs for func
data = [np.array([[x]*10]*10, dtype=np.uint8) for in range(10)]
# Output is generator of results for each call to func
outputs = map(func, data)
# Pass each complete result of func as a positional argument to zip via star
# unpacking to regroup, so the first return from each func call is the first
# group, then the second return the second group, etc.
positional_groups = zip(*outputs)
# Convert regrouped data (`tuple`s of 2D results) to numpy 3D result type, unpack to names
arr1,arr2,arr3,arr4,arr5, ...,arr27 = map(np.array, positional_groups)
If the elements returned from func at a given position might have inconsistent dimensions (e.g. one call might return 10x10 as the first return, and another 5x5), you'd avoid the final map step (since the array wouldn't have consistent dimensions and just replace the second-to last step with:
arr1,arr2,arr3,arr4,arr5, ...,arr27 = zip(*outputs)
making arr# a tuple of 2D arrays, or if the need to be mutable:
arr1,arr2,arr3,arr4,arr5, ...,arr27 = map(list, zip(*outputs))
to make them lists of 2D arrays.
This answer gives a solution using structured arrays. It has the following requirement: Ggven a function f that returns N arrays, and the size of each of the returned arrays can be different -- then for all results of f, len(array_i) must always be same. eg.
arrs_a = f("a")
arrs_b = f("b")
for sub_arr_a, sub_arr_b in zip(arrs_a, arrs_b):
assert len(sub_arr_a) == len(sub_arr_b)
If the above is true, then you can use structured arrays. A structured array is like a normal array, just with a complex data type. For instance, I could specify a data type that is made up of one array of ints of shape 5, and a second array of floats of shape (2, 2). eg.
# define what a record looks like
dtype = [
# tuples of (field_name, data_type)
("a", "5i4"), # array of five 4-byte ints
("b", "(2,2)f8"), # 2x2 array of 8-byte floats
]
Using dtype you can create a structured array, and set all the results on the structured array in one go.
import numpy as np
def func(n):
"mock implementation of func"
return (
np.ones(5) * n,
np.ones((2,2))* n
)
# define what a record looks like
dtype = [
# tuples of (field_name, data_type)
("a", "5i4"), # array of five 4-byte ints
("b", "(2,2)f8"), # 2x2 array of 8-byte floats
]
size = 5
# create array
arr = np.empty(size, dtype=dtype)
# fill in values
for i in range(size):
# func must return a tuple
# or you must convert the returned value to a tuple
arr[i] = func(i)
# alternate way of instantiating arr
arr = np.fromiter((func(i) for i in range(size)), dtype=dtype, count=size)
# How to use structured arrays
# access individual record
print(arr[1]) # prints ([1, 1, 1, 1, 1], [[1, 1], [1, 1]])
# access specific value -- get second record -> get b field -> get value at 0,0
assert arr[2]['b'][0,0] == 2
# access all values of a specific field
print(arr['a']) # prints all the a arrays
I'm new to Python and have a list of numbers. e.g.
5,10,32,35,64,76,23,53....
and I've grouped them into fours (5,10,32,35, 64,76,23,53 etc..) using the code from this post.
def group_iter(iterator, n=2, strict=False):
""" Transforms a sequence of values into a sequence of n-tuples.
e.g. [1, 2, 3, 4, ...] => [(1, 2), (3, 4), ...] (when n == 2)
If strict, then it will raise ValueError if there is a group of fewer
than n items at the end of the sequence. """
accumulator = []
for item in iterator:
accumulator.append(item)
if len(accumulator) == n: # tested as fast as separate counter
yield tuple(accumulator)
accumulator = [] # tested faster than accumulator[:] = []
# and tested as fast as re-using one list object
if strict and len(accumulator) != 0:
raise ValueError("Leftover values")
How can I access the individual arrays so that I can perform functions on them. For example, I'd like to get the average of the first values of every group (e.g. 5 and 64 in my example numbers).
Let's say you have the following tuple of tuples:
a=((5,10,32,35), (64,76,23,53))
To access the first element of each tuple, use a for-loop:
for i in a:
print i[0]
To calculate average for the first values:
elements=[i[0] for i in a]
avg=sum(elements)/float(len(elements))
Ok, this is yielding a tuple of four numbers each time it's iterated. So, convert the whole thing to a list:
L = list(group_iter(your_list, n=4))
Then you'll have a list of tuples:
>>> L
[(5, 10, 32, 35), (64, 76, 23, 53), ...]
You can get the first item in each tuple this way:
firsts = [tup[0] for tup in L]
(There are other ways, of course.)
You've created a tuple of tuples, or a list of tuples, or a list of lists, or a tuple of lists, or whatever...
You can access any element of any nested list directly:
toplist[x][y] # yields the yth element of the xth nested list
You can also access the nested structures by iterating over the top structure:
for list in lists:
print list[y]
Might be overkill for your application but you should check out my library, pandas. Stuff like this is pretty simple with the GroupBy functionality:
http://pandas.sourceforge.net/groupby.html
To do the 4-at-a-time thing you would need to compute a bucketing array:
import numpy as np
bucket_size = 4
n = len(your_list)
buckets = np.arange(n) // bucket_size
Then it's as simple as:
data.groupby(buckets).mean()
I have a series of Python tuples representing coordinates:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
I want to create the following list:
l = []
for t in tuples:
l[ t[0] ][ t[1] ] = something
I get an IndexError: list index out of range.
My background is in PHP and I expected that in Python you can create lists that start with index > 0, i.e. make gaps and then fill them up, but it seems you can't.
The idea is to have the lists sorted afterwards. I know I can do this with a dictionary, but as far as I know dictionaries cannot be sorted by keys.
Update: I now know they can - see the accepted solution.
Edit:
What I want to do is to create a 2D array that will represent the matrix described with the tuple coordinates, then iterate it in order.
If I use a dictionary, i have no guarantee that iterating over the keys will be in order -> (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)
Can anyone help?
No, you cannot create list with gaps. But you can create a dictionary with tuple keys:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = {}
for t in tuples:
l[t] = something
Update:
Try using NumPy, it provides wide range of operations over matrices and array. Cite from free pfd on NumPy available on the site (3.4.3 Flat Iterator indexing): "As mentioned previously, X.flat returns an iterator that will iterate over the entire array (in C-contiguous style with the last index varying the fastest". Looks like what you need.
You should look at dicts for something like that.
for t in tuples:
if not l.has_key(t[0]):
l[t[0]] = {}
l[t[0]][t[1]] = something
Iterating over the dict is a bit different than iterating over a list, though. You'll have the keys(), values() and items() functions to help with that.
EDIT: try something like this for ordering:
for x in sorted(l.keys()):
for y in sorted(l[x].keys()):
print l[x][y]
You create a one-dimensional list l and want to use it as a two-dimensional list.
Thats why you get an index error.
You have the following options:
create a map and use the tuple t as index:
l = {}
l[t] = something
and you will get entries in l as:
{(1, 1): something}
if you want a traditional array structure I'll advise you to look at numpy. With numpy you get n-dimensional arrays with "traditional" indexing.
As I mentioned use numpy,
with numpy you can create a 2-dimensional array, filled with zeros or ones or ...
Tha you can fill any desired value with indexing [x,y] as you desire.
Of course you can iterate over rows and columns or the whole array as a list.
If you know the size that you before hand,you can make a list of lists like this
>>> x = 3
>>> y = 3
>>> l = [[None] * x for i in range(y)]
>>> l
[[None, None, None], [None, None, None], [None, None, None]]
Which you can then iterate like you originally suggested.
What do you mean exactly by "but as far as I know dictionaries cannot be sorted by keys"?
While this is not strictly the same as a "sorted dictionary", you can easily turn a dictionary into a list, sorted by the key, which seems to be what you're after:
>>> tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
>>> l = {}
>>> for t in tuples:
... l[t] = "something"
>>> sorted(l) # equivalent to sorted(l.keys())
[(0, 0), (0, 1), (1, 0), (1, 1), (2, 1)]
>>> sorted(l.items()) # make a list of (key, value) tuples, and sort by key
[((0, 0), 'something'), ((0, 1), 'something'), ((1, 0), 'something'), ((1, 1), 'something'), ((2, 1), 'something')]
(I turned something into the string "something" just to make the code work)
To make use of this for your case however (if I understand it correctly, that is), you would still need to fill the dictionary with None values or something for every "empty" coordinate tuple)
Extending the Nathan's answer,
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
x = max(tuples, key = lambda z : z[0])[0] + 1
y = max(tuples, key = lambda z : z[1])[1] + 1
l = [[None] * y for i in range(x)]
And then you can do whatever you want
As mentioned earlier, you can't make lists with gaps, and dictionaries may be the better choice here. The trick is to makes sure that l[t[0]] exists when you put something in position t[1]. For this, I'd use a defaultdict.
import collections
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = collections.defaultdict(dict)
for t in tuples:
l[t[0]][t[1]] = something
Since l is a defaultdict, if l[t[0]] doesn't exist, it will create an empty dict for you to put your something in at position t[1].
Note: this ends up being the same as #unwesen's answer, without the minor tedium of hand-checking for existence of the inner dict. Chalk it up to concurrent answering.
The dict solutions given are probably best for most purposes. For your issue of iterating over the keys in order, generally you would instead iterate over the coordinate space, not the dict keys, exactly the same way you would have for your list of lists. Use .get and you can specify the default value to use for the blank cells, or alternatively use "collections.defaultdict" to define a default at dict creation time. eg.
for y in range(10):
for x in range(10):
value = mydict.get((x,y), some_default_value)
# or just "value = mydict[x,y]" if used defaultdict
If you do need an actual list of lists, you can construct it directly as below:
max_x, max_y = map(max, zip(*tuples))
l=[[something if (x,y) in tuples else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]
If the list of tuples is likely to be long, the for performance reasons, you may want to use a set for the lookup,as "(x,y) in tuples" performs a scan of the list, rather than a fast lookup by hash. ie, change the second line to:
tuple_set = set(tuples)
l=[[something if (x,y) in tuple_set else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]
I think you have only declared a one dimensional list.
I think you declare it as
l = [][]
Edit: That's a syntax error
>>> l = [][]
File "<stdin>", line 1
l = [][]
^
SyntaxError: invalid syntax
>>>