Populate a list in python - python

I have a series of Python tuples representing coordinates:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
I want to create the following list:
l = []
for t in tuples:
l[ t[0] ][ t[1] ] = something
I get an IndexError: list index out of range.
My background is in PHP and I expected that in Python you can create lists that start with index > 0, i.e. make gaps and then fill them up, but it seems you can't.
The idea is to have the lists sorted afterwards. I know I can do this with a dictionary, but as far as I know dictionaries cannot be sorted by keys.
Update: I now know they can - see the accepted solution.
Edit:
What I want to do is to create a 2D array that will represent the matrix described with the tuple coordinates, then iterate it in order.
If I use a dictionary, i have no guarantee that iterating over the keys will be in order -> (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)
Can anyone help?

No, you cannot create list with gaps. But you can create a dictionary with tuple keys:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = {}
for t in tuples:
l[t] = something
Update:
Try using NumPy, it provides wide range of operations over matrices and array. Cite from free pfd on NumPy available on the site (3.4.3 Flat Iterator indexing): "As mentioned previously, X.flat returns an iterator that will iterate over the entire array (in C-contiguous style with the last index varying the fastest". Looks like what you need.

You should look at dicts for something like that.
for t in tuples:
if not l.has_key(t[0]):
l[t[0]] = {}
l[t[0]][t[1]] = something
Iterating over the dict is a bit different than iterating over a list, though. You'll have the keys(), values() and items() functions to help with that.
EDIT: try something like this for ordering:
for x in sorted(l.keys()):
for y in sorted(l[x].keys()):
print l[x][y]

You create a one-dimensional list l and want to use it as a two-dimensional list.
Thats why you get an index error.
You have the following options:
create a map and use the tuple t as index:
l = {}
l[t] = something
and you will get entries in l as:
{(1, 1): something}
if you want a traditional array structure I'll advise you to look at numpy. With numpy you get n-dimensional arrays with "traditional" indexing.
As I mentioned use numpy,
with numpy you can create a 2-dimensional array, filled with zeros or ones or ...
Tha you can fill any desired value with indexing [x,y] as you desire.
Of course you can iterate over rows and columns or the whole array as a list.

If you know the size that you before hand,you can make a list of lists like this
>>> x = 3
>>> y = 3
>>> l = [[None] * x for i in range(y)]
>>> l
[[None, None, None], [None, None, None], [None, None, None]]
Which you can then iterate like you originally suggested.

What do you mean exactly by "but as far as I know dictionaries cannot be sorted by keys"?
While this is not strictly the same as a "sorted dictionary", you can easily turn a dictionary into a list, sorted by the key, which seems to be what you're after:
>>> tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
>>> l = {}
>>> for t in tuples:
... l[t] = "something"
>>> sorted(l) # equivalent to sorted(l.keys())
[(0, 0), (0, 1), (1, 0), (1, 1), (2, 1)]
>>> sorted(l.items()) # make a list of (key, value) tuples, and sort by key
[((0, 0), 'something'), ((0, 1), 'something'), ((1, 0), 'something'), ((1, 1), 'something'), ((2, 1), 'something')]
(I turned something into the string "something" just to make the code work)
To make use of this for your case however (if I understand it correctly, that is), you would still need to fill the dictionary with None values or something for every "empty" coordinate tuple)

Extending the Nathan's answer,
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
x = max(tuples, key = lambda z : z[0])[0] + 1
y = max(tuples, key = lambda z : z[1])[1] + 1
l = [[None] * y for i in range(x)]
And then you can do whatever you want

As mentioned earlier, you can't make lists with gaps, and dictionaries may be the better choice here. The trick is to makes sure that l[t[0]] exists when you put something in position t[1]. For this, I'd use a defaultdict.
import collections
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = collections.defaultdict(dict)
for t in tuples:
l[t[0]][t[1]] = something
Since l is a defaultdict, if l[t[0]] doesn't exist, it will create an empty dict for you to put your something in at position t[1].
Note: this ends up being the same as #unwesen's answer, without the minor tedium of hand-checking for existence of the inner dict. Chalk it up to concurrent answering.

The dict solutions given are probably best for most purposes. For your issue of iterating over the keys in order, generally you would instead iterate over the coordinate space, not the dict keys, exactly the same way you would have for your list of lists. Use .get and you can specify the default value to use for the blank cells, or alternatively use "collections.defaultdict" to define a default at dict creation time. eg.
for y in range(10):
for x in range(10):
value = mydict.get((x,y), some_default_value)
# or just "value = mydict[x,y]" if used defaultdict
If you do need an actual list of lists, you can construct it directly as below:
max_x, max_y = map(max, zip(*tuples))
l=[[something if (x,y) in tuples else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]
If the list of tuples is likely to be long, the for performance reasons, you may want to use a set for the lookup,as "(x,y) in tuples" performs a scan of the list, rather than a fast lookup by hash. ie, change the second line to:
tuple_set = set(tuples)
l=[[something if (x,y) in tuple_set else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]

I think you have only declared a one dimensional list.
I think you declare it as
l = [][]
Edit: That's a syntax error
>>> l = [][]
File "<stdin>", line 1
l = [][]
^
SyntaxError: invalid syntax
>>>

Related

Python Comprehensions

I've got a list of tuples, each tuple looks like (i,x).
i = index
x = value
I need to return a new list (using a comprehension only) that each value will be in the "right" index. If index is missing, we'll put the value -1000 to fill the gap.
For example:
Input: [(4,9), (0,2), (1,4), (3,2)]
Output should be: [2, 4, -1000, 2, 9]
I was trying to use index function, I'm trying to get the index of the tuple (1,2), while I "know" only the first element, the second can be anything.
I want to get the index of the tuple (1,2) by search (1,___), is that possible?
___ is a positive integer
return [sorted(L)[sorted(L).index((i,))][1] if i in [sorted(L)[j][0] for j in range(0,len(L))] else -1000 for i in range(sorted(L)[len(L)-1][0]+1)]
I can use list/dict/set comprehension, single line.
Thank you all for help!
With the help of a dictionary that maps indices to values, so we can easily and efficiently get the value for an index:
[g(i, -1000) for g in [dict(L).get] for i in range(max(L)[0] + 1)]
Try it online!

Why does a loop or list comprehension work to initialize an array, but individually initializing the elements does not?

Why does initializing the array arr work when it is done as a list comprehension (I think that is what the following example is --not sure), but not when each array location is initialized individually?
For example, this works:
(a)
arr=[]
arr=[0 for i in range(5)]
but (b),
arr=[]
arr[0]=0
arr[1]=0
etc, doesn't.
Isn't the arr=[0 for i in range(5)] instruction essentially doing what is done in (b) above in one fell swoop?
I realize that array sizes need to be predefined (or allocated). So, I can understand something like
arr= [0]*5
or using numpy,
arr = np.empty(10, dtype=object)
work.
However, I don't see how (a) preallocates the array dimension "ahead of time". How does python interpret (a) vs. (b) above?
Firstly, there is no point in declaring a variable if you rebind it later anyway:
arr = [] # <-- this line is entirely pointless
arr = [0 for i in range(5)]
Secondly, the two expressions
[0 for i in range(5)]
[0] * 5
create a new list object, whereas
arr[0] = 0
mutates an existing one, namely it wants to reassign the first element of arr. Since this doesn't exist, you will see an error. You could do instead:
arr = []
arr.append(0)
arr.append(0)
to fill an initially empty list incrementally.
Note that a Python list is not an Array in, let's say, the Java sense that it has a predefined size. It is more like an ArrayList.
It doesn't pre-allocate. It's basically just appending in a loop, just in nice form (syntactic sugar).
Why it doesn't pre-allocate? Because to pre-allocate, we would need to know the length of the iterable, which may be a generator and it would use it up. And also, comprehension can have an if clause, limiting what eventually gets into the list. (See also generator comprehensions, which create generators - no pre-allocation because it's lazily evaluated)
Let's take a look at documentation:
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it. For example, this listcomp combines the elements of two lists if they are not equal:
>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
and it’s equivalent to:
>>> combs = []
>>> for x in [1,2,3]:
... for y in [3,1,4]:
... if x != y:
... combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
See? Equivalent to append, not to pre-allocated (n*[0]) list.
I don't see how (a) preallocates the array dimension "ahead of time".
It doesn't. This:
arr=[]
arr=[0 for i in range(5)]
creates an empty array (I think it's more accurately called a list but I'm not a strong Python person) and stores it in arr, then creates an entirely new unrelated array and puts the new one in arr, throwing the old array away. It doesn't initialize the array you created with arr=[]. You can remove the first line (arr=[]) entirely; it doesn't do anything useful.
You can see that they're different arrays like this:
# Create a blank array, store it in `a`
a = []
# Store that same array in `b`
b = a
# Show that they're the same array
print(a == b);
# Create a new array and put it in `a`
a = [0 for i in range(5)]
# Show that they aren't the same array
print(a == b);
The output is
True
False
So just use arr=[0 for i in range(5)] or, if you want to do it separately, use append:
a = []
a.append(0)
a.append(0)
print(a)
which outputs [0, 0].

Fast, pythonic way to get all tuples obtained by dropping the elements from a given tuple?

Given a tuple T that contains all different integers, I want to get all the tuples that result from dropping individual integers from T. I came up with the following code:
def drop(T):
S = set(T)
for i in S:
yield tuple(S.difference({i}))
for t in drop((1,2,3)):
print(t)
# (2,3)
# (1,3)
# (1,2)
I'm not unhappy with this, but I wonder if there is a better/faster way because with large tuples, difference() needs to look for the item in the set, but I already know that I'll be removing items sequentially. However, this code is only 2x faster:
def drop(T):
for i in range(len(T)):
yield T[:i] + T[i+1:]
and in any case, neither scales linearly with the size of T.
Instead of looking at it as "remove one item each item" you can look at it as "use all but one" and then using itertools it becomes straightforward:
from itertools import combinations
T = (1, 2, 3, 4)
for t in combinations(T, len(T)-1):
print(t)
Which gives:
(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)
* Assuming the order doesn't really matter
From your description, you're looking for combinations of the elements of T. With itertools.combinations, you can ask for all r-length tuples, in sorted order, without repeated elements. For example :
import itertools
T = [1,2,3]
for i in itertools.combinations(T, len(T) - 1):
print(i)

Sort a list then give the indexes of the elements in their original order

I have an array of n numbers, say [1,4,6,2,3]. The sorted array is [1,2,3,4,6], and the indexes of these numbers in the old array are 0, 3, 4, 1, and 2. What is the best way, given an array of n numbers, to find this array of indexes?
My idea is to run order statistics for each element. However, since I have to rewrite this function many times (in contest), I'm wondering if there's a short way to do this.
>>> a = [1,4,6,2,3]
>>> [b[0] for b in sorted(enumerate(a),key=lambda i:i[1])]
[0, 3, 4, 1, 2]
Explanation:
enumerate(a) returns an enumeration over tuples consisting of the indexes and values in the original list: [(0, 1), (1, 4), (2, 6), (3, 2), (4, 3)]
Then sorted with a key of lambda i:i[1] sorts based on the original values (item 1 of each tuple).
Finally, the list comprehension [b[0] for b in ...] returns the original indexes (item 0 of each tuple).
Using numpy arrays instead of lists may be beneficial if you are doing a lot of statistics on the data. If you choose to do so, this would work:
import numpy as np
a = np.array( [1,4,6,2,3] )
b = np.argsort( a )
argsort() can operate on lists as well, but I believe that in this case it simply copies the data into an array first.
Here is another way:
>>> sorted(xrange(len(a)), key=lambda ix: a[ix])
[0, 3, 4, 1, 2]
This approach sorts not the original list, but its indices (created with xrange), using the original list as the sort keys.
This should do the trick:
from operator import itemgetter
indices = zip(*sorted(enumerate(my_list), key=itemgetter(1)))[0]
The long way instead of using list comprehension for beginner like me
a = [1,4,6,2,3]
b = enumerate(a)
c = sorted(b, key = lambda i:i[1])
d = []
for e in c:
d.append(e[0])
print(d)

Accessing grouped items in arrays

I'm new to Python and have a list of numbers. e.g.
5,10,32,35,64,76,23,53....
and I've grouped them into fours (5,10,32,35, 64,76,23,53 etc..) using the code from this post.
def group_iter(iterator, n=2, strict=False):
""" Transforms a sequence of values into a sequence of n-tuples.
e.g. [1, 2, 3, 4, ...] => [(1, 2), (3, 4), ...] (when n == 2)
If strict, then it will raise ValueError if there is a group of fewer
than n items at the end of the sequence. """
accumulator = []
for item in iterator:
accumulator.append(item)
if len(accumulator) == n: # tested as fast as separate counter
yield tuple(accumulator)
accumulator = [] # tested faster than accumulator[:] = []
# and tested as fast as re-using one list object
if strict and len(accumulator) != 0:
raise ValueError("Leftover values")
How can I access the individual arrays so that I can perform functions on them. For example, I'd like to get the average of the first values of every group (e.g. 5 and 64 in my example numbers).
Let's say you have the following tuple of tuples:
a=((5,10,32,35), (64,76,23,53))
To access the first element of each tuple, use a for-loop:
for i in a:
print i[0]
To calculate average for the first values:
elements=[i[0] for i in a]
avg=sum(elements)/float(len(elements))
Ok, this is yielding a tuple of four numbers each time it's iterated. So, convert the whole thing to a list:
L = list(group_iter(your_list, n=4))
Then you'll have a list of tuples:
>>> L
[(5, 10, 32, 35), (64, 76, 23, 53), ...]
You can get the first item in each tuple this way:
firsts = [tup[0] for tup in L]
(There are other ways, of course.)
You've created a tuple of tuples, or a list of tuples, or a list of lists, or a tuple of lists, or whatever...
You can access any element of any nested list directly:
toplist[x][y] # yields the yth element of the xth nested list
You can also access the nested structures by iterating over the top structure:
for list in lists:
print list[y]
Might be overkill for your application but you should check out my library, pandas. Stuff like this is pretty simple with the GroupBy functionality:
http://pandas.sourceforge.net/groupby.html
To do the 4-at-a-time thing you would need to compute a bucketing array:
import numpy as np
bucket_size = 4
n = len(your_list)
buckets = np.arange(n) // bucket_size
Then it's as simple as:
data.groupby(buckets).mean()

Categories

Resources