copying iterators and producing combinations - python

Say I have a list, and I want to produce a list of all unique pairs of elements without considering the order. One way to do this is:
mylist = ['W','X','Y','Z']
for i in xrange(len(mylist)):
for j in xrange(i+1,len(mylist)):
print mylist[i],mylist[j]
W X
W Y
W Z
X Y
X Z
Y Z
I want to do this with iterators, I thought of the following, even though it doesn't have brevity:
import copy
it1 = iter(mylist)
for a in it1:
it2 = copy.copy(it1)
for b in it2:
print a,b
But this doesn't even work. What is a more pythonic and efficient way of doing this, with iterators or zip, etc.?

This has already been done and is included in the standard library as of Python 2.6:
import itertools
mylist = ['W', 'X', 'Y', 'Z']
for pair in itertools.combinations(mylist, 2):
print pair # pair is a tuple of 2 elements
Seems pretty Pythonic to me ;-)
Note that even if you're calculating a lot of combinations, the combinations() function returns an iterator so that you can start printing them right away. See the docs.
Also, you are referring to the result as a Cartesian product between the list and itself, but this is not strictly correct: The Cartesian product would have 16 elements (4x4). Your output is a subset of that, namely only the 2-element combinations (with repetition not allowed) of the values of the list.

#Cameron's answer is correct.
I just wanted to point out that
for i in range(len(mylist)):
do_something_to(mylist[i])
is nastily un-Pythonic; if your operation is read-only (does not have to be stored back to the array), do
for i in mylist:
do_something_to(i)
otherwise
mylist = [do_something_to(i) for i in mylist]

Related

How to iterate a zip list and print each index of it after joining to lists of the same size?

Given the variables:
X = ['a', 'b', 'c']
Y = [1, 2, 3]
complete the following statement:
[print(pair) for pair in ...]
so as to print to the screen pairs of elements from X and Y which occupy the same position in the index.
I know I can make a join X and Y and make a list using list(zip(X,Y)) but the adding that in the statement gives an empty list.
I can't solve this problem using the form required any help?
thanks!
Not really clear what you're trying to achieve. If you need to print pairs, zip works, i.e.
for pair in zip(X, Y):
print(pair)
[print(pair) for pair in ...] is list comprehension, this is made to create lists, not to print data:
pairs_list = [pair for pair in zip(X, Y)] # don't do this
which is simply pairs_list = list(zip(X, Y)). Does this make sense to you?
Using list comprehensions to leverage a sideeffect (like printing something) is frowned upon. If you dont need a list, don't build one.
[print(pair) for pair in zip(X,Y)] # no need to list(zip(...))
will result in lots on None ... because the return value of print() is None.
Use a simple loop:
for p in zip(X,Y):
print(p)

Python 3: How to check if the item from list is in list and print it

I have a list of lists, which contains coordinates
points = [[5821, 293], [3214, 872], [4218, 820], [1223, 90], [7438, 820]]
and I need to find pair of lists with the same point[i][1] and then print both of them. This coordinates are given just for instance. In the code they're given randomly.
How to do this?
You can use itertools.combinations to create a series of pairs between each two items, and filter out only those with the same second item:
from itertools import combinations
result = [x for x in combinations(points, 2) if x[0][1] == x[1][1]]
There's no easy and efficient way to do what you want with your current data structure.
You can either use an inefficient method (O(N**2)), or convert your data to another format where you can use a more efficient algorithm.
Mureinik's answer is a good way to do an O(N**2) search, so I'll offer a solution that uses a dictionary to make the checking fast:
def find_matches(points):
dct = {}
for x, y in points:
for other_x in dct.setdefault(y, []):
yield (x, y), (other_x, y)
dct[y].append(x)
This is a generator, which will yield pairs of points with matching y values. It should use O(N) space and O(N+R) time (for N input points and R pairs of matches).
Not sure I understand the question correctly, but here is my approach.
I use python 3.5.2, by the way.
If your intent is to capture all lists with [1] or y-coordinate (depending on how you look at it) with a value of 820, then the code could be:
for i in points:
if i[1] == 820:
print(i)
Here is code that will work for you:
second_list = []
the_list = [[5821, 293], [3214, 872], [4218, 820], [1223, 90],
[7438, 820]]
for i in the_list:
second_list.append(i[1])
repeated = []
the_new_list = sorted(second_list, key=int)
for i in range(len(the_new_list)):
if i + 1 < len(the_new_list):
if the_new_list[i] == the_new_list[i+1]:
repeated.append(the_new_list[i])
for i in the_list:
if i[1] in repeated:
print(i)
second_list stores the y-coordinates of your list. Then, the program sorts the list of y-coordinates in ascending order and appends them to the_new_list. Finally, we loop over the_new_list and see if any numbers after each other are equal, and if so, append them to the list repeated. Then, we loop over the_list and see if any points are in repeated. If so, we print the entire thing. I hope that helps.

compactly generate list of list python

I have to create a list of lists (each inner list has n fixed elements). Right now, for n=3 I am doing this:
my_list = []
for x in range(min_inner max_inner + 1):
for y in range(min_outer, max_outer + 1):
for z in range(fixed_param):
my_list.append([x, y, z])
When I tried list comprehension, something like:
[[x,y,z] for x in range(1,4), y in range(1,4), z in range (4)]
I get a name error
NameError: name 'z' is not defined
Is there a list comprehension way of doing that? Considering that n can be any number (though not necessarily arbitrarily large)
You need to loop over your range objects inside the list comprehension too.
[[x,y,z] for x in range(1,4) for y in range(1,4) for z in range (4)]
Also as a more concise way you could use itertools.product() to achieve the same result:
from itertools import product
list(product(range(1,4),range(1,4),range(4)))
Note that itertools.product() returns an iterator object which is pretty more optimized (in terms of memory usage) than list comprehension which returns a list. And if you just want to iterate over the result you don't need to convert the result to list. Otherwise the list comprehension will performs faster.

Using variable tuple to access elements of list

Disclaimer:beginner, self-teaching Python user.
A pretty cool feature of ndarrays is their ability to accept a tuple of integers as indices (e.g. myNDArray[(1,2)] == myNDArray[1][2]). This allows me to leave the indices unspecified as a variable (e.g. indicesTuple ) until a script determines what part of an ndarray to work with, in which case the variable is specified as a tuple of integers and used to access part of an ndarray (e.g. myNDArray[indicesTuple]). The utility in using a variable is that the LENGTH of the tuple can be varied depending on the dimensions of the ndarray.
However, this limits me to working with arrays of numerical values. I tried using lists, but they can't take in a tuple as indices (e.g. myList[(1,2)] gives an error.). Is there a way to "unwrap" a tuple for list indices as one could for function arguments? Or something far easier or more efficient?
UPDATE: Holy shite I forgot this existed. Basically I eventually learned that you can initialize the ndarray with the argument dtype=object, which allows the ndarray to contain multiple types of Python objects, much like a list. As for accessing a list, as a commenter pointed out, I could use a for-loop to iterate through the variable indicesTuple to access increasingly nested elements of the list. For in-place editing, see the accepted comment, really went the extra mile there.
I'm interpreting your question as:
I have an N-dimensional list, and a tuple containing N values (T1, T2... TN). How can I use the tuple values to access the list? I don't know what N will be ahead of time.
I don't know of a built-in way to do this, but you can write a method that iteratively digs into the list until you reach the innermost value.
def get(seq, indices):
for index in indices:
seq = seq[index]
return seq
seq = [
[
["a","b"],
["c","d"]
],
[
["e","f"],
["g","h"]
]
]
indices = [0,1,0]
print get(seq, indices)
Result:
c
You could also do this in one* line with reduce, although it won't be very clear to the reader what you're trying to accomplish.
print reduce(lambda s, idx: s[idx], indices, seq)
(*if you're using 3.X, you'll need to import reduce from functools. So, two lines.)
If you want to set values in the N-dimensional list, use get to access the second-deepest level of the list, and assign to that.
def set(seq, indices, value):
innermost_list = get(seq, indices[:-1])
innermost_list[indices[-1]] = value
Say you have a list of (i,j) indexes
indexList = [(1,1), (0,1), (1,2)]
And some 2D list you want to index from
l = [[1,2,3],
[4,5,6],
[7,8,9]]
You could get those elements using a list comprehension as follows
>>> [l[i][j] for i,j in indexList]
[5, 2, 6]
Then your indexes can be whatever you want them to be. They will be unpacked in the list comprehension, and used as list indices. For your specific application, we'd have to see where your index variables were coming from, but that's the general idea.
Python doesn't have multidimensional lists, so myList[(1,2)] could only conceivably be considered a shortcut for (myList[1], myList[2]) (which would be pretty convenient sometimes, although you can use import operator; x = operator.itemgetter(1,2)(myList) to accomplish the same).
If your myList looks something like
myList = [ ["foo", "bar", "baz"], ["a", "b", c" ] ]
then myList[(1,2)] won't work (or make sense) because myList is not a two-dimensional list: it's a list that contains references to lists. You use myList[1][2] because the first index myList[1] returns the references to ["a", "b", "c"], to which you apply the second index [2] to get "c".
Slightly related, you could use a dictionary to simulate a sparse array precisely by using tuples as keys to a default dict.
import collections
d = collections.defaultdict(str)
d[(1,2)] = "foo"
d[(4,5)] = "bar"
Any other tuple you try to use as a key would return the empty string. It's not a perfect simulation, as you can't access full rows or columns of the array without using something like
row1 = [d[1, x] for x in range(C)] # where C is the number of columns
col3 = [d[x, 3] for x in range(R)] # where R is the number of columns
Use dictionaries indexed by tuple
>>> width, height = 7, 6
>>> grid = dict(
((x,y),"x={} y={}".format(x,y))
for x in range(width)
for y in range(height))
>>> print grid[3,1]
x=3 y=1
Use lists of lists
>>> width, height = 7, 6
>>> grid = [
["x={} y={}".format(x,y) for x in range(width)]
for y in range(width)]
>>> print grid[1][3]
x=3 y=1
In this case, you could make a getter and setter function:
def get_grid(grid, index):
x, y = index
return grid[y][x]
def set_grid(grid, index, value):
x, y = index
grid[y][x] = value
You could go a step further and create your own class that contains a list of lists and defines an indexer that takes tuples as indexes and does this same process. It can do slightly more sensible bounds-checking and give better diagnostics than the dictionary, but it takes a bit of setup. I think the dictionary approach is fine for quick exploration.

How to count number of unique lists within list?

I've tried using Counter and itertools, but since a list is unhasable, they don't work.
My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]
I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:
>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})
If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))
Also, if you wanted to count the occurences of a unique* list:
print(ll.count([1,2,3]))
*value unique, not reference unique)
I think, using the Counter class on tuples like
Counter(tuple(item) for item in li)
Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).
The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.
In order to improve on performance, you would have to
Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)
At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.
As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.
So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.
list = [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
count = 0;
if i not in repeats:
for i2 in list:
if i == i2:
count += 1
if count > 1:
repeats.append(i)
elif count == 1:
unique += 1
print "Repeated Items"
for r in repeats:
print r,
print "\nUnique items:", unique
loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

Categories

Resources