Python: determine length of sequence of equal items in list

Python: determine length of sequence of equal items in list - python

I have a list as follows:
l = [0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,2,2,2]
I want to determine the length of a sequence of equal items, i.e for the given list I want the output to be:
[(0, 6), (1, 6), (0, 4), (2, 3)]
(or a similar format).
I thought about using a defaultdict but it counts the occurrences of each item and accumulates it for the entire list, since I cannot have more than one key '0'.
Right now, my solution looks like this:
out = []
cnt = 0
last_x = l[0]
for x in l:
if x == last_x:
cnt += 1
else:
out.append((last_x, cnt))
cnt = 1
last_x = x
out.append((last_x, cnt))
print out
I am wondering if there is a more pythonic way of doing this.

You almost surely want to use itertools.groupby:
l = [0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,2,2,2]
answer = []
for key, iter in itertools.groupby(l):
answer.append((key, len(list(iter))))
# answer is [(0, 6), (1, 6), (0, 4), (2, 3)]
If you want to make it more memory efficient, yet add more complexity, you can add a length function:
def length(l):
if hasattr(l, '__len__'):
return len(l)
else:
i = 0
for _ in l:
i += 1
return i
l = [0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,2,2,2]
answer = []
for key, iter in itertools.groupby(l):
answer.append((key, length(iter)))
# answer is [(0, 6), (1, 6), (0, 4), (2, 3)]
Note though that I have not benchmarked the length() function, and it's quite possible it will slow you down.

Mike's answer is good, but the itertools._grouper returned by groupby will never have a __len__ method so there is no point testing for it
I use sum(1 for _ in i) to get the length of the itertools._grouper
>>> import itertools as it
>>> L = [0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,2,2,2]
>>> [(k, sum(1 for _ in i)) for k, i in it.groupby(L)]
[(0, 6), (1, 6), (0, 4), (2, 3)]

Related

Find an element by inner tuple in a list of a tuple of tuples

Alright. So I've been through some SO answers such as Find an element in a list of tuples in python and they don't seem that specific to my case. And I am getting no idea on how to use them in my issue.
Let us say I have a list of a tuple of tuples; i.e. the list stores several data points each referring to a Cartesian point. Each outer tuple represents the entire data of the point. There is an inner tuple in this tuple which is the point exactly. That is, let us take the point (1,2) and have 5 denoting some meaning to this point. The outer tuple will be ((1,2),5)
Well, it is easy to figure out how to generate this. However, I want to search for an outer tuple based on the value of the inner tuple. That is I wanna do:
for y in range(0, 10):
for x in range(0, 10):
if (x, y) in ###:
print("Found")
or something of this sense. How can this be done?
Based on the suggestion posted as a comment by #timgen, here is some pseudo-sample data.
The list is gonna be
selectPointSet = [((9, 2), 1), ((4, 7), 2), ((7, 3), 0), ((5, 0), 0), ((8, 1), 2)]
So I may wanna iterate through the whole domain of points which ranges from (0,0) to (9,9) and do something if the point is one among those in selectPointSet; i.e. if it is (9, 2), (4, 7), (7, 3), (5, 0) or (8, 1)

Using the data structures that you currently are, you can do it like this:
listTuple = [((1,1),5),((2,3),5)] #dummy list of tuples
for y in range(0, 10):
for x in range(0, 10):
for i in listTuple:#loop through list of tuples
if (x, y) in listTuple[listTuple.index(i)]:#test to see if (x,y) is in the tuple at this index
print(str((x,y)) , "Found")

You can make use of a dictionary.
temp = [((1,2),3),((2,3),4),((6,7),4)]
newDict = {}
# a dictionary with inner tuple as key
for t in temp:
newDict[t[0]] = t[1]
for y in range(0, 10):
for x in range(0, 10):
if newDict.__contains__((x,y)):
print("Found")
I hope this is what you are asking for.

Make a set from the two-element tuples for O(1) lookup.
>>> data = [((1,2),3),((2,3),4),((6,7),4)]
>>> tups = {x[0] for x in data}
Now you can query tups with any tuple you like.
>>> (6, 7) in tups
True
>>> (3, 2) in tups
False
Searching for values from 0 to 9:
>>> from itertools import product
>>> for x, y in product(range(10), range(10)):
... if (x, y) in tups:
... print('found ({}, {})'.format(x, y))
...
found (1, 2)
found (2, 3)
found (6, 7)
If you need to retain information about the third number (and the two-element inner tuples in data are unique) then you can also construct a dictionary instead of a set.
>>> d = dict(data)
>>> d
{(1, 2): 3, (2, 3): 4, (6, 7): 4}
>>> (2, 3) in d
True
>>> d[(2, 3)]
4

all possible permutations for a string

Having a string='january' ,
how can I generate following cases:
case1(Replacing 1 character) => taking j and replace it with all ASCII letters(a-z). then do the same with: a , n , u , a , r , y.
Basically we would have
(Aanuary , Banuary ,..... ,Zanuary )+ (jAnuary , jBanuary .....jZanuary) + ....+(januarA , januarB , ....., januarZ)
I have done this part using following code, However, I have no idea how to do it for more than one letter since there are lots of permutations.
monthName= 'january'
asci_letters = ['a' , 'b' , .... , 'z']
lst = list(monthName)
indxs = [i for i , _ in enumerate(monthName)]
oneLetter=[]
for i in indxs:
word = monthName
pos = list(word)
for j in asci_letters:
pos[i] = j
changed = ("".join(pos))
oneLetter.append(changed)
Case2: Taking 2 characters and replacing them:
(AAnuary , ABnuary ,.....,AZanuary) + (BAnuary , BBanuary, .... , BZanuary) + (AaAuary , AaBuary,.....,AaZuary) + ...... + (januaAB , .... , januaAZ)
Case3 : doing the same for 3 characters
Case7: doing the same for 7 characters(length of string)
To summarize, I want to create all possible cases of replacing, 1 letter, 2 letters,3 letters, up to all letters of a string.

It's very likely that you can't hold all these permutations in memory because it will quickly become very crowded.
But to get all indices for the cases you can use itertools.combinations. For 1 it will give the single indices:
from itertools import combinations
string_ = 'january'
length = len(string_)
print(list(combinations(range(length), 1)))
# [(0,), (1,), (2,), (3,), (4,), (5,), (6,)]
Likewise you can get the indices for case 2-7:
print(list(combinations(range(length), 2)))
# [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4),
# (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6),
# (4, 5), (4, 6), (5, 6)]
Then it's just a matter of inserting the itertools.product of string.ascii_uppercase at the given indices:
from itertools import product
import string
print(list(product(string.ascii_uppercase, repeat=1)))
# [('A',), ('B',), ('C',), ('D',), ('E',), ('F',), ('G',), ('H',), ('I',),
# ('J',), ('K',), ('L',), ('M',), ('N',), ('O',), ('P',), ('Q',), ('R',),
# ('S',), ('T',), ('U',), ('V',), ('W',), ('X',), ('Y',), ('Z',)]
Likewise for different repeats given the "case".
Putting this all together:
def all_combinations(a_string, case):
lst = list(a_string)
length = len(lst)
for combination in combinations(range(length), case):
for inserter in product(string.ascii_uppercase, repeat=case):
return_string = lst.copy()
for idx, newchar in zip(combination, inserter):
return_string[idx] = newchar
yield ''.join(return_string)
Then you can get all desired permutations for each case by:
list(all_combinations('january', 2)) # case2
list(all_combinations('january', 4)) # case4
list(all_combinations('january', 7)) # case7
Or if you need all of them:
res = []
for case in [1, 2, 3, 4, 5, 6, 7]:
res.extend(all_combinations('january', case))
But that will require a lot of memory.

You can use itertools.combinations_with_replacement for this, which gives you an iterator with all permutations:
from itertools import combinations_with_replacement
# First Param is an iterable of possible values, second the length of the
# resulting permutations
combinations = combinations_with_replacement('ABCDEFGHIJKLMNOPQRSTUVWXYZ',7)
# Then you can iterate like this:
for combination in combinations:
#Do Stuff here
Don't try to convert this iterator to a list of all values, because you probably gonna get a MemoryException.
For your distance you might want to use python distance package. (You need to install it via pip first).
For your case, that you want to get all combinations for Characters a-z with length = 7 (because of January):
import distance
from itertools import combinations_with_replacement
str_to_compary_with = "JANUARY"
for i in range(len(str_to_compare_with):
combinations = combinations_with_replacement('ABCDEFGHIJKLMNOPQRSTUVWXYZ', i+1)
# Then you can iterate like this:
for combination in combinations:
# This is calculating the hamming distance for the combination with the string you want to compare to
# Here you have to figure out yourself if you want to save that output to a file or whatever you wanna do with the distance
hamming_dist = distance.hamming(''.join(combination), str_to_compare_with)

This should do everything that you wanted with help of product and permutations:
from itertools import product, permutations
monthName= 'january'
letters = list('abcdefghijklmnopqrstuvwxyz')
n = len(monthName)
indxs = range(n)
mn = list(monthName)
cases = {k: [] for k in range(2, n+1)}
for num in range(2, n+1):
letter_combos = list(product(*[letters for _ in range(num)]))
positions = permutations(indxs, num)
for p in positions:
for l in letter_combos:
l = iter(l)
for i in p:
mn[i] = next(l)
mn = ''.join(mn)
cases[num].append(mn)
mn = list(monthName)

If you want to know how it is working, you can test this with a subset of letters, say from A-F:
x = []
for i in range(65,70): #subset of letters
x.append(chr(i))
def recurse(string,index,arr):
if(index>len(string)-1):
return
for i in range(index,len(string)):
for item in x:
temp = string[:i]+item+string[i+1:]
arr.append(temp)
recurse(temp,i+1,arr)
arr = []
recurse('abc',0,arr)
print arr

Making number from elements of list

How can i check that can i create a number from elements of list?
For example:
list=[1,1,3,3,3,3,5,10,23,53]
And now we can make 9 from [1,3,5] or [3,3,3]
i tried something like that:
list=[1,1,3,3,3,3,5,10,23,53]
tmp=[]
sum=0
for i in range(len(list)):
tmpChange=9
tmpChange -= list[i]+sum
if tmpChange == 0:
break
elif tmpChange > 0:
tmp.append(list[i])
sum += list[i]
print(tmpChange)
print(tmp)
else:
tmp.pop(i)

A naive way to approach this is to find all of the subsets of your original list, which you can do using itertools.combinations. Then you can check if the subset sums to your original value, then add them to a set.
import itertools
l = [1,1,3,3,3,3,5,10,23,53]
total = 9
values = set()
for r in range(1, len(l)):
for c in itertools.combinations(l, r):
if sum(c) == total:
values.add(tuple(c))
The result is then
>>> values
{(1, 3, 5), (3, 3, 3)}
As another example using the following data
l = [1,1,3,3,3,3,4,5,9,10,23,53]
The result would be
>>> values
{(4, 5), (3, 3, 3), (1, 1, 3, 4), (1, 3, 5), (9,)}

Sort out pairs with same members but different order from list of pairs

From the list
l =[(3,4),(2,3),(4,3),(3,2)]
I want to sort out all second appearances of pairs with the same members in reverse order. I.e., the result should be
[(3,4),(2,3)]
What's the most concise way to do that in Python?

Alternatively, one might do it in a more verbose way:
l = [(3,4),(2,3),(4,3),(3,2)]
L = []
omega = set([])
for a,b in l:
key = (min(a,b), max(a,b))
if key in omega:
continue
omega.add(key)
L.append((a,b))
print(L)

If we want to keep only the first tuple of each pair:
l =[(3,4),(2,3),(4,3),(3,2), (3, 3), (5, 6)]
def first_tuples(l):
# We could use a list to keep track of the already seen
# tuples, but checking if they are in a set is faster
already_seen = set()
out = []
for tup in l:
if set(tup) not in already_seen:
out.append(tup)
# As sets can only contain hashables, we use a
# frozenset here
already_seen.add(frozenset(tup))
return out
print(first_tuples(l))
# [(3, 4), (2, 3), (3, 3), (5, 6)]

This ought to do the trick:
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[23]: [(3, 4), (2, 3)]
Expanding the initial list a little bit with different orderings:
l =[(3,4),(2,3),(4,3),(3,2), (1,3), (3,1)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[25]: [(3, 4), (2, 3), (1, 3)]
And, depending on whether each tuple is guaranteed to have an accompanying "sister" reversed tuple, the logic may change in order to keep "singleton" tuples:
l = [(3, 4), (2, 3), (4, 3), (3, 2), (1, 3), (3, 1), (10, 11), (10, 12)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:]) or not any(y[::-1] == x for y in l)]
Out[35]: [(3, 4), (2, 3), (1, 3), (10, 11), (10, 12)]

IMHO, this should be both shorter and clearer than anything posted so far:
my_tuple_list = [(3,4),(2,3),(4,3),(3,2)]
set((left, right) if left < right else (right, left) for left, right in my_tuple_list)
>>> {(2, 3), (3, 4)}
It simply makes a set of all tuples, whose members are exchanged beforehand if first member is > second member.

Pairwise circular Python 'for' loop

Is there a nice Pythonic way to loop over a list, retuning a pair of elements? The last element should be paired with the first.
So for instance, if I have the list [1, 2, 3], I would like to get the following pairs:
1 - 2
2 - 3
3 - 1

A Pythonic way to access a list pairwise is: zip(L, L[1:]). To connect the last item to the first one:
>>> L = [1, 2, 3]
>>> zip(L, L[1:] + L[:1])
[(1, 2), (2, 3), (3, 1)]

I would use a deque with zip to achieve this.
>>> from collections import deque
>>>
>>> l = [1,2,3]
>>> d = deque(l)
>>> d.rotate(-1)
>>> zip(l, d)
[(1, 2), (2, 3), (3, 1)]

I'd use a slight modification to the pairwise recipe from the itertools documentation:
def pairwise_circle(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ... (s<last>,s0)"
a, b = itertools.tee(iterable)
first_value = next(b, None)
return itertools.zip_longest(a, b,fillvalue=first_value)
This will simply keep a reference to the first value and when the second iterator is exhausted, zip_longest will fill the last place with the first value.
(Also note that it works with iterators like generators as well as iterables like lists/tuples.)
Note that #Barry's solution is very similar to this but a bit easier to understand in my opinion and easier to extend beyond one element.

I would pair itertools.cycle with zip:
import itertools
def circular_pairwise(l):
second = itertools.cycle(l)
next(second)
return zip(l, second)
cycle returns an iterable that yields the values of its argument in order, looping from the last value to the first.
We skip the first value, so it starts at position 1 (rather than 0).
Next, we zip it with the original, unmutated list. zip is good, because it stops when any of its argument iterables are exhausted.
Doing it this way avoids the creation of any intermediate lists: cycle holds a reference to the original, but doesn't copy it. zip operates in the same way.
It's important to note that this will break if the input is an iterator, such as a file, (or a map or zip in python-3), as advancing in one place (through next(second)) will automatically advance the iterator in all the others. This is easily solved using itertools.tee, which produces two independently operating iterators over the original iterable:
def circular_pairwise(it):
first, snd = itertools.tee(it)
second = itertools.cycle(snd)
next(second)
return zip(first, second)
tee can use large amounts of additional storage, for example, if one of the returned iterators is used up before the other is touched, but as we only ever have one step difference, the additional storage is minimal.

There are more efficient ways (that don't built temporary lists), but I think this is the most concise:
> l = [1,2,3]
> zip(l, (l+l)[1:])
[(1, 2), (2, 3), (3, 1)]

Pairwise circular Python 'for' loop
If you like the accepted answer,
zip(L, L[1:] + L[:1])
you can go much more memory light with semantically the same code using itertools:
from itertools import islice, chain #, izip as zip # uncomment if Python 2
And this barely materializes anything in memory beyond the original list (assuming the list is relatively large):
zip(l, chain(islice(l, 1, None), islice(l, None, 1)))
To use, just consume (for example, with a list):
>>> list(zip(l, chain(islice(l, 1, None), islice(l, None, 1))))
[(1, 2), (2, 3), (3, 1)]
This can be made extensible to any width:
def cyclical_window(l, width=2):
return zip(*[chain(islice(l, i, None), islice(l, None, i)) for i in range(width)])
and usage:
>>> l = [1, 2, 3, 4, 5]
>>> cyclical_window(l)
<itertools.izip object at 0x112E7D28>
>>> list(cyclical_window(l))
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 1)]
>>> list(cyclical_window(l, 4))
[(1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 1), (4, 5, 1, 2), (5, 1, 2, 3)]
Unlimited generation with itertools.tee with cycle
You can also use tee to avoid making a redundant cycle object:
from itertools import cycle, tee
ic1, ic2 = tee(cycle(l))
next(ic2) # must still queue up the next item
and now:
>>> [(next(ic1), next(ic2)) for _ in range(10)]
[(1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2)]
This is incredibly efficient, an expected usage of iter with next, and elegant usage of cycle, tee, and zip.
Don't pass cycle directly to list unless you have saved your work and have time for your computer to creep to a halt as you max out its memory - if you're lucky, after a while your OS will kill the process before it crashes your computer.
Pure Python Builtin Functions
Finally, no standard lib imports, but this only works for up to the length of original list (IndexError otherwise.)
>>> [(l[i], l[i - len(l) + 1]) for i in range(len(l))]
[(1, 2), (2, 3), (3, 1)]
You can continue this with modulo:
>>> len_l = len(l)
>>> [(l[i % len_l], l[(i + 1) % len_l]) for i in range(10)]
[(1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2)]

I would use a list comprehension, and take advantage of the fact that l[-1] is the last element.
>>> l = [1,2,3]
>>> [(l[i-1],l[i]) for i in range(len(l))]
[(3, 1), (1, 2), (2, 3)]
You don't need a temporary list that way.

Amazing how many different ways there are to solve this problem.
Here's one more. You can use the pairwise recipe but instead of zipping with b, chain it with the first element that you already popped off. Don't need to cycle when we just need a single extra value:
from itertools import chain, izip, tee
def pairwise_circle(iterable):
a, b = tee(iterable)
first = next(b, None)
return izip(a, chain(b, (first,)))

I like a solution that does not modify the original list and does not copy the list to temporary storage:
def circular(a_list):
for index in range(len(a_list) - 1):
yield a_list[index], a_list[index + 1]
yield a_list[-1], a_list[0]
for x in circular([1, 2, 3]):
print x
Output:
(1, 2)
(2, 3)
(3, 1)
I can imagine this being used on some very large in-memory data.

This one will work even if the list l has consumed most of the system's memory. (If something guarantees this case to be impossible, then zip as posted by chepner is fine)
l.append( l[0] )
for i in range( len(l)-1):
pair = l[i],l[i+1]
# stuff involving pair
del l[-1]
or more generalizably (works for any offset n i.e. l[ (i+n)%len(l) ] )
for i in range( len(l)):
pair = l[i], l[ (i+1)%len(l) ]
# stuff
provided you are on a system with decently fast modulo division (i.e. not some pea-brained embedded system).
There seems to be a often-held belief that indexing a list with an integer subscript is un-pythonic and best avoided. Why?

This is my solution, and it looks Pythonic enough to me:
l = [1,2,3]
for n,v in enumerate(l):
try:
print(v,l[n+1])
except IndexError:
print(v,l[0])
prints:
1 2
2 3
3 1
The generator function version:
def f(iterable):
for n,v in enumerate(iterable):
try:
yield(v,iterable[n+1])
except IndexError:
yield(v,iterable[0])
>>> list(f([1,2,3]))
[(1, 2), (2, 3), (3, 1)]

How about this?
li = li+[li[0]]
pairwise = [(li[i],li[i+1]) for i in range(len(li)-1)]

from itertools import izip, chain, islice
itr = izip(l, chain(islice(l, 1, None), islice(l, 1)))
(As above with #j-f-sebastian's "zip" answer, but using itertools.)
NB: EDITED given helpful nudge from #200_success. previously was:
itr = izip(l, chain(l[1:], l[:1]))

If you don't want to consume too much memory, you can try my solution:
[(l[i], l[(i+1) % len(l)]) for i, v in enumerate(l)]
It's a little slower, but consume less memory.

Starting in Python 3.10, the new pairwise function provides a way to create sliding pairs of consecutive elements:
from itertools import pairwise
# l = [1, 2, 3]
list(pairwise(l + l[:1]))
# [(1, 2), (2, 3), (3, 1)]
or simply pairwise(l + l[:1]) if you don't need the result as a list.
Note that we pairwise on the list appended with its head (l + l[:1]) so that rolling pairs are circular (i.e. so that we also include the (3, 1) pair):
list(pairwise(l)) # [(1, 2), (2, 3)]
l + l[:1] # [1, 2, 3, 1]

Just another try
>>> L = [1,2,3]
>>> zip(L,L[1:]) + [(L[-1],L[0])]
[(1, 2), (2, 3), (3, 1)]

L = [1, 2, 3]
a = zip(L, L[1:]+L[:1])
for i in a:
b = list(i)
print b

this seems like combinations would do the job.
from itertools import combinations
x=combinations([1,2,3],2)
this would yield a generator. this can then be iterated over as such
for i in x:
print i
the results would look something like
(1, 2)
(1, 3)
(2, 3)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: determine length of sequence of equal items in list - python

Related

Find an element by inner tuple in a list of a tuple of tuples

all possible permutations for a string

Making number from elements of list

Sort out pairs with same members but different order from list of pairs

Pairwise circular Python 'for' loop

Categories

Resources