Is there better way to check value changed in sequence?

Is there better way to check value changed in sequence? - python

I have a list like below:
list = ['A', 'A', 'B', 'A', 'B', 'A', 'B']
And, I want to count the number of the first value (in case above, 'A') consecutively before the other values ('B') come.
So I wrote a code like:
history = list[0]
number_A = 0
number_B = 0
for i in list:
if history != i:
break
if i == 'A':
number_A += 1
history = 'A'
else:
number_B += 1
history = 'B'
However, I think this is very untidy.
Is there any more simple way to do this process?
Thank you for reading.

Using groupby with the default key function, you can count the number of items in the first grouper:
from itertools import groupby
def count_first(lst):
if not lst:
return 0
_, grouper = next(groupby(lst))
return sum(1 for _ in grouper)
print(count_first(['A', 'A', 'B', 'A', 'B', 'A', 'B']))
# 2

There is no reason for the "else" clause, you are not going to count 'B's since you are going to break before you get there.
lst = ['A', 'A', 'B', 'A', 'B', 'A', 'B']
count = 0
for i in lst:
if i != lst[0]:
break
count += 1
print("list starts with %d of %s's" % (count, lst[0]))

You could use takewhile:
from itertools import takewhile
my_list = ['A', 'A', 'B', 'A', 'B', 'A', 'B']
res = takewhile(lambda x: x == my_list[0], my_list)
print(len(list(res)))
OUT: 2

I renamed your list to lst in order to not override the builtin name list.
>>> lst = ['A', 'A', 'B', 'A', 'B', 'A', 'B']
>>> string = ''.join(lst)
>>> len(string) - len(string.lstrip('A'))
2

Related

Python - For Loop - Print only if the above line is equal

I've the following code:
characters = ['a', 'b', 'b', 'c','d', 'b']
for i in characters:
if i[0] == i[-1]:
print(i)
Basically I only want to extract the characters that are equal from the line above. For example, in my case I only want to extract the b from 1 and 2 position.
How can I do that?
Thanks!

a = ['a', 'b', 'b', 'c', 'd', 'b']
b = ['a', 'b', 'b', 'c', 'd', 'b', 'd']
import collections
print([item for item, count in collections.Counter(a).items() if count > 1])
print([item for item, count in collections.Counter(b).items() if count > 1])
output
['b']
['b', 'd']

Without iterating multiple times over the same list.
characters = ['a', 'b', 'b', 'c','d', 'b']
last_char = None
output = []
for char in characters:
if char == last_char:
output.append(char)
last_char = char
print(output)

To extract the characters form the list which matches only the last char from list you can do the following:
characters = ['a', 'b', 'b', 'c','d', 'b']
for i in range(0, len(characters) - 1):
if characters[i] == characters[-1]:
print(characters[i])
In you snippet i when you are looping is the individual chars from your list, and it looks you were trying to access last, and first item from the list.

equal = [a for a in characters[0:-1] if a == characters[-1]]
Unless you also want the last character which will always be equal to itself, then do:
equal = [a for a in characters if a == characters[-1]]

little modification in your code
characters = ['a', 'b', 'b', 'c','d', 'b']
ch= (characters[-1])
for i in characters:
if i == ch:
print(i

Python: Keep first occurrence of an item in a list

How can I remove all occurrences of a specific value in a list except for the first occurrence?
E.g. I have a list:
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
And I need a function that looks something like this:
preserve_first(letters, 'c')
And returns this:
['a', 'b', 'c', 'd', 'a', 'a']
Removing all but the first occurrence of the given value while otherwise preserving the order. If there is a way to do this with a pandas.Series that would be even better.

You want to remove duplicates of 'c' only. So you want to filter where the series is either not duplicated at all or it isn't equal to 'c'. I like to use pd.Series.ne in place of pd.Series != because the reduction in wrapping parenthesis adds to readability (my opinion).
s = pd.Series(letters)
s[s.ne('c') | ~s.duplicated()]
0 a
1 b
2 c
5 d
7 a
8 a
dtype: object
To do exactly what was asked for.
def preserve_first(letters, letter):
s = pd.Series(letters)
return s[s.ne(letter) | ~s.duplicated()].tolist()
preserve_first(letters, 'c')
['a', 'b', 'c', 'd', 'a', 'a']

A general Python solution:
def keep_first(iterable, value):
it = iter(iterable)
for val in it:
yield val
if val == value:
yield from (el for el in it if el != value)
This yields all items up to and including the first value if found, then yields the rest of the iterable filtering out items matching the value.

You can try this using generators:
def conserve_first(l, s):
last_seen = False
for i in l:
if i == s and not last_seen:
last_seen = True
yield i
elif i != s:
yield i
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
print(list(conserve_first(letters, "c")))
Output:
['a', 'b', 'c', 'd', 'a', 'a']

Late to the party, but
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
def preserve_first(data, letter):
new = []
count = 0
for i in data:
if i not in new:
if i == letter and count == 0:
new.append(i)
count+=1
elif i == letter and count == 1:
continue
else:
new.append(i)
else:
if i == letter and count == 1:
continue
else:
new.append(i)
l = preserve_first(letters, "c")

You can use a list filter and slices:
def preserve_first(letters, elem):
if elem in letters:
index = letters.index(elem)
return letters[:index + 1] + filter(lambda a: a != 'c', letters[index + 1:])

Doesn't use pandas but this is a simple algorithm to do the job.
def preserve_firsts(letters, target):
firsts = []
seen = False
for letter in letters:
if letter == target:
if not seen:
firsts.append(letter)
seen = True
else:
firsts.append(letter)
return firsts
> letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a']
> preserve_firsts(letters, 'c')
['a', 'b', 'c', 'd', 'a', 'a']

Simplest solution I could come up with.
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
key = 'c'
def preserve_first(letters, key):
first_occurrence = letters.index(key)
return [item for i, item in enumerate(letters) if i == first_occurrence or item != key]

How can I group a list of objects by continuity?

Given a very large (gigabytes) list of arbitrary objects (I've seen a similar solution to this for ints), can I either group it easily into sublists by equivalence? Either in-place or by generator which consumes the original list.
l0 = [A,B, A,B,B, A,B,B,B,B, A, A, A,B] #spaces for clarity
Desired result:
[['A', 'B'], ['A', 'B', 'B'], ['A', 'B', 'B', 'B', 'B'], ['A'], ['A'], ['A', 'B']]
I wrote a looping version like so:
#find boundaries
b0 = []
prev = A
group = A
for idx, elem in enumerate(l0):
if elem == group:
b0.append(idx)
prev = elem
b0.append(len(l0)-1)
for idx, b in enumerate(b0):
try:
c = b0[idx+1]
except:
break
if c == len(l0)-1:
l1.append(l0[b:])
else:
l1.append(l0[b:c])
Can this be done as a generator gen0(l) that will work like:
for g in gen(l0):
print g
....
['A', 'B']
['A', 'B', 'B']
['A', 'B', 'B', 'B', 'B']
....
etc?
EDIT: using python 2.6 or 2.7
EDIT: preferred solution, mostly based on the accepted answer:
def gen_group(f, items):
out = [items[0]]
while items:
for elem in items[1:]:
if f(elem, out[0]):
break
else:
out.append(elem)
for _i in out:
items.pop(0)
yield out
if items:
out = [items[0]]
g = gen_group(lambda x, y: x == y, l0)
for out in g:
print out

Maybe something like this:
def subListGenerator(f,items):
i = 0
n = len(items)
while i < n:
sublist = [items[i]]
i += 1
while i < n and not f(items[i]):
sublist.append(items[i])
i += 1
yield sublist
Used like:
>>> items = ['A', 'B', 'A', 'B', 'B', 'A', 'B', 'B', 'B', 'B', 'A', 'A', 'A', 'B']
>>> g = subListGenerator(lambda x: x == 'A',items)
>>> for x in g: print(x)
['A', 'B']
['A', 'B', 'B']
['A', 'B', 'B', 'B', 'B']
['A']
['A']
['A', 'B']

I assume that A is your breakpoint.
>>> A, B = 'A', 'B'
>>> x = [A,B, A,B,B, A,B,B,B,B, A, A, A,B]
>>> map(lambda arr: [i for i in arr[0]], map(lambda e: ['A'+e], ''.join(x).split('A')[1:]))
[['A', 'B'], ['A', 'B', 'B'], ['A', 'B', 'B', 'B', 'B'], ['A'], ['A'], ['A', 'B']]

Here's a simple generator to perform your task:
def gen_group(L):
DELIMETER = "A"
out = [DELIMETER]
while L:
for ind, elem in enumerate(L[1:]):
if elem == DELIMETER :
break
else:
out.append(elem)
for i in range(ind + 1):
L.pop(0)
yield out
out = [DELIMETER ]
The idea is to cut down the list and yield the sublists until there is nothing left. This assumes the list starts with "A" (DELIMETER variable).
Sample output:
for out in gen_group(l0):
print out
Produces
['A', 'B']
['A', 'B', 'B']
['A', 'B', 'B', 'B', 'B']
['A']
['A']
['A', 'B']
['A']
Comparitive Timings:
timeit.timeit(s, number=100000) is used to test each of the current answers, where s is the multiline string of the code (listed below):
Trial 1 Trial 2 Trial 3 Trial 4 | Avg
This answer (s1): 0.08247 0.07968 0.08635 0.07133 0.07995
Dilara Ismailova (s2): 0.77282 0.72337 0.73829 0.70574 0.73506
John Coleman (s3): 0.08119 0.09625 0.08405 0.08419 0.08642
This answer is the fastest, but it is very close. I suspect the difference is the additional argument and anonymous function in John Coleman's answer.
s1="""l0 = ["A","B", "A","B","B", "A","B","B","B","B", "A", "A", "A","B"]
def gen_group(L):
out = ["A"]
while L:
for ind, elem in enumerate(L[1:]):
if elem == "A":
break
else:
out.append(elem)
for i in range(ind + 1):
L.pop(0)
yield out
out = ["A"]
out =gen_group(l0)"""
s2 = """A, B = 'A', 'B'
x = [A,B, A,B,B, A,B,B,B,B, A, A, A,B]
map(lambda arr: [i for i in arr[0]], map(lambda e: ['A'+e], ''.join(x).split('A')[1:]))"""
s3 = """def subListGenerator(f,items):
i = 0
n = len(items)
while i < n:
sublist = [items[i]]
i += 1
while i < n and not f(items[i]):
sublist.append(items[i])
i += 1
yield sublist
items = ['A', 'B', 'A', 'B', 'B', 'A', 'B', 'B', 'B', 'B', 'A', 'A', 'A', 'B']
g = subListGenerator(lambda x: x == 'A',items)"""

The following works in this case. You could change the l[0] != 'A' condition to be whatever. I would probably pass it as an argument, so that you can reuse it somewhere else.
def gen(l_arg, boundary):
l = l_arg.copy() # Optional if you want to save memory
while l:
sub_list = [l.pop(0)]
while l and l[0] != boundary: # Here boundary = 'A'
sub_list.append(l.pop(0))
yield sub_list
It assumes that there is an 'A' at the beginning of your list. And it copies the list, which isn't great when the list is in the range of Gb. you could remove the copy to save memory if you don't care about keeping the original list.

Start loop after certain element in list is reached

How do I start executing code in a for loop after a certain element in the list has been reached. I've got something that works, but is there a more pythonic or faster way of doing this?
list = ['a', 'b', 'c', 'd', 'e', 'f']
condition = 0
for i in list:
if i == 'c' or condition == 1:
condition = 1
print i

One way would to be to iterate over a generator combining dropwhile and islice:
from itertools import dropwhile, islice
data = ['a', 'b', 'c', 'd', 'e', 'f']
for after in islice(dropwhile(lambda L: L != 'c', data), 1, None):
print after
If you want including then drop the islice.

A little simplified code:
lst = ['a', 'b', 'c', 'd', 'e', 'f']
start_index = lst.index('c')
for i in lst[start_index:]:
print i

Python Remove SOME duplicates from a list while maintaining order?

I want to remove certain duplicates in my python list.
I know there are ways to remove all duplicates, but I wanted to remove only consecutive duplicates, while maintaining the list order.
For example, I have a list such as the following:
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
However, I want to remove the duplicates, and maintain order, but still keep the 2 c's and 2 f's, such as this:
wantedList = [a,b,c,f,d,e,f,g,c]
So far, I have this:
z = 0
j=0
list2=[]
for i in list1:
if i == "c":
z = z+1
if (z==1):
list2.append(i)
if (z==2):
list2.append(i)
else:
pass
elif i == "f":
j = j+1
if (j==1):
list2.append(i)
if (j==2):
list2.append(i)
else:
pass
else:
if i not in list2:
list2.append(i)
However, this method gives me something like:
wantedList = [a,b,c,c,d,e,f,f,g]
Thus, not maintaining the order.
Any ideas would be appreciated! Thanks!

Not completely sure if c and f are special cases, or if you want to compress consecutive duplicates only. If it is the latter, you can use itertools.groupby():
>>> import itertools
>>> list1
['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
>>> [k for k, g in itertools.groupby(list1)]
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

To remove consecutive duplicates from a list, you can use the following generator function:
def remove_consecutive_duplicates(a):
last = None
for x in a:
if x != last:
yield x
last = x
With your data, this gives:
>>> list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
>>> list(remove_consecutive_duplicates(list1))
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

If you want to ignore certain items when removing duplicates...
list2 = []
for item in list1:
if item not in list2 or item in ('c','f'):
list2.append(item)
EDIT: Note that this doesn't remove consecutive items

EDIT
Never mind, I read your question wrong. I thought you were wanting to keep only certain sets of doubles.
I would recommend something like this. It allows a general form to keep certain doubles once.
list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
doubleslist = ['c', 'f']
def remove_duplicate(firstlist, doubles):
newlist = []
for x in firstlist:
if x not in newlist:
newlist.append(x)
elif x in doubles:
newlist.append(x)
doubles.remove(x)
return newlist
print remove_duplicate(list1, doubleslist)

The simple solution is to compare this element to the next or previous element
a=1
b=2
c=3
d=4
e=5
f=6
g=7
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
output_list=[list1[0]]
for ctr in range(1, len(list1)):
if list1[ctr] != list1[ctr-1]:
output_list.append(list1[ctr])
print output_list

list1 = ['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
wantedList = []
for item in list1:
if len(wantedList) == 0:
wantedList.append(item)
elif len(wantedList) > 0:
if wantedList[-1] != item:
wantedList.append(item)
print(wantedList)
Fetch each item from the main list(list1).
If the 'temp_list' is empty add that item.
If not , check whether the last item in the temp_list is
not same as the item we fetched from 'list1'.
if items are different append into temp_list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is there better way to check value changed in sequence? - python

Using groupby with the default key function, you can count the number of items in the first grouper: from itertools import groupby def count_first(lst): if not lst: return 0 _, grouper = next(groupby(lst)) return sum(1 for _ in grouper) print(count_first(['A', 'A', 'B', 'A', 'B', 'A', 'B'])) # 2

There is no reason for the "else" clause, you are not going to count 'B's since you are going to break before you get there. lst = ['A', 'A', 'B', 'A', 'B', 'A', 'B'] count = 0 for i in lst: if i != lst[0]: break count += 1 print("list starts with %d of %s's" % (count, lst[0]))

You could use takewhile: from itertools import takewhile my_list = ['A', 'A', 'B', 'A', 'B', 'A', 'B'] res = takewhile(lambda x: x == my_list[0], my_list) print(len(list(res))) OUT: 2

I renamed your list to lst in order to not override the builtin name list. >>> lst = ['A', 'A', 'B', 'A', 'B', 'A', 'B'] >>> string = ''.join(lst) >>> len(string) - len(string.lstrip('A')) 2

Related

Python - For Loop - Print only if the above line is equal

Python: Keep first occurrence of an item in a list

How can I group a list of objects by continuity?

Start loop after certain element in list is reached

Python Remove SOME duplicates from a list while maintaining order?

Categories

Resources