Breaking out of a while loop when hitting a duplicate

Breaking out of a while loop when hitting a duplicate - python

I have this code:
steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']]
for step in steps:
while True:
last_item = ""
for item in step:
if item != last_item:
print(item)
last_item = item
else:
break
The desired result is for the loop to print A, then B, then C, but when hitting the first duplicate C it should move on to printing D, then E, then F, and then stop when hitting the first duplicate F.
This is a minimal reproducible example of a loop to be used in a web scraping job, so solutions that involve doing set(steps)or other operations on the example steps as such will not solve it. My question has to to with the architecture of the loop.

steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']]
for step in steps:
last_item = ""
for item in step:
if item != last_item:
print(item)
last_item = item
else:
break
When you keep while true and break is encountered from inner for loop, control will never pass to outer for loop for getting next item
(['D', 'E', 'F', 'F', 'F'])
in outer list, creating infinite loop.

Option with while loop, accessing objects by index:
steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']]
i = 0
ii = 0
memo = []
res = []
while True:
if i == len(steps): break
e = steps[i][ii]
if e in memo:
res.append(memo)
memo = []
ii = 0
i += 1
else:
memo.append(e)
print(e)
ii += 1
It prints out:
# A
# B
# C
# D
# E
# F
While res value is:
print(res) #=> [['A', 'B', 'C'], ['D', 'E', 'F']]

You do not need while True. Except for that part your code works as expected:
steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']]
for step in steps:
# while True:
last_item = ""
for item in step:
if item != last_item:
print(item)
last_item = item
else:
break
Output:
A
B
C
D
E
F

Remove this while loop from your code. [break] below works in this loop. To achieve your desired output, [break] need to break the for loop above.
steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']]
for step in steps:
# while True:
last_item = ""
for item in step:
if item != last_item:
print(item)
last_item = item
else:
break

Related

Variable value is changing between print and append

The variable is changing and prints different value and it saves another
If i run this code
def swap(string,x,y):
string[y], string[x] = string[x], string[y]
def permutations(string ,i=0):
if i == len(string):
yield string
for x in range(i, len(string)):
perm = string
swap(perm,x,i)
yield from permutations(perm, i+1)
swap(perm,i,x)
result = []
test = permutations(['a','b','c'])
for x in test:
print(x)
result.append(x)
print(result)
It prints this and i dont know why:
['a', 'b', 'c']
['a', 'c', 'b']
['b', 'a', 'c']
['b', 'c', 'a']
['c', 'b', 'a']
['c', 'a', 'b']
[['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]

You're mutating the same x in place, so only the final version of it is printed after the loop.
result.append(x) does not copy the object (x in this case), it just places a reference to it into the result list.
Do e.g. result.append(x[:]) or result.append(list(x)) to put copies of x into the result list.

That's why the yielded list has the same references, so whenever you change it, the previous referenced value will be changed, too. The quick fix is to return a copy instance of the list.
def swap(string,x,y):
string[y], string[x] = string[x], string[y]
def permutations(string ,i=0):
if i == len(string):
yield string.copy()
for x in range(i, len(string)):
perm = string
swap(perm,x,i)
yield from permutations(perm, i+1)
swap(perm,i,x)
result = []
test = permutations(['a','b','c'])
for x in test:
print(x)
result.append(x)
print(result)

Replacing a slice of several elements in a list with one element

I am trying to find a slice, of variable size, in a list and replace it with one element:
ls = ['c', 'b', 'c', 'd', 'c']
lt = ['b', 'c']
r = 'bc'
for s,next_s in zip(ls, ls[1:]):
for t, next_t in zip(lt, lt[1:]):
if (s, next_s) == (t, next_t):
i = ls.index(s)
ii = ls.index(next_s)
del ls[i]
del ls[ii]
ls.insert(i, r)
print (ls)
This works only sometimes, producing:
['c', 'bc', 'd', 'c']
but if lt = ['d', 'c'] and r = 'dc', it fails producing:
['b', 'c', 'c', 'dc']
How to fix that? Or what is a better way to handle this?

Simple way that might work for you (depends on whether lt can appear multiple times and on what to do then).
ls = ['c', 'b', 'c', 'd', 'c']
lt = ['b', 'c']
r = 'bc'
for i in range(len(ls)):
if ls[i:i+len(lt)] == lt:
ls[i:i+len(lt)] = [r]
print(ls)

Python: Keep first occurrence of an item in a list

How can I remove all occurrences of a specific value in a list except for the first occurrence?
E.g. I have a list:
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
And I need a function that looks something like this:
preserve_first(letters, 'c')
And returns this:
['a', 'b', 'c', 'd', 'a', 'a']
Removing all but the first occurrence of the given value while otherwise preserving the order. If there is a way to do this with a pandas.Series that would be even better.

You want to remove duplicates of 'c' only. So you want to filter where the series is either not duplicated at all or it isn't equal to 'c'. I like to use pd.Series.ne in place of pd.Series != because the reduction in wrapping parenthesis adds to readability (my opinion).
s = pd.Series(letters)
s[s.ne('c') | ~s.duplicated()]
0 a
1 b
2 c
5 d
7 a
8 a
dtype: object
To do exactly what was asked for.
def preserve_first(letters, letter):
s = pd.Series(letters)
return s[s.ne(letter) | ~s.duplicated()].tolist()
preserve_first(letters, 'c')
['a', 'b', 'c', 'd', 'a', 'a']

A general Python solution:
def keep_first(iterable, value):
it = iter(iterable)
for val in it:
yield val
if val == value:
yield from (el for el in it if el != value)
This yields all items up to and including the first value if found, then yields the rest of the iterable filtering out items matching the value.

You can try this using generators:
def conserve_first(l, s):
last_seen = False
for i in l:
if i == s and not last_seen:
last_seen = True
yield i
elif i != s:
yield i
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
print(list(conserve_first(letters, "c")))
Output:
['a', 'b', 'c', 'd', 'a', 'a']

Late to the party, but
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
def preserve_first(data, letter):
new = []
count = 0
for i in data:
if i not in new:
if i == letter and count == 0:
new.append(i)
count+=1
elif i == letter and count == 1:
continue
else:
new.append(i)
else:
if i == letter and count == 1:
continue
else:
new.append(i)
l = preserve_first(letters, "c")

You can use a list filter and slices:
def preserve_first(letters, elem):
if elem in letters:
index = letters.index(elem)
return letters[:index + 1] + filter(lambda a: a != 'c', letters[index + 1:])

Doesn't use pandas but this is a simple algorithm to do the job.
def preserve_firsts(letters, target):
firsts = []
seen = False
for letter in letters:
if letter == target:
if not seen:
firsts.append(letter)
seen = True
else:
firsts.append(letter)
return firsts
> letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a']
> preserve_firsts(letters, 'c')
['a', 'b', 'c', 'd', 'a', 'a']

Simplest solution I could come up with.
letters = ['a', 'b', 'c', 'c', 'c', 'd', 'c', 'a', 'a', 'c']
key = 'c'
def preserve_first(letters, key):
first_occurrence = letters.index(key)
return [item for i, item in enumerate(letters) if i == first_occurrence or item != key]

Start loop after certain element in list is reached

How do I start executing code in a for loop after a certain element in the list has been reached. I've got something that works, but is there a more pythonic or faster way of doing this?
list = ['a', 'b', 'c', 'd', 'e', 'f']
condition = 0
for i in list:
if i == 'c' or condition == 1:
condition = 1
print i

One way would to be to iterate over a generator combining dropwhile and islice:
from itertools import dropwhile, islice
data = ['a', 'b', 'c', 'd', 'e', 'f']
for after in islice(dropwhile(lambda L: L != 'c', data), 1, None):
print after
If you want including then drop the islice.

A little simplified code:
lst = ['a', 'b', 'c', 'd', 'e', 'f']
start_index = lst.index('c')
for i in lst[start_index:]:
print i

Python Remove SOME duplicates from a list while maintaining order?

I want to remove certain duplicates in my python list.
I know there are ways to remove all duplicates, but I wanted to remove only consecutive duplicates, while maintaining the list order.
For example, I have a list such as the following:
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
However, I want to remove the duplicates, and maintain order, but still keep the 2 c's and 2 f's, such as this:
wantedList = [a,b,c,f,d,e,f,g,c]
So far, I have this:
z = 0
j=0
list2=[]
for i in list1:
if i == "c":
z = z+1
if (z==1):
list2.append(i)
if (z==2):
list2.append(i)
else:
pass
elif i == "f":
j = j+1
if (j==1):
list2.append(i)
if (j==2):
list2.append(i)
else:
pass
else:
if i not in list2:
list2.append(i)
However, this method gives me something like:
wantedList = [a,b,c,c,d,e,f,f,g]
Thus, not maintaining the order.
Any ideas would be appreciated! Thanks!

Not completely sure if c and f are special cases, or if you want to compress consecutive duplicates only. If it is the latter, you can use itertools.groupby():
>>> import itertools
>>> list1
['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
>>> [k for k, g in itertools.groupby(list1)]
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

To remove consecutive duplicates from a list, you can use the following generator function:
def remove_consecutive_duplicates(a):
last = None
for x in a:
if x != last:
yield x
last = x
With your data, this gives:
>>> list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
>>> list(remove_consecutive_duplicates(list1))
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

If you want to ignore certain items when removing duplicates...
list2 = []
for item in list1:
if item not in list2 or item in ('c','f'):
list2.append(item)
EDIT: Note that this doesn't remove consecutive items

EDIT
Never mind, I read your question wrong. I thought you were wanting to keep only certain sets of doubles.
I would recommend something like this. It allows a general form to keep certain doubles once.
list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
doubleslist = ['c', 'f']
def remove_duplicate(firstlist, doubles):
newlist = []
for x in firstlist:
if x not in newlist:
newlist.append(x)
elif x in doubles:
newlist.append(x)
doubles.remove(x)
return newlist
print remove_duplicate(list1, doubleslist)

The simple solution is to compare this element to the next or previous element
a=1
b=2
c=3
d=4
e=5
f=6
g=7
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
output_list=[list1[0]]
for ctr in range(1, len(list1)):
if list1[ctr] != list1[ctr-1]:
output_list.append(list1[ctr])
print output_list

list1 = ['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
wantedList = []
for item in list1:
if len(wantedList) == 0:
wantedList.append(item)
elif len(wantedList) > 0:
if wantedList[-1] != item:
wantedList.append(item)
print(wantedList)
Fetch each item from the main list(list1).
If the 'temp_list' is empty add that item.
If not , check whether the last item in the temp_list is
not same as the item we fetched from 'list1'.
if items are different append into temp_list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Breaking out of a while loop when hitting a duplicate - python

You do not need while True. Except for that part your code works as expected: steps = [['A', 'B', 'C', 'C', 'C'], ['D', 'E', 'F', 'F', 'F']] for step in steps: # while True: last_item = "" for item in step: if item != last_item: print(item) last_item = item else: break Output: A B C D E F

Related

Variable value is changing between print and append

Replacing a slice of several elements in a list with one element

Python: Keep first occurrence of an item in a list

Start loop after certain element in list is reached

Python Remove SOME duplicates from a list while maintaining order?

Categories

Resources