Python: find smallest missing positive integer in ordered list - python

I need to find the first missing number in a list. If there is no number missing, the next number should be the last +1.
It should first check to see if the first number is > 1, and if so then the new number should be 1.
Here is what I tried. The problem is here: if next_value - items > 1:
results in an error because at the end and in the beginning I have a None.
list = [1,2,5]
vlans3=list
for items in vlans3:
if items in vlans3:
index = vlans3.index(items)
previous_value = vlans3[index-1] if index -1 > -1 else None
next_value = vlans3[index+1] if index + 1 < len(vlans3) else None
first = vlans3[0]
last = vlans3[-1]
#print ("index: ", index)
print ("prev item:", previous_value)
print ("-cur item:", items)
print ("nxt item:", next_value)
#print ("_free: ", _free)
#print ("...")
if next_value - items > 1:
_free = previous_value + 1
print ("free: ",_free)
break
print ("**************")
print ("first item:", first)
print ("last item:", last)
print ("**************")
Another method:
L = vlans3
free = ([x + 1 for x, y in zip(L[:-1], L[1:]) if y - x > 1][0])
results in a correct number if there is a gap between the numbers, but if no space left error occurs: IndexError: list index out of range. However I need to specify somehow that if there is no free space it should give a new number (last +1). But with the below code it gives an error and I do not know why.
if free = []:
print ("no free")
else:
print ("free: ", free)

To get the smallest integer that is not a member of vlans3:
ints_list = range(min(vlans3), max(vlans3) + 1)
missing_list = [x for x in ints_list if x not in vlans3]
first_missing = min(missing_list)
However you want to return 1 if the smallest value in your list is greater than 1, and the last value + 1 if there are no missing values, so this becomes:
ints_list = [1] + list(range(min(vlan3), max(vlan3) + 2))
missing_list = [x for x in ints_list if x not in vlan3]
first_missing = min(missing_list)

First avoid using reserved word list for variable.
Second use try:except to quickly and neatly avoid this kind of issues.
def free(l):
if l == []: return 0
if l[0] > 1: return 1
if l[-1] - l[0] + 1 == len(l): return l[-1] + 1
for i in range(len(l)):
try:
if l[i+1] - l[i] > 1: break
except IndexError:
break
return l[i] + 1

How about a numpy solution? Below code works if your input is a sorted integer list with non-duplicating positive values (or is empty).
nekomatic's solution is a bit faster for small inputs, but it's just a fraction of a second, doesn't really matter. However, it does not work for large inputs - e.g. list(range(1,100000)) completely freezes on list comprehension with inclusion check. Below code does not have this issue.
import numpy as np
def first_free_id(array):
array = np.concatenate((np.array([-1, 0], dtype=np.int), np.array(array, dtype=np.int)))
where_sequence_breaks = np.where(np.diff(array) > 1)[0]
return where_sequence_breaks[0] if len(where_sequence_breaks)>0 else array[-1]+1
Prepend the array with -1 and 0 so np.diff works for empty and 1-element lists without breaking existing sequence's continuity.
Compute differences between consecutive values. Seeked discontinuity ("hole") is where the difference is bigger than 1.
If there ary any "holes" return the id of the first one, otherwise return the integer succeeding the last element.

Related

If, else return else value even when the condition is true, inside a for loop

Here is the function i defined:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
if data[i:i + l] is field:
while data[i - l: i] == data[i:i + l]:
count = count + 1
i = i + 1
else:
print("OK")
if final == 0 or count >= final:
final = count
return final
a = input("Enter the field - ")
b = input("Enter the data - ")
print(count_longest(a, b))
It works in some cases and gives incorrect output in most cases. I checked by printing the strings being compared, and even after matching the requirement, the loop results in "OK" which is to be printed when the condition is not true! I don't get it! Taking the simplest example, if i enter 'as', when prompted for field, and 'asdf', when prompted for data, i should get count = 1, as the longest iteration of the substring 'as' is once in the string 'asdf'. But i still get final as 0 at the end of the program. I added the else statement just to check the if the condition was being satisfied, but the program printed 'OK', therefore informing that the if condition has not been satisfied. While in the beginning itself, data[0 : 0 + 2] is equal to 'as', 2 being length of the "field".
There are a few things I notice when looking at your code.
First, use == rather than is to test for equality. The is operator checks if the left and right are referring to the very same object, whereas you want to properly compare them.
The following code shows that even numerical results that are equal might not be one and the same Python object:
print(2 ** 31 is 2 ** 30 + 2 ** 30) # <- False
print(2 ** 31 == 2 ** 30 + 2 ** 30) # <- True
(note: the first expression could either be False or True—depending on your Python interpreter).
Second, the while-loop looks rather suspicious. If you know you have found your sequence "as" at position i, you are repeating the while-loop as long as it is the same as in position i-1—which is probably something else, though. So, a better way to do the while-loop might be like so:
while data[i: i + l] == field:
count = count + 1
i = i + l # <- increase by l (length of field) !
Finally, something that might be surprising: changing the variable i inside the while-loop has no effect on the for-loop. That is, in the following example, the output will still be 0, 1, 2, 3, ..., 9, although it looks like it should skip every other element.
for i in range(10):
print(i)
i += 1
It does not effect the outcome of the function, but when debugging you might observe that the function seems to go backward after having found a run and go through parts of it again, resulting in additional "OK"s printed out.
UPDATE: Here is the complete function according to my remarks above:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
while data[i: i + l] == field:
count = count + 1
i = i + l
if count >= final:
final = count
return final
Note that I made two additional simplifications. With my changes, you end up with an if and while that share the same condition, i.e:
if data[i:i+1] == field:
while data[i:i+1] == field:
...
In that case, the if is superfluous since it is already included in the condition of while.
Secondly, the condition if final == 0 or count >= final: can be simplified to just if count >= final:.

having error while jumping in list

You are given a list jumps of positive and negative integers which signify forward or negative jumps.
Starting at index 0, you jump to index 0+jump[0] . In general, if you are at index k, you would jump to index k+jump[k] Let's say jumps[0] was 2 . Index becomes 2. Assuming jumps[2] was -1, index would become 1.
Write the function list_jumps(jumps) where jumps is the aforementioned list. The function should return the string 'cycle' if the index will never leave the boundaries of the input list otherwise it must return 'out-of-bounds'. The starting index is always 0.
dm=[3,0,0,-2]
def list_jump(jumps):
xs=jumps
xy=jumps
max=len(xs)
while(True):
p = int(xs[0])
print p
for i in range (0,max,p):
print i,"iii"
p = xs[i]
if xs[i]=="visited":
return False
else:
xs[i]="visited"
print xs
return True
if list_jump(dm):
print "not cycle"
else:
print "cycle"
I really don't need solution. I just want to know what is the error.
note that I would just make a set seen and add each index to that.
def list_jump(jumps):
i = 0
seen = {0}
while True:
try:
i += jumps[i]
if i < 0:
# negative indices are legal, but we should exit if we have one.
return "out-of-bounds"
except IndexError:
# we've escaped the list: non-cyclic!
return "out-of-bounds"
else:
if i in seen:
return "cycle"
else:
seen.add(i)

How to run a function for each row in a nested list?

I'm a Python newbie and trying to write code that checks if a nested list has a valid set of numbers. Each row and each column have to be valid. I have written a function called check_sequence which validates if a list has a valid set of numbers. How would I call that function from another to check to see if the row is valid? So for example, I need something like this for check_rows:
check_sequence(list):
checks if list is valid
check_rows(list):
For each of the rows in the nested list call check_sequence
Here is my code for check_sequence:
def check_sequence(mylist):
pos = 0
sequence_counter = 1
while pos < len(mylist):
print "The pos is: " + " " + str(pos)
print "The sequence_counter is:" + " " + str(sequence_counter)
for number in mylist:
print "The number is:" + " " + str(number)
if number == sequence_counter:
sequence_counter = sequence_counter + 1
pos = pos + 1
break
else:
# if list is at the last position on the last item
if sequence_counter not in mylist:
print "The pos is:" + " " + str(pos) + " and the last position is:" + " " + str(mylist[len(mylist) - 1])
print "False"
return False
print "True"
return True
So I'd call the main method like below:
check_square([[1, 2, 3],
[2, 3, 1],
[3, 1, 2]])
def check_square(list):
if check_rows() and check_columns() == True:
return True
else:
return False
Here's a solution that'll work for any arbitrary 2D list.
l = [[1,2,3],[1,2],[1,4,5,6,7]]
try:
if len([1 for x in reduce(lambda x, y :x + y, l) if type(x) != type(0)]) > 0:
raise Exception
catch Exception:
pass # error, do something
The intuition is to flatten the list and then successively check if its type is int.
Given the nested list is row oriented (the rows are the lowest dimension), you can simply use:
check_rows(list):
return all(check_sequence(sublist) for sublist in list)
Here we thus use the all(..) builtin: it evaluates to True if and only if the truthiness of all elements the generator (boldface part) is True, otherwise the result is False. So from the moment one of the rows is not valid, the matrix is not valid.
If on the other hand the nested list is column oriented (the columns are the lowest dimension), we will first need to do a transpose using zip:
check_rows(list):
return all(check_sequence(list(sublist)) for sublist in zip(*list))
The zip(*..) transposes the list and we use list(..) to make sure that check_sequence(..) is still working with lists (if any iterable is sufficient, the list(..) part can be omitted.
Are you looking for an iterative for loop?
check_sequence(list):
#your check here
check_rows(list):
for row in list:
if not check_sequence(row):
return False
return True
You have to separate in two function, and think the first one will return the complete check for each value of the other:
def check_sequence(lis):
ret = True
for row in lis:
ret = ret and check_rows(row)
return ret
def check_rows(row):
ret = True
for elem in row:
pass #do your checking
return ret
a concrete example could be:
l = [[1,2,3],[1,2],[1,4,5,6,7]]
def check_sequence(lis):
ret = True
for row in lis:
ret = ret and check_rows(row)
return ret
def check_rows(row):
return 1 in row #ask if 1 belongs to the list
check_sequence(l) ---> True
check_sequence([[1],[2,3]]) ---> False

Assign new values to an array based on a function ran on other 3 dimensional array

I have a multiband raster where I want to apply a function to the values that each pixel has across all the bands. Depending on the result, a new value is assigned, and a new single-band raster is generated from these new values. For example if a pixel has increasing values across the bands, the value "1" will be assigned to that pixel in the resulting raster. I am doing some tests on an three dimensional array using numpy but I am not able to resolve the last part, where the new values are assigned.
The function to be applied to the 3 dimensional array is Trend(List). I have defined it in the begining. To be easier to iterate through the array values on the z (or 0) axis I have used np.swapaxes (thank you #Fabricator for this). The problem comes now when assinging new values to the new_band[i,j] array so that the result of Trend(List) over the list:
[myArraySw[0,0]] will be assigned to new_band[0,0]
[myArraySw[0,1]] will be assigned to new_band[0,1]
[myArraySw[0,2]] will be assigned to new_band[0,2]
[myArraySw[0,3]] will be assigned to new_band[0,3]
................................................
[myArraySw[3,3]] will be assigned to new_band[3,3]
Some values are assigned, but some not. For example new_band[0,1] should be "2" but is "0". The same with new_band[3,0], new_band[3,1], new_band[3,2], new_band[3,3] that should be "5" but they are "0". Other values look alright. Where could be the problem?
Thank you
Here is the code:
import os
import numpy as np
def decrease(A):
return all(A[i] >= A[i+1]for i in range(len(A)-1))
def increase(A):
return all(A[i] <= A[i+1] for i in range(len(A)-1))
def Trend(List):
if all(List[i] == List[i+1] for i in range(len(List)-1))==1:
return "Trend: Stable"
else:
a=[value for value in List if value !=0]
MaxList = a.index(max(a)) #max of the list
MinList=a.index(min(a)) #min of the list
SliceInc=a[:MaxList] #slice until max
SliceDec=a[MaxList:] #slice after max value
SliceDec2=a[:MinList] #slice until min value
SliceInc2=a[MinList:] #slice after min value
if all(a[i] <= a[i+1] for i in range(len(a)-1))==1:
return "Trend: increasing"
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
print "Trend: decreasing"
elif increase(SliceInc)==1 and decrease(SliceDec)==1:
return "Trend: Increasing and then Decreasing"
elif decrease(SliceDec2)==1 and increase(SliceInc2)==1:
return "Trend: Decreasing and then Increasing"
else:
return "Trend: mixed"
myArray = np.zeros((4,4,4)) # generating an example array to try the above functions on
myArray[1,0,0] = 2
myArray[3,0,0] = 4
myArray[1,0,1] = 10
myArray[3,0,1] = 8
myArray[0,1,2] = 5
myArray[1,1,2] = 7
myArray[2,1,2] = 4
print "\n"
print "This is the original: "
print "\n"
print myArray
print "\n"
print "\n"
myArraySw = np.swapaxes(np.swapaxes(myArray,0,2),0,1) # swaping axes so that I can iterate through the lists
print "\n"
print "This is the swapped: "
print "\n"
print myArraySw
print "\n"
new_band = np.zeros_like(myArray[0]) # create a new array to store the results of the functions
for j in range(3):
for i in range(3):
if Trend(myArraySw[i,j]) == "Trend: increasing":
new_band[i,j] = 1
elif Trend(myArraySw[i,j]) == "Trend: decreasing":
new_band[i,j] = 2
elif Trend(myArraySw[i,j]) == "Trend: Increasing and then Decreasing":
new_band[i,j] = 3
elif Trend(myArraySw[i,j]) == "Trend: Decreasing and then Increasing":
new_band[i,j] = 4
elif Trend(myArraySw[i,j]) == "Trend: Stable":
new_band[i,j] = 5
elif Trend(myArraySw[i,j]) == "Trend: mixed":
new_band[i,j] = 6
print "\n"
print "The new array is: "
print "\n"
print new_band
At least part of the problem is that when you typed:
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
print "Trend: decreasing"
you probably meant to type this:
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
return "Trend: decreasing"
^^^^^^
Also, if you don't mind a little unsolicited advice, the code you've posted has a pretty strong "code smell" - you're doing a lot of things in unnecessarily complicated ways. Good on you for getting it done anyways, but I think you'll find this sort of thing easier if you work through some python tutorial problem sets, and read over the given solutions to see how more experienced programmers handle common tasks. You'll discover easier ways to implement many of the elements of this and future projects.

How to find the position of first instance of duplicates in two equal length, sorted lists

I have two random lists of same length, in range of 0 to 99.
lista = [12,34,45,56,66,80,89,90]
listb = [13,30,56,59,72,77,80,85]
I need to find the first instance of a duplicate number, and in what list it is from.
In this example, I need to find the number '56' in listb, and get the index i = 2
Thanks.
Update:
After running it a couple of times, I got this error:
if list_a[i] == list_b[j]:
IndexError: list index out of range
like #Asterisk suggested, my two lists are equal length and sorted, both i and j are set to 0 at the beginning.
that bit is part of a genetic crossover code:
def crossover(r1,r2):
i=random.randint(1,len(domain)-1) # Indices not at edges of domain
if set(r1) & set(r2) == set([]): # If lists are different, splice at random
return r1[0:i]+r2[i:]
else: # Lists have duplicates
# Duplicates At Edges
if r1[0] == r2[0]: # If [0] is double, retain r1
return r1[:1]+r2[1:]
if r1[-1] == r2[-1]: # If [-1] is double, retain r2
return r1[:-1]+r2[-1:]
# Duplicates In Middle
else: # Splice at first duplicate point
i1, i2 = 0, 0
index = ()
while i1 < len(r1):
if r1[i1] == r2[i2]:
if i1 < i2:
index = (i1, r1, r2)
else:
index = (i2, r2, r1)
break
elif r1[i1] < r2[i2]:
i1 += 1
else:
i2 += 1
# Return A Splice In Relation To What List It Appeared First
# Eliminates Further Duplicates In Original Lists
return index[2][:index[0]+1]+index[1][index[0]+1:]
The function takes 2 lists and returns one.
domain is a list of 10 tupples: (0,99).
As I said, the error doesn't happen every time, only once in a while.
I appreciate your help.
I'm not a python guy, but this is an algorithm question...
You maintain an index into each list and you look at the elements at those two list positions.
Whichever list has the smallest element at the current position, you move to the next element in that list.
When you find an element that is the same as the other list's current element, that is your smallest duplicate.
If you reach the end of either list, there are no duplicates.
If you're looking for all the duplicates, you can use something like this:
list_a = [12,34,45,56,66,80,89,90]
list_b = [13,30,56,59,72,77,80,85]
set_a = set(list_a)
set_b = set(list_b)
duplicates = set_a.intersection(set_b)
# or just this:
# duplicates = [n for n in list_a if n in list_b]
for duplicate in duplicates:
print list_a.index(duplicate)
To get the smallest index of a duplicate in either list:
a_min = min(map(list_a.index, duplicates))
b_min = min(map(list_b.index, duplicates))
if a_min < b_min:
print 'a', a_min, list_a[a_min]
else:
print 'b', b_min, list_b[b_min]
If not, this should work a bit better:
duplicate = None
for n in set_a:
if n in set_b:
duplicate = n
break
if duplicate is not None:
print list_a.index(duplicate)
lista = [12,34,45,56,66,80,89,90]
listb = [13,30,56,59,72,77,80,85]
i, j = 0, 0
while i < len(lista):
if lista[i] == listb[j]:
if i < j:
print i, lista
else:
print j, listb
break
elif lista[i] < listb[j]:
i += 1
else:
j += 1
>>>
2 [13, 30, 56, 59, 72, 77, 80, 85]
Assumptions: both lists have the same length, and they are sorted
Just scan all the lists at position 0, then 1, then 2, ... Keep track of what you've seen (you can query a set in O(1) time).
def firstDuplicate(*lists):
seen = {}
for i,tup in enumerate(zip(*lists)):
for listNum,value in enumerate(tup):
position = (listNum,i)
if value in seen:
return value, [seen[value], position]
else:
seen[value] = position
Demo:
>>> value,positions = firstDuplicate(lista,listb)
>>> value
56
>>> positions
[(1, 2), (0, 3)]
(Does not generalize to N lists... yet. Would need a minor tweak to use a defaultdict(set), insert all indices as a tuple together, then check for duplicates.)

Categories

Resources