Iterating and indexing

Iterating and indexing - python

I am currently stuck with this program. I am attempting to determine the molecular weight of a compound given the molecular equation (only Cs, Hs, and Os). I also am unsure of how to correctly format [index +1], as I am trying to determine what the next character after "x" is to see if it is a number or another molecule
def main():
C1 = 0
H1 = 0
O1 = 0
num = 0
chemicalFormula = input("Enter the chemical formula, or enter key to quit: ")
while True:
cformula = list(chemicalFormula)
for index, x in enumerate(cformula):
if x == 'C':
if cformula[index + 1] == 'H' or cformula[index + 1] == 'O':
C1 += 1
else:
for index, y in range(index + 1, 1000000000):
if cformula[index + 1] != 'H' or cformula[index + 1] != 'O':
num = int(y)
num = num*10 + int(cformula[index + 1])
else:
C1 += num
break
this is the error I keep getting
Enter the chemical formula, or enter key to quit: C2
File "/Users/ykasznik/Documents/ykasznikp7.py", line 46, in main
for index, y in range(index + 1, 1000000000):
TypeError: 'int' object is not iterable
>>>

You should change this line
for index, y in range(index + 1, 1000000000):
to
for y in range(index + 1, 1000000000):

The answers provided here focus on two different aspects of solving your problem:
A very specific solution to your error (int is not iterable), by correcting some code.
A bit bigger perspective of how to handle your code.
Regarding 1, a comment to your question noted the issue: the syntax of tuple-unpacking in your inner loop.
An example of Tuple-unpacking would be
a,b = ['a','b']
Here, Python would take the first element of the right hand side (RHS) and assign it to the first name on the left hand side (LHS), the second element of RHS and assign it to the second name in the LHF.
Your inner loop that faults,
for index, y in range(index + 1, 1000000000),
is equivalent of trying to do
index, y = 1
Now, an integer is not a collection of elements, so this would not work.
Regarding 2, you should focus on the strategy of modularization, which basically means you write a function for each sub-problem. Python was almost born for this. (Note, this strategy does not necessarily mean writing Python-modules for each subproblem.)
In you case, your main goal can be divided into several sub-problems:
Getting the molecular sequences.
Split the sequences into individual sequences.
Splitting the sequence into its H, C, and O-elements.
Given the number of H, C and O-atoms, calculate the molecular weight.
Step 3 and 4 are excellent candidates for independent functions, as their core problem is isolated from the remaining context.
Here, I assume we only get 1 sequence at a time, and that they can be of the form:
CH4
CHHHH
CP4H3OH
Step 3:
def GetAtoms(sequence):
'''
Counts the number of C's, H's and O's in sequence and returns a dictionary.
Only works with a numeric suffices up to 9, e.g. C10H12 would not work.
'''
atoms = ['C','H','O'] # list of which atoms we want to count.
res = {atom:0 for atom in atoms}
last_c = None
for c in sequence:
if c in atoms:
res[c] += 1
last_c = c
elif c.isdigit() and last_c is not None:
res[last_c] += int(c) - 1
last_c = None
else:
last_c = None
return res
You can see, that regardless of how you obtain the sequence and how the molecular weight is calculated, this method works (under the preconditions). If you later need to extend the capabilities of how you obtain the atom-count, this can be altered without affecting the remaining logic.
Step 4:
def MolecularWeight(atoms):
return atoms['H']*1 + atoms['C']*8 + atoms['O']*18
Now your total logic could be this:
while True:
chemicalFormula = input("Enter the chemical formula, or enter key to quit: ")
if len(chemicalFormula) == 0:
break
print 'Molecular weight of', chemicalFormula, 'is', MolecularWeight(GetAtoms(chemicalFormula))

Here's my idea on how to solve the problem. Basically, you keep track of the current 'state' and iterate through each character exactly once, so you can't lose track of where you are or anything like that.
def getWeightFromChemical(chemical):
chemicals = {"C" : 6, "H" : 1, "O" : 8}
return chemicals.get(chemical, 0)
def chemicalWeight(chemicalFormula):
lastchemical = ""
currentnumber = ""
weight = 0
for c in chemicalFormula:
if str.isalpha(c): # prepare new chemical
if len(lastchemical) > 0:
weight += getWeightFromChemical(lastchemical)*int("1" if currentnumber == "" else currentnumber)
lastchemical = c
currentnumber = ""
elif str.isdigit(c): # build up number for previous chemical
currentnumber += c
# one last check
if len(lastchemical) > 0:
weight += getWeightFromChemical(lastchemical)*int("1" if currentnumber == "" else currentnumber)
return weight
By the way, can anyone see how to refactor this to not have that piece of code twice? It bugs me.

Change
for index, y in range(index + 1, 1000000000):
to
for index, y in enumerate(range(index + 1, 1000000000)):
Although you may consider renaming your outer loop or inner loop index for clarity

for index, x in enumerate(cformula):
if x == 'C':
if cformula[index + 1] == 'H' or cformula[index + 1] == 'O':
C1 += 1
else:
for index, y in range(index + 1, 1000000000):
This is a Really Bad Idea. You are overwriting the value of index from the outer loop with the value of index from the inner loop.
You should use a different name, say index2 for the inner loop.
Also, when you say for index, y in range(index + 1, 1000000000): you are acting as if you are expecting range() to produce a sequence of 2-tuples. But range always produces a sequence of ints.
Roger has suggested for y in range(index + 1, 1000000000): but I think you are intending to get the value of y from somewhere else (it's not clear where. Maybe you want to use the second argument of enumerate() to specify the value to start from, instead?
That is,
for index2, y in enumerate(whereeveryoumeanttogetyfrom, index + 1)
so that index2 equals index +1 on the first step through the loop, index +2 on the second, etc.

Range returns either a list of int, or an iterable of int, depending on which version of Python you are using. Attempting to assign that single int into two names causes Python to attempt to iterate through that int in automated tuple unpacking.
So, change the
for index, y in range(index + 1, y):
to
for y in range(index + 1, y):
Also, you use index + 1 repeatedly, but mostly to look up the next symbol in your cformula. Since that doesn't change over the course of your outer loop, just assign it its own name once, and keep using that name:
for index, x in enumerate(cformula):
next_index = index + 1
next_symbol = cformula[next_index]
if x == 'C':
if next_symbol == 'H' or next_symbol == 'O':
C1 += 1
else:
for y in range(next_index, 1000000000):
if next_symbol != 'H' or next_symbol != 'O':
num = y*10 + int(next_symbol)
else:
C1 += num
break
I've also refactored out some constants to make the code cleaner. Your inner loop as written was failing on tuple assignment, and would only be counting up the y. Also, your index would be reset again once you exited the inner loop, so you would be processing all of your digits repeatedly.
If you want to iterate over the substring after your current symbol, you could just use slice notation to get all of those characters: for subsequent in cformula[next_index:]
For example:
>>> chemical = 'CH3OOCH3'
>>> chemical[2:]
'3OOCH3'
>>> for x in chemical[2:]:
... print x
...
3
O
O
C
H
3

Related

scope of a variable in python for this question

I don't know why, but I am getting value of scope as final as 0 even len(s) as zero in the last line of countfrequency(s) function.
import collections
def countfrequency(s):
final = 0
flag = 1
d = dict(collections.Counter(s))
for item in d:
if d[item] <= k:
flag = 0
if flag == 1: #Here
final = max(final, len(s))
print(final)
s = "ababbc"
k = 2
for x in range(len(s)):
for y in range(1, len(s)):
countfrequency(s[x:y + 1])

It is because of 2 reasons :
Value of flag is 0 at last so it wont change the value of final
Length function takes object as a parameter and when unchanged it gives 0
So you can can either make flag 1 so that control goes inside if condition or print the value of len(s) out side the if condition

In addition to the answer posted by shaktiraj jadeja, the modified code is as follows:
import collections
def countfrequency(s, k):
final = 0
flag = 0
d = dict(collections.Counter(s))
# print(d)
for item in d:
if d[item] > k:
flag = 1
break
if flag == 1: #Here
# print("Inside:", final, len(s))
final = max(final, len(s))
print(final)
s = "ababbc"
k = 2
for x in range(len(s)):
for y in range(1, len(s)):
# print(s[x:y])
countfrequency(s[x:y + 1], k)

To start with there is no problem of scope.
Now lets get back to the problem
Lets define a rule.
Rule: If a sub string has each character repeated more than k(=2) times in it. Then it is a good substring. Else it is a bad substring
Then your code simply prints the length of good sub string or 0 in case of bad substring
In short in your example string s= "ababbc" contains no good substring
if you try S = "aaaaaa" you will see many numbers printed other than 0 (exactly 11 0's and 10 other numbers)
Now either this was your confusion or you wrote the wrong code for some logic
I hope this helps

If, else return else value even when the condition is true, inside a for loop

Here is the function i defined:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
if data[i:i + l] is field:
while data[i - l: i] == data[i:i + l]:
count = count + 1
i = i + 1
else:
print("OK")
if final == 0 or count >= final:
final = count
return final
a = input("Enter the field - ")
b = input("Enter the data - ")
print(count_longest(a, b))
It works in some cases and gives incorrect output in most cases. I checked by printing the strings being compared, and even after matching the requirement, the loop results in "OK" which is to be printed when the condition is not true! I don't get it! Taking the simplest example, if i enter 'as', when prompted for field, and 'asdf', when prompted for data, i should get count = 1, as the longest iteration of the substring 'as' is once in the string 'asdf'. But i still get final as 0 at the end of the program. I added the else statement just to check the if the condition was being satisfied, but the program printed 'OK', therefore informing that the if condition has not been satisfied. While in the beginning itself, data[0 : 0 + 2] is equal to 'as', 2 being length of the "field".

There are a few things I notice when looking at your code.
First, use == rather than is to test for equality. The is operator checks if the left and right are referring to the very same object, whereas you want to properly compare them.
The following code shows that even numerical results that are equal might not be one and the same Python object:
print(2 ** 31 is 2 ** 30 + 2 ** 30) # <- False
print(2 ** 31 == 2 ** 30 + 2 ** 30) # <- True
(note: the first expression could either be False or True—depending on your Python interpreter).
Second, the while-loop looks rather suspicious. If you know you have found your sequence "as" at position i, you are repeating the while-loop as long as it is the same as in position i-1—which is probably something else, though. So, a better way to do the while-loop might be like so:
while data[i: i + l] == field:
count = count + 1
i = i + l # <- increase by l (length of field) !
Finally, something that might be surprising: changing the variable i inside the while-loop has no effect on the for-loop. That is, in the following example, the output will still be 0, 1, 2, 3, ..., 9, although it looks like it should skip every other element.
for i in range(10):
print(i)
i += 1
It does not effect the outcome of the function, but when debugging you might observe that the function seems to go backward after having found a run and go through parts of it again, resulting in additional "OK"s printed out.
UPDATE: Here is the complete function according to my remarks above:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
while data[i: i + l] == field:
count = count + 1
i = i + l
if count >= final:
final = count
return final
Note that I made two additional simplifications. With my changes, you end up with an if and while that share the same condition, i.e:
if data[i:i+1] == field:
while data[i:i+1] == field:
...
In that case, the if is superfluous since it is already included in the condition of while.
Secondly, the condition if final == 0 or count >= final: can be simplified to just if count >= final:.

Check result using 4 operations based with Python

I'm struggling to make a Python program that can solve riddles such as:
get 23 using [1,2,3,4] and the 4 basic operations however you'd like.
I expect the program to output something such as
# 23 reached by 4*(2*3)-1
So far I've come up with the following approach as reduce input list by 1 item by checking every possible 2-combo that can be picked and every possible result you can get to.
With [1,2,3,4] you can pick:
[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]
With x and y you can get to:
(x+y),(x-y),(y-x),(x*y),(x/y),(y/x)
Then I'd store the operation computed so far in a variable, and run the 'reducing' function again onto every result it has returned, until the arrays are just 2 items long: then I can just run the x,y -> possible outcomes function.
My problem is this "recursive" approach isn't working at all, because my function ends as soon as I return an array.
If I input [1,2,3,4] I'd get
[(1+2),3,4] -> [3,3,4]
[(3+3),4] -> [6,4]
# [10,2,-2,24,1.5,0.6666666666666666]
My code so far:
from collections import Counter
def genOutputs(x,y,op=None):
results = []
if op == None:
op = str(y)
else:
op = "("+str(op)+")"
ops = ['+','-','*','/','rev/','rev-']
z = 0
#will do every operation to x and y now.
#op stores the last computated bit (of other functions)
while z < len(ops):
if z == 4:
try:
results.append(eval(str(y) + "/" + str(x)))
#yield eval(str(y) + "/" + str(x)), op + "/" + str(x)
except:
continue
elif z == 5:
results.append(eval(str(y) + "-" + str(x)))
#yield eval(str(y) + "-" + str(x)), op + "-" + str(x)
else:
try:
results.append(eval(str(x) + ops[z] + str(y)))
#yield eval(str(x) + ops[z] + str(y)), str(x) + ops[z] + op
except:
continue
z = z+1
return results
def pickTwo(array):
#returns an array with every 2-combo
#from input array
vomit = []
a,b = 0,1
while a < (len(array)-1):
choice = [array[a],array[b]]
vomit.append((choice,list((Counter(array) - Counter(choice)).elements())))
if b < (len(array)-1):
b = b+1
else:
b = a+2
a = a+1
return vomit
def reduceArray(array):
if len(array) == 2:
print("final",array)
return genOutputs(array[0],array[1])
else:
choices = pickTwo(array)
print(choices)
for choice in choices:
opsofchoices = genOutputs(choice[0][0],choice[0][1])
for each in opsofchoices:
newarray = list([each] + choice[1])
print(newarray)
return reduceArray(newarray)
reduceArray([1,2,3,4])

The largest issues when dealing with problems like this is handling operator precedence and parenthesis placement to produce every possible number from a given set. The easiest way to do this is to handle operations on a stack corresponding to the reverse polish notation of the infix notation. Once you do this, you can draw numbers and/or operations recursively until all n numbers and n-1 operations have been exhausted, and store the result. The below code generates all possible permutations of numbers (without replacement), operators (with replacement), and parentheses placement to generate every possible value. Note that this is highly inefficient since operators such as addition / multiplication commute so a + b equals b + a, so only one is necessary. Similarly by the associative property a + (b + c) equals (a + b) + c, but the below algorithm is meant to be a simple example, and as such does not make such optimizations.
def expr_perm(values, operations="+-*/", stack=[]):
solution = []
if len(stack) > 1:
for op in operations:
new_stack = list(stack)
new_stack.append("(" + new_stack.pop() + op + new_stack.pop() + ")")
solution += expr_perm(values, operations, new_stack)
if values:
for i, val in enumerate(values):
new_values = values[:i] + values[i+1:]
solution += expr_perm(new_values, operations, stack + [str(val)])
elif len(stack) == 1:
return stack
return solution
Usage:
result = expr_perm([4,5,6])
print("\n".join(result))

Python: find smallest missing positive integer in ordered list

I need to find the first missing number in a list. If there is no number missing, the next number should be the last +1.
It should first check to see if the first number is > 1, and if so then the new number should be 1.
Here is what I tried. The problem is here: if next_value - items > 1:
results in an error because at the end and in the beginning I have a None.
list = [1,2,5]
vlans3=list
for items in vlans3:
if items in vlans3:
index = vlans3.index(items)
previous_value = vlans3[index-1] if index -1 > -1 else None
next_value = vlans3[index+1] if index + 1 < len(vlans3) else None
first = vlans3[0]
last = vlans3[-1]
#print ("index: ", index)
print ("prev item:", previous_value)
print ("-cur item:", items)
print ("nxt item:", next_value)
#print ("_free: ", _free)
#print ("...")
if next_value - items > 1:
_free = previous_value + 1
print ("free: ",_free)
break
print ("**************")
print ("first item:", first)
print ("last item:", last)
print ("**************")
Another method:
L = vlans3
free = ([x + 1 for x, y in zip(L[:-1], L[1:]) if y - x > 1][0])
results in a correct number if there is a gap between the numbers, but if no space left error occurs: IndexError: list index out of range. However I need to specify somehow that if there is no free space it should give a new number (last +1). But with the below code it gives an error and I do not know why.
if free = []:
print ("no free")
else:
print ("free: ", free)

To get the smallest integer that is not a member of vlans3:
ints_list = range(min(vlans3), max(vlans3) + 1)
missing_list = [x for x in ints_list if x not in vlans3]
first_missing = min(missing_list)
However you want to return 1 if the smallest value in your list is greater than 1, and the last value + 1 if there are no missing values, so this becomes:
ints_list = [1] + list(range(min(vlan3), max(vlan3) + 2))
missing_list = [x for x in ints_list if x not in vlan3]
first_missing = min(missing_list)

First avoid using reserved word list for variable.
Second use try:except to quickly and neatly avoid this kind of issues.
def free(l):
if l == []: return 0
if l[0] > 1: return 1
if l[-1] - l[0] + 1 == len(l): return l[-1] + 1
for i in range(len(l)):
try:
if l[i+1] - l[i] > 1: break
except IndexError:
break
return l[i] + 1

How about a numpy solution? Below code works if your input is a sorted integer list with non-duplicating positive values (or is empty).
nekomatic's solution is a bit faster for small inputs, but it's just a fraction of a second, doesn't really matter. However, it does not work for large inputs - e.g. list(range(1,100000)) completely freezes on list comprehension with inclusion check. Below code does not have this issue.
import numpy as np
def first_free_id(array):
array = np.concatenate((np.array([-1, 0], dtype=np.int), np.array(array, dtype=np.int)))
where_sequence_breaks = np.where(np.diff(array) > 1)[0]
return where_sequence_breaks[0] if len(where_sequence_breaks)>0 else array[-1]+1
Prepend the array with -1 and 0 so np.diff works for empty and 1-element lists without breaking existing sequence's continuity.
Compute differences between consecutive values. Seeked discontinuity ("hole") is where the difference is bigger than 1.
If there ary any "holes" return the id of the first one, otherwise return the integer succeeding the last element.

Python Dynamic Knapsack

Right now I am attempting to code the knapsack problem in Python 3.2. I am trying to do this dynamically with a matrix. The algorithm that I am trying to use is as follows
Implements the memoryfunction method for the knapsack problem
Input: A nonnegative integer i indicating the number of the first
items being considered and a nonnegative integer j indicating the knapsack's capacity
Output: The value of an optimal feasible subset of the first i items
Note: Uses as global variables input arrays Weights[1..n], Values[1...n]
and table V[0...n, 0...W] whose entries are initialized with -1's except for
row 0 and column 0 initialized with 0's
if V[i, j] < 0
if j < Weights[i]
value <-- MFKnapsack(i - 1, j)
else
value <-- max(MFKnapsack(i -1, j),
Values[i] + MFKnapsack(i -1, j - Weights[i]))
V[i, j} <-- value
return V[i, j]
If you run the code below that I have you can see that it tries to insert the weight into the the list. Since this is using the recursion I am having a hard time spotting the problem. Also I get the error: can not add an integer with a list using the '+'. I have the matrix initialized to start with all 0's for the first row and first column everything else is initialized to -1. Any help will be much appreciated.
#Knapsack Problem
def knapsack(weight,value,capacity):
weight.insert(0,0)
value.insert(0,0)
print("Weights: ",weight)
print("Values: ",value)
capacityJ = capacity+1
## ------ initialize matrix F ---- ##
dimension = len(weight)+1
F = [[-1]*capacityJ]*dimension
#first column zeroed
for i in range(dimension):
F[i][0] = 0
#first row zeroed
F[0] = [0]*capacityJ
#-------------------------------- ##
d_index = dimension-2
print(matrixFormat(F))
return recKnap(F,weight,value,d_index,capacity)
def recKnap(matrix, weight,value,index, capacity):
print("index:",index,"capacity:",capacity)
if matrix[index][capacity] < 0:
if capacity < weight[index]:
value = recKnap(matrix,weight,value,index-1,capacity)
else:
value = max(recKnap(matrix,weight,value,index-1,capacity),
value[index] +
recKnap(matrix,weight,value,index-1,capacity-(weight[index]))
matrix[index][capacity] = value
print("matrix:",matrix)
return matrix[index][capacity]
def matrixFormat(*doubleLst):
matrix = str(list(doubleLst)[0])
length = len(matrix)-1
temp = '|'
currChar = ''
nextChar = ''
i = 0
while i < length:
if matrix[i] == ']':
temp = temp + '|\n|'
#double digit
elif matrix[i].isdigit() and matrix[i+1].isdigit():
temp = temp + (matrix[i]+matrix[i+1]).center(4)
i = i+2
continue
#negative double digit
elif matrix[i] == '-' and matrix[i+1].isdigit() and matrix[i+2].isdigit():
temp = temp + (matrix[i]+matrix[i+1]+matrix[i+2]).center(4)
i = i + 2
continue
#negative single digit
elif matrix[i] == '-' and matrix[i+1].isdigit():
temp = temp + (matrix[i]+matrix[i+1]).center(4)
i = i + 2
continue
elif matrix[i].isdigit():
temp = temp + matrix[i].center(4)
#updates next round
currChar = matrix[i]
nextChar = matrix[i+1]
i = i + 1
return temp[:-1]
def main():
print("Knapsack Program")
#num = input("Enter the weights you have for objects you would like to have:")
#weightlst = []
#valuelst = []
## for i in range(int(num)):
## value , weight = eval(input("What is the " + str(i) + " object value, weight you wish to put in the knapsack? ex. 2,3: "))
## weightlst.append(weight)
## valuelst.append(value)
weightLst = [2,1,3,2]
valueLst = [12,10,20,15]
capacity = 5
value = knapsack(weightLst,valueLst,5)
print("\n Max Matrix")
print(matrixFormat(value))
main()

F = [[-1]*capacityJ]*dimension
does not properly initialize the matrix. [-1]*capacityJ is fine, but [...]*dimension creates dimension references to the exact same list. So modifying one list modifies them all.
Try instead
F = [[-1]*capacityJ for _ in range(dimension)]
This is a common Python pitfall. See this post for more explanation.

for the purpose of cache illustration, I generally use a default dict as follows:
from collections import defaultdict
CS = defaultdict(lambda: defaultdict(int)) #if i want to make default vals as 0
###or
CACHE_1 = defaultdict(lambda: defaultdict(lambda: int(-1))) #if i want to make default vals as -1 (or something else)
This keeps me from making the 2d arrays in python on the fly...
To see an answer to z1knapsack using this approach:
http://ideone.com/fUKZmq

def zeroes(n,m):
v=[['-' for i in range(0,n)]for j in range(0,m)]
return v
value=[0,12,10,20,15]
w=[0,2,1,3,2]
v=zeroes(6,5)
def knap(i,j):
global v
if i==0 or j==0:
v[i][j]= 0
elif j<w[i] :
v[i][j]=knap(i-1,j)
else:
v[i][j]=max(knap(i-1,j),value[i]+knap(i-1,j-w[i]))
return v[i][j]
x=knap(4,5)
print (x)
for i in range (0,len(v)):
for j in range(0,len(v[0])):
print(v[i][j],end="\t\t")
print()
print()
#now these calls are for filling all the boxes in the matrix as in the above call only few v[i][j]were called and returned
knap(4,1)
knap(4,2)
knap(4,3)
knap(4,4)
for i in range (0,len(v)):
for j in range(0,len(v[0])):
print(v[i][j],end="\t\t")
print()
print()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Iterating and indexing - python

You should change this line for index, y in range(index + 1, 1000000000): to for y in range(index + 1, 1000000000):

Change for index, y in range(index + 1, 1000000000): to for index, y in enumerate(range(index + 1, 1000000000)): Although you may consider renaming your outer loop or inner loop index for clarity

Related

scope of a variable in python for this question

If, else return else value even when the condition is true, inside a for loop

Check result using 4 operations based with Python

Python: find smallest missing positive integer in ordered list

Python Dynamic Knapsack

Categories

Resources