Actually this is a Python in GIS, so I use table in my Arcgis and try to count the field and divided it by using category.
I have Field named Elevation
the data contain integer example :
1 - 2
3 - 6
2 - 3
8.5 - 12
11 - 12
I need to categorize it using rule that if
Elevation < 1 then Index = 0.3 ,if Elevation = 2 - 3 Index = 0.6, if Elevation > 3 Index = 1
I have this code :
def Reclass( Elevation ):
r_min, r_max = (float(s.strip()) for s in Elevation.split('-'))
print "r_min: {0}, r_max: {1}".format(r_min,r_max)
if r_min < 1 and r_max < 1:
return 0.333
elif r_min >= 1 and r_max >= 1 and r_min <= 3 and r_max <= 3:
return 0.666
elif r_min > 3 and r_max > 3:
return 1
elif r_min <= 3 and r_max > 3:
return 1
else:
return 999
my question is how to strip it, and categorized it using my rule above?
Thanks before
Based on comments, your field is a string that contains ranges of the form you describe above.
Firstly, this is horrible database design. The minimum and maximum should be separate columns of integer types. shakes fist at ESRI more for discouraging good database design
Furthermore, your rule is insufficient for dealing with a range. A range check would either need to compare against either 1 end of the range or both ends. So you will have to clarify exactly what you want for your "indexing" rule.
Given that you have strings representing ranges, your only option is to parse the range into its minimum and maximum and work with those. That's not too hard in Python:
>>> r = "3 - 6"
>>> r_min, r_max = (int(s.strip()) for s in r.split('-'))
>>> r_min
3
>>> r_max
6
What does this do?
It's pretty simple, actually. It splits the string by the -. Then it loops over the resulting list, and each element has its leading and trailing whitespace removed and is then converted into an int. Finally, Python unpacks the generator on the right to fill in the variables on the left.
Be aware that malformed data will cause errors.
Once you've clarified your "index" rule, you can figure out how to use this minimum and maximum to get your "index".
I have borrowed code from you and #jpmc26 below. This code (minus the print statements that are just there for testing) should work for you in the Field Calculator of ArcMap but it is simply Python code. The problem is that you have not told us what you want to do when the two ends of a range fall into different categories so for now I have used an else statement to put out 999.
def Reclass( Elevation ):
r_min, r_max = (float(s.strip()) for s in Elevation.split('-'))
print "r_min: {0}, r_max: {1}".format(r_min,r_max)
if r_min < 1 and r_max < 1:
return 0.333
elif r_min >= 1 and r_max >= 1 and r_min <= 3 and r_max <= 3:
return 0.666
elif r_min > 3 and r_max > 3:
return 1
else:
return 999
print Reclass("0 - 1.1")
print Reclass("5.2 - 10")
print Reclass("2 - 3")
print Reclass("0 - 0")
Related
I'm trying to find out how many times you have to throw the dice to get on file 5 100 times(board is played from 0 to 5). This is how I tried(I know the answer is 690 but I don't know what I'm doing wrong).
from random import *
seed(8)
five = 0
count = 0
add = 0
while five < 100:
count = count + 1
print(randint(1,6))
add = add + randint(1,6)
if add % 5 == 0 :
five = five + 1
else: add = add + randint(1,6)
print(count)
This is the code I think you were trying to write. This does average about 600. Is it possible your "answer" came from Python 2? The random seed algorithm is quite likely different.
from random import *
seed(8)
five = 0
count = 0
add = 0
while five < 100:
count += 1
r = randint(0,5)
if r == 5:
five += 1
else:
add += r
print(count, add)
You're adding a second dice throw every time you don't get on 5, this makes the probability distribution irregular (i.e. advancing by 7 will be more probable (1/6) than any other value, e.g. 1/9 for 5) so your result will not be the same as counting single throws.
BTW there is no fixed result for this, just a higher probability around a given number of throws. However, given that you seeded the random number generator with a constant, every run should give the same result. And it should be the right one if you don't double throw the dice.
Here is an example of the process that arrives at 690:
import random
random.seed(8)
fiveCount = 0
throwCount = 0
position = 0
while fiveCount < 100:
position = (position + random.randint(1,6)) % 6
throwCount += 1
fiveCount += position == 5
print(throwCount) # 690
Other observations:
Updating the position wraps around using modulo 6 (there are 6 positions from 0 to 5 inclusively)
Your check of add%5 == 0 does not reflect this. It should have been add%6 == 5 instead but it is always preferable to model the computation as close as possible to the real world process (so keep the position in the 0...5 range)
I have a homework assignment in which we have to write a program that outputs the change to be given by a vending machine using the lowest number of coins. E.g. £3.67 can be dispensed as 1x£2 + 1x£1 + 1x50p + 1x10p + 1x5p + 1x2p.
However, my program is outputting the wrong numbers. I know there will probably be rounding issues, but I think the current issue is to do with my method of coding this.
change=float(input("Input change"))
twocount=0
onecount=0
halfcount=0
pttwocount=0
ptonecount=0
while change!=0:
if change-2>-1:
change=change-2
twocount+=1
else:
if change-1>-1:
change=change-1
onecount+=1
else:
if change-0.5>-1:
change=change-0.5
halfcount+=1
else:
if change-0.2>-1:
change=change-0.2
pttwocount+=1
else:
if change-0.1>-1:
change=change-0.1
ptonecount+=1
else:
break
print(twocount,onecount,halfcount,pttwocount,ptonecount)
RESULTS:
Input: 2.3
Output: 11010
i.e. 3.2
Input: 3.2
Output:20010
i.e. 4.2
Input: 2
Output: 10001
i.e. 2.1
All your comparisons use >-1, so you give out change as long as you have more than -1 balance.
This would be correct if you were only dealing with integers, since there >-1 is equal to >=0.
For floating point numbers however, we have for example -0.5>-1, so we will give out change for negative balance (which we do not want).
So the correct way would be to replace all >-1 comparisons by >=0 (larger or equal to 0) comparisons.
The problem is how it calculates the change using your if/else statements. If you walk through the first example change-2>-1 will register true and then result will be .3 but on the next loop the if change - 1 > -1 you are expecting to be false but it's not it's actually -0.7. One of the best ways to do this would be with Python's floor // and mod % operators. You have to round some of the calculations because of the way Python handles floats
change=float(input("Input change"))
twocount=0
onecount=0
halfcount=0
pttwocount=0
ptonecount=0
twocount = int(change//2)
change = round(change%2,1)
if change//1 > 0:
onecount = int(change//1)
change = round(change%1,1)
if change//0.5 > 0:
halfcount = int(change//0.5)
change = round(change%0.5, 1)
if change//0.2 > 0:
pttwocount = int(change//0.2)
change = round(change%0.2, 1)
if change//0.1 > 0:
ptonecount = int(change//0.1)
change = round(change%0.1,1)
print(twocount,onecount,halfcount,pttwocount,ptonecount)
But given the inputs this produces
Input: 2.3
Output: 1 0 0 1 1
Input: 3.2
Output:1 1 0 1 0
Input: 2
Output: 1 0 0 0 0
I have written the following python code to solve one of the Rosalind problems (http://rosalind.info/problems/cons/) and for some reason, Rosalind says the answer is wrong but I did some spot-checking and it appears right.
The problem is as follows:
Given: A collection of at most 10 DNA strings of equal length (at most 1 kbp) in FASTA format.
Return: A consensus string and profile matrix for the collection. (If several possible consensus strings exist, then you may return any one of them.)
A sample dataset is:
>Rosalind_1
ATCCAGCT
>Rosalind_2
GGGCAACT
>Rosalind_3
ATGGATCT
>Rosalind_4
AAGCAACC
>Rosalind_5
TTGGAACT
>Rosalind_6
ATGCCATT
>Rosalind_7
ATGGCACT
A sample solution is:
ATGCAACT
A: 5 1 0 0 5 5 0 0
C: 0 0 1 4 2 0 6 1
G: 1 1 6 3 0 1 0 0
T: 1 5 0 0 0 1 1 6
My attempt to solve this:
from Bio import SeqIO
A,C,G,T = [],[],[],[]
consensus=""
for i in range(0,len(record.seq)):
countA,countC,countG,countT=0,0,0,0
for record in SeqIO.parse("fasta.txt", "fasta"):
if record.seq[i]=="A":
countA=countA+1
if record.seq[i]=="C":
countC=countC+1
if record.seq[i]=="G":
countG=countG+1
if record.seq[i]=="T":
countT=countT+1
A.append(countA)
C.append(countC)
G.append(countG)
T.append(countT)
if countA >= max(countC,countG,countT):
consensus=consensus+"A"
elif countC >= max(countA,countG,countT):
consensus=consensus+"C"
elif countG >= max(countA,countC,countT):
consensus=consensus+"G"
elif countT >= max(countA,countC,countG):
consensus=consensus+"T"
print("A: "+" ".join([str(i) for i in A]))
print("C: "+" ".join([str(i) for i in C]))
print("G: "+" ".join([str(i) for i in G]))
print("T: "+" ".join([str(i) for i in T]))
print(consensus)
Would be great if someone can take a look and suggest what I am doing wrong? Many thanks!
For your consensus string, your code is not handling the case in which you have a tie, i.e., two nucleotides in a given position are equally frequent. The way your code is written now, this case will result in nothing being printed at that position in the consensus string
in this part
if countA >= max(countC,countG,countT):
consensus=consensus+"A"
elif countC >= max(countA,countG,countT):
consensus=consensus+"C"
elif countG >= max(countA,countC,countT):
consensus=consensus+"G"
elif countT >= max(countA,countC,countG):
consensus=consensus+"T"
Use this instead and you will get your Consensus sequences correctly
if countA[i] >= max(countC[i],countG[i],countT[i]):
consensus+="A"
if countC[i] >= max(countA[i],countG[i],countT[i]):
consensus+="C"
if countG[i] >= max(countA[i],countC[i],countT[i]):
consensus+="G"
if countT[i] >= max(countA[i],countC[i],countG[i]):
consensus+="T"
I have a data set with float values:
dog-auto dog-bird dog-cat dog-dog Result
41.9579761457 41.7538647304 36.4196077068 33.4773590373 0
46.0021331807 41.33958925 38.8353268874 32.8458495684 0
42.9462290692 38.6157590853 36.9763410854 35.0397073189 0
41.6866060048 37.0892269954 34.575072914 33.9010327697 0
39.2269664935 38.272288694 34.778824791 37.4849250909 0
40.5845117698 39.1462089236 35.1171578292 34.945165344 0
45.1067352961 40.523040106 40.6095830913 39.0957278345 0
41.3221140974 38.1947918393 39.9036867306 37.7696131032 0
41.8244654995 40.1567131661 38.0674700168 35.1089144603 0
45.4976929401 45.5597962603 42.7258732951 43.2422832585 0
This is an SFrame. I have attempted to write a function that uses an if/an statement to determine if the value for dog-dog is less that the values for dog-ct AND dog-auto AND dog-bird.
I've gone through this for the better part of 4 hours. Admittedly I'm a newby to python - I'm making a illy mistake and just not seeing it.
If statement:
def is_dog_correct(row):
if (dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-cat']]) & (dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-bird']]) & (dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-auto']]):
dog_distances['Result'] = 1
else:
dog_distances['Result'] = 0
then I call the function with:
dog_distances.apply(is_dog_correct)
If this was working correctly, I would see "0" in every row but the fifth record. What is wrong with my if statement?
Full disclosure - this is coursework, but after spending 4 hours on this, I'm reaching for help!
Change & to and as indicated by the previous comments. Also, I recommend you break up such long if statements into multiple lines so it's clearer and easier to read.
def is_dog_correct(row):
if (dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-cat']]) and
(dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-bird']]) and
(dog_distances[dog_distances['dog-dog']] < dog_distances[dog_distances['dog-auto']]):
dog_distances['Result'] = 1
else:
dog_distances['Result'] = 0
Make your first if statement more clean by finding the min (minimum) of all of the values. This makes sure that 'dog-dog' is less than all of the rest:
def is_dog_correct(row):
if dog_distances[dog_distances['dog-dog']] < min([dog_distances[dog_distances['dog-'+x]] for x in ['cat','bird','auto']]):
dog_distances['Result'] = 0
else:
dog_distances['Result'] = 1
EDIT: For debuggin purposes use the following:
def is_dog_correct(row):
print 'dog is {}'.format(dog_distances[dog_distances['dog-dog']])
print 'everyone else is {}'.format([dog_distances[dog_distances['dog-'+x]] for x in ['cat','bird','auto']])
if dog_distances[dog_distances['dog-dog']] < min([dog_distances[dog_distances['dog-'+x]] for x in ['cat','bird','auto']]):
print 'Yay dog is faster'
dog_distances['Result'] = 0
else:
print 'Awww, dog is not faster'
dog_distances['Result'] = 1
I have an assignment to do. The problem is something like this. You give a number, say x. The program calculates the square of the numbers starting from 1 and prints it only if it's a palindrome. The program continues to print such numbers till it reaches the number x provided by you.
I have solved the problem. It works fine for uptil x = 10000000. Works fine as in executes in a reasonable amount of time. I want to improve upon the efficiency of my code. I am open to changing the entire code, if required. My aim is to make a program that could execute 10^20 within around 5 mins.
limit = int(input("Enter a number"))
def palindrome(limit):
count = 1
base = 1
while count < limit:
base = base * base #square the number
base = list(str(base)) #convert the number into a list of strings
rbase = base[:] #make a copy of the number
rbase.reverse() #reverse this copy
if len(base) > 1:
i = 0
flag = 1
while i < len(base) and flag == 1:
if base[i] == rbase[i]: #compare the values at the indices
flag = 1
else:
flag = 0
i += 1
if flag == 1:
print(''.join(base)) #print if values match
base = ''.join(base)
base = int(base)
base = count + 1
count = count + 1
palindrome(limit)
He're my version:
import sys
def palindrome(limit):
for i in range(limit):
istring = str(i*i)
if istring == istring[::-1]:
print(istring,end=" ")
print()
palindrome(int(sys.argv[1]))
Timings for your version on my machine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin1.py 100000
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.457s
user 0m0.437s
sys 0m0.012s
and for mine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin2.py 100000
0 1 4 9
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.122s
user 0m0.104s
sys 0m0.010s
BTW, my version gives more results (0, 1, 4, 9).
Surely something like this will perform better (avoiding the unnecessary extra list operations) and is more readable:
def palindrome(limit):
base = 1
while base < limit:
squared = str(base * base)
reversed = squared[::-1]
if squared == reversed:
print(squared)
base += 1
limit = int(input("Enter a number: "))
palindrome(limit)
I think we can do it a little bit easier.
def palindrome(limit):
count = 1
while count < limit:
base = count * count # square the number
base = str(base) # convert the number into a string
rbase = base[::-1] # make a reverse of the string
if base == rbase:
print(base) #print if values match
count += 1
limit = int(input("Enter a number: "))
palindrome(limit)
String into number and number into string conversions were unnecessary. Strings can be compared, this is why you shouldn't make a loop.
You can keep a list of square palindromes upto a certain limit(say L) in memory.If the Input number x is less than sqrt(L) ,you can simply iterate over the list of palindromes and print them.This way you wont have to iterate over every number and check if its square is palindrome .
You can find a list of square palindromes here : http://www.fengyuan.com/palindrome.html
OK, here's my program. It caches valid suffixes for squares (i.e. the values of n^2 mod 10^k for a fixed k), and then searches for squares which have both that suffix and start with the suffix reversed. This program is very fast: in 24 seconds, it lists all the palindromic squares up to 10^24.
from collections import defaultdict
# algorithm will print palindromic squares x**2 up to x = 10**n.
# efficiency is O(max(10**k, n*10**(n-k)))
n = 16
k = 6
cache = defaultdict(list)
print 0, 0 # special case
# Calculate everything up to 10**k; these will be the prefix/suffix pairs we use later
tail = 10**k
for i in xrange(tail):
if i % 10 == 0: # can't end with 0 and still be a palindrome
continue
sq = i*i
s = str(sq)
if s == s[::-1]:
print i, s
prefix = int(str(sq % tail).zfill(k)[::-1])
cache[prefix].append(i)
prefixes = sorted(cache)
# Loop through the rest, but only consider matching prefix/suffix pairs
for l in xrange(k*2+1, n*2+1):
for p in prefixes:
low = (p * 10**(l-k))**.5
high = ((p+1) * 10**(l-k))**.5
low = int(low / tail) * tail
high = (int(high / tail) + 1) * tail
for n in xrange(low, high, tail):
for suf in cache[p]:
x = n + suf
s = str(x*x)
if s == s[::-1]:
print x, s
Sample output:
0 0
1 1
2 4
3 9
11 121
22 484
26 676
101 10201
111 12321
121 14641
202 40804
212 44944
<snip>
111010010111 12323222344844322232321
111100001111 12343210246864201234321
111283619361 12384043938083934048321
112247658961 12599536942224963599521
128817084669 16593841302620314839561
200000000002 40000000000800000000004