Binary search: Not getting upper & lower bound for very large values - python

I'm trying to solve this cp problem, UVA - The Playboy Chimp using Python but for some reason, the answer comes wrong for very large values for example this input:
5
3949 45969 294854 9848573 2147483647
5
10000 6 2147483647 4959 5949583
Accepted output:
3949 45969
X 3949
9848573 X
3949 45969
294854 9848573
My output:
X 294854
X 294854
9848573 X
X 294854
45969 9848573
My code:
def bs(target, search_space):
l, r = 0, len(search_space) - 1
while l <= r:
m = (l + r) >> 1
if target == search_space[m]:
return m - 1, m + 1
elif target > search_space[m]:
l = m + 1
else:
r = m - 1
return r, l
n = int(input())
f_heights = list(set([int(a) for a in input().split()]))
q = int(input())
heights = [int(b) for b in input().split()]
for h in heights:
a, b = bs(h, f_heights)
print(f_heights[a] if a >= 0 else 'X', f_heights[b] if b < len(f_heights) else 'X')
Any help would be appreciated!

This is because you are inserting the first input to set, which changes the order of the numbers in the list. If you are using Python 3.6 or newer
dict maintains the insertion order, so you can use dict.fromkeys to maintain the order
f_heights = list(dict.fromkeys(int(a) for a in s.split()))
Example:
f_heights = list(set([int(a) for a in input().split()]))
print(f_heights) # [294854, 3949, 45969, 9848573, 2147483647]
f_heights = list(dict.fromkeys(int(a) for a in input().split()))
print(f_heights) # [3949, 45969, 294854, 9848573, 2147483647]

Related

Rhombus shape based on user input

I am trying to create a rhombus made out of letters that a user selects, using Python 3. So if a user selects "B" then the rhombus is
A
B B
A
If the user selects "D" the rhombus would be:
A
B B
C C C
D D D D
C C C
B B
A
Can anyone help me get started on this? As of now, I am thinking if a user selects D then that corresponds to 4 and you would use the equation 2k-1 to determine the size of the "square." I would also create a linked list containing all the letters
so letter = ['A', 'B', 'C', 'D'.... 'Z'] (or would a dictionary be better?)
so:
def rhombus(n):
squareSize = 2n-1
for i in range(1,squareSize):
for l in letter:
print l + "/n"
golfing time \o/
edit: there's of course an SE for code golf and i'll do as in rome
Python 3, 106 bytes
n=26
for x in range(-n, n):
x = abs(x)
print(' '*x+' '.join([chr(64+n-x) for _ in range(n-x)]))
Try it online!
explanation
for x in range(-n, n): generate the rows
' '*x: generate the space before each first letter in the row
chr(64+n-x): display the letter, with chr(65) = "A"
' '.join: join all letters with three spaces between each of them
for _ in range(n-x): will generate the right number of letters. the value itself is useless.
output for n=4:
A
B B
C C C
D D D D
C C C
B B
A
domochevski's answer is great but we don't actually need those imports.
def rhombus(char):
A = 64
Z = A + 26
try:
val = ord(char)
if val < A or val > Z:
return None
except:
return None
L = [ ''.join(([chr(x)]*(x-A))) for x in range(A,val+1) ]
L = [' '.join(list(x)) for x in L]
max_len = max(len(x) for x in L)
L = [x.center(max_len) for x in L]
L += L[-2::-1]
return '\n'.join(L)
print(rhombus('Z'))
Well that was an interesting question. Here is some quick and dirty way to do it:
from string import ascii_uppercase
def rhombus(c): # Where c is the chosen character
# Get the position (1-based)
n = ascii_uppercase.find(c.upper()) + 1
if 0 < n <= 26:
# Get the strings for the top of the rhombus without spaces
l = [ascii_uppercase[i] * ((i)+1) for i in range(n)]
# Add in the spaces
l = [' '.join(list(e)) for e in l]
# Align everything
max_len = max(len(e) for e in l)
l = [e.center(max_len) for e in l]
# Get the bottom from the top
l += l[-2::-1]
# Print the rhombus
for e in l:
print(e)
As I mentioned this is not beautiful code but it should work.

Approximate periods of strings - port Python code to F#

Given two strings u and v we can compute the edit distance using the popular Levenshtein algorithm. Using a method introduced in [1] by Sim et al. I was able to compute k-approximate periods of strings in Python with the following code
def wagnerFischerTable(a, b):
D = [[0]]
[D.append([i]) for i, s in enumerate(a, 1)]
[D[0].append(j) for j, t in enumerate(b, 1)]
for j, s in enumerate(b, 1):
for i, t in enumerate(a, 1):
if s == t:
D[i].append(D[i-1][j-1])
else:
D[i].append(
min(
D[i-1][j] + 1,
D[i][j-1] + 1,
D[i-1][j-1] +1
)
)
return D
def simEtAlTables(s, p):
D = []
for i in xrange(len(s)):
D.append(wagnerFischerTable(p, s[i:]))
return D
def approx(s, p):
D = simEtAlTables(s, p)
t = [0]
for i in xrange(1, len(s)+1):
cmin = 9000
for h in xrange(0, i):
cmin = min(
cmin,
max(t[h], D[h][-1][i-h])
)
t.append(cmin)
return t[len(s)]
I wanted to port this to F# however I wasn't successful yet and I am looking forward to get some feedback what might be wrong.
let inline min3 x y z =
min (min x y) z
let wagnerFischerTable (u: string) (v: string) =
let m = u.Length
let n = v.Length
let d = Array2D.create (m + 1) (n + 1) 0
for i = 0 to m do d.[i, 0] <- i
for j = 0 to n do d.[0, j] <- j
for j = 1 to n do
for i = 1 to m do
if u.[i-1] = v.[j-1] then
d.[i, j] <- d.[i-1, j-1]
else
d.[i, j] <-
min3
(d.[i-1, j ] + 1) // a deletion
(d.[i , j-1] + 1) // an insertion
(d.[i-1, j-1] + 1) // a substitution
d
let simEtAlTables (u: string) (v: string) =
let rec tabulate n lst =
if n <> u.Length then
tabulate (n+1) (lst # [wagnerFischerTable (u.Substring(n)) v])
else
lst
tabulate 0 []
let approx (u: string) (v: string) =
let tables = simEtAlTables u v
let rec kApprox i (ks: int list) =
if i = u.Length + 1 then
ks
else
let mutable curMin = 9000
for h = 0 to i-1 do
curMin <- min curMin (max (ks.Item h) ((tables.Item h).[i-h, v.Length - 1]))
kApprox (i+1) (ks # [curMin])
List.head (List.rev (kApprox 1 [0]))
The reason why it "doesn't work" is just that I am getting wrong values. The Python code passes all test cases while the F# code fails every test. I presume that I have errors in the functions simEtAlTables and/or approx. Probably something with the indices, especially accessing the three dimensional list of table in approx.
So here are three test cases which should cover different results:
Test 1: approx "abcdabcabb" "abc" -> 1
Test 2: approx "abababababab" "ab" -> 0
Test 3: approx "abcdefghijklmn" "xyz" -> 3
[1] http://www.lirmm.fr/~rivals/ALGOSEQ/DOC/SimApprPeriodsTCS262.pdf
This isn't functional in the least (neither is your Python solution), but here's a more direct translation to F#. Maybe you can use it as a starting point and make it more functional from there (although I'll hazard a guess it won't improve performance).
let wagnerFischerTable (a: string) (b: string) =
let d = ResizeArray([ResizeArray([0])])
for i = 1 to a.Length do d.Add(ResizeArray([i]))
for j = 1 to b.Length do d.[0].Add(j)
for j = 1 to b.Length do
for i = 1 to a.Length do
let s, t = b.[j-1], a.[i-1]
if s = t then
d.[i].Add(d.[i-1].[j-1])
else
d.[i].Add(
Seq.min [
d.[i-1].[j] + 1
d.[i].[j-1] + 1
d.[i-1].[j-1] + 1
])
d
let simEtAlTables (s: string) (p: string) =
let d = ResizeArray()
for i = 0 to s.Length - 1 do
d.Add(wagnerFischerTable p s.[i..])
d
let approx (s: string) (p: string) =
let d = simEtAlTables s p
let t = ResizeArray([0])
for i = 1 to s.Length do
let mutable cmin = 9000
for h = 0 to i - 1 do
let dh = d.[h]
cmin <- min cmin (max t.[h] dh.[dh.Count-1].[i-h])
t.Add(cmin)
t.[s.Length]
This code may help:
let levenshtein word1 word2 =
let preprocess = fun (str : string) -> str.ToLower().ToCharArray()
let chars1, chars2 = preprocess word1, preprocess word2
let m, n = chars1.Length, chars2.Length
let table : int[,] = Array2D.zeroCreate (m + 1) (n + 1)
for i in 0..m do
for j in 0..n do
match i, j with
| i, 0 -> table.[i, j] <- i
| 0, j -> table.[i, j] <- j
| _, _ ->
let delete = table.[i-1, j] + 1
let insert = table.[i, j-1] + 1
//cost of substitution is 2
let substitute =
if chars1.[i - 1] = chars2.[j - 1]
then table.[i-1, j-1] //same character
else table.[i-1, j-1] + 2
table.[i, j] <- List.min [delete; insert; substitute]
table.[m, n], table //return tuple of the table and distance
//test
levenshtein "intention" "execution" //|> ignore
You might also want to check this blog posting from Rick Minerich.

What's wrong with my Extended Euclidean Algorithm (python)?

My algorithm to find the HCF of two numbers, with displayed justification in the form r = a*aqr + b*bqr, is only partially working, even though I'm pretty sure that I have entered all the correct formulae - basically, it can and will find the HCF, but I am also trying to provide a demonstration of Bezout's Lemma, so I need to display the aforementioned displayed justification. The program:
# twonumbers.py
inp = 0
a = 0
b = 0
mul = 0
s = 1
r = 1
q = 0
res = 0
aqc = 1
bqc = 0
aqd = 0
bqd = 1
aqr = 0
bqr = 0
res = 0
temp = 0
fin_hcf = 0
fin_lcd = 0
seq = []
inp = input('Please enter the first number, "a":\n')
a = inp
inp = input('Please enter the second number, "b":\n')
b = inp
mul = a * b # Will come in handy later!
if a < b:
print 'As you have entered the first number as smaller than the second, the program will swap a and b before proceeding.'
temp = a
a = b
b = temp
else:
print 'As the inputted value a is larger than or equal to b, the program has not swapped the values a and b.'
print 'Thank you. The program will now compute the HCF and simultaneously demonstrate Bezout\'s Lemma.'
print `a`+' = ('+`aqc`+' x '+`a`+') + ('+`bqc`+' x '+`b`+').'
print `b`+' = ('+`aqd`+' x '+`a`+') + ('+`bqd`+' x '+`b`+').'
seq.append(a)
seq.append(b)
c = a
d = b
while r != 0:
if s != 1:
c = seq[s-1]
d = seq[s]
res = divmod(c,d)
q = res[0]
r = res[1]
aqr = aqc - (q * aqd)#These two lines are the main part of the justification
bqr = bqc - (q * aqd)#-/
print `r`+' = ('+`aqr`+' x '+`a`+') + ('+`bqr`+' x '+`b`+').'
aqd = aqr
bqd = bqr
aqc = aqd
bqc = bqd
s = s + 1
seq.append(r)
fin_hcf = seq[-2] # Finally, the HCF.
fin_lcd = mul / fin_hcf
print 'Using Euclid\'s Algorithm, we have now found the HCF of '+`a`+' and '+`b`+': it is '+`fin_hcf`+'.'
print 'We can now also find the LCD (LCM) of '+`a`+' and '+`b`+' using the following method:'
print `a`+' x '+`b`+' = '+`mul`+';'
print `mul`+' / '+`fin_hcf`+' (the HCF) = '+`fin_lcd`+'.'
print 'So, to conclude, the HCF of '+`a`+' and '+`b`+' is '+`fin_hcf`+' and the LCD (LCM) of '+`a`+' and '+`b`+' is '+`fin_lcd`+'.'
I would greatly appreciate it if you could help me to find out what is going wrong with this.
Hmm, your program is rather verbose and hence hard to read. For example, you don't need to initialise lots of those variables in the first few lines. And there is no need to assign to the inp variable and then copy that into a and then b. And you don't use the seq list or the s variable at all.
Anyway that's not the problem. There are two bugs. I think that if you had compared the printed intermediate answers to a hand-worked example you should have found the problems.
The first problem is that you have a typo in the second line here:
aqr = aqc - (q * aqd)#These two lines are the main part of the justification
bqr = bqc - (q * aqd)#-/
in the second line, aqd should be bqd
The second problem is that in this bit of code
aqd = aqr
bqd = bqr
aqc = aqd
bqc = bqd
you make aqd be aqr and then aqc be aqd. So aqc and aqd end up the same. Whereas you actually want the assignments in the other order:
aqc = aqd
bqc = bqd
aqd = aqr
bqd = bqr
Then the code works. But I would prefer to see it written more like this which is I think a lot clearer. I have left out the prints but I'm sure you can add them back:
a = input('Please enter the first number, "a":\n')
b = input('Please enter the second number, "b":\n')
if a < b:
a,b = b,a
r1,r2 = a,b
s1,s2 = 1,0
t1,t2 = 0,1
while r2 > 0:
q,r = divmod(r1,r2)
r1,r2 = r2,r
s1,s2 = s2,s1 - q * s2
t1,t2 = t2,t1 - q * t2
print r1,s1,t1
Finally, it might be worth looking at a recursive version which expresses the structure of the solution even more clearly, I think.
Hope this helps.
Here is a simple version of Bezout's identity; given a and b, it returns x, y, and g = gcd(a, b):
function bezout(a, b)
if b == 0
return 1, 0, a
else
q, r := divide(a, b)
x, y, g := bezout(b, r)
return y, x - q * y, g
The divide function returns both the quotient and remainder.
The python program that does what you want (please note that extended Euclid algorithm gives only one pair of Bezout coefficients) might be:
import sys
def egcd(a, b):
if a == 0:
return (b, 0, 1)
g, y, x = egcd(b % a, a)
return (g, x - (b // a) * y, y)
def main():
if len(sys.argv) != 3:
's program caluclates LCF, LCM and Bezout identity of two integers
usage %s a b''' % (sys.argv[0], sys.argv[0])
sys.exit(1)
a = int(sys.argv[1])
b = int(sys.argv[2])
g, x, y = egcd(a, b)
print 'HCF =', g
print 'LCM =', a*b/g
print 'Bezout identity: %i * (%i) + %i * (%i) = %i' % (a, x, b, y, g)
main()

Counting intersections for all combinations in a list of sets

I have a collection of sets. I want to find the number of items that are found only in the intersection for each combination of sets. I'm basically want to do the same thing as creating the numbers in a Venn diagram.
An basic example might make it clearer.
a = set(1,2,5,10,12)
b = set(1,2,6,9,12,15)
c = set(1,2,7,8,15)
I should end up with a count of items found only in:
a
b
c
the intersection of a and b
the intersection of a and c
the intersection of b and c
the intersection of a, b and c
A non-extensible way of doing this is
num_a = len(a - b - c) # len(set([5,10])) -> 2
num_b = len(b - a - c) # len(set([6,9])) -> 2
num_c = len(c - a - b) # len(set([7,8])) -> 2
num_ab = len((a & b) - c) # 1
num_ac = len((a & c) - b) # 0
num_bc = len((b & c) - a) # 1
num_abc = len(a & b & c) # 2
While this works for 3 sets my collection of sets is not static.
IIUC, something like this should work:
from itertools import combinations
def venn_count(named_sets):
names = set(named_sets)
for i in range(1, len(named_sets)+1):
for to_intersect in combinations(sorted(named_sets), i):
others = names.difference(to_intersect)
intersected = set.intersection(*(named_sets[k] for k in to_intersect))
unioned = set.union(*(named_sets[k] for k in others)) if others else set()
yield to_intersect, others, len(intersected - unioned)
ns = {"a": {1,2,5,10,12}, "b": {1,2,6,9,12,15}, "c": {1,2,7,8,15}}
for intersected, unioned, count in venn_count(ns):
print 'len({}{}) = {}'.format(' & '.join(sorted(intersected)),
' - ' + ' - '.join(sorted(unioned)) if unioned else '',
count)
which gives
len(a - b - c) = 2
len(b - a - c) = 2
len(c - a - b) = 2
len(a & b - c) = 1
len(a & c - b) = 0
len(b & c - a) = 1
len(a & b & c) = 2
You can use itertools.combinations to get all the possible combinations.
http://docs.python.org/2/library/itertools.html
I'd try using bit masks:
sets = [
set([1,2,5,10,12]),
set([1,2,6,9,12,15]),
set([1,2,7,8,15]),
]
d = {}
for n, s in enumerate(sets):
for i in s:
d[i] = d.get(i, 0) | (1 << n)
for mask in range(1, 2**len(sets)):
cnt = sum(1 for x in d.values() if x & mask == mask)
num = ','.join(str(j) for j in range(len(sets)) if mask & (1 << j))
print 'number of items in set(s) %s = %d' % (num, cnt)
Results for your input:
number of items in set(s) 0 = 5
number of items in set(s) 1 = 6
number of items in set(s) 0,1 = 3
number of items in set(s) 2 = 5
number of items in set(s) 0,2 = 2
number of items in set(s) 1,2 = 3
number of items in set(s) 0,1,2 = 2

how to group a data in python

I have a file with data like:
Entry Freq.
2 4.5
3 3.4
5 4.9
8 9.1
12 11.1
16 13.1
18 12.2
22 11.2
now the problem I am trying to solve is: I want to make it a grouped data (with range 10) based on the Entry and want to add up the frequencies falling within the range.
e.g. for above table if I group it then it should be like:
Range SumFreq.
0-10 21.9(i.e. 4.5 + 3.4 + 4.9 + 9.1)
11-20 36.4
I reached upto column separation with following code but can't be able to perform range separation thing:
my code is:
inp = ("c:/usr/ovisek/desktop/file.txt",'r').read().strip().split('\n')
for line in map(str.split,inp):
k = int(line[0])
l = float(line[-1])
so far is fine but how could I be able to group the data in 10 range.
One way would be to [ab]use the fact that integer division will give you the right bins:
import collections
bin_size = 10
d = collections.defaultdict(float)
for line in map(str.split,inp):
k = int(line[0])
l = float(line[-1])
d[bin_size * (k // bin_size)] += l
How about, just adding to your code there:
def group_data(range):
grouped_data = {}
inp = ("c:/usr/ovisek/desktop/file.txt",'r').read().strip().split('\n')
for line in map(str.split,inp):
k = int(line[0])
l = float(line[-1])
range_value = k // range
if grouped_data.has_key(range_value):
grouped_data[range_value]['freq'] = groped_data[range_value]['freq'] + l
else:
grouped_data[range_value] = {'freq':l, 'value':[str(range_value * range) + ':' + str((range_value + 1) * range )]}
return grouped_data
This should give you a dictionary like:
{1 : {'value':'0-10', 'freq':21.9} , .... }
This should get you started, tested fine:
inp = open("/tmp/input.txt",'r').read().strip().split('\n')
interval = 10
index = 0
resultDict = {}
for line in map(str.split,inp):
k = int(line[0])
l = float(line[-1])
rangeNum = (int) ((k-1)/10 )
rangeKeyName = str(rangeNum*10+1)+"-"+str((rangeNum+1)*10)
if(rangeKeyName in resultDict):
resultDict[rangeKeyName] += l
else:
resultDict[rangeKeyName] = l
print(str(resultDict))
Would output:
{'21-30': 11.199999999999999, '11-20': 36.399999999999999, '1-10': 21.899999999999999}
you can do something like this:
fr = {}
inp = open("file.txt",'r').read().strip().split('\n')
for line in map(str.split,inp):
k = int(line[0])
l = float(line[-1])
key = abs(k-1) / 10 * 10
if fr.has_key(key):
fr[key] += l
else:
fr[key] = l
for k in sorted(fr.keys()):
sum = fr[k]
print '%d-%d\t%f' % (k+1 if k else 0, k+10, sum)
output:
0-10 21.900000
11-20 36.400000
21-30 11.200000

Categories

Resources