Thinking Python

Thinking Python - python

I've been learning Python lately and tonight I was playing around with a couple of examples and I just came up with the following for fun:
#!/usr/bin/env python
a = range(1,21) # Range of numbers to print
max_length = 1 # String length of largest number
num_row = 5 # Number of elements per row
for l in a:
ln = len(str(l))
if max_length <= ln:
max_length = ln
for x in a:
format_string = '{:>' + str(max_length) + 'd}'
print (format_string).format(x),
if not x % num_row and x != 0:
print '\n',
Which outputs the following:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
The script is doing what I want, which is to print aligned rows of 5 numbers per row, calculating the largest width plus one; but I'm almost convinced that there is either a:
more "pythonic" way to do this
more efficient way to do this.
I'm not an expert in big O by any means but I believe that my two for loops change this from an O(n) to at least O(2n), so I would really like to see if it's possible to combine them somehow. I'm also not too keen on my format_string declaration, is there a better way to do that? You aren't helping me cheat on homework or anything, I think this would pass most Python classes, I just want to wrap my head more around the Python way of thinking as I'm coming primarily from Perl (not sure if it shows :). Thanks in advance!

You don't need to make format_string every time. Using str.rjust, you don't need to use format string.
Instead of using x % num_row (an element of list), use i (1-based index using enumerate(a, 1)). Think about a case a = range(3, 34).
You can drop i == 0 becaue i will never be 0.
not x % num_row is hard to understand. Use x % num_row == 0 instead.
a = range(1,21)
num_row = 5
a = map(str, a)
max_length = len(max(a, key=len))
for i, x in enumerate(a, 1):
print x.rjust(max_length),
if i % num_row == 0:
print

I think you could do more pythonic calculation of maxlength :)
max_length = len(str(max(a)))
if your numbers could be negative or float
max_length = max([len(str(x)) for x in a])

Another entry. Just to add one with a little functional programming. :-)
n = 20
f = lambda x: str(x).rjust(len(str(n+1))) + (" " if x % 5 else "\n")
print "".join(map(f, range(1,n+1))),

I'm not sure this is not more of a pessimization :-) but building on falsetru's answer a bit, we can use itertools.groupby to group each row by its row index. Because groupby needs a key, we have to enumerate the values, and then discard the enumeration index afterward:
a = range(1,21)
num_row = 5
a = map(str, a)
max_length = len(max(a, key=len))
(same as before, but now:)
from itertools import groupby
# assumes / is integer division - use // if needed
# (I had // but SO formats it as a comment)
for _, g in groupby(enumerate(a), lambda x: x[0] / num_row):
print ' '.join(x.rjust(max_length) for _, x in g)
Here each group g consists of all the (enumerated) values that make up each row, with their row number in front, so the inner generator for ' '.join needs to discard the row index (for _, x in g). That leaves just the string x, which gets right adjusted as before, and then the right-adjusted strings are joined with spaces between them. The resulting string is ready to be printed as a complete line.

Related

Python: Given a string s and an integer C I have to find the summation of difference of number of each alphabet and C

If number of occurence is less than C it should be ignored i.e. sum of max(number of occurrence - C,0).
for example if the string is aabbaacdddd and C is 2 then output should be 4. There are 4 a's , 4 d's , 2 b's so sum of(4-2,4-2,2-2) = 4 . There is 1 c but since C > 1 the difference is taken to be zero.
Below is my code.
T = int(input())
for _ in range(0,T):
N,Q = map(int,input().split())
s = input()
#print(ord("a"))
#print(ord("z"))
for q in range(0,Q):
C = int(input())
m = [0] * 26
for i in s:
m[ord(i)-97] = m[ord(i)-97] + 1
#print(m)
ans = []
m.sort(reverse=True)
for i in m:
if i < C:
break
ans.append(i-C)
#print(ans)
print(sum(ans))
I am getting a time limit exceeded in this. What would be a faster way to do this?
I would prefer a solution that does not use built-ins or a dictionary
The constraints are-
All characters in s are lowercase alphabets ,
T,N,Q < 10^5 ,
C <= 10^9 ,

This will work under the given constraints. Assuming the string only consists of lower and uppercase alphabets.
for _ in range(int(input().strip())):
N, Q = map(int, input().strip().split())
s = input().strip()
frequencies = [0] * 26
[frequencies.__setitem__(ord(k) - 97, frequencies[ord(k) - 97] + 1) for k in list(s)]
# To get something like [4,4,1,2,0,0,0,0,0,....]
counts = list(sorted(filter(lambda x: x > 0, frequencies)))
# counts = [1,2,4,4]
for __ in range(N):
C = int(input().strip())
# Find the index of the value just greater in counts.
for i, c in enumerate(counts):
if c > C:
break
if i >= 0 and i < len(counts): # If i is within range. Print the sum from thereon.
print(max(sum(counts[i:]) - C * len(counts[i:]), 0)) # Subtract C from the individual counts
else:
print(0)
And to answer why your code exceeds time limit.
You are iterating over the entire string s. Inside the for loop of the queries.
So if len(s)=10^5 and len(N)=10^5 you will make 10^10 iterations. Or O(n^2)

Ok. First of all sorting is a no-no because that will push your complexity to O(n log(n)).
Make an array for taking the count of the 26 alphabets.
Then, count the alphabets in the string using a single loop i.e. O(n).
Go through your count array and apply the formula you stated above.
Overall complexity would be O(n), which I think what they require.
If you want to leverage all the Python power, then,
from collections import Counter
a = Counter(T) # it will return a dict with counts of all unique characters
Iterate through the returned dict and you will get your answer.
But, I suggest you use the O(n) approach above. Your code will pass then.

How to increment a variable inside of a list comprehension

I have a python function which increments the character by 1 ascii value if the index is odd and decreases the ascii value by 1 if index is even .I want to convert it to a function which do the same increment and decrement but in the next set i want to increment by 2 and then decrement by -2 then by 3 and -3 and so on..
What Iam trying to do is to increment the counter variable by 1 each time an even index occurs after performing the ascii decrement.
I also dont want to do it with a for loop is there any way to do it in the list comprehension itself?
In my function if the input is
input :'abcd' output: is 'badc' what i want is 'baeb'
input :'cgpf' output: is 'dfqe' what i want is 'dfrd'
def changer(s):
b=list(s)
count=1
d=[chr(ord(b[i])+count) if i%2==0 else chr(ord(b[i])-count) for i in range(0,len(b))]
return ''.join(d)
I need something like count++ as show but sadly python dont support it.
def changer(s):
b=list(s)
count=1
d=[chr(ord(b[i])+count) if i%2==0 else chr(ord(b[i])-count++) for i in range(0,len(b))]
return ''.join(d)
Here is the runnable code

If I correctly understood what you're after, something like this (compact form) should do:
def changer(s):
return "".join(chr(ord(c) + ((i // 2) + 1) * (-1 if i % 2 else 1))
for i, c in enumerate(s))
We get index and character from string by means of enumerate() and use that to feed a generator comprehension (as asked for) far of index (i) and character (c) from the string.
For each ASCII value of c, we add result of integer division of i (incremented by one as index was 0 based) by 2 and multiply it by (-1 if i % 2 else 1) which flips +/- based on even/odd number: multiply by -1 when modulo of i division by 2 is non-zero (and bool() evaluates as True), otherwise use 1.
Needless to say: such comprehension is not necessarily easy to read, so if you'd use it in your script, it would deserve a good comment for future reference. ;)

Combine a stream of plus/minus with the string.
import itertools
s = 'abcdefg'
x = range(1,len(s)+1)
y = range(-1,-len(s)+1,-1)
z = itertools.chain.from_iterable(zip(x,y))
r = (n + ord(c) for (n,c) in zip(z,s))
''.join(chr(n) for n in r)
I don't think I'd try to put it all in one line. Uses generator expressions instead of list comprehensions.

Try the code below,
def changer(s):
return ''.join([chr(ord(j) + (i // 2 + 1)) if i % 2 == 0 else chr(ord(j) - (i // 2 + 1)) for i, j in enumerate(s)])
s = 'abcd'
changer(s)
output
baeb

Python sorting arrays to get two digit values

I have an array A = [1 - 100] and I need to find the sum of all the two digit values in this array. How would I approach this? I have tried :
def solution(A):
A =array[0-100])
while A > 9 & A < 99
total = sum(A)
print "%s" % total
)
Is there a function that given an array consisting of N integers returns the sum of all two digit numbers i.e A = [1,1000,80, -91] the function should return -11(as the two are 80 and -91). not a range, multiple array

You can use a list comprehension and check if the length of the string-format is equal to 2, like so:
sum([x if len(str(x))==2 else 0 for x in xrange(1,101)])

Use the keyword and rather than the bitwise &.
Edit: a fuller answer, as that's not the only thing wrong:
def solution():
A = range(101)
total = sum([a for a in A if 9 < a <= 99])
print total
This uses list comprehension and chained inequalities, so is pretty 'pythonic'.

There is tons of errors in your code, please next time before posting,spend some time try to figure it out yourself and be sure that your code at lest doesn't contain any obvious syntax error.
By array, I assume you're talking about a list. And change it to range(101) for every number from 0 to 100
def solution(A):
return sum([x for x in range(A) if len(str(abs(x))) == 2])
print(solution(101))
As a side note, use and instead of & since that's a bitwise-or sign.

Here are a couple of ways to go about the problem, the first is most similar to the approach you appear to be trying:
def solution1(array):
total = 0
for a in array:
if 9 < a < 100:
total += a
return total
print(solution1(range(101)))
And here's a more compact solution using a comprehension (actually, a generator expression):
def solution2(array):
return sum(a for a in array if 9 < a < 100)
print(solution2(range(101)))
Note that in your original you're confusing loops and conditionals.

Is there a better way to write the following method in python?

I am writing a small program, in python, which will find a lone missing element from an arithmetic progression (where the starting element could be both positive and negative and the series could be ascending or descending).
so for example: if the input is 1 3 5 9 11, then the function should return 7 as this is the lone missing element in the above AP series.
The input format: the input elements are separated by 1 white space and not commas as is commonly done.
Here is the code:
def find_missing_elm_ap_series(n, series):
ap = series
ap = ap.split(' ')
ap = [int(i) for i in ap]
cd = []
for i in range(n-1):
cd.append(ap[i+1]-ap[i])
common_diff = 0
if len(set(cd)) == 1:
print 'The series is complete'
return series
else:
cd = [abs(i) for i in cd]
common_diff = min(cd)
if ap[0] > ap[1]:
common_diff = (-1)*common_diff
new_ap = []
for i in range(n+1):
new_ap.append(ap[0] + i*common_diff)
missing_element = set(new_ap).difference(set(ap))
return missing_element
where n is the length of the series provided (the series with the missing element:5 in the above example).
I am sure there are other shorter and more elegant way of writing this code in python. Can anybody help ?
Thanks
BTW: i am learning python by myself and hence the question.

Based on the fact that if an element is missing it is exactly expected-sum(series) - actual-sum(series). The expected sum for a series with n elements starting at a and ending at b is (a+b)*n/2. The rest is Python:
def find_missing(series):
A = map(int, series.split(' '))
a, b, n, sumA = A[0], A[-1], len(A), sum(A)
if (a+b)*n/2 == sumA:
return None #no element missing
return (a+b)*(n+1)/2-sumA
print find_missing("1 3 5 9") #7
print find_missing("-1 1 3 5 9") #7
print find_missing("9 6 0") #3
print find_missing("1 2 3") #None
print find_missing("-3 1 3 5") #-1

Well... You can do simpler, but it would completely change your algorithm.
First, you can prove that the step for the arithmetic progression is ap[1] - ap[0], unless ap[2] - ap[1] is lower in magnitude than it, in which case the missing element is between terms 0 and 1. (This is true as there is a single missing element.)
Then you can just take ap[0] + n * step and print the first one that doesn't match.
Here is the source code (also implementing some minor shortcuts, such as grouping your first three lines into one):
def find_missing_elm_ap_series(n, series):
ap = [int(i) for i in series.split(' ')]
step = ap[1] - ap[0]
if (abs(ap[2] - ap[1]) <= abs(step)): # Check missing elt is not between 0 and 1
return ap[0] + ap[2] - ap[1]
for (i, val) in zip(range(len(ap)), ap): # And check position of missing element
if ap[0] + i * step != val:
return ap[0] + i * step
return series # missing element not found

The code appears to be working. There is perhaps a slightly easier way to get it done. This is due to the fact that you don't have to attempt to look through all of the values to get the common difference. The following code simply looks at the difference between the 1st and 2nd as well as the last and second last.
This works in the event that only a single value is missing (and the length of the list is at least 3). As the min difference between the values will provide you the common difference.
def find_missing(prog):
# First we cast them to numbers.
items = [int(x) for x in prog.split()]
#Then we compare the first and second
first_to_second = items[1] - items[0]
#then we compare the last to second last
last_to_second_last = items[-1] - items[-2]
#Now we have to care about which one is closes
# to zero
if abs(first_to_second) < abs(last_to_second_last):
change = first_to_second
else:
change = last_to_second_last
#Iterate through the list. As soon as we find a gap
#that is larger than change, we fill in and return
for i in range(1, len(items)):
comp = items[i] - items[i-1]
if comp != change:
return items[i-1] + change
#There was no gap
return None
print(find_missing("1 3 5 9")) #7
print(find_missing("-1 1 3 5 9")) #7
print(find_missing("9 6 0")) #3
print(find_missing("1 2 3")) #None
The previous code shows this example. First of all attempting to find change between each of the values of the list. Then iterating till the change is missed, and returning the value that has been expected.

Here's the way I thought about it: find the position of the maximum difference between the elements of the array; then regenerate the expected number in the sequence from the other differences (which should be all the same and the minimum number in the differences list):
def find_missing(a):
d = [a[i+1] - a[i] for i in range(len(a)-1)]
i = d.index(max(d))
x = min(d)
return a[0] + (i+1)*x
print find_missing([1,3,5,9,11])
7
print find_missing([1,5,7,9,11])
3

Here are some ideas:
Passing the length of the series seems like a bad idea. The function can more easily calculate the length
There is no reason to assign series to ap, just do a function using series and assign the result to ap
When splitting the string, don't give the sep argument. If you don't give the argument, then consecutive white space will also be removed and leading and trailing white space will also be ignored. This is more friendly on the format of the data.
I've combined a few operations. For example the split and the list comprehension converting to integer make sense to group together. There is also no need to create cd as a list and then convert that to a set. Just build it as a set to start with.
I don't like that the function returns the original series in the case of no missing element. The value None would be more in keeping with the name of the function.
Your original function returned a one item set as the result. That seems odd, so I've used pop() to extract that item and return just the missing element.
The last item was more of an experiment with combining all of the code at the bottom into a single statement. Don't know if it is better, but it's something to think about. I built a set with all the correct numbers and a set with the given numbers and then subtracted them and returned the number that was missing.
Here's the code that I came up with:
def find_missing_elm_ap_series(series):
ap = [int(i) for i in series.split()]
n = len(ap)
cd = {ap[i+1]-ap[i] for i in range(n-1)}
if len(cd) == 1:
print 'The series is complete'
return None
else:
common_diff = min([abs(i) for i in cd])
if ap[0] > ap[1]:
common_diff = (-1)*common_diff
return set(range(ap[0],ap[0]+common_diff*n,common_diff)).difference(set(ap)).pop()

Assuming the first & last items are not missing, we can also make use of range() or xrange() with the step of the common difference, getting rid of the n altogether, it can also return more than 1 missing item (although not reliably depending on number of items missing):
In [13]: def find_missing_elm(series):
ap = map(int, series.split())
cd = map(lambda x: x[1]-x[0], zip(ap[:-1], ap[1:]))
if len(set(cd)) == 1:
print 'complete series'
return ap
mcd = min(cd) if ap[0] < ap[1] else max(cd)
sap = set(ap)
return filter(lambda x: x not in sap, xrange(ap[0], ap[-1], mcd))
....:
In [14]: find_missing_elm('1 3 5 9 11 15')
Out[14]: [7, 13]
In [15]: find_missing_elm('15 11 9 5 3 1')
Out[15]: [13, 7]

Find repeats with certain length within a string using python

I am trying to use the regex module to find non-overlapping repeats (duplicated sub-strings) within a given string (30 char), with the following requirements:
I am only interested in non-overlapping repeats that are 6-15 char long.
allow 1 mis-match
return the positions for each match
One way I thought of is that for each possible repeat length, let python loop through the 30char string input. For example,
string = "ATAGATATATGGCCCGGCCCATAGATATAT" #input
#for 6char repeats, first one in loop would be for the following event:
text = "ATAGAT"
text2 ="(" + text + ")"+ "{e<=1}" #this is to allow 1 mismatch later in regex
string2="ATATGGCCCGGCCCATAGATATAT" #string after excluding text
for x in regex.finditer(text2,string2,overlapped=True):
print x.span()
#then still for 6char repeats, I will move on to text = "TAGATA"...
#after 6char, loop again for 7char...
There should be two outputs for this particular string = "ATAGATATATGGCCCGGCCCATAGATATAT". 1. The bold two "ATAGATATAT" + 1 mismatch: "ATAGATATATG" &"CATAGATATAT" with position index returned as (0,10)&(19, 29); 2. "TGGCCC" & "GGCCCA" (need add one mismatch to be at least 6 char), with index (9,14)&(15,20). Numbers can be in a list or table.
I'm sorry that I didn't include a real loop, but I hope the idea is clear...As you can see, this is a very less efficient method, not to mention it would create redundancy --- e.g. 10char repeats will be counted more than once, because it would suit for 9,8,7 and 6 char repeats loops. Moreover, I have a lot of such 30 char strings to work with, so I would appreciate your advice on some cleaner methods.
Thank you very much:)

I'd try straightforward algorithm instead of regex (which are quite confusing in this instance);
s = "ATAGATATATGGCCCGGCCCATAGATATAT"
def fuzzy_compare(s1, s2):
# sanity check
if len(s1) != len(s2):
return False
diffs = 0
for a, b in zip(s1, s2):
if a != b:
diffs += 1
if diffs > 1:
return False
return True
slen = len(s) # 30
for l in range(6, 16):
i = 0
while (i + l * 2) <= slen:
sub1 = s[i:i+l]
for j in range(i+l, slen - l):
sub2 = s[j:j+l]
if fuzzy_compare(sub1, sub2):
# checking if this could be partial
partial = False
if i + l < j and j + l < slen:
extsub1 = s[i:i+l+1]
extsub2 = s[j:j+l+1]
# if it is partial, we'll get it later in the main loop
if fuzzy_compare(extsub1, extsub2):
partial = True
if not partial:
print (i, i+l), (j, j+l)
i += 1
It's a first draft, so feel free to experiment with it. It also seems to be clunky and not optimal, but try running it first - it may be sufficient enough.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Thinking Python - python

I think you could do more pythonic calculation of maxlength :) max_length = len(str(max(a))) if your numbers could be negative or float max_length = max([len(str(x)) for x in a])

Another entry. Just to add one with a little functional programming. :-) n = 20 f = lambda x: str(x).rjust(len(str(n+1))) + (" " if x % 5 else "\n") print "".join(map(f, range(1,n+1))),

Related

Python: Given a string s and an integer C I have to find the summation of difference of number of each alphabet and C

How to increment a variable inside of a list comprehension

Python sorting arrays to get two digit values

Is there a better way to write the following method in python?

Find repeats with certain length within a string using python

Categories

Resources