Perfomance of concatenating strings vs joining lists? [duplicate]

Perfomance of concatenating strings vs joining lists? [duplicate] - python

This question already has answers here:
What is the most efficient string concatenation method in Python?
(12 answers)
Closed 1 year ago.
I'm doing a small python exercises for fun.
I need to print something that looks like this based on some input, and some logic:
.P....
.I....
.D....
.Z....
BANANA
.M....
.A....
And right now I'm struggling a bit with constructing the strings with the dots in them.
So what I need is a function that takes a number n a number i, and a char, like this;
def buildstring(n,i, char):
And then returns a string of length n, consisiting only of dots where the i'th char is the char given.
I currently have this attempt:
def buildprint(n,i,char):
start = "".join(['.']*(i))
mid = char
end = "".join(['.']*(n - i-1))
print(start+mid+end)
buildprint(10,3,"j")
which produces:
...j......
Which is what I want. But my solution will be graded on time, but I'm not toosure that this would be the most effecient way of concatenating the strings, since I remember something about concatenating strings often being the slow part of a program or if this would be more perfomant:
def buildprint(n,i,char):
start = ['.']*(i)
mid = [char]
end = ['.']*(n - i-1)
print("".join(start+mid+end))
So the question is really about the way string concatenation works, vs joining lists

I would use a completly different approach using string interpolation and the
f-string
def buildprint(n,i,char):
start = "".join(['.']*(i))
mid = char
end = "".join(['.']*(n - i-1))
print(start+mid+end)
def build_f_string(total_char, dots, char):
total_char -= dots + 1
print(f'{"." * dots}{char}{"." * total_char}')
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
One Liner
build_f_string = lambda total_char, dots, char : print(f'{"." * dots}{char}{"." * (total_char - (dots + 1 ))}')
build_f_string(*test_1)
Performance
Make a timer function
from time import perf_counter
from functools import wraps
def timer(runs):
def _timer(f,):
#wraps(f)
def wrapper(*args, **kwargs):
start = perf_counter()
for test in range(runs):
res = f(*args, **kwargs)
end = perf_counter()
print(f'{f.__name__}: Total Time: {end - start}')
return res
return wrapper
return _timer
Decorate our functions
TEST_RUNS = 100_000
#timer(runs=TEST_RUNS)
def buildprint(n, i, char):
start = "".join(['.'] * (i))
mid = char
end = "".join(['.'] * (n - i - 1))
return start + mid + end
#timer(runs=TEST_RUNS)
def build_f_string(total_char, dots, char):
return f'{"." * dots}{char}{"." * (total_char - dots + 1)}'
Run Tests
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
# Output
# buildprint: Total Time: 0.06150109999999999
# build_f_string: Total Time: 0.027191400000000004
Going to next level
We can use Python Deque
Deques are a generalization of stacks and queues (the name is pronounced “deck” and is short for “double-ended queue”). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
def build_withdeque(n, i, char):
hyper_list = deque([char]) # initialize the deque with the char, is in the middle already
hyper_list.extendleft(["." for _ in range(i)]) # Add left side
hyper_list.extend(["." for _ in range(n - i - 1)])
return "".join(hyper_list)
Running test
from collections import deque
TEST_RUNS = 10_00_000
#timer(runs=TEST_RUNS)
def build_withdeque(n, i, char):
hyper_list = deque([char]) # initialize the deque with the char, is in the middle already
hyper_list.extendleft(["." for _ in range(i)]) # Add left side
hyper_list.extend(["." for _ in range(n - i - 1)])
return "".join(hyper_list)
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
build_withdeque(*test_1)
# buildprint: Total Time: 0.546178
# build_f_string: Total Time: 0.29445559999999993
# build_withdeque: Total Time: 1.4074019
Still not so good..
Pre build a list
#timer(runs=TEST_RUNS)
def build_list_complete(n, i, char):
return ''.join(["." * i, char, "." * (n - i - 1)])
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
build_withdeque(*test_1)
build_list_complete(*test_1)
# buildprint: Total Time: 0.5440142
# build_f_string: Total Time: 0.3002815999999999
# build_withdeque: Total Time: 1.4215970999999998
# build_list_complete: Total Time: 0.30323829999999985
Final Result
# TEST_RUNS = 10_00_000
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
build_withdeque(*test_1)
build_list_complete(*test_1)
# buildprint: Total Time: 0.6512364
# build_f_string: Total Time: 0.2695955000000001
# build_withdeque: Total Time: 14.0086889
# build_list_complete: Total Time: 3.049139399999998
Wait no so fast
As #MisterMiyagi point out, using return "." * i + char + ("." * (n - i + 1)) could be equivalent than using return f'{"." * dots}{char}{"." * (total_char - dots + 1)}'.
However it is actually Faster.
Look at this:
def SuperHyperMegaUltraFastButBasicallyLikeFStringThatMisterMiyagiSayd(n, i, char):
return "." * i + char + ("." * (n - i + 1))
Run some test with 100_00_000 repetitions.
#timer(runs=TEST_RUNS)
def SuperHyperMegaUltraFastButBasicallyLikeFStringThatMisterMiyagiSaid(n, i, char):
return "." * i + char + ("." * (n - i + 1))
test_1 = (10, 3, "j")
buildprint(*test_1)
build_f_string(*test_1)
build_withdeque(*test_1)
build_list_complete(*test_1)
buildprint: Total Time: 5.3067210000000005
build_f_string: Total Time: 2.721678
build_withdeque: Total Time: 14.302031600000001
build_list_complete: Total Time: 3.0364287999999995 SuperHyperMegaUltraFastButBasicallyLikeFStringThatMisterMiyagiSayd: Total Time: 2.440598699999999
buildprint: Total Time: 5.3067210000000005
build_f_string: Total Time: 2.721678
build_withdeque: Total Time: 14.302031600000001
build_list_complete: Total Time: 3.0364287999999995
SuperHyperMegaUltraFastButBasicallyLikeFStringThatMisterMiyagiSayd
Total Time: 2.440598699999999
The f-string string concatenation is the clear winner here, untill proven otherwise

Related

Confused about Backtracking?

I'm trying to solve this Leetcode Question but I'm getting an error where I'm exceeding the time limit.
class Solution:
def readBinaryWatchHelper(self,hours, minutes, num, pastHours, pastMinutes):
if num == 0:
hour, minute = sum(pastHours), sum(pastMinutes)
if self.isValidTime(hour, minute):
time = str(hour) + ":" + str(minute).zfill(2)
self.times.add(time)
else:
for i in minutes:
cMinutesTemp = list(minutes)
pMinutesTemp = list(pastMinutes)
pMinutesTemp.append(i)
cMinutesTemp.remove(i)
if self.isValidTime(sum(pastHours), sum(pMinutesTemp)):
self.readBinaryWatchHelper(hours, cMinutesTemp, num - 1, pastHours, pMinutesTemp)
for i in hours:
cHoursTemp = list(hours)
pHoursTemp = list(pastHours)
pHoursTemp.append(i)
cHoursTemp.remove(i)
if self.isValidTime(sum(pHoursTemp), sum(pastMinutes)):
self.readBinaryWatchHelper(cHoursTemp, minutes, num - 1, pHoursTemp, pastMinutes)
#staticmethod
def isValidTime(hours, minutes):
if hours < 12 and minutes < 60:
return True
return False
def readBinaryWatch(self, num):
self.times = set()
hChoices = [1,2,4,8]
mChoices = [1,2,4,8,16,32]
self.readBinaryWatchHelper(hChoices[::-1], mChoices[::-1], num, [],[])
return list(self.times)
That's the solution I've written up using Backtracking. I was hoping I could get feedback on why it's too slow? One of the valid solutions is just getting all the combinations of hours from 0 - 12 and minutes from 0 - 60 and checking the sum of the bits to see if they add up to the correct sum. I'm confused as to how that solution is faster than mine? Shouldn't mine be faster due to the "tree pruning"? Thanks guys.

A solution w/o backtracking but using itertools and list comp would be:
class Solution:
def readBinaryWatch(self, num):
"""
:type num: int
:rtype: List[str]
"""
from itertools import combinations, product
hhmin = min(num,3)+1 # only generate at much as needed
mmmin = min(num,5)+1 # only generate as much as needed
nBits = [1,2,4,8,16,32]
# {0: ['0'], 1: ['1','2','4','8'], 2: ['3','5','9','6','10'], 3: ['7','11']} for>2
h = { bitCount: list(str(sum(x)) for x in combinations(nBits,bitCount)
if sum(x) < 12) for bitCount in range(hhmin)}
m = { bitCount: list(str(sum(x)).zfill(2) for x in combinations(nBits,bitCount)
if sum(x) < 60) for bitCount in range(mmmin)}
ranges = ((h[subl[0]],m[subl[1]]) for subl in (x for x in product(range(num + 1),
range(num + 1)) if sum(x) == num and x[0]<hhmin and x[1]<mmmin))
return ["{0}:{1}".format(hh,mm) for (hhhh,mmmm) in ranges
for hh in hhhh for mm in mmmm]
Test with:
s = Solution()
for n in range(8):
print(s.readBinaryWatch(n))
print()
It dials in between 55ms and 64ms depending on whatever between different submits. According to the page this reaches 77% - there are some much shorter and more elegnant solutions. You can inspect some of them once you submit one. Yours unfortunately does not run at all, recursion needs recursive which seems to be too slow.
Funny enough the "brute force" if.. elif .. elif ... else with prefabbed lists is below 50% - just had to try it ;)

Time measurement - function with an input inside

My task is to measure time of a function which contains an input inside (the user is to input a list).
There is the code:
import numpy
import itertools
def amount(C):
N = numpy.array(input().strip().split(" "),int)
N = list(N)
N = sorted(N)
while C < max(N):
N.remove(max(N))
res = []
for i in range(1, C):
for j in list(itertools.combinations_with_replacement(N, i)):
res.append(sum(list(j)))
m = 0
for z in range (0, len(res)):
if res[z] == C:
m += 1
if N[0] == 1:
return m + 1
else:
return m
Of course it is not optimized (but it's not the case now).
Because of the "list" standard way by which I mean:
import time
start = time.time()
amount(10)
end = time.time()
print(end - start)
does not work.
How can I measure the time? For example for:
C in range [0, 11]
N = [1, 2 , 5]
I would be grateful for any help! :)

You want somethig like this?
import time
# measure process time
t0 = time.clock()
amount(10)
print time.clock() - t0, "seconds process time"
# measure wall time
t0 = time.time()
amount(10)
print time.time() - t0, "seconds wall time"
UPDATE : The soulution maybe could be this :
times=[]
for x in range(12):
t0 = time.time()
amount(x)
times.append( time.time() - t0);
or better if you use timeit : here

What do you think about this:
import time
def time_counter(amount, n=100, M=100):
res = list(range(n))
def count_once():
start = time.perf_counter()
amount(res)
return time.perf_counter() - start
return [count_once() for m in range(M)]
It is to make M = 1000 measurements and for C in range(0, n). Is it alright?

Interpolate between elements in an array of floats

I'm getting a list of 5 floats which I would like to use as values to send pwm to an LED. I want to ramp smoothly in a variable amount of milliseconds between the elements in the array.
So if this is my array...
list = [1.222, 3.111, 0.456, 9.222, 22.333]
I want to ramp from 1.222 to 3.111 over say 3000 milliseconds, then from 3.111 to 0.456 over the same amount of time, and when it gets to the end of the list I want the 5th element of the list to ramp to the 1st element of the list and continue indefinitely.

do you think about something like that?
import time
l = [1.222, 3.111, 0.456, 9.222, 22.333]
def play_led(value):
#here should be the led- code
print value
def calc_ramp(given_list, interval_count):
new_list = []
len_list = len(given_list)
for i in range(len_list):
first = given_list[i]
second = given_list[(i+1) % len_list]
delta = (second - first) / interval_count
for j in range(interval_count):
new_list.append(first + j * delta)
return new_list
def endless_play_led(ramp_list,count):
endless = count == 0
count = abs(count)
while endless or count!=0:
for i in range(len(ramp_list)):
play_led(ramp_list[i])
#time.sleep(1)
if not endless:
count -= 1
print '##############',count
endless_play_led(calc_ramp(l, 3),2)
endless_play_led(calc_ramp(l, 3),-2)
endless_play_led(calc_ramp(l, 3),0)

another version, similar to the version of dsgdfg (based on his/her idea), but without timing lag:
import time
list_of_ramp = [1.222, 3.111, 0.456, 9.222, 22.333]
def play_LED(value):
s = ''
for i in range(int(value*4)):
s += '*'
print s, value
def interpol(first, second, fract):
return first + (second - first)*fract
def find_borders(list_of_values, total_time, time_per_step):
len_list = len(list_of_values)
total_steps = total_time // time_per_step
fract = (total_time - total_steps * time_per_step) / float(time_per_step)
index1 = int(total_steps % len_list)
return [list_of_values[index1], list_of_values[(index1 + 1) % len_list], fract]
def start_program(list_of_values, time_per_step, relax_time):
total_start = time.time()
while True:
last_time = time.time()
while time.time() - last_time < relax_time:
pass
x = find_borders(list_of_values,time.time(),time_per_step)
play_LED(interpol(x[0],x[1],x[2]))
start_program(list_of_ramp,time_per_step=5,relax_time=0.5)

Python recursion timings in return statement

I am currently trying to time recursions of factorials and I cannot find a way around printing every factorial in each recursion step. Now I have tried printing it just in the return statement which would solve my problem, but that just ended up in a mess of wall of text with timings being fragmented.
EDIT: I should mention that I am trying to get the cumulative timings of the whole process and not fragmented results like I have below with the print statement.
I tried something like:
return (str(n) + '! = ' + (str(FactResult)) +
' - Runtime = %.9f seconds' % (end-start))
But here is what I have below as of now.
import time
def factorial(n):
"""Factorial function that uses recursion and returns factorial of
number given."""
start = time.clock()
if n < 1:
return 1
else:
FactResult = n * factorial(n - 1)
end = time.clock()
print(str(n) + '! - Runtime = %.9f seconds' % (end-start))
return FactResult

It seems to work fine after fixing the indentation and minor (cosmetic) changes:
import time
def factorial(n):
"""Factorial function that uses recursion and returns factorial of number given."""
start = time.clock()
if n < 1:
return 1
else:
FactResult = n * factorial(n - 1)
end = time.clock()
print(str(n) + '! =', FactResult, '- Runtime = %.9f seconds' % (end-start))
return FactResult
factorial(10)
It prints for me... without printing the result value:
c:\tmp\___python\BobDunakey\so12828669>py a.py
1! - Runtime = 0.000001440 seconds
2! - Runtime = 0.000288474 seconds
3! - Runtime = 0.000484790 seconds
4! - Runtime = 0.000690225 seconds
5! - Runtime = 0.000895181 seconds
6! - Runtime = 0.001097736 seconds
7! - Runtime = 0.001294052 seconds
8! - Runtime = 0.001487008 seconds
9! - Runtime = 0.001683804 seconds
10! - Runtime = 0.001884920 seconds
... and with printing the value:
c:\tmp\___python\BobDunakey\so12828669>py a.py
1! = 1 - Runtime = 0.000001440 seconds
2! = 2 - Runtime = 0.001313252 seconds
3! = 6 - Runtime = 0.002450827 seconds
4! = 24 - Runtime = 0.003409847 seconds
5! = 120 - Runtime = 0.004300708 seconds
6! = 720 - Runtime = 0.005694598 seconds
7! = 5040 - Runtime = 0.006678577 seconds
8! = 40320 - Runtime = 0.007579038 seconds
9! = 362880 - Runtime = 0.008463659 seconds
10! = 3628800 - Runtime = 0.009994826 seconds
EDIT
For the cumulative timing, you have to measure outside the call. Otherwise you are not able to capture the start time. It is also more natural:
import time
def factorial(n):
"""Factorial function that uses recursion and returns factorial of number given."""
if n < 1:
return 1
else:
return n * factorial(n - 1)
n = 10
start = time.clock()
result = factorial(n)
end = time.clock()
print(str(n) + '! =', result, '- Runtime = %.9f seconds' % (end-start))
It prints:
c:\tmp\___python\BobDunakey\so12828669>py a.py
10! = 3628800 - Runtime = 0.000007200 seconds

Move the "end = time.clock()" and the print statement right before the "return 1" in the block that catches n<1. This is the last execution at the biggest depth of the recursion stack, so all you will miss is the backing up out of it. To get the most proper result, you should follow the suggestion of NullUserException and time outside the recursion method.

leading number groups between two numbers

(Python) Given two numbers A and B. I need to find all nested "groups" of numbers:
range(2169800, 2171194)
leading numbers: 21698XX, 21699XX, 2170XX, 21710XX, 217110X, 217111X,
217112X, 217113X, 217114X, 217115X, 217116X, 217117X, 217118X, 2171190X,
2171191X, 2171192X, 2171193X, 2171194X
or like this:
range(1000, 1452)
leading numbers: 10XX, 11XX, 12XX, 13XX, 140X, 141X, 142X, 143X,
144X, 1450, 1451, 1452

Harder than it first looked - pretty sure this is solid and will handle most boundary conditions. :) (There are few!!)
def leading(a, b):
# generate digit pairs a=123, b=456 -> [(1, 4), (2, 5), (3, 6)]
zip_digits = zip(str(a), str(b))
zip_digits = map(lambda (x,y):(int(x), int(y)), zip_digits)
# this ignores problems where the last matching digits are 0 and 9
# leading (12000, 12999) is same as leading(12, 12)
while(zip_digits[-1] == (0,9)):
zip_digits.pop()
# start recursion
return compute_leading(zip_digits)
def compute_leading(zip_digits):
if(len(zip_digits) == 1): # 1 digit case is simple!! :)
(a,b) = zip_digits.pop()
return range(a, b+1)
#now we partition the problem
# given leading(123,456) we decompose this into 3 problems
# lows -> leading(123,129)
# middle -> leading(130,449) which we can recurse to leading(13,44)
# highs -> leading(450,456)
last_digits = zip_digits.pop()
low_prefix = reduce(lambda x, y : 10 * x + y, [tup[0] for tup in zip_digits]) * 10 # base for lows e.g. 120
high_prefix = reduce(lambda x, y : 10 * x + y, [tup[1] for tup in zip_digits]) * 10 # base for highs e.g. 450
lows = range(low_prefix + last_digits[0], low_prefix + 10)
highs = range(high_prefix + 0, high_prefix + last_digits[1] + 1)
#check for boundary cases where lows or highs have all ten digits
(a,b) = zip_digits.pop() # pop last digits of middle so they can be adjusted
if len(lows) == 10:
lows = []
else:
a = a + 1
if len(highs) == 10:
highs = []
else:
b = b - 1
zip_digits.append((a,b)) # push back last digits of middle after adjustments
return lows + compute_leading(zip_digits) + highs # and recurse - woohoo!!
print leading(199,411)
print leading(2169800, 2171194)
print leading(1000, 1452)

def foo(start, end):
index = 0
is_lower = False
while index < len(start):
if is_lower and start[index] == '0':
break
if not is_lower and start[index] < end[index]:
first_lower = index
is_lower = True
index += 1
return index-1, first_lower
start = '2169800'
end = '2171194'
result = []
while int(start) < int(end):
index, first_lower = foo(start, end)
range_end = index > first_lower and 10 or int(end[first_lower])
for x in range(int(start[index]), range_end):
result.append(start[:index] + str(x) + 'X'*(len(start)-index-1))
if range_end == 10:
start = str(int(start[:index])+1)+'0'+start[index+1:]
else:
start = start[:index] + str(range_end) + start[index+1:]
result.append(end)
print "Leading numbers:"
print result
I test the examples you've given, it is right. Hope this will help you

This should give you a good starting point :
def leading(start, end):
leading = []
hundreds = start // 100
while (end - hundreds * 100) > 100:
i = hundreds * 100
leading.append(range(i,i+100))
hundreds += 1
c = hundreds * 100
tens = 1
while (end - c - tens * 10) > 10:
i = c + tens * 10
leading.append(range(i, i + 10))
tens += 1
c += tens * 10
ones = 1
while (end - c - ones) > 0:
i = c + ones
leading.append(i)
ones += 1
leading.append(end)
return leading
Ok, the whole could be one loop-level deeper. But I thought it might be clearer this way. Hope, this helps you...
Update :
Now I see what you want. Furthermore, maria's code doesn't seem to be working for me. (Sorry...)
So please consider the following code :
def leading(start, end):
depth = 2
while 10 ** depth > end : depth -=1
leading = []
const = 0
coeff = start // 10 ** depth
while depth >= 0:
while (end - const - coeff * 10 ** depth) >= 10 ** depth:
leading.append(str(const / 10 ** depth + coeff) + "X" * depth)
coeff += 1
const += coeff * 10 ** depth
coeff = 0
depth -= 1
leading.append(end)
return leading
print leading(199,411)
print leading(2169800, 2171194)
print leading(1000, 1453)
print leading(1,12)
Now, let me try to explain the approach here.
The algorithm will try to find "end" starting from value "start" and check whether "end" is in the next 10^2 (which is 100 in this case). If it fails, it will make a leap of 10^2 until it succeeds. When it succeeds it will go one depth level lower. That is, it will make leaps one order of magnitude smaller. And loop that way until the depth is equal to zero (= leaps of 10^0 = 1). The algorithm stops when it reaches the "end" value.
You may also notice that I have the implemented the wrapping loop I mentioned so it is now possible to define the starting depth (or leap size) in a variable.
The first while loop makes sure the first leap does not overshoot the "end" value.
If you have any questions, just feel free to ask.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Perfomance of concatenating strings vs joining lists? [duplicate] - python

Related

Confused about Backtracking?

Time measurement - function with an input inside

Interpolate between elements in an array of floats

Python recursion timings in return statement

leading number groups between two numbers

Categories

Resources