Profiling of a python function - python

Do you have any idea of how can I make this function more time-efficient?
def c(n):
word = 32
#l = []
c = 0
for i in range(0, 2**word):
#print(str(bin(i)))#.count('1')
if str(bin(i)).count('1') == n:
c = c + 1
print(c)
if i == 2**28:
print('6 %')
if i == 2**29:
print('12 %')
if i == 2**30:
print('25 %')
if i == 2**31:
print('50 %')
if i == 2**32:
print('100 %')
return c
135274023 function calls in 742.161 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 391.662 391.662 742.161 742.161 <pyshell#3>:1(c)
1 0.000 0.000 742.161 742.161 <string>:1(<module>)
4816 0.014 0.000 0.014 0.000 rpc.py:149(debug)
688 0.010 0.000 3.162 0.005 rpc.py:208(remotecall)
688 0.017 0.000 0.107 0.000 rpc.py:218(asynccall)
688 0.019 0.000 3.043 0.004 rpc.py:238(asyncreturn)
688 0.002 0.000 0.002 0.000 rpc.py:244(decoderesponse)
688 0.007 0.000 3.018 0.004 rpc.py:279(getresponse)
688 0.006 0.000 0.010 0.000 rpc.py:287(_proxify)
688 0.025 0.000 3.000 0.004 rpc.py:295(_getresponse)
688 0.002 0.000 0.002 0.000 rpc.py:317(newseq)
688 0.023 0.000 0.062 0.000 rpc.py:321(putmessage)
688 0.007 0.000 0.011 0.000 rpc.py:546(__getattr__)
688 0.002 0.000 0.002 0.000 rpc.py:587(__init__)
688 0.004 0.000 3.166 0.005 rpc.py:592(__call__)
1376 0.008 0.000 0.011 0.000 threading.py:1012(current_thread)
688 0.004 0.000 0.019 0.000 threading.py:172(Condition)
688 0.009 0.000 0.015 0.000 threading.py:177(__init__)
688 0.019 0.000 2.962 0.004 threading.py:226(wait)
688 0.002 0.000 0.002 0.000 threading.py:45(__init__)
688 0.002 0.000 0.002 0.000 threading.py:50(_note)
688 0.004 0.000 0.004 0.000 threading.py:88(RLock)
688 0.004 0.000 0.004 0.000 {built-in method allocate_lock}
67620326 162.442 0.000 162.442 0.000 {built-in method bin}
688 0.007 0.000 0.007 0.000 {built-in method dumps}
1 0.000 0.000 742.161 742.161 {built-in method exec}
1376 0.003 0.000 0.003 0.000 {built-in method get_ident}
1376 0.004 0.000 0.004 0.000 {built-in method isinstance}
2064 0.005 0.000 0.005 0.000 {built-in method len}
688 0.002 0.000 0.002 0.000 {built-in method pack}
344 0.009 0.000 3.187 0.009 {built-in method print}
688 0.008 0.000 0.008 0.000 {built-in method select}
688 0.003 0.000 0.003 0.000 {method '_acquire_restore' of '_thread.RLock' objects}
688 0.002 0.000 0.002 0.000 {method '_is_owned' of '_thread.RLock' objects}
688 0.002 0.000 0.002 0.000 {method '_release_save' of '_thread.RLock' objects}
688 0.003 0.000 0.003 0.000 {method 'acquire' of '_thread.RLock' objects}
1376 2.929 0.002 2.929 0.002 {method 'acquire' of '_thread.lock' objects}
688 0.002 0.000 0.002 0.000 {method 'append' of 'list' objects}
67620325 184.869 0.000 184.869 0.000 {method 'count' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
688 0.002 0.000 0.002 0.000 {method 'get' of 'dict' objects}
688 0.002 0.000 0.002 0.000 {method 'release' of '_thread.RLock' objects}
688 0.015 0.000 0.015 0.000 {method 'send' of '_socket.socket' objects}
What I try to achieve is to calculate how many of numbers from 0 to 2**32 have n number of 1 in their binary representation.

You are counting how many 32-bit numbers have a given number of 1s. This number is the binomial coefficient 32 choose bits, and can be calculated with:
from math import factorial
print factorial(32) // (factorial(bits) * factorial(32-bits))

Related

Speed up order_by and pagination in Django

Currently I have this result:
That's not bad (I guess), but I'm wondering if I can speed things up a little bit.
I've looked at penultimate query and don't really know how to speed it up, I guess I should get rid off join, but don't know how:
I'm already using prefetch_related in my viewset, my viewset is:
class GameViewSet(viewsets.ModelViewSet):
queryset = Game.objects.prefetch_related(
"timestamp",
"fighters",
"score",
"coefs",
"rounds",
"rounds_view",
"rounds_view_f",
"finishes",
"rounds_time",
"round_time",
"time_coef",
"totals",
).all()
serializer_class = GameSerializer
permission_classes = [AllowAny]
pagination_class = StandardResultsSetPagination
#silk_profile(name="Get Games")
def list(self, request):
qs = self.get_queryset().order_by("-timestamp__ts")
page = self.paginate_queryset(qs)
if page is not None:
serializer = GameSerializer(page, many=True)
return self.get_paginated_response(serializer.data)
serializer = self.get_serializer(qs, many=True)
return Response(serializer.data)
Join is happening because I'm ordering on a related field?
My models looks like:
class Game(models.Model):
id = models.AutoField(primary_key=True)
...
class Timestamp(models.Model):
id = models.AutoField(primary_key=True)
game = models.ForeignKey(Game, related_name="timestamp", on_delete=models.CASCADE)
ts = models.DateTimeField(db_index=True)
time_of_day = models.TimeField()
And my serializers:
class TimestampSerializer(serializers.Serializer):
ts = serializers.DateTimeField(read_only=True)
time_of_day = serializers.TimeField(read_only=True)
class GameSerializer(serializers.Serializer):
id = serializers.IntegerField(read_only=True)
timestamp = TimestampSerializer(many=True)
fighters = FighterSerializer(many=True)
score = ScoreSerializer(many=True)
coefs = CoefsSerializer(many=True)
rounds = RoundsSerializer(many=True)
rounds_view = RoundsViewSerializer(many=True)
rounds_view_f = RoundsViewFinishSerializer(many=True)
finishes = FinishesSerializer(many=True)
rounds_time = RoundTimesSerializer(many=True)
round_time = RoundTimeSerializer(many=True)
time_coef = TimeCoefsSerializer(many=True)
totals = TotalsSerializer(many=True)
Also results of profling:
166039 function calls (159016 primitive calls) in 3.226 seconds
Ordered by: internal time
List reduced from 677 to 100 due to restriction <100>
ncalls tottime percall cumtime percall filename:lineno(function)
20959/20958 0.206 0.000 0.283 0.000 {built-in method builtins.isinstance}
2700 0.123 0.000 0.359 0.000 fields.py:62(is_simple_callable)
390/30 0.113 0.000 1.211 0.040 serializers.py:493(to_representation)
8943/8473 0.098 0.000 0.307 0.000 {built-in method builtins.getattr}
2700 0.096 0.000 0.650 0.000 fields.py:85(get_attribute)
14 0.068 0.005 0.130 0.009 traceback.py:388(format)
7653 0.065 0.000 0.065 0.000 {method 'append' of 'list' objects}
28 0.064 0.002 0.072 0.003 {method 'execute' of 'psycopg2.extensions.cursor' objects}
390 0.062 0.000 0.153 0.000 base.py:406(__init__)
3090 0.060 0.000 0.257 0.000 serializers.py:359(_readable_fields)
3090 0.059 0.000 0.093 0.000 _collections_abc.py:760(__iter__)
6390 0.055 0.000 0.055 0.000 {built-in method builtins.hasattr}
1440 0.054 0.000 0.112 0.000 related.py:652(get_instance_value_for_fields)
2749 0.052 0.000 0.078 0.000 abc.py:96(__instancecheck__)
388 0.052 0.000 0.107 0.000 query.py:303(clone)
2700 0.049 0.000 0.072 0.000 inspect.py:158(isfunction)
2700 0.048 0.000 0.699 0.000 fields.py:451(get_attribute)
2702 0.048 0.000 0.071 0.000 inspect.py:80(ismethod)
2701 0.047 0.000 0.070 0.000 inspect.py:285(isbuiltin)
14 0.047 0.003 0.189 0.014 traceback.py:321(extract)
3786/3426 0.043 0.000 0.123 0.000 {built-in method builtins.setattr}
4445/247 0.038 0.000 1.936 0.008 {built-in method builtins.len}
360 0.035 0.000 0.374 0.001 related_descriptors.py:575(_apply_rel_filters)
2247 0.034 0.000 0.088 0.000 traceback.py:285(line)
3203 0.029 0.000 0.029 0.000 {method 'copy' of 'dict' objects}
12 0.028 0.002 1.836 0.153 query.py:1831(prefetch_one_level)
2749 0.026 0.000 0.026 0.000 {built-in method _abc._abc_instancecheck}
2700 0.024 0.000 0.024 0.000 serializer_helpers.py:154(__getitem__)
2780 0.024 0.000 0.024 0.000 {method 'get' of 'dict' objects}
360 0.023 0.000 0.069 0.000 related_lookups.py:26(get_normalized_value)
720 0.022 0.000 0.458 0.001 related_descriptors.py:615(get_queryset)
744 0.022 0.000 0.087 0.000 related_descriptors.py:523(__get__)
360 0.022 0.000 0.057 0.000 related_descriptors.py:203(__set__)
470 0.022 0.000 0.081 0.000 local.py:46(_get_context_id)
749 0.020 0.000 0.048 0.000 linecache.py:15(getline)
720 0.019 0.000 0.031 0.000 lookups.py:252(resolve_expression_parameter)
361/1 0.018 0.000 1.211 1.211 serializers.py:655(to_representation)
296/14 0.018 0.000 0.084 0.006 copy.py:128(deepcopy)
470 0.018 0.000 0.106 0.000 local.py:82(_get_storage)
732 0.017 0.000 0.043 0.000 related_descriptors.py:560(__init__)
720 0.017 0.000 0.040 0.000 related_descriptors.py:76(__set__)
1185/1151 0.017 0.000 0.032 0.000 {method 'join' of 'str' objects}
1501 0.017 0.000 0.017 0.000 {method 'format' of 'str' objects}
732 0.016 0.000 0.026 0.000 manager.py:26(__init__)
372 0.016 0.000 0.414 0.001 query.py:951(_filter_or_exclude)
14 0.016 0.001 0.028 0.002 traceback.py:369(from_list)
759 0.015 0.000 0.040 0.000 query.py:178(__init__)
387 0.015 0.000 0.143 0.000 query.py:1308(_clone)
732 0.015 0.000 0.022 0.000 manager.py:20(__new__)
1710 0.015 0.000 0.015 0.000 {built-in method __new__ of type object at 0x7fa87d9ad940}
720 0.015 0.000 0.034 0.000 __init__.py:1818(get_prep_value)
470 0.014 0.000 0.033 0.000 sync.py:469(get_current_task)
390 0.014 0.000 0.174 0.000 base.py:507(from_db)
1365 0.014 0.000 0.014 0.000 {method 'update' of 'dict' objects}
749 0.013 0.000 0.022 0.000 linecache.py:37(getlines)
744 0.013 0.000 0.044 0.000 lookups.py:266()
12 0.013 0.001 0.054 0.004 lookups.py:230(get_prep_lookup)
749 0.013 0.000 0.020 0.000 linecache.py:147(lazycache)
720 0.013 0.000 0.066 0.000 related.py:646(get_local_related_value)
720 0.013 0.000 0.071 0.000 related.py:649(get_foreign_related_value)
720 0.013 0.000 0.019 0.000 __init__.py:824(get_prep_value)
1506 0.013 0.000 0.013 0.000 {method 'strip' of 'str' objects}
372/12 0.013 0.000 0.026 0.002 query.py:1088(resolve_lookup_value)
638 0.013 0.000 0.018 0.000 threading.py:1306(current_thread)
732 0.013 0.000 0.021 0.000 reverse_related.py:200(get_cache_name)
720 0.013 0.000 0.018 0.000 base.py:573(_get_pk_val)
360 0.012 0.000 0.021 0.000 __init__.py:543(__hash__)
470 0.011 0.000 0.117 0.000 local.py:101(__getattr__)
28 0.011 0.000 0.089 0.003 compiler.py:199(get_select)
385 0.011 0.000 0.857 0.002 query.py:265(__iter__)
320 0.011 0.000 0.019 0.000 __init__.py:515(__eq__)
387 0.011 0.000 0.120 0.000 query.py:354(chain)
780 0.011 0.000 0.021 0.000 dispatcher.py:159(send)
360 0.010 0.000 0.031 0.000 related.py:976(get_prep_value)
387 0.010 0.000 0.157 0.000 query.py:1296(_chain)
372 0.010 0.000 0.014 0.000 query.py:151(__init__)
403 0.010 0.000 0.914 0.002 query.py:45(__iter__)
1204 0.010 0.000 0.010 0.000 {built-in method builtins.iter}
360 0.010 0.000 0.016 0.000 :1017(_handle_fromlist)
372 0.010 0.000 0.434 0.001 query.py:935(filter)
360 0.010 0.000 0.023 0.000 query.py:1124(check_query_object_type)
360 0.009 0.000 0.017 0.000 mixins.py:21(is_cached)
1152 0.009 0.000 0.009 0.000 query.py:194(query)
104 0.009 0.000 0.017 0.000 fields.py:323(__init__)
732 0.009 0.000 0.009 0.000 manager.py:120(_set_creation_counter)
470 0.009 0.000 0.013 0.000 :389(parent)
470 0.009 0.000 0.014 0.000 tasks.py:34(current_task)
459 0.009 0.000 0.013 0.000 deconstruct.py:14(__new__)
870 0.009 0.000 0.009 0.000 fields.py:810(to_representation)
322 0.009 0.000 0.019 0.000 linecache.py:53(checkcache)
318/266 0.009 0.000 0.145 0.001 compiler.py:434(compile)
763 0.008 0.000 0.008 0.000 traceback.py:292(walk_stack)
494 0.008 0.000 0.014 0.000 compiler.py:417(quote_name_unless_alias)
396 0.008 0.000 0.019 0.000 related.py:710(get_path_info)
374 0.008 0.000 0.011 0.000 utils.py:237(_route_db)
796 0.008 0.000 0.008 0.000 tree.py:21(__init__)
322 0.008 0.000 0.008 0.000 {built-in method posix.stat}
413/365 0.008 0.000 1.938 0.005 query.py:1322(_fetch_all)
12 0.008 0.001 1.244 0.104 related_descriptors.py:622(get_prefetch_queryset)
732 0.008 0.000 0.008 0.000 reverse_related.py:180(get_accessor_name)
And visual representation:
So, my questions is, how can I speed it up?
On the database side, you could set up a Materialized View for your query and trigger an update anytime a new timestamp is added (or whatever is happening in your application that requires a refresh). That way the results are pre-calculated for your lookup. However, I suppose there could be edge cases where when you update and lookup at the same time, you end up with a pre-updated result? It'd be a trade off and only you know whether that'd be worth it...
In any case, I did not come up with that, check out e.g. https://hashrocket.com/blog/posts/materialized-view-strategies-using-postgresql

Leetcode 1155. dictionary based memoization vs LRU Cache

I was solving leetcode 1155 which is about number of dice rolls with target sum. I was using dictionary-based memorization. Here's the exact code:
class Solution:
def numRollsToTarget(self, dices: int, faces: int, target: int) -> int:
dp = {}
def ways(t, rd):
if t == 0 and rd == 0: return 1
if t <= 0 or rd <= 0: return 0
if dp.get((t,rd)): return dp[(t,rd)]
dp[(t,rd)] = sum(ways(t-i, rd-1) for i in range(1,faces+1))
return dp[(t,rd)]
return ways(target, dices)
But this solution is invariably timing out for a combination of face and dices around 15*15
Then I found this solution which uses functools.lru_cache and the rest of it is exactly the same. This solution works very fast.
class Solution:
def numRollsToTarget(self, dices: int, faces: int, target: int) -> int:
from functools import lru_cache
#lru_cache(None)
def ways(t, rd):
if t == 0 and rd == 0: return 1
if t <= 0 or rd <= 0: return 0
return sum(ways(t-i, rd-1) for i in range(1,faces+1))
return ways(target, dices)
Earlier, I have compared and found that in most cases, lru_cache does not outperform dictionary-based cache by such a margin.
Can someone explain the reason why there is such a drastic performance difference between the two approaches?
First, running your OP code with cProfile and this is the report:
with print(numRollsToTarget2(4, 6, 20)) (OP version)
You can spot right away there're some heavy calls in ways genexpr and sum. That's prob. need close examinations and try to improve/reduce. Next posting is for similar memo version, but the calls is much less. And that version has passed w/o timeout.
35
2864 function calls (366 primitive calls) in 0.018 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.018 0.018 <string>:1(<module>)
1 0.000 0.000 0.001 0.001 dice_rolls.py:23(numRollsToTarget2)
1075/1 0.001 0.000 0.001 0.001 dice_rolls.py:25(ways)
1253/7 0.001 0.000 0.001 0.000 dice_rolls.py:30(<genexpr>)
1 0.000 0.000 0.018 0.018 dice_rolls.py:36(main)
21 0.000 0.000 0.000 0.000 rpc.py:153(debug)
3 0.000 0.000 0.017 0.006 rpc.py:216(remotecall)
3 0.000 0.000 0.000 0.000 rpc.py:226(asynccall)
3 0.000 0.000 0.016 0.005 rpc.py:246(asyncreturn)
3 0.000 0.000 0.000 0.000 rpc.py:252(decoderesponse)
3 0.000 0.000 0.016 0.005 rpc.py:290(getresponse)
3 0.000 0.000 0.000 0.000 rpc.py:298(_proxify)
3 0.000 0.000 0.016 0.005 rpc.py:306(_getresponse)
3 0.000 0.000 0.000 0.000 rpc.py:328(newseq)
3 0.000 0.000 0.000 0.000 rpc.py:332(putmessage)
2 0.000 0.000 0.001 0.000 rpc.py:559(__getattr__)
3 0.000 0.000 0.000 0.000 rpc.py:57(dumps)
1 0.000 0.000 0.001 0.001 rpc.py:577(__getmethods)
2 0.000 0.000 0.000 0.000 rpc.py:601(__init__)
2 0.000 0.000 0.016 0.008 rpc.py:606(__call__)
4 0.000 0.000 0.000 0.000 run.py:412(encoding)
4 0.000 0.000 0.000 0.000 run.py:416(errors)
2 0.000 0.000 0.017 0.008 run.py:433(write)
6 0.000 0.000 0.000 0.000 threading.py:1306(current_thread)
3 0.000 0.000 0.000 0.000 threading.py:222(__init__)
3 0.000 0.000 0.016 0.005 threading.py:270(wait)
3 0.000 0.000 0.000 0.000 threading.py:81(RLock)
3 0.000 0.000 0.000 0.000 {built-in method _struct.pack}
3 0.000 0.000 0.000 0.000 {built-in method _thread.allocate_lock}
6 0.000 0.000 0.000 0.000 {built-in method _thread.get_ident}
1 0.000 0.000 0.018 0.018 {built-in method builtins.exec}
6 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
9 0.000 0.000 0.000 0.000 {built-in method builtins.len}
1 0.000 0.000 0.017 0.017 {built-in method builtins.print}
179/1 0.000 0.000 0.001 0.001 {built-in method builtins.sum}
3 0.000 0.000 0.000 0.000 {built-in method select.select}
3 0.000 0.000 0.000 0.000 {method '_acquire_restore' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method '_is_owned' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method '_release_save' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.RLock' objects}
6 0.016 0.003 0.016 0.003 {method 'acquire' of '_thread.lock' objects}
3 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}
2 0.000 0.000 0.000 0.000 {method 'decode' of 'bytes' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.000 0.000 0.000 0.000 {method 'dump' of '_pickle.Pickler' objects}
2 0.000 0.000 0.000 0.000 {method 'encode' of 'str' objects}
201 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
3 0.000 0.000 0.000 0.000 {method 'getvalue' of '_io.BytesIO' objects}
3 0.000 0.000 0.000 0.000 {method 'release' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method 'send' of '_socket.socket' objects}
Then I tried to run modified/simplified version, and compare the results.
35
387 function calls (193 primitive calls) in 0.006 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.006 0.006 <string>:1(<module>)
1 0.000 0.000 0.006 0.006 dice_rolls.py:36(main)
1 0.000 0.000 0.000 0.000 dice_rolls.py:5(numRollsToTarget)
195/1 0.000 0.000 0.000 0.000 dice_rolls.py:8(dp)
21 0.000 0.000 0.000 0.000 rpc.py:153(debug)
3 0.000 0.000 0.006 0.002 rpc.py:216(remotecall)
3 0.000 0.000 0.000 0.000 rpc.py:226(asynccall)
3 0.000 0.000 0.006 0.002 rpc.py:246(asyncreturn)
3 0.000 0.000 0.000 0.000 rpc.py:252(decoderesponse)
3 0.000 0.000 0.006 0.002 rpc.py:290(getresponse)
3 0.000 0.000 0.000 0.000 rpc.py:298(_proxify)
3 0.000 0.000 0.006 0.002 rpc.py:306(_getresponse)
3 0.000 0.000 0.000 0.000 rpc.py:328(newseq)
3 0.000 0.000 0.000 0.000 rpc.py:332(putmessage)
2 0.000 0.000 0.001 0.000 rpc.py:559(__getattr__)
3 0.000 0.000 0.000 0.000 rpc.py:57(dumps)
1 0.000 0.000 0.001 0.001 rpc.py:577(__getmethods)
2 0.000 0.000 0.000 0.000 rpc.py:601(__init__)
2 0.000 0.000 0.005 0.003 rpc.py:606(__call__)
4 0.000 0.000 0.000 0.000 run.py:412(encoding)
4 0.000 0.000 0.000 0.000 run.py:416(errors)
2 0.000 0.000 0.006 0.003 run.py:433(write)
6 0.000 0.000 0.000 0.000 threading.py:1306(current_thread)
3 0.000 0.000 0.000 0.000 threading.py:222(__init__)
3 0.000 0.000 0.006 0.002 threading.py:270(wait)
3 0.000 0.000 0.000 0.000 threading.py:81(RLock)
3 0.000 0.000 0.000 0.000 {built-in method _struct.pack}
3 0.000 0.000 0.000 0.000 {built-in method _thread.allocate_lock}
6 0.000 0.000 0.000 0.000 {built-in method _thread.get_ident}
1 0.000 0.000 0.006 0.006 {built-in method builtins.exec}
6 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
9 0.000 0.000 0.000 0.000 {built-in method builtins.len}
34 0.000 0.000 0.000 0.000 {built-in method builtins.max}
1 0.000 0.000 0.006 0.006 {built-in method builtins.print}
3 0.000 0.000 0.000 0.000 {built-in method select.select}
3 0.000 0.000 0.000 0.000 {method '_acquire_restore' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method '_is_owned' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method '_release_save' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.RLock' objects}
6 0.006 0.001 0.006 0.001 {method 'acquire' of '_thread.lock' objects}
3 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}
2 0.000 0.000 0.000 0.000 {method 'decode' of 'bytes' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.000 0.000 0.000 0.000 {method 'dump' of '_pickle.Pickler' objects}
2 0.000 0.000 0.000 0.000 {method 'encode' of 'str' objects}
2 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
3 0.000 0.000 0.000 0.000 {method 'getvalue' of '_io.BytesIO' objects}
3 0.000 0.000 0.000 0.000 {method 'release' of '_thread.RLock' objects}
3 0.000 0.000 0.000 0.000 {method 'send' of '_socket.socket' objects}
The profiling codes are here:
import cProfile
from typing import List
def numRollsToTarget(d, f, target):
memo = {}
def dp(d, target):
if d == 0:
return 0 if target > 0 else 1
if (d, target) in memo:
return memo[(d, target)]
result = 0
for k in range(max(0, target-f), target):
result += dp(d-1, k)
memo[(d, target)] = result
return result
return dp(d, target) % (10**9 + 7)
def numRollsToTarget2(dices: int, faces: int, target: int) -> int:
dp = {}
def ways(t, rd):
if t == 0 and rd == 0: return 1
if t <= 0 or rd <= 0: return 0
if dp.get((t,rd)): return dp[(t,rd)]
dp[(t,rd)] = sum(ways(t-i, rd-1) for i in range(1,faces+1))
return dp[(t,rd)]
return ways(target, dices)
def numRollsToTarget3(dices: int, faces: int, target: int) -> int:
from functools import lru_cache
#lru_cache(None)
def ways(t, rd):
if t == 0 and rd == 0: return 1
if t <= 0 or rd <= 0: return 0
return sum(ways(t-i, rd-1) for i in range(1,faces+1))
return ways(target, dices)
def main():
print(numRollsToTarget(4, 6, 20))
#print(numRollsToTarget2(4, 6, 20))
#print(numRollsToTarget3(4, 6, 20)) # not faster than first
if __name__ == '__main__':
cProfile.run('main()')

Performance problems w/ many widgets respectively stacked QTableWidgets

I need to display a complex Table and decided to use stacked QTableWidgets. With an increasing number of rows the program needs a lot of time for creating all the widgets and almost the same time for displaying.
The maintable looks like this: MainTable
The stacked TableWidget in the table:
StackedTables
if the cell contains data, there is at least one TableWidget in one cell of the MainTable and in the worst case there are 2 more TableWidgets in that one. That means I could have 3 TableWidgets in one cell.
Time measurement with cProfile and time.time for 80rows (with 48 of the complex cells for each row):
complete update time: 15s (manually stopped)
time to create the table: 7.548534870147705s (time.time over complete function)
display time: 7.5s (complete update time - function time)
rows: 80
63600 function calls in 7.462 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1896 2.455 0.001 3.882 0.002 DigitalePlanungstafel.py:6054(grundWidgetErstellen) -- (create table in cell)
3936 2.027 0.001 2.027 0.001 {built-in method setCellWidget}
2535 1.306 0.001 1.306 0.001 {built-in method setColumnCount}
630 0.770 0.001 1.183 0.002 DigitalePlanungstafel.py:6035(obenWidgetErstellen) -- (create table in table in cell)
2607 0.674 0.000 0.674 0.000 {built-in method setRowCount}
2528 0.059 0.000 0.059 0.000 {built-in method horizontalHeader}
2526 0.021 0.000 0.021 0.000 {built-in method verticalHeader}
163 0.019 0.000 0.019 0.000 {method 'execute' of 'sqlite3.Cursor' objects}
2526 0.016 0.000 0.016 0.000 {built-in method setFrameShape}
1410 0.014 0.000 0.014 0.000 {built-in method setStyleSheet}
4502 0.013 0.000 0.013 0.000 {built-in method setRowHeight}
2526 0.009 0.000 0.009 0.000 {built-in method setFixedSize}
2546 0.009 0.000 0.009 0.000 {built-in method setColumnWidth}
5052 0.009 0.000 0.009 0.000 {built-in method setVisible}
1329 0.007 0.000 0.007 0.000 {built-in method setItem}
2181 0.006 0.000 0.006 0.000 {built-in method cellWidget}
80 0.005 0.000 0.005 0.000 {built-in method addWidget}
2526 0.004 0.000 0.004 0.000 {built-in method setEditTriggers}
929 0.004 0.000 0.004 0.000 {built-in method setBackground}
1330 0.003 0.000 0.003 0.000 {method 'format' of 'str' objects}
414 0.003 0.000 0.003 0.000 {built-in method _pickle.loads}
336 0.003 0.000 0.003 0.000 {method 'strftime' of 'datetime.date' objects}
2526 0.003 0.000 0.003 0.000 {built-in method setHorizontalScrollBarPolicy}
1410 0.002 0.000 0.002 0.000 {built-in method setFixedHeight}
1377 0.002 0.000 0.002 0.000 {built-in method setTextAlignment}
83 0.002 0.000 0.004 0.000 _strptime.py:321(_strptime)
2 0.002 0.001 0.002 0.001 {built-in method setSortingEnabled}
2526 0.001 0.000 0.001 0.000 {built-in method setShowGrid}
1570 0.001 0.000 0.001 0.000 {built-in method rowHeight}
2526 0.001 0.000 0.001 0.000 {built-in method setSelectionMode}
163 0.001 0.000 0.001 0.000 {method 'fetchall' of 'sqlite3.Cursor' objects}
240 0.001 0.000 0.001 0.000 DigitalePlanungstafel.py:7494(__init__)
2526 0.001 0.000 0.001 0.000 {built-in method setVerticalScrollBarPolicy}
1 0.001 0.001 0.001 0.001 {built-in method sortByColumn}
80 0.001 0.000 0.001 0.000 {built-in method setLayout}
83 0.001 0.000 0.001 0.000 {built-in method _locale.setlocale}
1 0.001 0.001 0.001 0.001 {built-in method _sqlite3.connect}
89 0.000 0.000 0.000 0.000 {built-in method setForeground}
83 0.000 0.000 0.000 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
83 0.000 0.000 0.004 0.000 _strptime.py:562(_strptime_datetime)
48 0.000 0.000 0.000 0.000 {built-in method setHorizontalHeaderItem}
80 0.000 0.000 0.000 0.000 {built-in method setContentsMargins}
83 0.000 0.000 0.005 0.000 {built-in method strptime}
83 0.000 0.000 0.000 0.000 locale.py:379(normalize)
88 0.000 0.000 0.000 0.000 {built-in method setFont}
160 0.000 0.000 0.000 0.000 {built-in method setData}
83 0.000 0.000 0.000 0.000 {method 'groupdict' of '_sre.SRE_Match' objects}
80 0.000 0.000 0.000 0.000 {built-in method setAlignment}
83 0.000 0.000 0.001 0.000 _strptime.py:29(_getlang)
83 0.000 0.000 0.001 0.000 locale.py:565(getlocale)
83 0.000 0.000 0.001 0.000 locale.py:462(_parse_localename)
80 0.000 0.000 0.000 0.000 {built-in method setUnderline}
1 0.000 0.000 0.000 0.000 {built-in method io.open}
249 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
1 0.000 0.000 0.000 0.000 {method 'close' of 'sqlite3.Connection' objects}
160 0.000 0.000 0.000 0.000 {built-in method item}
742 0.000 0.000 0.000 0.000 DigitalePlanungstafel.py:7499(__lt__)
48 0.000 0.000 0.000 0.000 {built-in method today}
165 0.000 0.000 0.000 0.000 {method 'toordinal' of 'datetime.date' objects}
475 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
167 0.000 0.000 0.000 0.000 {built-in method builtins.len}
166 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
83 0.000 0.000 0.000 0.000 {method 'end' of '_sre.SRE_Match' objects}
84 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'close' of '_io.TextIOWrapper' objects}
2 0.000 0.000 0.000 0.000 {built-in method builtins.print}
83 0.000 0.000 0.000 0.000 {method 'keys' of 'dict' objects}
47 0.000 0.000 0.000 0.000 {built-in method columnCount}
83 0.000 0.000 0.000 0.000 {method 'weekday' of 'datetime.date' objects}
96 0.000 0.000 0.000 0.000 {method 'date' of 'datetime.datetime' objects}
20 0.000 0.000 0.000 0.000 {built-in method columnWidth}
1 0.000 0.000 0.000 0.000 {built-in method _locale._getdefaultlocale}
3 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
1 0.000 0.000 0.000 0.000 _strptime.py:284(_calc_julian_from_U_or_W)
1 0.000 0.000 0.000 0.000 _bootlocale.py:11(getpreferredencoding)
1 0.000 0.000 0.000 0.000 {built-in method sortIndicatorOrder}
2 0.000 0.000 0.000 0.000 {built-in method time.time}
1 0.000 0.000 0.000 0.000 {method 'cursor' of 'sqlite3.Connection' objects}
2 0.000 0.000 0.000 0.000 {method 'index' of 'list' objects}
1 0.000 0.000 0.000 0.000 {built-in method fromordinal}
1 0.000 0.000 0.000 0.000 {built-in method sortIndicatorSection}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 codecs.py:259(__init__)
The time for calling the function and creating the Table is okay but the program needs almost the same time again until it reacts.
The Goal would be to reduce the update time by approx. 50%. I need to display around 200 rows.
Is the stacked QTableWidget the right approach - If so what do I have to do to optimize the update times?
I already thought about changing the presenatation from QTableWidget to a QGraphicsView and simply drawing the rectangles.
Or maybe a combnation, for example: Using the QTableWidget for the Header and the first columns and then merging all the complex cells and inserting a QGraphicsView but I am not sure if I am able to get the right size for the drawed cells and I do not know if the displaying time will be shorter.
What do you guys think is the right approch for a table like this?
If you need I can append the function, which is updating the Table.
EDIT:
I am using now 2 rows in the MainTable for one Block which reduced the number of stacked widgets from 2.526 to 7!
The time measurement for the same rows now looks like this:
complete update time: ~2s (manually stopped)
time to create the table: 0.572490930557251 (time.time over complete function)
display time: ~1.5s (complete update time - function time)
rows: 160
20912 function calls in 0.534 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1417 0.258 0.000 0.258 0.000 {built-in method setCellWidget}
88 0.209 0.002 0.209 0.002 {built-in method setRowCount}
163 0.017 0.000 0.017 0.000 {method 'execute' of 'sqlite3.Cursor' objects}
1410 0.010 0.000 0.010 0.000 {built-in method setStyleSheet}
80 0.005 0.000 0.005 0.000 {built-in method addWidget}
1336 0.004 0.000 0.004 0.000 {built-in method setItem}
7 0.003 0.000 0.004 0.001 DigitalePlanungstafel.py:6037(obenWidgetErstellen)
501 0.003 0.000 0.003 0.000 {built-in method _pickle.loads}
1990 0.003 0.000 0.003 0.000 {built-in method cellWidget}
336 0.002 0.000 0.002 0.000 {method 'strftime' of 'datetime.date' objects}
83 0.002 0.000 0.004 0.000 _strptime.py:321(_strptime)
1330 0.002 0.000 0.002 0.000 {method 'format' of 'str' objects}
1410 0.002 0.000 0.002 0.000 {built-in method setFixedHeight}
929 0.001 0.000 0.001 0.000 {built-in method setBackground}
1377 0.001 0.000 0.001 0.000 {built-in method setTextAlignment}
240 0.001 0.000 0.001 0.000 DigitalePlanungstafel.py:7452(__init__)
163 0.001 0.000 0.001 0.000 {method 'fetchall' of 'sqlite3.Cursor' objects}
16 0.001 0.000 0.001 0.000 {built-in method setColumnCount}
770 0.001 0.000 0.001 0.000 {built-in method setSpan}
2127 0.001 0.000 0.001 0.000 {built-in method item}
1570 0.001 0.000 0.001 0.000 {built-in method rowHeight}
80 0.001 0.000 0.001 0.000 {built-in method setLayout}
83 0.000 0.000 0.000 0.000 {built-in method _locale.setlocale}
1 0.000 0.000 0.000 0.000 {built-in method _sqlite3.connect}
167 0.000 0.000 0.000 0.000 {built-in method setRowHeight}
89 0.000 0.000 0.000 0.000 {built-in method setForeground}
83 0.000 0.000 0.000 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
48 0.000 0.000 0.000 0.000 {built-in method setHorizontalHeaderItem}
83 0.000 0.000 0.004 0.000 _strptime.py:562(_strptime_datetime)
7 0.000 0.000 0.000 0.000 {built-in method takeItem}
80 0.000 0.000 0.000 0.000 {built-in method setContentsMargins}
83 0.000 0.000 0.004 0.000 {built-in method strptime}
88 0.000 0.000 0.000 0.000 {built-in method setFont}
160 0.000 0.000 0.000 0.000 {built-in method setData}
83 0.000 0.000 0.000 0.000 locale.py:379(normalize)
80 0.000 0.000 0.000 0.000 {built-in method setAlignment}
83 0.000 0.000 0.001 0.000 _strptime.py:29(_getlang)
83 0.000 0.000 0.000 0.000 {method 'groupdict' of '_sre.SRE_Match' objects}
83 0.000 0.000 0.001 0.000 locale.py:565(getlocale)
1 0.000 0.000 0.000 0.000 {method 'close' of 'sqlite3.Connection' objects}
1 0.000 0.000 0.000 0.000 {built-in method io.open}
7 0.000 0.000 0.000 0.000 {built-in method horizontalHeader}
80 0.000 0.000 0.000 0.000 {built-in method setUnderline}
249 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
83 0.000 0.000 0.000 0.000 locale.py:462(_parse_localename)
48 0.000 0.000 0.000 0.000 {built-in method today}
7 0.000 0.000 0.000 0.000 {built-in method verticalHeader}
475 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
167 0.000 0.000 0.000 0.000 {built-in method builtins.len}
165 0.000 0.000 0.000 0.000 {method 'toordinal' of 'datetime.date' objects}
1 0.000 0.000 0.000 0.000 {method 'close' of '_io.TextIOWrapper' objects}
166 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
7 0.000 0.000 0.000 0.000 {built-in method setFrameShape}
27 0.000 0.000 0.000 0.000 {built-in method setColumnWidth}
84 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'sort' of 'list' objects}
83 0.000 0.000 0.000 0.000 {method 'end' of '_sre.SRE_Match' objects}
87 0.000 0.000 0.000 0.000 DigitalePlanungstafel.py:6059(<lambda>)
83 0.000 0.000 0.000 0.000 {method 'keys' of 'dict' objects}
2 0.000 0.000 0.000 0.000 {built-in method builtins.print}
7 0.000 0.000 0.000 0.000 {built-in method setFixedSize}
14 0.000 0.000 0.000 0.000 {built-in method setVisible}
83 0.000 0.000 0.000 0.000 {method 'weekday' of 'datetime.date' objects}
96 0.000 0.000 0.000 0.000 {method 'date' of 'datetime.datetime' objects}
7 0.000 0.000 0.000 0.000 {built-in method setEditTriggers}
20 0.000 0.000 0.000 0.000 {built-in method columnWidth}
7 0.000 0.000 0.000 0.000 {built-in method setHorizontalScrollBarPolicy}
1 0.000 0.000 0.000 0.000 {built-in method _locale._getdefaultlocale}
The updateing time is more than good but I have 2 problems because of the splitting.
Sorting (over the header) is not working anymore. This is caused by the merging, it messes up everything. Is there a way to freeze two associated rows before sorting?
I only want one row to be selected at a time. Cause of the splitting I need to select the two associated rows no matter which one gets selected. Not really a big deal but doesen´t really look good (see pictures).
MainTable selected row
MainTable selected row
You do not want to do any of this - instead:
Use a model for your data.
Use a viewmodel that adapts the model to the view you wish to have.
Use a QTableView to display the viewmodel.

Improve performance of MongoDB client (sockets)

I am using Python 2.7 (Anaconda distribution) on Windows 8.1 Pro.
I have a database of articles with their respective topics.
I am building an application which queries textual phrases in my database and associates article topics to each queried phrase. The topics are assigned based on the relevance of the phrase for the article.
The bottleneck seems to be Python socket communication with the localhost.
Here are my cProfile outputs:
topics_fit (PhraseVectorizer_1_1.py:668)
function called 1 times
1930698 function calls (1929630 primitive calls) in 148.209 seconds
Ordered by: cumulative time, internal time, call count
List reduced from 286 to 40 due to restriction <40>
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.224 1.224 148.209 148.209 PhraseVectorizer_1_1.py:668(topics_fit)
206272 0.193 0.000 146.780 0.001 cursor.py:1041(next)
601 0.189 0.000 146.455 0.244 cursor.py:944(_refresh)
534 0.030 0.000 146.263 0.274 cursor.py:796(__send_message)
534 0.009 0.000 141.532 0.265 mongo_client.py:725(_send_message_with_response)
534 0.002 0.000 141.484 0.265 mongo_client.py:768(_reset_on_error)
534 0.019 0.000 141.482 0.265 server.py:69(send_message_with_response)
534 0.002 0.000 141.364 0.265 pool.py:225(receive_message)
535 0.083 0.000 141.362 0.264 network.py:106(receive_message)
1070 1.202 0.001 141.278 0.132 network.py:127(_receive_data_on_socket)
3340 140.074 0.042 140.074 0.042 {method 'recv' of '_socket.socket' objects}
535 0.778 0.001 4.700 0.009 helpers.py:88(_unpack_response)
535 3.828 0.007 3.920 0.007 {bson._cbson.decode_all}
67 0.099 0.001 0.196 0.003 {method 'sort' of 'list' objects}
206187 0.096 0.000 0.096 0.000 PhraseVectorizer_1_1.py:705(<lambda>)
206187 0.096 0.000 0.096 0.000 database.py:339(_fix_outgoing)
206187 0.074 0.000 0.092 0.000 objectid.py:68(__init__)
1068 0.005 0.000 0.054 0.000 server.py:135(get_socket)
1068/534 0.010 0.000 0.041 0.000 contextlib.py:21(__exit__)
1068 0.004 0.000 0.041 0.000 pool.py:501(get_socket)
534 0.003 0.000 0.028 0.000 pool.py:208(send_message)
534 0.009 0.000 0.026 0.000 pool.py:573(return_socket)
567 0.001 0.000 0.026 0.000 socket.py:227(meth)
535 0.024 0.000 0.024 0.000 {method 'sendall' of '_socket.socket' objects}
534 0.003 0.000 0.023 0.000 topology.py:134(select_server)
206806 0.020 0.000 0.020 0.000 collection.py:249(database)
418997 0.019 0.000 0.019 0.000 {len}
449 0.001 0.000 0.018 0.000 topology.py:143(select_server_by_address)
534 0.005 0.000 0.018 0.000 topology.py:82(select_servers)
1068/534 0.001 0.000 0.018 0.000 contextlib.py:15(__enter__)
534 0.002 0.000 0.013 0.000 thread_util.py:83(release)
207307 0.010 0.000 0.011 0.000 {isinstance}
534 0.005 0.000 0.011 0.000 pool.py:538(_get_socket_no_auth)
534 0.004 0.000 0.011 0.000 thread_util.py:63(release)
534 0.001 0.000 0.011 0.000 mongo_client.py:673(_get_topology)
535 0.003 0.000 0.010 0.000 topology.py:57(open)
206187 0.008 0.000 0.008 0.000 {method 'popleft' of 'collections.deque' objects}
535 0.002 0.000 0.007 0.000 topology.py:327(_apply_selector)
536 0.003 0.000 0.007 0.000 topology.py:286(_ensure_opened)
1071 0.004 0.000 0.007 0.000 periodic_executor.py:50(open)
In particular: {method 'recv' of '_socket.socket' objects} seems to cause trouble.
According to suggestions found in What can I do to improve socket performance in Python 3?, I tried gevent.
I added this snippet at the beginning of my script (before importing anything):
from gevent import monkey
monkey.patch_all()
This resulted in even slower performance...
*** PROFILER RESULTS ***
topics_fit (PhraseVectorizer_1_1.py:671)
function called 1 times
1956879 function calls (1951292 primitive calls) in 158.260 seconds
Ordered by: cumulative time, internal time, call count
List reduced from 427 to 40 due to restriction <40>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 158.170 158.170 hub.py:358(run)
1 0.000 0.000 158.170 158.170 {method 'run' of 'gevent.core.loop' objects}
2/1 1.286 0.643 158.166 158.166 PhraseVectorizer_1_1.py:671(topics_fit)
206272 0.198 0.000 156.670 0.001 cursor.py:1041(next)
601 0.192 0.000 156.203 0.260 cursor.py:944(_refresh)
534 0.029 0.000 156.008 0.292 cursor.py:796(__send_message)
534 0.012 0.000 150.514 0.282 mongo_client.py:725(_send_message_with_response)
534 0.002 0.000 150.439 0.282 mongo_client.py:768(_reset_on_error)
534 0.017 0.000 150.437 0.282 server.py:69(send_message_with_response)
551/535 0.002 0.000 150.316 0.281 pool.py:225(receive_message)
552/536 0.079 0.000 150.314 0.280 network.py:106(receive_message)
1104/1072 0.815 0.001 150.234 0.140 network.py:127(_receive_data_on_socket)
2427/2395 0.019 0.000 149.418 0.062 socket.py:381(recv)
608/592 0.003 0.000 48.541 0.082 socket.py:284(_wait)
552 0.885 0.002 5.464 0.010 helpers.py:88(_unpack_response)
552 4.475 0.008 4.577 0.008 {bson._cbson.decode_all}
3033 2.021 0.001 2.021 0.001 {method 'recv' of '_socket.socket' objects}
7/4 0.000 0.000 0.221 0.055 hub.py:189(_import)
4 0.127 0.032 0.221 0.055 {__import__}
67 0.104 0.002 0.202 0.003 {method 'sort' of 'list' objects}
536/535 0.003 0.000 0.142 0.000 topology.py:57(open)
537/536 0.002 0.000 0.139 0.000 topology.py:286(_ensure_opened)
1072/1071 0.003 0.000 0.138 0.000 periodic_executor.py:50(open)
537/536 0.001 0.000 0.136 0.000 server.py:33(open)
537/536 0.001 0.000 0.135 0.000 monitor.py:69(open)
20/19 0.000 0.000 0.132 0.007 topology.py:342(_update_servers)
4 0.000 0.000 0.131 0.033 hub.py:418(_get_resolver)
1 0.000 0.000 0.122 0.122 resolver_thread.py:13(__init__)
1 0.000 0.000 0.122 0.122 hub.py:433(_get_threadpool)
206187 0.081 0.000 0.101 0.000 objectid.py:68(__init__)
206187 0.100 0.000 0.100 0.000 database.py:339(_fix_outgoing)
206187 0.098 0.000 0.098 0.000 PhraseVectorizer_1_1.py:708(<lambda>)
1 0.073 0.073 0.093 0.093 threadpool.py:2(<module>)
2037 0.003 0.000 0.092 0.000 hub.py:159(get_hub)
2 0.000 0.000 0.090 0.045 thread.py:39(start_new_thread)
2 0.000 0.000 0.090 0.045 greenlet.py:195(spawn)
2 0.000 0.000 0.090 0.045 greenlet.py:74(__init__)
1 0.000 0.000 0.090 0.090 hub.py:259(__init__)
1102 0.004 0.000 0.078 0.000 pool.py:501(get_socket)
1068 0.005 0.000 0.074 0.000 server.py:135(get_socket)
This performance is somewhat unacceptable for my application - I would like it to be much faster (this is timed and profiled for a subset of ~20 documents, and I need to process few tens of thousands).
Any ideas on how to speed it up?
Much appreciated.
Edit:
Code snippet that I profiled:
# also tried monkey patching all here, see profiler
from pymongo import MongoClient
def topics_fit(self):
client = MongoClient()
# tried motor for multithreading - also slow
#client = motor.motor_tornado.MotorClient()
# initialize DB cursors
db_wiki = client.wiki
# initialize topic feature dictionary
self.topics = OrderedDict()
self.topic_mapping = OrderedDict()
vocabulary_keys = self.vocabulary.keys()
num_categories = 0
for phrase in vocabulary_keys:
phrase_tokens = phrase.split()
if len(phrase_tokens) > 1:
# query for current phrase
AND_phrase = "\"" + phrase + "\""
cursor = db_wiki.categories.find({ "$text" : { "$search": AND_phrase } },{ "score": { "$meta": "textScore" } })
cursor = list(cursor)
if cursor:
cursor.sort(key=lambda k: k["score"], reverse = True)
added_categories = cursor[0]["category_ids"]
for added_category in added_categories:
if not (added_category in self.topics):
self.topics[added_category] = num_categories
if not (self.vocabulary[phrase] in self.topic_mapping):
self.topic_mapping[self.vocabulary[phrase]] = [num_categories, ]
else:
self.topic_mapping[self.vocabulary[phrase]].append(num_categories)
num_categories+=1
else:
if not (self.vocabulary[phrase] in self.topic_mapping):
self.topic_mapping[self.vocabulary[phrase]] = [self.topics[added_category], ]
else:
self.topic_mapping[self.vocabulary[phrase]].append(self.topics[added_category])
Edit 2: output of index_information():
{u'_id_':
{u'ns': u'wiki.categories', u'key': [(u'_id', 1)], u'v': 1},
u'article_title_text_article_body_text_category_names_text': {u'default_language': u'english', u'weights': SON([(u'article_body', 1), (u'article_title', 1), (u'category_names', 1)]), u'key': [(u'_fts', u'text'), (u'_ftsx', 1)], u'v': 1, u'language_override': u'language', u'ns': u'wiki.categories', u'textIndexVersion': 2}}

Optimizing point - circle distance method

I'm implementing a RANSAC algorithm for circle detection in images. I profiled the execution and I get:
13699392 function calls in 799.981 seconds
Random listing order was used
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 {time.time}
579810 0.564 0.000 0.564 0.000 {getattr}
289905 2.343 0.000 8.661 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/blas.py:226(_get_funcs)
579810 0.124 0.000 0.124 0.000 {method 'get' of 'dict' objects}
289905 0.645 0.000 2.676 0.000 {map}
2954 0.005 0.000 0.005 0.000 {method 'transpose' of 'numpy.ndarray' objects}
2954 0.023 0.000 0.464 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/shape_base.py:179(vstack)
2954 2.373 0.001 2.373 0.001 {method 'read' of 'cv2.VideoCapture' objects}
579810 0.966 0.000 2.031 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/function_base.py:550(asarray_chkfinite)
289905 10.164 0.000 24.316 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/basic.py:456(lstsq)
2954 1.090 0.000 1.090 0.000 {normalize}
1455433 3.827 0.000 3.827 0.000 {numpy.core.multiarray.array}
579810 2.899 0.000 3.148 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/numerictypes.py:949(_can_coerce_all)
1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.empty}
2954 32.544 0.011 795.875 0.269 git/tra-python-processer/tra/ransac.py:31(image_search)
289905 0.714 0.000 38.644 0.000 git/tra-python-processer/tra/features.py:44(__init__)
289905 2.157 0.000 2.157 0.000 {method 'randint' of 'mtrand.RandomState' objects}
1 0.005 0.005 0.005 0.005 {VideoCapture}
289905 1.026 0.000 1.026 0.000 {method 'astype' of 'numpy.generic' objects}
2954 0.006 0.000 0.010 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/fromnumeric.py:495(transpose)
289905 11.303 0.000 37.930 0.000 git/tra-python-processer/tra/features.py:48(__gen)
3496584 0.343 0.000 0.343 0.000 {len}
2954 0.344 0.000 0.344 0.000 {numpy.core.multiarray.concatenate}
2954 3.214 0.001 3.214 0.001 {numpy.core.multiarray.where}
869715 0.575 0.000 0.575 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/fromnumeric.py:2514(size)
869715 0.778 0.000 2.278 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/numeric.py:394(asarray)
289905 716.946 0.002 716.946 0.002 git/tra-python-processer/tra/features.py:89(points_distance)
5908 0.015 0.000 0.031 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/numeric.py:464(asanyarray)
289905 0.275 0.000 0.275 0.000 {isinstance}
289905 0.342 0.000 9.003 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/lapack.py:255(get_lapack_funcs)
5908 0.058 0.000 0.097 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/shape_base.py:60(atleast_2d)
295813 0.089 0.000 0.089 0.000 {method 'append' of 'list' objects}
289905 0.645 0.000 3.793 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/numerictypes.py:970(find_common_type)
2954 0.221 0.000 0.221 0.000 {threshold}
1 0.000 0.000 0.000 0.000 {method 'get' of 'cv2.VideoCapture' objects}
1 0.000 0.000 0.000 0.000 git/tra-python-processer/tra/ransac.py:24(__init__)
2954 0.009 0.000 0.009 0.000 {numpy.core.multiarray.zeros}
579810 0.143 0.000 0.143 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/misc.py:126(_datacopied)
1 0.201 0.201 799.981 799.981 git/tra-python-processer/tra/ransac.py:122(video_processing)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2954 1.528 0.001 1.528 0.001 {cvtColor}
289905 1.280 0.000 5.346 0.000 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/blas.py:182(find_best_blas_type)
289905 0.198 0.000 0.198 0.000 {method 'index' of 'list' objects}
It's first time I use profiler, however for what I can understand the function that is most heavy is features.py:89(points_distance) that comes out to be a very easy implementation:
def points_distance(self,points):
d = n.abs(\
n.sqrt(\
n.power(self.xc - points[:,0],2) + n.power(self.yc - points[:,1],2)
)\
- self.radius
)
return d
Any suggestions? Maybe cython?
Use scipy.spatial.distance.cdist for the distance calculation in points_distance.
First, optimize your code in pure Python and numpy. Then if necessary port the critical parts to Cython. Since a number of functions are called repeatedly a few ~100000 times, you should get some speed up from Cython for those parts. Unless, of course, the computational bottleneck is in the distance calculation, which will then limit the overall execution time.
By the way, you should sort your profiler results by tottime so they are easier to read.

Categories

Resources