Related
Trying to solidify my knowledge about Time Complexity. I think I know the answer to this, but would like to hear some good explanations.
main = []
while len(main) < 5:
sub = []
while len(sub) < 5:
sub.append(random.randint(1,10))
main.append(sub)
VS
main = []
sub = []
while len(main) < 5:
sub.append(random.randint(1,10))
if len(sub) == 5:
main.append(list(sub))
sub = []
There's no difference, since the time complexity is constant in both cases - you perform a constant amount of operations both times.
The time complexity in both are O(1) - constant time because they both perform a constant number of operations as #Yakov Dan already stated.
This is because time complexity is usually expressed as a function of a variable number(say n) and tends to show how changing the value of n will change the time the algorithm will take.
Now, assuming you had n instead of 5, then you would have O(n^2) for both cases. It may be tricky for the second case since a basic way of checking the polynomial complexity is to count the number of nested loops and you can be lead to conclude that the second version is O(n) since it has a single loop.
However, carefully looking at it will show you that the loop runs n(5 in this case) times for sub for each value appended to main, so it is essentially the same.
This of course assumes that the in-built list.append is atomic or runs in a constant time.
I have a list of numbers, e.g. [50,100,150,200,250]. I need to increment (or decrement) each number from a specified index and by a specified amount. I have been able to do this in two ways:
from itertools import islice
l = [50,100,150,200,250]
start_increment_index = 3
l[start_increment_index:] = [e+100 for e in l[start_increment_index:]]
print (l)
l = [50,100,150,200,250]
l[start_increment_index:] = [e+100 for e in islice(l,start_increment_index,len(l))]
print (l)
Both print: [50, 100, 150, 300, 350].
However, my real list contains millions of numbers and this operation is performed repeatedly with different indexes and different increments/decrements. Would there be a faster way of doing this using a Python list? I have been considering writing my own C/C++ extension to deal with this.
Edit: Would this be a useful module for Python in general? Having a function written in C which can take parameters (python_list_object, increment_amount, start_index, end_index)?
Main problem in your solution that you creates(allocating memory + copy) two lists. First it's list comprehension by itself and second l[start_increment_index:] inside it.
If you data source is python list, you can do you operation for O(n):
for i in range(start_increment_index, len(l)):
l[i] += increment
NB: define increment first.
It depends specifically on your goals. I suppose that you can use segment tree for this case. For more information see https://en.m.wikipedia.org/wiki/Segment_tree.
Just for brief description. This structure represents array upon which will be performed range operations (like addition/substraction subarray with number) This structure is optimized for case where you have very big number of such range queries.
Note: if you want to use only python list structure, then you can implement sparse table (it is another view of segment tree with implicit storing of tree in arrays)
These codes gives the sum of even integers in a list without using loop statement. I would like to know the time complexity and space complexity of both codes. which is best?
CODE 1:
class EvenSum:
#Initialize the class
def __init__(self):
self.res = 0
def sumEvenIntegers(self, integerlist):
if integerlist:
if not integerlist[0] % 2:
self.res += integerlist[0]
del integerlist[0]
self.sumEvenIntegers(integerlist)
else:
del integerlist[0]
self.sumEvenIntegers(integerlist)
return self.res
#main method
if __name__ == "__main__":
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even = EvenSum()
print even.sumEvenIntegers(l)
CODE 2:
import numpy as np
def sum_of_all_even_integers(list):
list_sum = sum(list)
bin_arr = map(lambda x:x%2, list)
return list_sum - sum(list*bin_arr)
if __name__ == "__main__":
list = np.array([1,2,3,4,5,6,7,8,9,10])
print sum_of_all_even_integers(list)
According to the Python wiki, deleting an item from a list takes linear time proportional to the number of elements in the list. Since you delete every item in the list, and each deletion takes linear time, the overall runtime is proportional to the square of number of items in the list.
In your second code snippet, both sum as well as map take linear time. So the overall complexity is linear proportional to the number of elements in the list. Interestingly, sum_of_elements isn't used at all (but it doesn't sum all even elements either).
what about the following?
import numpy as np
a = np.arange(20)
print np.sum(a[a%2==0])
It seems to be much more lightweight compared to your two code snippets.
Small timings with an np.arange(998):
Pure numpy:
248502
0.0
Class recursion:
248502
0.00399994850159
List/Numpy one:
248502
0.00200009346008
And, if there's a 999 element array, your class runs in failure, because the maximum recursion depth is reached.
First code use item deletion in list and recursivity, two thing at which python is not so good : time deletion take an O(n) time, since you recreate the whole list, and python does not optimize recursive calls (to keep full info about the traceback I think).
So I would go for the second code (which I think actually use "for loops", only the loops are hidden in the reduce and map).
If you use numpy, you could actually do something like :
a = np.array([1,2,3,4,5,6,7,8,9,10])
np.sum(np.where((a+1)%2,a,0))
Or like anki proposed :
np.sum( a[a%2 == 0] )
Which I think would be best since numpy is optimized for array manipulation.
By the way, never name an object list, as it overwrites the list constructor.
EDIT :
If you just want the sum of all even number in [0,n], you don't need a sum or anything. There is a mathematical formula for that :
s=(n//2)*(n//2+1)
First have O(N^2) time complexity and O(N) space complexity. The second have O(N) time complexity and space complexity.
The first uses one stack frame (one piece of stack memory of constant but quite large size) for each element in the array. In addition it executes the function for each element, but for each time it deletes the first element of the array, which is an O(N) operation.
The second happens much behind the scene. The map function generates a new list of the same size of the original, in addition it calls a function for each element - giving the complexity directly. Similarily for the reduce and sum functions - they do the same operation for each element in the list, although they don't use more memory than constant. Adding up these doesn't make the complexities worse - twice or thrice O(N) is still O(N).
The best is probably the later, but then again - it depends on what your preferences are. Maybe you want to consume a lot of time and stack space? In that case the first would suit your preferences much better.
Also note that the first solution modifies the input data - they don't do the same thing in other words. After calling the first the list sent to the function would be empty (which may or may not be a bad thing).
Recently I read a problem to practice DP. I wasn't able to come up with one, so I tried a recursive solution which I later modified to use memoization. The problem statement is as follows :-
Making Change. You are given n types of coin denominations of values
v(1) < v(2) < ... < v(n) (all integers). Assume v(1) = 1, so you can
always make change for any amount of money C. Give an algorithm which
makes change for an amount of money C with as few coins as possible.
[on problem set 4]
I got the question from here
My solution was as follows :-
def memoized_make_change(L, index, cost, d):
if index == 0:
return cost
if (index, cost) in d:
return d[(index, cost)]
count = cost / L[index]
val1 = memoized_make_change(L, index-1, cost%L[index], d) + count
val2 = memoized_make_change(L, index-1, cost, d)
x = min(val1, val2)
d[(index, cost)] = x
return x
This is how I've understood my solution to the problem. Assume that the denominations are stored in L in ascending order. As I iterate from the end to the beginning, I have a choice to either choose a denomination or not choose it. If I choose it, I then recurse to satisfy the remaining amount with lower denominations. If I do not choose it, I recurse to satisfy the current amount with lower denominations.
Either way, at a given function call, I find the best(lowest count) to satisfy a given amount.
Could I have some help in bridging the thought process from here onward to reach a DP solution? I'm not doing this as any HW, this is just for fun and practice. I don't really need any code either, just some help in explaining the thought process would be perfect.
[EDIT]
I recall reading that function calls are expensive and is the reason why bottom up(based on iteration) might be preferred. Is that possible for this problem?
Here is a general approach for converting memoized recursive solutions to "traditional" bottom-up DP ones, in cases where this is possible.
First, let's express our general "memoized recursive solution". Here, x represents all the parameters that change on each recursive call. We want this to be a tuple of positive integers - in your case, (index, cost). I omit anything that's constant across the recursion (in your case, L), and I suppose that I have a global cache. (But FWIW, in Python you should just use the lru_cache decorator from the standard library functools module rather than managing the cache yourself.)
To solve for(x):
If x in cache: return cache[x]
Handle base cases, i.e. where one or more components of x is zero
Otherwise:
Make one or more recursive calls
Combine those results into `result`
cache[x] = result
return result
The basic idea in dynamic programming is simply to evaluate the base cases first and work upward:
To solve for(x):
For y starting at (0, 0, ...) and increasing towards x:
Do all the stuff from above
However, two neat things happen when we arrange the code this way:
As long as the order of y values is chosen properly (this is trivial when there's only one vector component, of course), we can arrange that the results for the recursive call are always in cache (i.e. we already calculated them earlier, because y had that value on a previous iteration of the loop). So instead of actually making the recursive call, we replace it directly with a cache lookup.
Since every component of y will use consecutively increasing values, and will be placed in the cache in order, we can use a multidimensional array (nested lists, or else a Numpy array) to store the values instead of a dictionary.
So we get something like:
To solve for(x):
cache = multidimensional array sized according to x
for i in range(first component of x):
for j in ...:
(as many loops as needed; better yet use `itertools.product`)
If this is a base case, write the appropriate value to cache
Otherwise, compute "recursive" index values to use, look up
the values, perform the computation and store the result
return the appropriate ("last") value from cache
I suggest considering the relationship between the value you are constructing and the values you need for it.
In this case you are constructing a value for index, cost based on:
index-1 and cost
index-1 and cost%L[index]
What you are searching for is a way of iterating over the choices such that you will always have precalculated everything you need.
In this case you can simply change the code to the iterative approach:
for each choice of index 0 upwards:
for each choice of cost:
compute value corresponding to index,cost
In practice, I find that the iterative approach can be significantly faster (e.g. *4 perhaps) for simple problems as it avoids the overhead of function calls and checking the cache for preexisting values.
This may be more of an 'approach' or conceptual question.
Basically, I have a python a multi-dimensional list like so:
my_list = [[0,1,1,1,0,1], [1,1,1,0,0,1], [1,1,0,0,0,1], [1,1,1,1,1,1]]
What I have to do is iterate through the array and compare each element with those directly surrounding it as though the list was layed out as a matrix.
For instance, given the first element of the first row, my_list[0][0], I need to know know the value of my_list[0][1], my_list[1][0] and my_list[1][1]. The value of the 'surrounding' elements will determine how the current element should be operated on. Of course for an element in the heart of the array, 8 comparisons will be necessary.
Now I know I could simply iterate through the array and compare with the indexed values, as above. I was curious as to whether there was a more efficient way which limited the amount of iteration required? Should I iterate through the array as is, or iterate and compare only values to either side and then transpose the array and run it again. This, however would ignore those values to the diagonal. And should I store results of the element lookups, so I don't keep determining the value of the same element multiple times?
I suspect this may have a fundamental approach in Computer Science, and I am eager to get feedback on the best approach using Python as opposed to looking for a specific answer to my problem.
You may get faster, and possibly even simpler, code by using numpy, or other alternatives (see below for details). But from a theoretical point of view, in terms of algorithmic complexity, the best you can get is O(N*M), and you can do that with your design (if I understand it correctly). For example:
def neighbors(matrix, row, col):
for i in row-1, row, row+1:
if i < 0 or i == len(matrix): continue
for j in col-1, col, col+1:
if j < 0 or j == len(matrix[i]): continue
if i == row and j == col: continue
yield matrix[i][j]
matrix = [[0,1,1,1,0,1], [1,1,1,0,0,1], [1,1,0,0,0,1], [1,1,1,1,1,1]]
for i, row in enumerate(matrix):
for j, cell in enumerate(cell):
for neighbor in neighbors(matrix, i, j):
do_stuff(cell, neighbor)
This has takes N * M * 8 steps (actually, a bit less than that, because many cells will have fewer than 8 neighbors). And algorithmically, there's no way you can do better than O(N * M). So, you're done.
(In some cases, you can make things simpler—with no significant change either way in performance—by thinking in terms of iterator transformations. For example, you can easily create a grouper over adjacent triplets from a list a by properly zipping a, a[1:], and a[2:], and you can extend this to adjacent 2-dimensional nonets. But I think in this case, it would just make your code more complicated that writing an explicit neighbors iterator and explicit for loops over the matrix.)
However, practically, you can get a whole lot faster, in various ways. For example:
Using numpy, you may get an order of magnitude or so faster. When you're iterating a tight loop and doing simple arithmetic, that's one of the things that Python is particularly slow at, and numpy can do it in C (or Fortran) instead.
Using your favorite GPGPU library, you can explicitly vectorize your operations.
Using multiprocessing, you can break the matrix up into pieces and perform multiple pieces in parallel on separate cores (or even separate machines).
Of course for a single 4x6 matrix, none of these are worth doing… except possibly for numpy, which may make your code simpler as well as faster, as long as you can express your operations naturally in matrix/broadcast terms.
In fact, even if you can't easily express things that way, just using numpy to store the matrix may make things a little simpler (and save some memory, if that matters). For example, numpy can let you access a single column from a matrix naturally, while in pure Python, you need to write something like [row[col] for row in matrix].
So, how would you tackle this with numpy?
First, you should read over numpy.matrix and ufunc (or, better, some higher-level tutorial, but I don't have one to recommend) before going too much further.
Anyway, it depends on what you're doing with each set of neighbors, but there are three basic ideas.
First, if you can convert your operation into simple matrix math, that's always easiest.
If not, you can create 8 "neighbor matrices" just by shifting the matrix in each direction, then perform simple operations against each neighbor. For some cases, it may be easier to start with an N+2 x N+2 matrix with suitable "empty" values (usually 0 or nan) in the outer rim. Alternatively, you can shift the matrix over and fill in empty values. Or, for some operations, you don't need an identical-sized matrix, so you can just crop the matrix to create a neighbor. It really depends on what operations you want to do.
For example, taking your input as a fixed 6x4 board for the Game of Life:
def neighbors(matrix):
for i in -1, 0, 1:
for j in -1, 0, 1:
if i == 0 and j == 0: continue
yield np.roll(np.roll(matrix, i, 0), j, 1)
matrix = np.matrix([[0,0,0,0,0,0,0,0],
[0,0,1,1,1,0,1,0],
[0,1,1,1,0,0,1,0],
[0,1,1,0,0,0,1,0],
[0,1,1,1,1,1,1,0],
[0,0,0,0,0,0,0,0]])
while True:
livecount = sum(neighbors(matrix))
matrix = (matrix & (livecount==2)) | (livecount==3)
(Note that this isn't the best way to solve this problem, but I think it's relatively easy to understand, and likely to illuminate whatever your actual problem is.)