What is the time complexity of operations in SortedList implementation of sortedcontainers module?
As I understand, the underlying data structure is an array list. So does insertion takes O(n) time since the index can be found in O(logn) and then insert the element at the correct location is O(n)?
Similarly, popping an element from an index must be O(n) as well.
Insert, remove, get index, bisect right and left, find element inside list, are all log(n) operations. Its similar to treeset and multiset in java and c++, implemented with AVL tree or red black tree.
Related
I wanna pop out the first (or the nth) element of a list (or deque). My friend told me the deque structure is O(1) operation. I am not talkign about poping out the last element. I am talking about poping out the first or the nth element.
Yes, removing this first element or the last (as well as inserting) is O(1)
You can see more information about that here, quoting from there:
The complexity (efficiency) of common operations on deques is as
follows:
Random access - constant O(1)
Insertion or removal of elements at the end or beginning - constant O(1)
Insertion or removal of elements - linear O(n)
Poping out the first element is O(1). But poping out the nth element is probably not.
As written in the python documentation:
https://docs.python.org/3/library/collections.html#deque-objects
Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
It is possible to build a data structure that can pop out nth element in constant time, but the space complexity will not be satisfactory.
I am looking for a Python datastructure that functions as a sorted list that has the following asymptotics:
O(1) pop from beginning (pop smallest element)
O(1) pop from end (pop largest element)
>= O(log n) insert
Does such a datastructure with an efficient implementation exist? If so, is there a library that implements it in Python?
A regular red/black tree or B-tree can do this in an amortized sense. If you store pointers to the smallest and biggest elements of the tree, then the cost of deleting those elements is amortized O(1), meaning that any series of d deletions will take time O(d), though individual deletions may take longer than this. The cost of insertions are O(log n), which is as good as possible because otherwise you could sort n items in less than O(n log n) time with your data structure.
As for libraries that implement this - that I’m not sure of.
heapq documentation is:
These two make it possible to view the heap as a regular Python list without surprises: heap[0] is the smallest item, and heap.sort() maintains the heap invariant!
So is heapq implementation really just heap.sort() after every push/pop, or is it implemented as a traditional min heap queue (which would make sense, since it would be O(log(n)) instead of O(nlog(n)) for pop and push)?
Firstly, heappush() and heappop() in heapq library is definitely O(log(n)).
Secondly, heap.sort() would sort the items in increasing order. Which would mean the min-heap rule that parent value is always less than the value of children is still maintained.
heapq implementation is definitely not heap.sort() after every push() and pop() because that would be O(nlog(n)) and suboptimal to the O(log(n)) it provides. For more information take a look at https://github.com/python/cpython/blob/master/Lib/heapq.py
I'm trying to determine the complexity of converting a collections.deque object into a python list object is O(n). I imagine it would have to take every element and convert it into the list, but I cannot seem to find the implementation code behind deque. So has Python built in something more efficient behind the hood that could allow for O(1) conversion to a list?
Edit: Based off the following I do not believe it could be any faster than O(n)
"Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead."
If it cannot access a middle node in O(1) time it will not be able to convert without the same complexity.
You have to access every node. O(1) time is impossible for that fact alone.
I would believe that a deque follows the same principles as conventional deques, in that it's constant time to access the first element. You have to do that for n elements, so the runtime to do so would be O(n).
Here is the implementation of deque
However, that is irrelevant for determining complexity to convert a deque to list in python.
If python is not reusing the data structure internally somehow, conversion into a list will require a walk through the deque and it will be O(n).
I've read that Python's lists are implemented using pointers. I then see this module http://docs.python.org/2/library/bisect.html which does efficient insertion into a sorted list. How does it do that efficiently? If the list is implemented using pointers and not via contiguous array, then how can it be efficiently searched for the insertion point? And if the list is backed via a contiguous array, then there would have to be element shifting when inserting an element. So how this bisect work efficiently?
I believe the elements of a list are pointed at, but the "list" is really a contiguous array (in C). They're called lists, but they're not linked lists.
Actually, finding an element in a sorted list is pretty good - it's O(logn). But inserting is not that good - it's O(n).
If you need a logn datastructure, it'd be better to use a treap or red-black tree.
It's the searching that's efficient, not the actual insertion. The fast searching makes the whole operation "adding a value and keeping all values in order" fast compared to, for example, appending and then sorting again: O(n) rather than O(n log n).