I am looking for a Python datastructure that functions as a sorted list that has the following asymptotics:
O(1) pop from beginning (pop smallest element)
O(1) pop from end (pop largest element)
>= O(log n) insert
Does such a datastructure with an efficient implementation exist? If so, is there a library that implements it in Python?
A regular red/black tree or B-tree can do this in an amortized sense. If you store pointers to the smallest and biggest elements of the tree, then the cost of deleting those elements is amortized O(1), meaning that any series of d deletions will take time O(d), though individual deletions may take longer than this. The cost of insertions are O(log n), which is as good as possible because otherwise you could sort n items in less than O(n log n) time with your data structure.
As for libraries that implement this - that I’m not sure of.
Related
I wanna pop out the first (or the nth) element of a list (or deque). My friend told me the deque structure is O(1) operation. I am not talkign about poping out the last element. I am talking about poping out the first or the nth element.
Yes, removing this first element or the last (as well as inserting) is O(1)
You can see more information about that here, quoting from there:
The complexity (efficiency) of common operations on deques is as
follows:
Random access - constant O(1)
Insertion or removal of elements at the end or beginning - constant O(1)
Insertion or removal of elements - linear O(n)
Poping out the first element is O(1). But poping out the nth element is probably not.
As written in the python documentation:
https://docs.python.org/3/library/collections.html#deque-objects
Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
It is possible to build a data structure that can pop out nth element in constant time, but the space complexity will not be satisfactory.
I am having problems trying to find the Big-O runtime of this. It's building a heap by calling the insert function to insert the elements into the heap.
buildHeap(A)
h = new empty heap
for each element e in A
h.insert(e)
What is the Big-O runtime of this version of buildHeap?
Written this way, for a typical binary heap, it would be O(n log n); you're inserting one at a time, and each insertion is O(log n). There are optimized ways to build a heap an array of elements all at once from n elements in O(n) time (referred to as the "heapify" operation), but it's not done by repeated single-element insertions.
The big-O could change depending on the type of heap; some variant heap designs have O(1) insertion, though of course the come with other trade-offs that differ by type, e.g. memory fragmentation, complexity of implementation, higher fixed costs per operation, etc.
What is the time complexity of operations in SortedList implementation of sortedcontainers module?
As I understand, the underlying data structure is an array list. So does insertion takes O(n) time since the index can be found in O(logn) and then insert the element at the correct location is O(n)?
Similarly, popping an element from an index must be O(n) as well.
Insert, remove, get index, bisect right and left, find element inside list, are all log(n) operations. Its similar to treeset and multiset in java and c++, implemented with AVL tree or red black tree.
There's already a question regarding this, and the answer says that the asymptotic complexity is O(n). But I observed that if an unsorted list is converted into a set, the set can be printed out in a sorted order, which means that at some point in the middle of these operations the list has been sorted. Then, as any comparison sort has the lower bound of Omega(n lg n), the asymptotic complexity of this operation should also be Omega(n lg n). So what exactly is the complexity of this operation?
A set in Python is an unordered collection so any order you see is by chance. As both dict and set are implemented as hash tables in CPython, insertion is average case O(1) and worst case O(N).
So list(set(...)) is always O(N) and set(list(...)) is average case O(N).
You can browse the source code for set here.
I was studying hash tables and a thought came:
Why not use dictionaries for searching an element instead of first sorting the list then doing binary search? (assume that I want to search multiple times)
We can convert a list to a dictionary in O(n) (I think) time because we have to go through all the elements.
We add all those elements to dictionary and this takes O(1) time
When the dictionary is ready,we can then search for any element in O(1) time(average) and O(n) is the worst case
Now if we talk about average case O(n) is better than other sorting algorithms because at best they take O(nlogn).And if I am right about all of what I have said then why not do this way?
I know there are various other things which you can do with the sorted elements which cannot be done in an unsorted dictionary or array.But if we stick only to search then Is it not a better way to do search than other sorting algorithms?
Right, a well-designed hash table can beat sorting and searching.
For a proper choice, there are many factors entering into play such as in-place requirement, dynamism of the data set, number of searches vs. insertions/deletions, ease to build an effective hashing function...
Binary Search is a searching technique which exploits the fact that list of keys in which a key is to be searched is already sorted, it doesn't requires you to sort and then search, making its worst case search time O(log n).
If you do not have a sorted list of keys and want to search a key then you will have to go for linear search which in worst case will run with O(n) complexity, there is no need to sort and then search which definitely slower since best known sorting algos can work in only O(n log n) time.
Building a dictionary from a list of keys and then performing a lookup is of no advantage here because linear search will yield the same for better performance and also there need for auxiliary memory which would be needed in case of dictionary; however if you have multiple lookups and key space is small using a dictionary can of advantage since building the dictionary is one time work of O(n) and subsequent lookups can be done by O(1) at the expense of some memory which will be used by the dictionary.