MongoDB - use numericOrdering in aggregation do mongo slow - python

I have this issue, when I run my query normal, everything works 100% perfect but when I try to add numericOrdering it will not.
aggregation = [
{'$match': {'store': 'fourcom', 'closed': 0}},
{'$lookup': {
'from': 'ordre_open',
'localField': 'reference_number',
'foreignField': 'order_number',
'as': 'ordre_open'
}},
{'$lookup': {
'from': 'ordre_backend_open',
'localField': 'reference_number',
'foreignField': 'ordre-id',
'as': 'ordre_backend'
}},
{'$project': {
'_id': 1,
'store': 1,
'order_number': 1,
'document_number': 1,
'reference_number': 1,
'document_date': 1,
'invoice_group': 1,
'account': 1,
'name': 1,
'our_ref': 1,
'your_ref': 1,
'payment': 1,
'vat': 1,
'total_price': 1,
'currency': 1,
'department': 1,
'closed': 1,
'deleted': 1,
'arch': 1,
'locations': 1,
'lines.price': 1,
'lines.qty': 1,
'ordre_open._id': 1,
'ordre_open.order_number': 1,
'ordre_open.closed': 1,
'ordre_open.deleted': 1,
'ordre_open.type': 1,
'ordre_open.arch': 1,
'ordre_open.fcomputer_synced': 1,
'ordre_backend.ordre-id': 1,
'ordre_backend.error': 1,
'ordre_backend.done': 1,
'ordre_backend.cancel': 1
}},
{'$sort': {'order_number': 1}},
{'$skip': 0},
{'$limit': 1000}
]
mongodb_limit = 1000
resualt = db.command('aggregate', 'ordre_purchase_open', pipeline=aggregation, explain=False, cursor={
'batchSize': mongodb_limit
})
but when I decide to to sort like we humans do and add
collation={
'numericOrdering': True,
'locale': "da"
}
its going total wrong, my query go down and will not work as I expect. I have used numericOrdering many time before but this time is first time in db.command by using aggregate.
I hope someone can explain what I am doing wrong here?

Related

Create a nested dictionary for every distinct words in a list

I have a nested list, and for each list inside I want to create a dictionary that will contain another dictionary with the words related to a certain word as a key and the times they appear as the value. For example:
from
sentences = [["i", "am", "a", "sick", "man"],
["i", "am", "a", "spiteful", "man"],
["i", "am", "an", "unattractive", "man"],
["i", "believe", "my", "liver", "is", "diseased"],
["however", "i", "know", "nothing", "at", "all", "about", "my",
"disease", "and", "do", "not", "know", "for", "certain", "what", "ails", "me"]]
part of the dictionary returned would be:
{ "man": {"i": 3, "am": 3, "a": 2, "sick": 1, "spiteful": 1, "an": 1, "unattractive": 1}, "liver": {"i": 1, "believe": 1, "my": 1, "is": 1, "diseased": 1}...}
with as many keys as there are distinct words in the passage.
I've tried this:
d = {}
for row in sentences:
for words in rows:
if words not in d:
d[words] = 1
else:
d[words] += 1
But is only the way to count them, how could I use d as a value for another dictionary?
from collections import defaultdict
data = {}
for sentence in sentences:
for word in sentence:
data[word] = defaultdict(lambda: 0)
for sentence in sentences:
length = len(sentence)
for index1, word1 in enumerate(sentence):
for num in range(0, length - 1):
index2 = (index1 + 1 + num) % length
word2 = sentence[index2]
data[word1][word2] += 1
print(data)
sentences = [["i", "am", "a", "sick", "man"],
["i", "am", "a", "spiteful", "man"],
["i", "am", "an", "unattractive", "man"],
["i", "believe", "my", "liver", "is", "diseased"],
["however", "i", "know", "nothing", "at", "all", "about", "my",
"disease", "and", "do", "not", "know", "for", "certain", "what", "ails", "me"]]
# "as many keys as there are distinct words in the passage"
# Well then we need to start by finding the distinct words.
# sets always help for this.
# first we flatten the list. If you don't know what this is doing,
# search "flatten nested list Python". This is a common pattern:
flat_list = [term for group in sentences for term in group]
# now use set to find distinct words
distinct_words = set(flat_list)
# variable for final dictionary
result = {}
# define this function first. See invocation below
def find_related_counts(word):
# a nice way to do counts us with
# setdefault. If the term has already
# been counted, then it just increments.
# otherwise, it will create the key and
# initialise it to the default
related_counts = {}
for group in sentences:
# is "word" related to the terms in this group?
if word in group:
# yes it is! add the other terms:
for other in group:
# except, presumably, the word itself
if other != word:
related_counts.setdefault(other, 0)
related_counts[other] += 1
return related_counts
# for each word we have a key, and must find the value
for word in distinct_words:
# when dealing with nested anythings, it helps to
# make a function, so you don't have so much
# nesting in one place and separate things out
# nicely instead
value = find_related_counts(word)
result[word] = value
print(result)
print(result["man"])
OUTPUT:
{'spiteful': {'i': 1, 'am': 1, 'a': 1, 'man': 1}, 'and': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'unattractive': {'i': 1, 'am': 1, 'an': 1, 'man': 1}, 'nothing': {'however': 1, 'i': 1, 'know': 2, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'diseased': {'i': 1, 'believe': 1, 'my': 1, 'liver': 1, 'is': 1}, 'sick': {'i': 1, 'am': 1, 'a': 1, 'man': 1}, 'man': {'i': 3, 'am': 3, 'a': 2, 'sick': 1, 'spiteful': 1, 'an': 1, 'unattractive': 1}, 'do': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'believe': {'i': 1, 'my': 1, 'liver': 1, 'is': 1, 'diseased': 1}, 'i': {'am': 3, 'a': 2, 'sick': 1, 'man': 3, 'spiteful': 1, 'an': 1, 'unattractive': 1, 'believe': 1, 'my': 2, 'liver': 1, 'is': 1, 'diseased': 1, 'however': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'certain': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'what': 1, 'ails': 1, 'me': 1}, 'an': {'i': 1, 'am': 1, 'unattractive': 1, 'man': 1}, 'my': {'i': 2, 'believe': 1, 'liver': 1, 'is': 1, 'diseased': 1, 'however': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'a': {'i': 2, 'am': 2, 'sick': 1, 'man': 2, 'spiteful': 1}, 'am': {'i': 3, 'a': 2, 'sick': 1, 'man': 3, 'spiteful': 1, 'an': 1, 'unattractive': 1}, 'however': {'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'about': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'not': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'for': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'liver': {'i': 1, 'believe': 1, 'my': 1, 'is': 1, 'diseased': 1}, 'know': {'however': 1, 'i': 1, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'at': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'all': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'disease': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1, 'me': 1}, 'ails': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'me': 1}, 'me': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'what': 1, 'ails': 1}, 'what': {'however': 1, 'i': 1, 'know': 2, 'nothing': 1, 'at': 1, 'all': 1, 'about': 1, 'my': 1, 'disease': 1, 'and': 1, 'do': 1, 'not': 1, 'for': 1, 'certain': 1, 'ails': 1, 'me': 1}, 'is': {'i': 1, 'believe': 1, 'my': 1, 'liver': 1, 'diseased': 1}}
{'i': 3, 'am': 3, 'a': 2, 'sick': 1, 'spiteful': 1, 'an': 1, 'unattractive': 1}

list.append copies the last item only

This might endup in very silly question, but being a newbie in python i am not able to find a good solution to following problem.
class Preprocessor:
mPath = None;
df = None;
def __init__(self, path):
self.mPath = path;
def read(self):
self.df = pd.read_csv(self.mPath);
return self.df;
def __findUniqueGenres(self):
setOfGenres = set();
for index, genre in self.df['genres'].iteritems():
listOfGenreInMovie = genre.lower().split("|");
for i, _genre in np.ndenumerate(listOfGenreInMovie):
setOfGenres.add(_genre)
return setOfGenres;
def __prepareDataframe(self, genres):
all_columns = set(["title", "movieId"]).union(genres)
_df = pd.DataFrame(columns=all_columns)
return _df;
def __getRowTemplate(self, listOfColumns):
_rowTemplate = {}
for col in listOfColumns:
_rowTemplate[col] = 0
return _rowTemplate;
def __createRow(self, rowTemplate, row):
rowTemplate['title'] = row.title;
rowTemplate['movieId'] = row.movieId;
movieGenres = row.genres.lower().split("|");
for movieGenre in movieGenres:
rowTemplate[movieGenre] = 1;
return rowTemplate;
def tranformDataFrame(self):
genres = self.__findUniqueGenres();
print('### List of genres...', genres);
__df = self.__prepareDataframe(genres); # Data frame with all required columns.
rowTemplate = self.__getRowTemplate(__df.columns)
print('### Row template looks like -->', rowTemplate)
collection = []
for index, row in self.df.iterrows():
_rowToAdd=self.__createRow(rowTemplate, row);
print('### Row looks like', _rowToAdd)
collection.append(_rowToAdd)
print('### Collection looks like', collection)
return __df.append(collection)
Here when i am trying to append a _rowToAdd to collection, it endsup having a collection of last item ( last row of self.df).
Below are logs for the same (self.df has 3 rows here),
### List of genres... {'mystery', 'horror', 'comedy', 'drama', 'thriller', 'children', 'adventure'}
### Row template looks like --> {'title': 0, 'horror': 0, 'comedy': 0, 'drama': 0, 'children': 0, 'mystery': 0, 'movieId': 0, 'thriller': 0, 'adventure': 0}
### Row looks like {'title': 'Big Night (1996)', 'horror': 0, 'comedy': 1, 'drama': 1, 'children': 0, 'mystery': 0, 'movieId': 994, 'thriller': 0, 'adventure': 0}
### Row looks like {'title': 'Grudge, The (2004)', 'horror': 1, 'comedy': 1, 'drama': 1, 'children': 0, 'mystery': 1, 'movieId': 8947, 'thriller': 1, 'adventure': 0}
### Row looks like {'title': 'Cheetah (1989)', 'horror': 1, 'comedy': 1, 'drama': 1, 'children': 1, 'mystery': 1, 'movieId': 2039, 'thriller': 1, 'adventure': 1}
### Collection looks like [{'title': 'Cheetah (1989)', 'horror': 1, 'comedy': 1, 'drama': 1, 'children': 1, 'mystery': 1, 'movieId': 2039, 'thriller': 1, 'adventure': 1}, {'title': 'Cheetah (1989)', 'horror': 1, 'comedy': 1, 'drama': 1, 'children': 1, 'mystery': 1, 'movieId': 2039, 'thriller': 1, 'adventure': 1}, {'title': 'Cheetah (1989)', 'horror': 1, 'comedy': 1, 'drama': 1, 'children': 1, 'mystery': 1, 'movieId': 2039, 'thriller': 1, 'adventure': 1}]
I want my collection to like
### [
{'title': 'Big Night (1996)', 'horror': 0, 'comedy': 1, 'drama': 1, 'children': 0, 'mystery': 0, 'movieId': 994, 'thriller': 0, 'adventure': 0},
{'title': 'Grudge, The (2004)', 'horror': 1, 'comedy': 0, 'drama': 0, 'children': 0, 'mystery': 1, 'movieId': 8947, 'thriller': 1, 'adventure': 0},
{'title': 'Cheetah (1989)', 'horror': 0, 'comedy': 0, 'drama': 0, 'children': 1, 'mystery': 0, 'movieId': 2039, 'thriller': 0, 'adventure': 1}
]
Dataset - https://grouplens.org/datasets/movielens/
I got to understand the issue now, i was trying to mutate the dictionary object.
def tranformDataFrame(self):
genres = self.__findUniqueGenres();
print('### List of genres...', genres);
__df = self.__prepareDataframe(genres); # Data frame with all required columns.
rowTemplate = self.__getRowTemplate(__df.columns)
print('### Row template looks like -->', rowTemplate)
collection = []
for index, row in self.df.iterrows():
# Creating the fresh copy of row template every time prevent mutation.
_rowToAdd = self.__createRow(self.__getRowTemplate(__df.columns), row);
print('### Row looks like', _rowToAdd)
collection.append(_rowToAdd)
print('### Collection looks like', collection)
return __df.append(collection)
Although there must be some way to cache the copy and cloning it every time ( instead of processing some logic, and creating a dictionary). But, this solution resolve this particular issue at-least.

pandas - pd.replace and TypeError

I have all_data dataframe. I want to replace some categorical values in certain columns with numerical values. I'm trying to use this nested dictionary notation (I've checked that the brackets and curly brackets are in place, I don't think that's the issue):
all_data = all_data.replace({'Street': {'Pave': 1, 'Grvl': 0}},
{'LotShape': {'IR3': 1, 'IR2': 2, 'IR1': 3, 'Reg': 4}},
{'Utilities': {'ELO': 0, 'NoSeWa': 0, 'NoSewr': 0, 'AllPub': 1}},
{'LandSlope': {'Sev': 1, 'Mod': 2, 'Gtl': 3}},
{'ExterQual': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'ExterCond': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'BsmtQual': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtCond': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtExposure': {'NA': 0, 'No': 1, 'Mn': 2, 'Av': 3, 'Gd': 4}},
{'BsmtFinType1': {'NA': 0, 'Unf': 1, 'LwQ': 2, 'Rec': 3, 'BLQ': 4, 'ALQ': 5, 'GLQ': 6}},
{'BsmtFinType2': {'NA': 0, 'Unf': 1,'LwQ': 2,'Rec': 3, 'BLQ': 4,'ALQ': 5, 'GLQ': 6}},
{'HeatingQC': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'CentralAir': {'No': 0,'Yes': 1}},
{'KitchenQual': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'Functional': {'Sal': -7,'Sev': -6,'Maj1': -5,'Maj2': -4,'Mod': -3,'Min2': -2,'Min1': -1,
'Typ': 0}},
{'FireplaceQu': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'GarageFinish': {'NA': 0,'Unf': 1,'RFn': 2, 'Fin': 3}},
{'GarageQual': {'NA': 0, 'Po': 1,'Fa': 2, 'TA': 3,'Gd': 4, 'Ex': 5}},
{'GarageCond': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'PavedDrive': {'N': 0,'P': 0, 'Y': 1}},
{'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
{'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}}
)
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-f9c9c28b7237> in <module>()
22 {'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
23 {'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
---> 24 'Partial': 0}}
25 )
TypeError: replace() takes from 1 to 8 positional arguments but 23 were given
If I remove the 'SaleCondition' row from the above code, the error is again there but this time referring to 'Fence', and so on, for each line of code from bottom up. I've googled but have no idea what this means. Help MUCH appreciated.
You should do something like :
df.replace({'Fence':{'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4},'SaleCondition':{'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}})
the format should be .replace({'col1':{},'col2':{}}) not .replace({'col1':{}},{'col2':{}})

Adding node elements to json object in Python from NetworkX

I have a json object that I made using networkx:
json_data = json_graph.node_link_data(network_object)
It is structured like this (mini version of my output):
>>> json_data
{'directed': False,
'graph': {'name': 'compose( , )'},
'links': [{'source': 0, 'target': 7, 'weight': 1},
{'source': 0, 'target': 2, 'weight': 1},
{'source': 0, 'target': 12, 'weight': 1},
{'source': 0, 'target': 9, 'weight': 1},
{'source': 2, 'target': 18, 'weight': 25},
{'source': 17, 'target': 25, 'weight': 1},
{'source': 29, 'target': 18, 'weight': 1},
{'source': 30, 'target': 18, 'weight': 1}],
'multigraph': False,
'nodes': [{'bipartite': 1, 'id': 'Icarus', 'node_type': 'Journal'},
{'bipartite': 1,
'id': 'A Giant Step: from Milli- to Micro-arcsecond Astrometry',
'node_type': 'Journal'},
{'bipartite': 1,
'id': 'The Astrophysical Journal Supplement Series',
'node_type': 'Journal'},
{'bipartite': 1,
'id': 'Astronomy and Astrophysics Supplement Series',
'node_type': 'Journal'},
{'bipartite': 1, 'id': 'Astronomy and Astrophysics', 'node_type': 'Journal'},
{'bipartite': 1,
'id': 'Astronomy and Astrophysics Review',
'node_type': 'Journal'}]}
What I want to do is add the following elements to each of the nodes so I can use this data as an input for sigma.js:
"x": 0,
"y": 0,
"size": 3
"centrality": 0
I can't seem to find an efficient way to do this though using add_node(). Is there some obvious way to add this that I'm missing?
While you have your data as a networkx graph, you could use the set_node_attributes method to add the attributes (e.g. stored in a python dictionary) to all the nodes in the graph.
In my example the new attributes are stored in the dictionary attr:
import networkx as nx
from networkx.readwrite import json_graph
# example graph
G = nx.Graph()
G.add_nodes_from(["a", "b", "c", "d"])
# your data
#G = json_graph.node_link_graph(json_data)
# dictionary of new attributes
attr = {"x": 0,
"y": 0,
"size": 3,
"centrality": 0}
for name, value in attr.items():
nx.set_node_attributes(G, name, value)
# check new node attributes
print(G.nodes(data=True))
You can then export the new graph in JSON with node_link_data.

cx_Oracle ignores order by clause

I've created complex query builder in my project, and during tests stumbled upon strange issue: same query with the same plan produces different results on different clients: cx_Oracle ignores order by clause, while Oracle SQLDeveloper Studio process query correctly, however in both cases order by present in both plans.
Query in question is:
select *
from
(
select
a.*,
ROWNUM tmp__rnum
from
(
select base.*
from
(
select id
from
(
(
select
profile_id as id,
surname as sort__col
from names
)
/* here usually are several other subqueries chained by unions */
)
group by id
order by min(sort__col) asc
) tmp
left join (profiles) base
on tmp.id = base.id
where exists
(
select t.object_id
from object_rights t
where
t.object_id = base.id
and t.subject_id = :a__subject_id
and t.rights in ('r','w')
)
) a
where ROWNUM < :rows_to
)
where tmp__rnum >= :rows_from
and plan from cx_Oracle in case I missed anything:
{'operation': 'SELECT STATEMENT', 'position': 9225, 'cardinality': 2164, 'time': 1, 'cost': 9225, 'depth': 0, 'bytes': 84396, 'optimizer': 'ALL_ROWS', 'id': 0, 'cpu_cost': 1983805801},
{'operation': 'VIEW', 'position': 1, 'filter_predicates': '"TMP__RNUM">=TO_NUMBER(:ROWS_FROM)', 'parent_id': 0, 'object_instance': 1, 'cardinality': 2164SEL$1', 'projection': '"from$_subquery$_001"."ID"[NUMBER,22], "from$_subquery$_001"."CREATION_TIME"[TIMESTAMP,11], "TMP__RNUM"[NUMBER,22]', 'time': 1, 'cost': 9225, 'depth': 1, 'bytes': 84396, 'id': 1, 'cpu_cost': 1983805801},
{'operation': 'COUNT', 'position': 1, 'filter_predicates': 'ROWNUM<TO_NUMBER(:ROWS_TO)', 'parent_id': 1, 'projection': '"BASE"."ID"[NUMBER,22], "BASE"."CREATION_TIME"[TIMESTAMP,11], ROWNUM[8]', 'options': 'STOPKEY', 'depth': 2, 'id': 2,
{'operation': 'HASH JOIN', 'position': 1, 'parent_id': 2, 'access_predicates': '"TMP"."ID"="BASE"."ID"', 'cardinality': 2164, 'projection': '(#keys=1) "BASE"."ID"[NUMBER,22], "BASE"."CREATION_TIME"[TIMESTAMP,11]', 'time': 1, 'cost': 9225, 'depth': 3, 'bytes': 86560, 'id': 3, 'cpu_cost': 1983805801},
{'operation': 'JOIN FILTER', 'position': 1, 'parent_id': 3, 'object_owner': 'SYS', 'cardinality': 2219, 'projection': '"BASE"."ID"[NUMBER,22], "BASE"."CREATION_TIME"[TIMESTAMP,11]', 'object_name': ':BF0000', 'time': 1, 'cost': 662, 'options': 'CREATE', 'depth': 4, 'bytes': 59913, 'id': 4, 'cpu_cost': 223290732},
{'operation': 'HASH JOIN', 'position': 1, 'parent_id': 4, 'access_predicates': '"T"."OBJECT_ID"="BASE"."ID"', 'cardinality': 2219, 'projection': '(#keys=1) "BASE"."ID"[NUMBER,22], "BASE"."CREATION_TIME"[TIMESTAMP,11]', 'time': 1, 'cost': 662, 'options': 'RIGHT SEMI', 'depth': 5, 'bytes': 59913, 'id': 5, 'cpu_cost': 223290732},
{'operation': 'TABLE ACCESS', 'position': 1, 'filter_predicates': '"T"."SUBJECT_ID"=TO_NUMBER(:A__SUBJECT_ID) AND ("T"."RIGHTS"=\'r\' OR "T"."RIGHTS"=\'w\')', 'parent_id': 5, 'object_type': 'TABLE', 'object_instance': 8, 'cardinality': 2219, 'projection': '"T"."OBJECT_ID"[NUMBER,22]', 'object_name': 'OBJECT_RIGHTS', 'time': 1, 'cost': 5, 'options': 'FULL', 'depth': 6, 'bytes': 24409, 'optimizer': 'ANALYZED', 'id': 6, 'cpu_cost': 1823386},
{'operation': 'TABLE ACCESS', 'position': 2, 'parent_id': 5, 'object_type': 'TABLE', 'object_instance': 6, 'cardinality': 753862, 'projection': '"BASE"."ID"[NUMBER,22], "BASE"."CREATION_TIME"[TIMESTAMP,11]', 'object_name': 'PROFILES', 'time': 1, 'cost': 654, 'options': 'FULL', 'depth': 6, 'bytes': 12061792, 'optimizer': 'ANALYZED', 'id': 7, 'cpu_cost': 145148296},
{'operation': 'VIEW', 'position': 2, 'parent_id': 3, 'object_instance': 3, 'cardinality': 735296, 'projection': '"TMP"."ID"[NUMBER,22]', 'time': 1, 'cost': 8559, 'depth': 4, 'bytes': 9558848, 'id': 8, 'cpu_cost': 1686052619},
{'operation': 'SORT', 'position': 1, 'parent_id': 8, 'cardinality': 735296, 'projection': '(#keys=1) MIN("SURNAME")[50], "PROFILE_ID"[NUMBER,22]', 'time': 1, 'cost': 8559, 'options': 'ORDER BY', 'temp_space': 18244000, 'depth': 5, 'bytes': 10294144, 'id': 9, 'cpu_cost': 1686052619},
{'operation': 'HASH', 'position': 1, 'parent_id': 9, 'cardinality': 735296, 'projection': '(#keys=1; rowset=200) "PROFILE_ID"[NUMBER,22], MIN("SURNAME")[50]', 'time': 1, 'cost': 8559, 'options': 'GROUP BY', 'temp_space': 18244000, 'depth': 6, 'bytes': 10294144, 'id': 10, 'cpu_cost': 1686052619},
{'operation': 'JOIN FILTER', 'position': 1, 'parent_id': 10, 'object_owner': 'SYS', 'cardinality': 756586, 'projection': '(rowset=200) "PROFILE_ID"[NUMBER,22], "SURNAME"[VARCHAR2,50]', 'object_name': ':BF0000', 'time': 1, 'cost': 1202, 'options': 'USE', 'depth': 7, 'bytes': 10592204, 'id': 11, 'cpu_cost': 190231639},
{'operation': 'TABLE ACCESS', 'position': 1, 'filter_predicates': 'SYS_OP_BLOOM_FILTER(:BF0000,"PROFILE_ID")', 'parent_id': 11, 'object_type': 'TABLE', 'object_instance': 5, 'cardinality': 756586, 'projection': '(rowset=200) "PROFILE_ID"[NUMBER,22], "SURNAME"[VARCHAR2,50]', 'object_name': 'NAMES', 'time': 1, 'cost': 1202, 'options': 'FULL', 'depth': 8, 'bytes': 10592204, 'optimizer': 'ANALYZED', 'id': 12, 'cpu_cost': 190231639}
cx_Oracle output (appears to be ordered by id):
ID, Created, rownum
(1829, 2016-08-24, 1)
(2438, 2016-08-24, 2)
SQLDeveloper Output (ordered by surname, as expected):
ID, Created, rownum
(518926, 2016-08-28, 1)
(565556, 2016-08-29, 2)
I don't see an ORDER BY clause that would affect the ordering of the results of the query. In SQL, the only way to guarantee the ordering of a result set is to have an ORDER BY clause for the outer-most SELECT.
In almost all cases, an ORDER BY in a subquery is not necessarily respected (Oracle makes an exception when there are rownum comparisons in the next level of the query -- and even that is now out of date with the support of FETCH FIRST <n> ROWS).
So, there is no reason to expect that an ORDER BY in the innermost subquery would have any effect, particularly with the JOIN that then happens.
Suggestions:
Move the ORDER BY to the outermost query.
Use FETCH FIRST syntax, if you are using Oracle 12c+.
Move the ORDER BY after the JOIN.
Use ROW_NUMBER() instead of rownum.

Categories

Resources