Container to Use for Sorting in Python [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
The task is to create a spam filter using machine learning. In order to do feature selection I have implemented a method that calculates MI for each word but then I want to return N words that have a high MI and choose between them based on how many times they appear in the spam email.
The reason for the additional requirement is that we are using the small lingspam set and there is little difference between the results and there are about 3000 words that share the same top MI value.
We are required to do this in Python and I have currently implemented this using dictionaries but I can't find a container type that would let me do what I need.

You can sort the items of a dictionary (you'll have to use a custom key), where the items are stored as a list.
>>> some_dictionary = {"a": 1, "b": 5, "c": 0, "e": 2}
>>> sorted(some_dictionary.items())
[('a', 1), ('b', 5), ('c', 0), ('e', 2)]
>>> sorted(some_dictionary.items(), key=lambda i:i[1])
[('c', 0), ('a', 1), ('e', 2), ('b', 5)]
>>>
Where .items() lets you get the items in the dictionary (in an arbitrary order):
>>> some_dictionary.items()
dict_items([('a', 1), ('b', 5), ('e', 2), ('c', 0)])
Note that dict_items is an iterable, which just wraps a list in this case.

Related

Python: Sort list consisting of (int, string)-pairs descending by int and ascending by string? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
How do I sort a list consisting of (int, string)-pairs descending by int and ascending by string?
You can use sorted with a custom key argument. In this case negating the int will cause a descending behavior, then leave the str which will cause ascending behavior. Putting them in a tuple will allow lexicographical sort behavior if the first key (int) is identical.
>>> data = [(1, 'hello'), (7, 'bar'), (4, 'foo'), (4, 'world')]
>>> sorted(data, key=lambda i: (-i[0], i[1]))
[(7, 'bar'), (4, 'foo'), (4, 'world'), (1, 'hello')]

Why is 'key' function appended to the end of my OrderedDict when sorting? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I am using python 3.6.5 and I am sorting an OrderedDict, e.g. tmp.py:
from collections import OrderedDict
d = OrderedDict()
d[6] = 'a'
d[5] = 'b'
d[3] = 'c'
d[4] = 'd'
print(d)
print("keys : {}".format(d.keys()))
d = OrderedDict(sorted(d.items()), key=lambda t: t[1])
print(d)
print("keys : {}".format(d.keys()))
When I run tmp.py, I get :
OrderedDict([(6, 'a'), (5, 'b'), (3, 'c'), (4, 'd')])
keys : odict_keys([6, 5, 3, 4])
OrderedDict([(3, 'c'), (4, 'd'), (5, 'b'), (6, 'a'), ('key', <function <lambda> at 0x2ab444506bf8>)])
keys : odict_keys([3, 4, 5, 6, 'key'])
Clearly the process of sorting has appended the key() function to my new OrderedDict. I believe that I am sorting this in the same fashion prescribed in this post.
QUESTION :
Why does this happen and how to I properly sort an OrderedDict?
Your brackets are in the wrong place.
d = OrderedDict(sorted(d.items()), key=lambda t: t[1])
should be:
d = OrderedDict(sorted(d.items(), key=lambda t: t[1]))

Aligning two lists with duplicate keys [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
This question is similar to "Aligning to Lists in python" question, but I have a problem with using a dictionary because of repeated numbers for potential keys.
Here is an example. Start with these 2 lists:
If I used a dictionary these would be the keys. [5,6,6,1,6,1,6,1,1,2,1,2,1,2,2,1]
[13,14,15,10,16,11,17,12,12,13,13,14,14,15,16,17]
I am able to rearrange the first list the way I want it, which is:
[5,6,6,6,6,1,1,1,1,1,1,2,2,2,2,1]
I want the second list to keep it's same alignment, to the first list and look exactly like this:
[13,14,15,16,17,10,11,12,12,13,14,13,14,15,16,17]
Notice that it matters that the list of potential keys has it's repeated values aligned by position with the corresponding values in the second list.
Like other people below your post, I don't completely understand your problem (could you be more specific about relation you want to obtain?), but maybe zip is the answer for your question:
>>> a = [5,6,6,6,6,1,1,1,1,1,1,2,2,2,2,1]
>>> b = [13,14,15,16,17,10,11,12,12,13,14,13,14,15,16,17]
>>> alignment = zip(a, b)
>>> alignment
[(5, 13), (6, 14), (6, 15), (6, 16), (6, 17), (1, 10), (1, 11), (1, 12), (1, 12), (1, 13), (1, 14), (2, 13), (2, 14), (2, 15), (2, 16), (1, 17)]
Edited:
key_list = [5,6,6,1,6,1,6,1,1,2,1,2,1,2,2,1]
values_list = [13,14,15,10,16,11,17,12,12,13,13,14,14,15,16,17]
zipped_lists = zip(key_list, values_list)
sorted_zip = sorted(zipped_lists)
pattern = [5,6,6,6,6,1,1,1,1,1,1,2,2,2,2,1]
temp_dict = {}
for key, value in sorted_zip:
if key not in temp_dict:
temp_dict[key] = [value]
else:
temp_dict[key].append(value)
final_list = []
for i in pattern:
final_list.append((i, temp_dict[i].pop(0)))
And, of course, final_list is your result.

Python - why is this wrong? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
This don't works and I wish to know what I did wrong.
targets= ["a","b","c"]
tmplist= ["d","z","x"]
value = [ (x,y) for x in targets for y in tmplist]
I know this issue can be solved with zip function, but I want to do it without zip. Thanks for any help
EDIT: I'm very sorry for not being clear, I have been distracted.
Luckily my crystal ball is working today so I can guess what you mean when you say it isn't working. Of course, you might have made it easier by actually explaining, but there we go.
If you just want a list of (x, y) pairs then zip is the way to go. The syntax you have does something else: for each element in targets it iterates completely through all elements in tmplist. This is exactly equivalent to:
for x in targets:
for y in tmplist:
value.append((x, y))
So for a pair of lists ['a', 'b', 'c'] and [1, 2, 3] you would get:
[('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1), ('c', 2), ('c', 3)]

need max, min, avg value for duplicate keys in tuple, as well as count of keys [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Using python, I am trying to figure out how to extract unique keys from a list of tuple pairs, with the highest, lowest and average values, as well as a count of how many keys, for example with this list:
[('a', 1), ('b', 3,), ('a', 9), ('b', 0), ('b', 9), ('a', 10), ('c', 2)]
I need to extract this information:
a: max = 10, min = 1, avg = 7 count = 3
b: max = 9, min = 0, avg = 4 count = 3
c: max = 2, min = 2, avg = 2, count = 1
You can use a defaultdict to gather the information.
from collections import defaultdict
data = [('a', 1), ('b', 3,), ('a', 9), ('b', 0), ('b', 9), ('a', 10), ('c', 2)]
pool = defaultdict(list)
for key, value in data:
pool[key].append(value)
print(pool)
You should have no problem to implement the calculation of min, max and average (sum/len) for yourself.
Your target looks like a dictionary of dictionaries. So build dictionary taking the 1st element of each tuple as a key. Iterate through the tuples and build up the values for each.
You should end up with something like:
tally = {'a': {'count': 3, 'max': 10, 'avg': 7, 'min': 1}, ... etc.}

Categories

Resources