python / sets / dictionary / initialization

python / sets / dictionary / initialization - python

Can someone explain help me understand how the this bit of code works? Particularly how the myHeap assignment works. I know the freq variable is assigned as a dictionary. But what about my myHeap? is it a Set?
exe_Data = {
'e' : 0.124167,
't' : 0.0969225,
'a' : 0.0820011,
'i' : 0.0768052,
}
freq = exe_Data)
myHeap = [[pct, [symbol, ""]] for symbol, pct in freq.items()]

freq is a reference to the dictionary, as you said.
myHeap is constructed using a list comprehension, and so it is a list. The general form of a list comprehension is:
[ expr for x in iterable ]
So myHeap will be a list, each element of which is a list with the first element being the value of the corresponding dictionary entry, and the second element being another list whose first element is the corresponding key of the dictionary, and whose second element is "".
There are no sets in your given code sample.
You can see this working like so (I edited the number output for readability):
>>> [ symbol for symbol, pct in freq.items() ]
['a', 'i', 'e', 't']
>>> from pprint import pprint # Yay, pretty printing
>>> pprint([ [pct, symbol] for symbol, pct in freq.items() ])
[[0.0820011, 'a'],
[0.0768052, 'i'],
[0.1241670, 'e'],
[0.0969225, 't']]
>>> pprint([ [pct, [symbol, ""]] for symbol, pct in freq.items() ])
[[0.0820011, ['a', '']],
[0.0768052, ['i', '']],
[0.1241670, ['e', '']],
[0.0969225, ['t', '']]]
Note that, since dictionaries in Python don't preserve the order of their elements, there's no guarantee what order the freq elements will end up being in in myHeap.

exe_Data = {
'e' : 0.124167,
't' : 0.0969225,
'a' : 0.0820011,
'i' : 0.0768052,
}
The above code creates a dictionary called 'exe_Data'. Another way to do this is to use the built-in constructor, dict() with keyword arguments as follows: exe_Data = dict(e=0.12467, t=0.0969225, a=0.0820011, i=0.0768052)
freq = exe_Data)
I think the above bit should read freq=exe_Data. It makes another reference to the dictionary created in the previous bit.
myHeap = [[pct, [symbol, ""]] for symbol, pct in freq.items()]
This last part creates a list using a list comprehension. It creates a list of lists of two things, The first thing is a key from the dictionary created and referenced above, and the second thing is a list containing the corresponding value from the dictionary and a blank string.
EDIT:
In answer to the comment, it would be the equivalent of writing:
myHeap = []
for key, val in freq.items():
myHeap.append([key, [val, ""]])

I assume you meant
freq = exe_Data
In this case, myHeap will look like this:
[ [0.124167, ['e', ""]],
[0.0969225, ['t', ""]],
[0.0820011, ['a', ""]],
[0.0768052, ['i', ""]]
]
Note that the order here is arbitrary, but I wanted to write it plainly so you can see what you have in the end results.
Basically it just changes the order of your key/value of your dictionary, and puts the key in a sub-array for some reason.

Related

Reorder the position of the string in the list based on similarity

I have a list of string like this -
list = ["A","V","C,"D",X","Y","V_RT","D_RT"]
I want to reorder the strings with suffix "_RT" right after the parent string(string without the suffix).
For example, the above string should become something like this -
list = ["A","V","V_RT","C,"D","D_RT",X","Y"] #notice how the strings with _RT moved after the string without _RT.
My approach-
Right now I am finding the strings with _RT, then searching the index of the parent string without _RT and then inserting it there. Finally, deleting the original prefixed string.
The above approach works but I believe there must be some short(one-two liner way) of doing the required which I don't know.
Please help.
Thanks.
EDIT
I forgot to mention but can't change the order of appearance. After "A", there will be "V" then "V_RT", "C", "D", "D_RT", etc. The strings are not necessarily of length 1. The above is just an example.

Another approach using for-loop
Check if current element + '_RT' is in original list
if True add current element and current element + '_RT' to the new list
if False and also if substring '_RT' is not in current element add the element to the new list
Code:
l = ["A","V","C","D","X","Y","V_RT","D_RT"]
l2 = []
for x in l:
if x+'_RT' in l:
l2+=[x, x+'_RT']
elif '_RT' not in x:
l2.append(x)
print(l2)
Output:
['A', 'V', 'V_RT', 'C', 'D', 'D_RT', 'X', 'Y']

This does it
list1 = ["A","V","C" ,"D","X","Y","V_RT","D_RT"]
dict1={}
for x in list1:
dict1[x[0]]=x
list2=[]
for key,value in dict1.items():
list2.append(key)
if key!=value:
list2.append(value)
print(list2)

Subtle difference when apply `set()` to find max count items in a list

A friend has asked this question, and I just cannot find a good explanation for it. (He knows how the max() and key works in this case)
Given a list of scores as this:
lst = ['A', 'B', 'B', 'B', 'C', 'C', 'C', 'E']
>>> max(lst, key=lst.count)
'B'
>>> max(set(lst), key=lst.count)
'C'
# if run min - will return different results - w/ and w/o set():
>>> min(lst, key=lst.count)
'A'
>>> min(set(lst), key=lst.count)
'E'
>>>

max and min return the first maximal / minimal element in an iterable.
lst.count("A") and lst.count("E") are equal (evaluating to 1), and so are lst.count("B") and lst.count("C") (evaluating to 3). A set is unordered in Python, and converting a list to a set does not preserve its order. (The internal order of a set is not exactly random, but arbirtrary.)
This is the reason why the results differ.
If you want to keep the order, but have unique elements, you could do:
unique_lst = sorted(set(lst), key=lst.index)

Returning max of string after comparison with other sub-strings - Python

I have a list that looks like this:
json_file_list = ['349148424_20180312071059_20190402142033.json','349148424_20180312071059_20190405142033.json','360758678_20180529121334_20190402142033.json']
and a empty list:
list2 = []
What I want to do is compare the characters up until the second underscore '_', and if they are the same I only want to append the max of the full string, to the new list. In the case above, the first 2 entries are duplicates (until second underscore) so I want to base the max off the numbers after the second underscore. So the final list2 would have only 2 entries and not 3
I tried this:
for row in json_file_list:
if row[:24] == row[:24]:
list2.append(max(row))
else:
list2.append(row)
but that is just returning:
['s', 's', 's']
Final output should be:
['349148424_20180312071059_20190405142033.json','360758678_20180529121334_20190402142033.json']
Any ideas? I also realize this code is brittle with the way I am slicing it (what happens if the string gets longer/shorter) so I need to come up with a better way to do that. Maybe base if off the second underscore instead. The strings will always end with '.json'

I'd use a dictionary to do this:
from collections import defaultdict
d = defaultdict(list)
for x in json_file_list:
d[tuple(x.split("_")[:2])].append(x)
new_list = [max(x) for x in d.values()]
new_list
Output:
['349148424_20180312071059_20190405142033.json',
'360758678_20180529121334_20190402142033.json']

The if statement in this snippet:
for row in json_file_list:
if row[:24] == row[:24]:
list2.append(max(row))
else:
list2.append(row)
always resolves to True. Think about it, how could row[:24] be different from itself? Given that it's resolving to True, it's adding the farthest letter in the alphabet (and in your string), s in this case, to list2. That's why you're getting an output of ['s', 's', 's'].
Maybe I'm understanding your request incorrectly, but couldn't you just append all the elements of the row to a list and then remove duplicates?
for row in json_file_list:
for elem in row:
list2.append(elem)
list2 = sorted(list(set(list2)))

I suppose you can splice what you want to compare, and use the built in 'set', to perform your difference:
set([x[:24] for x in json_file_list])
set(['360758678_20180529121334', '349148424_20180312071059'])
It would be a simple matter of joining the remaining text later on
list2=[]
for unique in set([x[:24] for x in json_file_list]):
list2.append(unique + json_file_list[0][24:])
list2
['360758678_20180529121334_20190402142033.json',
'349148424_20180312071059_20190402142033.json']

Dynamically append sliced list items to a dictionary in python

I have a result dictionary with pre-defined keys that should be populated based on slices of an array without explicitly accessing the dictionary keys, below is an example of my approach
my_list = ['a','b','c','d','e']
my_dict = {'first_item':'', 'middle_items':'','last_item':''}
for key in my_dict.keys():
value = my_list.pop()
my_dict.update({k:''.join(value)})
This approach obviously does not work because pop does not slice the array. And if I want to use slicing, I will have to explicitly access the dictionary variables and assign the corresponding list slices.
I have tried using the list length to slice through the list, but was unable to find a general solution, here is my other approach
for key in my_dict.keys():
value = ''.join(my_list[:-len(my_list)+1])
del my_list[0]
my_dict.update({k:v})
How can I slice a list in a general way such that it splits into a first item, last item, and middle items? below is how the updated dictionary should look like
my_dict = {'first_item':'a', 'middle_items':'b c d','last_item':'e'}
Edit: if I use [0],[1:-1], and [-1] slices then that means that I will have to access each dictionary key individually and update it rather than loop over it (which is the desired approach)

If you are using python 3 then you should try this idiom:
my_list = ['a','b','c','d','e']
first_item, *middle_items, last_item = my_list
my_dict = dict(
first_item=first_item,
middle_items=' '.join(middle_items),
last_item=last_item
)
print(my_dict)
# {'last_item': 'e', 'middle_items': 'b c d', 'first_item': 'a'}

To get a slice of my_list, use the slice notation.
my_list = ['a', 'b', 'c', 'd', 'e']
my_dict = {
'first_item': my_list[0],
'middle_items': ' '.join(my_list[1:-1]),
'last_item': my_list[-1]
}
Output
{'first_item': 'a',
'middle_items': 'b c d',
'last_item': 'e'}

How do I append items from a list to another stopping at a specific value?

I'm building a game of Othello, so I have a list of coordinates that are either b or w on the board.
For example I have lists of coordinates within a list as such
list_a = [ [[1,2],[1,3],[1,4],[1,5],[1,6]], [[2,3],[3,4],[4,5]], [[3,5],[2,5],[1,5]] ]
list 1 is [[1,2],[1,3],[1,4],[1,5],[1,6]] ### [1,4] is 'b' everything else is 'w'
list 2 is [[2,3],[3,4],[4,5]] ### everything is 'w', no 'b'
third list is [[3,5],[2,5],[1,5]] ### [1,5] is 'b'
list_b = []
I want to add all the coordinates that are w into list_b stopping at b, but I don't want to append any w if it doesn't have a b after it.
Ideally, I want my list_b to be
[ [[1,2],[1,3]], [[3,5],[2,5]] ]
I don't mind if it's
[ [[1,2],[1,3],[1,4]], [[3,5],[2,5],[1,5] ]
I can just remove the b coordinates later.
What's the best way to do this? I'm currently building this within a class and using for loops, while True, if statements etc.

You never stated what the board was or how the coordinates are connected to it (perhaps something like [1, 4] for board[1][4] == 'b'?), so I'll just use the string values you're referring to, e.g. 'wwbww' instead of [[1,2],[1,3],[1,4],[1,5],[1,6]].
import itertools
list_a = ['wwbww', 'www', 'wwb']
result = [list(itertools.takewhile(lambda x: x!='b', item)) for item in list_a if 'b' in item]
Result:
>>> result
[['w', 'w'], ['w', 'w']]
result is a list comprehension where each item first checks if it contains a 'b' (since you only want the 'w's if there's a 'b' in there somewhere), and then creates a list out of an itertools.takewhile() call. This function grabs every element from an iterable until the given function returns a falsey value. To expand:
import itertools
def okay_to_add(x):
return x != 'b'
result = []
for item in list_a:
if 'b' in item:
temp = []
for c in itertools.takewhile(okay_to_add, item):
temp.append(c)
result.append(temp)
The result is identical.
You'll have to create an okay_to_add function of your own that, instead of a simple x != 'b', looks up the given coordinate in the board and returns whether it's the 'b' (or 'w', for the other player) that you're interested in.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python / sets / dictionary / initialization - python

Related

Reorder the position of the string in the list based on similarity

Subtle difference when apply `set()` to find max count items in a list

Returning max of string after comparison with other sub-strings - Python

Dynamically append sliced list items to a dictionary in python

How do I append items from a list to another stopping at a specific value?

Categories

Resources