Combinations of elements at various keys of a dict

Combinations of elements at various keys of a dict - python

I have a python dict which has various keys such that dict.keys()=[1,2,3] and each key holds an array of possibly different size such that dict[1] = [1,3,6], dict[2] = ['a', 'b'] dict[3] = [x]. I want to have a new array where I get all possible combinations of the n elements from each of the arrays.
For example, if the arrays were provided beforehand,
arr_1 = itertools.combinations(dict[1],n)
arr_2 = itertools.combinations(dict[2],n)
arr_3 = itertools.combinations(dict[3],n)
and then finally,
final = itertools.product(arr_1, arr_2, arr_3)
In a scenario where I do not the keys of the dict and the array sizes, how can I create the final array ?

If I understand correctly
itertools.product(*[itertools.combinations(v, min(n, len(v)) for v in dic.values()])
should do the trick.
edit: adjusted w.r.t. comments.

Your question is a bit coarsely stated. If you are asking, how you can dynamically form the final array when you need to determine dict keys and values on the fly, you could do it like this:
def combo_dist(dict): # <--- dict is your starting dictionary
array = []
for v in dict.values(): # <--- all values from your dict
arrays.append(itertools.combination(v,len(v)))
# now array should be populated and you can make your final product:
final = itertools.product(*array)
return final
# now call this function:
final_array = combo_dict(your_dict)
now, if I understood correctly, this is the spelled out version of the algorithm. You could actually do a one-liner with list comprehension:
final_array = itertools.product(*[itertools.combinations(v, len(v)) for v in your_dict.values()])
but mind, that here the order of values is not deterministic. You may want to use sorted() as well in one of the steps. Depends on your needs.

Related

How to group elements in a list of lists in Python?

I'm trying to efficientlly group / nest a list of vertices by their neighbourhood size, into a list of lists of vertices.
The neighbourhood size is a property of a vertex v can be obtained by calling len(v.neighbours).
The input I have is an unsorted list of vertices. The output I'm trying to obtain should look like this:
[[all vertices with len(v.neighbours) == 1], [... == 2], [... == 4]]
It should be a list of lists where each sublist contains vertices with the same neighbourhood size, sorted from small to big, without empty lists. I don't need the indices of the sublists to map to the neighbourhood size of the contained vertices.
I know how to achieve this with list comprehension, but it's rather inefficient:
def _group(V: List[Vertex], max: int) -> List[List[Vertex]]:
return [[v for v in V if v.label == i] for i in range(max)]
Additionally, I don't want to pass the maximum neighbourhood size as a parameter, but calculate that during the grouping, and I'm looking for a way to filter out empty lists during the grouping as well.
I've looked into more efficient ways to group the vertices, for example by using a dictionary as intermediary step, but I haven't managed to produce a working result.
Could anyone tell me the most efficient way to group / nest the list of vertices?
Thanks in advance, and sorry if this has been posted before, but I couldn't find what I was looking for in another question.

One pass over the input, put the result in an intermediate dictionary, work the dictionary into the output you want.
temp_result = defaultdict(list)
for v in vertices:
temp_result[neighborhood_size(v)].append(v)
max_size = max(temp_result.keys())
return_val = list()
for i in range(max_size):
if temp_result[i]: # check if empty
return_val.append(temp_result[i])

You can build it this way:
from collections import defaultdict
# Create a dict {nb_of_neighbours:[corresponding vertices]}
vertices_by_neighbours = defaultdict(list)
for v in vertices:
vertices_by_neighbours[len(v.neighbours)].append(v)
# Create the output list, sorted by number of neighbours
out = []
for nb_neighbours in sorted(vertices_by_neighbours):
out.append(vertices_by_neighbours[nb_neighbours])
# Max number of neighbours, in case you want it...
max_neighbours = max(vertices_by_neighbours)

Python: Conversion from dictionary to array

I have a Python dictionary (say D) where every key corresponds to some predefined list. I want to create an array with two columns where the first column corresponds to the keys of the dictionary D and the second column corresponds to the sum of the elements in the corresponding lists. As an example, if,
D = {1: [5,55], 2: [25,512], 3: [2, 18]}
Then, the array that I wish to create should be,
A = array( [[1,60], [2,537], [3, 20]] )
I have given a small example here, but I would like to know of a way where the implementation is the fastest. Presently, I am using the following method:
A_List = map( lambda x: [x,sum(D[x])] , D.keys() )
I realize that the output from my method is in the form of a list. I can convert it into an array in another step, but I don't know if that will be a fast method (I presume that the use of arrays will be faster than the use of lists). I will really appreciate an answer where I can know what's the fastest way of achieving this aim.

You can use a list comprehension to create the desired output:
>>> [(k, sum(v)) for k, v in D.items()] # Py2 use D.iteritems()
[(1, 60), (2, 537), (3, 20)]
On my computer, this runs about 50% quicker than the map(lambda:.., D) version.
Note: On py3 map just returns a generator so you need to list(map(...)) to get the real time it takes.

I hope that helps:
Build an array with the values of the keys of D:
first_column = list(D.keys())
Build an array with the sum of values in each key:
second_column = [sum(D[key]) for key in D.keys()]
Build an array with shape [first_column,second_column]
your_array = list(zip(first_column,second_column))

You can try this also:
a=[]
for i in D.keys():
a+=[[i,sum(D[i])]]

Extracting keys-values from dictionary

import random
dictionary = {'dog': 1,'cat': 2,'animal': 3,'horse': 4}
keys = random.shuffle(list(dictionary.keys())*3)
values = list(dictionary.values())*3
random_key = []
random_key_value = []
random_key.append(keys.pop())
random_key_value.append(???)
For random_key_values.append, I need to add the value that corresponds to the key that was popped. How can I achieve this? I need to make use of multiples of the list and I can't multiply a dictionary directly, either.

I'm going on python (you should specify the language in your question).
If I understand, you want to multiply the elements in the dictionary. So
list(dictionary.keys()) * 3
is not your solution: [1,2] * 3 results in [1,2,1,2,1,2]
Try instead list comprehension:
[i * 3 for i in dictionary.keys()]
To take into account the order (because you shuffle it) shuffle the keys before the multiplication, then create the values list (in the same order that the shuffled keys) and finally multiply the keys:
keys = dictionary.keys()
random.shuffle(keys)
values = [dictionary[i]*3 for i in keys]
keys = [i * 3 for i in keys]
And finally:
random_key.append(keys.pop())
random_key_value.append(values.pop())
Also take care about the random function, it doesn't work as you are using it. See the documentation.

Using variable tuple to access elements of list

Disclaimer:beginner, self-teaching Python user.
A pretty cool feature of ndarrays is their ability to accept a tuple of integers as indices (e.g. myNDArray[(1,2)] == myNDArray[1][2]). This allows me to leave the indices unspecified as a variable (e.g. indicesTuple ) until a script determines what part of an ndarray to work with, in which case the variable is specified as a tuple of integers and used to access part of an ndarray (e.g. myNDArray[indicesTuple]). The utility in using a variable is that the LENGTH of the tuple can be varied depending on the dimensions of the ndarray.
However, this limits me to working with arrays of numerical values. I tried using lists, but they can't take in a tuple as indices (e.g. myList[(1,2)] gives an error.). Is there a way to "unwrap" a tuple for list indices as one could for function arguments? Or something far easier or more efficient?
UPDATE: Holy shite I forgot this existed. Basically I eventually learned that you can initialize the ndarray with the argument dtype=object, which allows the ndarray to contain multiple types of Python objects, much like a list. As for accessing a list, as a commenter pointed out, I could use a for-loop to iterate through the variable indicesTuple to access increasingly nested elements of the list. For in-place editing, see the accepted comment, really went the extra mile there.

I'm interpreting your question as:
I have an N-dimensional list, and a tuple containing N values (T1, T2... TN). How can I use the tuple values to access the list? I don't know what N will be ahead of time.
I don't know of a built-in way to do this, but you can write a method that iteratively digs into the list until you reach the innermost value.
def get(seq, indices):
for index in indices:
seq = seq[index]
return seq
seq = [
[
["a","b"],
["c","d"]
],
[
["e","f"],
["g","h"]
]
]
indices = [0,1,0]
print get(seq, indices)
Result:
c
You could also do this in one* line with reduce, although it won't be very clear to the reader what you're trying to accomplish.
print reduce(lambda s, idx: s[idx], indices, seq)
(*if you're using 3.X, you'll need to import reduce from functools. So, two lines.)
If you want to set values in the N-dimensional list, use get to access the second-deepest level of the list, and assign to that.
def set(seq, indices, value):
innermost_list = get(seq, indices[:-1])
innermost_list[indices[-1]] = value

Say you have a list of (i,j) indexes
indexList = [(1,1), (0,1), (1,2)]
And some 2D list you want to index from
l = [[1,2,3],
[4,5,6],
[7,8,9]]
You could get those elements using a list comprehension as follows
>>> [l[i][j] for i,j in indexList]
[5, 2, 6]
Then your indexes can be whatever you want them to be. They will be unpacked in the list comprehension, and used as list indices. For your specific application, we'd have to see where your index variables were coming from, but that's the general idea.

Python doesn't have multidimensional lists, so myList[(1,2)] could only conceivably be considered a shortcut for (myList[1], myList[2]) (which would be pretty convenient sometimes, although you can use import operator; x = operator.itemgetter(1,2)(myList) to accomplish the same).
If your myList looks something like
myList = [ ["foo", "bar", "baz"], ["a", "b", c" ] ]
then myList[(1,2)] won't work (or make sense) because myList is not a two-dimensional list: it's a list that contains references to lists. You use myList[1][2] because the first index myList[1] returns the references to ["a", "b", "c"], to which you apply the second index [2] to get "c".
Slightly related, you could use a dictionary to simulate a sparse array precisely by using tuples as keys to a default dict.
import collections
d = collections.defaultdict(str)
d[(1,2)] = "foo"
d[(4,5)] = "bar"
Any other tuple you try to use as a key would return the empty string. It's not a perfect simulation, as you can't access full rows or columns of the array without using something like
row1 = [d[1, x] for x in range(C)] # where C is the number of columns
col3 = [d[x, 3] for x in range(R)] # where R is the number of columns

Use dictionaries indexed by tuple
>>> width, height = 7, 6
>>> grid = dict(
((x,y),"x={} y={}".format(x,y))
for x in range(width)
for y in range(height))
>>> print grid[3,1]
x=3 y=1
Use lists of lists
>>> width, height = 7, 6
>>> grid = [
["x={} y={}".format(x,y) for x in range(width)]
for y in range(width)]
>>> print grid[1][3]
x=3 y=1
In this case, you could make a getter and setter function:
def get_grid(grid, index):
x, y = index
return grid[y][x]
def set_grid(grid, index, value):
x, y = index
grid[y][x] = value
You could go a step further and create your own class that contains a list of lists and defines an indexer that takes tuples as indexes and does this same process. It can do slightly more sensible bounds-checking and give better diagnostics than the dictionary, but it takes a bit of setup. I think the dictionary approach is fine for quick exploration.

Correspendence between list indices originated from dictionary

I wrote the below code working with dictionary and list:
d = computeRanks() # dictionary of id : interestRank pairs
lst = list(d) # tuples (id, interestRank)
interestingIds = []
for i in range(20): # choice randomly 20 highly ranked ids
choice = randomWeightedChoice(d.values()) # returns random index from list
interestingIds.append(lst[choice][0])
There seems to be possible error because I'm not sure if there is a correspondence between indices in lst and d.values().
Do you know how to write this better?

One of the policies of dict is that the results of dict.keys() and dict.values() will correspond so long as the contents of the dictionary are not modified.

As #Ignacio says, the index choice does correspond to the intended element of lst, so your code's logic is correct. But your code should be much simpler: d already contains IDs for the elements, so rewrite randomWeightedChoice to take a dictionary and return an ID.
Perhaps it will help you to know that you can iterate over a dictionary's key-value pairs with d.items():
for k, v in d.items():
etc.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combinations of elements at various keys of a dict - python

If I understand correctly itertools.product(*[itertools.combinations(v, min(n, len(v)) for v in dic.values()]) should do the trick. edit: adjusted w.r.t. comments.

Related

How to group elements in a list of lists in Python?

Python: Conversion from dictionary to array

Extracting keys-values from dictionary

Using variable tuple to access elements of list

Correspendence between list indices originated from dictionary

Categories

Resources