Using a python dictionary of arrays

Using a python dictionary of arrays - python

EDIT: I'm sorry, I realized that I didn't give the full question, this edit should contain the full question. I thougth that my previous question was similar enough to the full question that I could get the answer that way without having to add too many additional details, but I was wrong.
I have a dictionary params in python, where each entry is a numpy array of some sort. But the arrays in this dictionary are not all of the same shape. Some are size (n,m), some are size (1,m), some still are totally different sizes. I can call a specific number in this dictionary via:
params[key][i,j]
I need to loop over x of the keys in this dictionary, so I have a loop like:
for i in range(0,x):
for j in range(0,(dim of 0th axis of array at key x)):
for k in range(0,(dim of 1 axis of array at key x)):
stuff = f(params['W'+str(i)][i,j])
I want to create a new dictionary paramscheck which will be contain the elements that I'm going to loop over, but the specific numbers in paramscheck will be some function of the corresponding numbers in params.
I can't quite figure out the best way to do this.
If I write code like:
paramscheck = {}
for i in range(0,x):
for j in range(0,n):
for k in range(0,m):
paramscheck['W'+str(i)][i,j] = f(params['W'+str(i)][i,j])
Then I will get that I'm outside the range of the index of paramscheck since that dictionary started out empty.
If I copy params to paramscheck first and then do the loop with a line like
paramscheck = params
Then, I won't have issue accessing any entry in paramscheck, but my params dictionary will be modified simultaneously (from what I understand of Python, dictionaries are not actually copied, it just makes an extra label that references to the one underlying dictionary).
I've tried something like paramscheck = params.copy() and paramscheck = dict(params), but still the params dictionary is being modified when I go to modify paramscheck.
How should I accomplish this?

If your operation is elementwise by nature (e.g. x + 1), then you could just use something like
def elementwise_f(x):
return f(x)
paramscheck = {k: elementwise_f(v) for k, v in array_dict.items()}
otherwise something like the following might also work well
import numpy as np
from functools import partial
from collections import defaultdict
paramscheck = defaultdict(partial(np.zeros, (10, 20))) # instead of {}
if you want to just initialize new one with old values without changing old values while editing the new one, try deepcopy instead of copy
from copy import deepcopy
paramscheck = deepcopy(params)

paramscheck = {}
for key in params:
paramscheck[key] = {} # this line
for i in range(0,n):
for j in range(0,m):
paramscheck[key][i,j] = f(params[key][i,j])

Related

Key-Range does not stay rounded when put into dictionary in Python

I am trying to bin the values in my data and put them in a dictionary in Python.
However, after creating the dictionary, its key-range produces weird aritfacts, like 0.6900000000000001 instead of 0.69. They only appear after creating the dictionary, though, the initial array "key_range" has only normal values. Therefore, the last two lines of my code produce KeyErrors, since the value 0.69 does not exist.
Does anyone know what is going on? Is it wrong to use the zip-function? Can I not create a functioning dictionary like this? I suppose I can iterate through the key values and round them manually, but I imagine there are more elegant solutions.
Cheers, and thanks
import numpy as np
key_range = np.arange(0, 1, 0.01) # these numbers are perfectly OK.
values = [0] * len(key_range)
value_dict = dict(zip(key_range, values)) # and here, I get weird artifacts.
print(value_dict)
for i in range(0, len(data)):
value_dict[data[i]] = value_dict[data[i]] + 1

I suppose I can iterate through the key values and round them manually, but I imagine there are more elegant solutions.. For what it is worth, you can fix them within your expression that creates value_dict, which still looks pretty elegant to me:
value_dict = dict(zip(map(lambda x: round(x,2), key_range), values))

Initialize a list using inline for loop

I am initializing my list object using following code.
list = [
func1(centroids[0],value),
func1(centroids[1],value),
....,
func1(centroids[n],value)]
I am trying to do it a more elegant way using some inline iteration. Following is the pseudo code of one possible way.
list = [value for value in func1(centroids[n],value)]
I am not clear how to call func1 in an iterative way. Can you suggest a possible implementation?

For a list of objects, Python knows how to iterate over it directly so you can eliminate the index shown in most of the other answers,
res = [func1(c, value) for c in centroids]
That's all there is to it.

A simple list comprehension consists of the "template" list element, followed by the iterator needed to step through the desired values.
my_list = [func1(centroids[0],value)
for n in range(n+1)]

Use this code:
list = [func1(centroids[x], value) for x in range(n)]
This is called a list comprehension. Put the values that you want the list to contain up front, then followed by the for loop. You can use the iterating variable of the for loop with the value. In this code, you set up n number(s) of variable(s) from the function call func1(centroids[x], value). If the variable n equals to, let's say, 4, list = [func1(centroids[0], value), func1(centroids[0], value), func1(centroids[0], value), func1(centroids[0], value)] would be equal to the code above

Best method to store in python [duplicate]

I'm trying to add items to an array in python.
I run
array = {}
Then, I try to add something to this array by doing:
array.append(valueToBeInserted)
There doesn't seem to be a .append method for this. How do I add items to an array?

{} represents an empty dictionary, not an array/list. For lists or arrays, you need [].
To initialize an empty list do this:
my_list = []
or
my_list = list()
To add elements to the list, use append
my_list.append(12)
To extend the list to include the elements from another list use extend
my_list.extend([1,2,3,4])
my_list
--> [12,1,2,3,4]
To remove an element from a list use remove
my_list.remove(2)
Dictionaries represent a collection of key/value pairs also known as an associative array or a map.
To initialize an empty dictionary use {} or dict()
Dictionaries have keys and values
my_dict = {'key':'value', 'another_key' : 0}
To extend a dictionary with the contents of another dictionary you may use the update method
my_dict.update({'third_key' : 1})
To remove a value from a dictionary
del my_dict['key']

If you do it this way:
array = {}
you are making a dictionary, not an array.
If you need an array (which is called a list in python ) you declare it like this:
array = []
Then you can add items like this:
array.append('a')

Arrays (called list in python) use the [] notation. {} is for dict (also called hash tables, associated arrays, etc in other languages) so you won't have 'append' for a dict.
If you actually want an array (list), use:
array = []
array.append(valueToBeInserted)

Just for sake of completion, you can also do this:
array = []
array += [valueToBeInserted]
If it's a list of strings, this will also work:
array += 'string'

In some languages like JAVA you define an array using curly braces as following but in python it has a different meaning:
Java:
int[] myIntArray = {1,2,3};
String[] myStringArray = {"a","b","c"};
However, in Python, curly braces are used to define dictionaries, which needs a key:value assignment as {'a':1, 'b':2}
To actually define an array (which is actually called list in python) you can do:
Python:
mylist = [1,2,3]
or other examples like:
mylist = list()
mylist.append(1)
mylist.append(2)
mylist.append(3)
print(mylist)
>>> [1,2,3]

You can also do:
array = numpy.append(array, value)
Note that the numpy.append() method returns a new object, so if you want to modify your initial array, you have to write: array = ...

Isn't it a good idea to learn how to create an array in the most performant way?
It's really simple to create and insert an values into an array:
my_array = ["B","C","D","E","F"]
But, now we have two ways to insert one more value into this array:
Slow mode:
my_array.insert(0,"A") - moves all values to the right when entering an "A" in the zero position:
"A" --> "B","C","D","E","F"
Fast mode:
my_array.append("A")
Adds the value "A" to the last position of the array, without touching the other positions:
"B","C","D","E","F", "A"
If you need to display the sorted data, do so later when necessary. Use the way that is most useful to you, but it is interesting to understand the performance of each method.

I believe you are all wrong. you need to do:
array = array[] in order to define it, and then:
array.append ["hello"] to add to it.

Optimised Python dictionary / negative index storage

Raised by this question's comments (I can see that this is irrelevant), I am now aware that using dictionaries for data that needs to be queried/accessed regularly is not good, speedwise.
I have a situation of something like this:
someDict = {}
someDict[(-2, -2)] = something
somedict[(3, -10)] = something else
I am storing keys of coordinates to objects that act as arrays of tiles in a game. These are going to be negative at some point, so I can't use a list or some kind of sparse array (I think that's the term?).
Can I either:
Speed up dictionary lookups, so this would not be an issue
Find some kind of container that will support sparse, negative indices?
I would use a list, but then the querying would go from O(log n) to O(n) to find the area at (x, y). (I think my timings are off here too).

Python dictionaries are very very fast, and using a tuple of integers is not going to be a problem. However your use case seems that sometimes you need to do a single-coordinate check and doing that traversing all the dict is of course slow.
Instead of doing a linear search you can however speed up the data structure for the access you need using three dictionaries:
class Grid(object):
def __init__(self):
self.data = {} # (i, j) -> data
self.cols = {} # i -> set of j
self.rows = {} # j -> set of i
def __getitem__(self, ij):
return self.data[ij]
def __setitem__(self, ij, value):
i, j = ij
self.data[ij] = value
try:
self.cols[i].add(j)
except KeyError:
self.cols[i] = set([j])
try:
self.rows[j].add(i)
except KeyError:
self.rows[j] = add([i])
def getRow(self, i):
return [(i, j, data[(i, j)])
for j in self.cols.get(i, [])]
def getCol(self, j):
return [(i, j, data[(i, j)])
for i in self.rows.get(j, [])]
Note that there are many other possible data structures depending on exactly what you are trying to do, how frequent is reading, how frequent is updating, if you query by rectangles, if you look for nearest non-empty cell and so on.

To start off with
Speed up dictionary lookups, so this would not be an issue
Dictionary lookups are pretty fast O(1), but (from your other question) you're not relying on the hash-table lookup of the dictionary, your relying on a linear search of the dictionary's keys.
Find some kind of container that will support sparse, negative indices?
This isn't indexing into the dictionary. A tuple is an immutable object, and you are hashing the tuple as a whole. The dictionary really has no idea of the contents of the keys, just their hash.
I'm going to suggest, as others did, that you restructure your data.
For example, you could create objects that encapsulate the data you need, and arrange them in a binary tree for O(n lg n) searches. You can even go so far as to wrap the entire thing in a class that will give you the nice if foo in Bar: syntax your looking for.
You probably need a couple coordinated structures to accomplish what you want. Here's a simplified example using dicts and sets (tweaking user 6502's suggestion a bit).
# this will be your dict that holds all the data
matrix = {}
# and each of these will be a dict of sets, pointing to coordinates
cols = {}
rows = {}
def add_data(coord, data)
matrix[coord] = data
try:
cols[coord[0]].add(coord)
except KeyError:
# wrap coords in a list to prevent set() from iterating over it
cols[coord[0]] = set([coord])
try:
rows[coord[1]].add(coord)
except KeyError:
rows[coord[1]] = set([coord])
# now you can find all coordinates from a row or column quickly
>>> add_data((2, 7), "foo4")
>>> add_data((2, 5), "foo3")
>>> 2 in cols
True
>>> 5 in rows
True
>>> [matrix[coord] for coord in cols[2]]
['foo4', 'foo3']
Now just wrap that in a class or a module, and you'll be off, and as always, if it's not fast enough profile and test before you guess.

Dictionary lookups are very fast. Searching for part of the key (e.g. all tiles in row x) is what's not fast. You could use a dict of dicts. Rather than a single dict indexed by a 2-tuple, use nested dicts like this:
somedict = {0: {}, 1:{}}
somedict[0][-5] = "thingy"
somedict[1][4] = "bing"
Then if you want all the tiles in a given "row" it's just somedict[0].
You will need some logic to add the secondary dictionaries where necessary and so on. Hint: check out getitem() and setdefault() on the standard dict type, or possibly the collections.defaultdict type.
This approach gives you quick access to all tiles in a given row. It's still slow-ish if you want all the tiles in a given column (though at least you won't need to look through every single cell, just every row). However, if needed, you could get around that by having two dicts of dicts (one in column, row order and the other in row, column order). Updating then becomes twice as much work, which may not matter for a game where most of the tiles are static, but access is very easy in either direction.
If you only need to store numbers and most of your cells will be 0, check out scipy's sparse matrix classes.

One alternative would be to simply shift the index so it's positive.
E.g. if your indices are contiguous like this:
...
-2 -> a
-1 -> c
0 -> d
1 -> e
2 -> f
...
Just do something like LookupArray[Index + MinimumIndex], where MinimumIndex is the absolute value of the smallest index you would use.
That way, if your minimum was say, -50, it would map to 0. -20 would map to 30, and so forth.
Edit:
An alternative would be to use a trick with how you use the indices. Define the following key function
Key(n) = 2 * n (n >= 0)
Key(n) = -2 * n - 1. (n < 0)
This maps all positive keys to the positive even indices, and all negative elements to the positive odd indices. This may not be practical though, since if you add 100 negative keys, you'd have to expand your array by 200.
One other thing to note: If you plan on doing look ups and the number of keys is constant (or very slowly changing), stick with an array. Otherwise, dictionaries aren't bad at all.

Use multi-dimensional lists -- usually implemented as nested objects. You can easily make this handle negative indices with a little arithmetic. It might use a more memory than a dictionary since something has to be put in every possible slot (usually None for empty ones), but access will be done via simple indexing lookup rather than hashing as it would with a dictionary.

How to get value on a certain index, in a python list?

I have a list which looks something like this
List = [q1,a1,q2,a2,q3,a3]
I need the final code to be something like this
dictionary = {q1:a1,q2:a2,q3:a3}
if only I can get values at a certain index e.g List[0] I can accomplish this, is there any way I can get it?

Python dictionaries can be constructed using the dict class, given an iterable containing tuples. We can use this in conjunction with the range builtin to produce a collection of tuples as in (every-odd-item, every-even-item), and pass it to dict, such that the values organize themselves into key/value pairs in the final result:
dictionary = dict([(List[i], List[i+1]) for i in range(0, len(List), 2)])

Using extended slice notation:
dictionary = dict(zip(List[0::2], List[1::2]))

The range-based answer is simpler, but there's another approach possible using the itertools package:
from itertools import izip
dictionary = dict(izip(*[iter(List)] * 2))
Breaking this down (edit: tested this time):
# Create instance of iterator wrapped around List
# which will consume items one at a time when called.
iter(List)
# Put reference to iterator into list and duplicate it so
# there are two references to the *same* iterator.
[iter(List)] * 2
# Pass each item in the list as a separate argument to the
# izip() function. This uses the special * syntax that takes
# a sequence and spreads it across a number of positional arguments.
izip(* [iter(List)] * 2)
# Use regular dict() constructor, same as in the answer by zzzeeek
dict(izip(* [iter(List)] * 2))
Edit: much thanks to Chris Lutz' sharp eyes for the double correction.

d = {}
for i in range(0, len(List), 2):
d[List[i]] = List[i+1]

You've mentioned in the comments that you have duplicate entries. We can work with this. Take your favorite method of generating the list of tuples, and expand it into a for loop:
from itertools import izip
dictionary = {}
for k, v in izip(List[::2], List[1::2]):
if k not in dictionary:
dictionary[k] = set()
dictionary[k].add(v)
Or we could use collections.defaultdict so we don't have to check if a key is already initialized:
from itertools import izip
from collections import defaultdict
dictionary = defaultdict(set)
for k, v in izip(List[::2], List[1::2]):
dictionary[k].add(v)
We'll end with a dictionary where all the keys are sets, and the sets contain the values. This still may not be appropriate, because sets, like dictionaries, cannot hold duplicates, so if you need a single key to hold two of the same value, you'll need to change it to a tuple or a list. But this should get you started.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using a python dictionary of arrays - python

paramscheck = {} for key in params: paramscheck[key] = {} # this line for i in range(0,n): for j in range(0,m): paramscheck[key][i,j] = f(params[key][i,j])

Related

Key-Range does not stay rounded when put into dictionary in Python

Initialize a list using inline for loop

Best method to store in python [duplicate]

Optimised Python dictionary / negative index storage

How to get value on a certain index, in a python list?

Categories

Resources