mapping in SD in python - python

Hi I want to map the sd keys and values from the result of two different query. to make it more clear i have written code.
rv = plpy.execute(select id from ABC);
if this returns 1, 2, 3
rv = plpy.execute(select name from XYZ);
if this returns A,B,C
Now I need a way where I can map these two ids, so that id retrieved from first query can be used as key and name retrieved from second query can be used as values, so i will have something like
SD[1] = A
SD[2] = B
SD[3] = C
THis is needed as I am trying to create dynamic SD for my application.
Can somebody suggest me some solution.

I'm unfamiliar with plpy, so maybe I'm totally off. But if what you want to do is to create a python dictionary with key:value-pairs based on results from two queries this is my suggestion:
If these are your queries:
a = [a, b, c]
b = [1, 2, 3]
Then:
dict(zip(a, b))
Gives you a dictionary like this:
{'a': '1', 'b': '2', 'c': '3'}

Related

Update multiple key/value pairs in python at once

Say I have
d = {"a":0,"b":0,"c":0}
is there a way to update the keys a and b at the same time, instead of looping over them, such like
update_keys = ["a","b"]
d.some_function(update_keys) +=[10,5]
print(d)
{"a":10,"b":5,"c":0}
Yes, you can use update like this:
d.update({'a':10, 'b':5})
Thus, your code would look this way:
d = {"a":0,"b":0,"c":0}
d.update({'a':10, 'b':5})
print(d)
and shows:
{"a":10,"b":5,"c":0}
If you mean a function that can add a new value to the existing value without an explict loop, you can definitely do it like this.
add_value = lambda d,k,v: d.update(zip(k,list(map(lambda _k,_v:d[_k]+_v,k,v)))) or d
and you can use it like this
>>> d = {"a":2,"b":3}
>>> add_value(d,["a","b"],[2,-3])
{'a': 4, 'b': 0}
There is nothing tricky here, I just replace the loop with a map and a lambda to do the update job and use list to wrap them up so Python will immediately evaluate the result of map. Then I use zip to create an updated key-value pair and use dict's update method the update the dictionary. However I really doubt if this has any practical usage since this is definitely more complex than a for loop and introduces extra complexity to the code.
Update values of multiple keys in dictionary
d = {"a":0,"b":0,"c":0}
d.update({'a': 40, 'b': 41, 'c': 89})
print(d)
{'a': 40, 'b': 41, 'c': 89}
If you are just storing integer values, then you can use the Counter class provided by the python module "collections":
from collections import Counter
d = Counter({"a":0,"b":0,"c":0})
result = d + Counter({"a":10, "b":5})
'result' will have the value of
Counter({'a': 10, 'b': 5})
And since Counter is subclassed from Dict, you have probably do not have to change anything else in your code.
>>> isinstance(result, dict)
True
You do not see the 'c' key in the result because 0-values are not stored in a Counter instance, which saves space.
You can check out more about the Counter instance here.
Storing other numeric types is supported, with some conditions:
"For in-place operations such as c[key] += 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are supported. The same is also true for update() and subtract() which allow negative and zero values for both inputs and outputs."
Performing the inverse operation of "+" requires using the method "subtract", which is a note-worthy "gotcha".
>>> d = Counter({"a":10, "b":15})
>>> result.subtract(d)
>>> c
Counter({'a': 0, 'b': 0})

What is the fastest way to return a specific list within a dictionary within a dictionary?

I have a list within a dictionary within a dictionary. The data set is very large. How can I most quickly return the list nested in the two dictionaries if I am given a List that is specific to the key, dict pairs?
{"Dict1":{"Dict2": ['UNIOUE LIST'] }}
Is there an alternate data structure to use for this for efficiency?
I do not believe a more efficient data structure exists in Python. Simply retrieving the list using the regular indexing operator should be a very fast operation, even if both levels of dictionaries are very large.
nestedDict = {"Dict1":{"Dict2": ['UNIOUE LIST'] }}
uniqueList = nestedDict["Dict1"]["Dict2"]
My only thought for improving performance was to try flattening the data structure into a single dictionary with tuples for keys. This would take more memory than the nested approach since the keys in the top-level dictionary will be replicated for every entry in the second-level dictionaries, but it will only compute the hash function once for every lookup. But this approach is actually slower than the nested approach in practice:
nestedDict = {i: {j: ['UNIQUE LIST'] for j in range(1000)} for i in range(1000)}
flatDict = {(i, j): ['UNIQUE LIST'] for i in range(1000) for j in range(1000)}
import random
def accessNested():
i = random.randrange(1000)
j = random.randrange(1000)
return nestedDict[i][j]
def accessFlat():
i = random.randrange(1000)
j = random.randrange(1000)
return nestedDict[(i,j)]
import timeit
print(timeit.timeit(accessNested))
print(timeit.timeit(accessFlat))
Output:
2.0440238649971434
2.302736301004188
The fastest way to access the list within the nested dictionary is,
d = {"Dict1":{"Dict2": ['UNIOUE LIST'] }}
print(d["Dict1"]["Dict2"])
Output :
['UNIOUE LIST']
But if you perform iteration on the list that is in nested dictionary. so you can use the following code as example,
d = {"a":{"b": ['1','2','3','4'] }}
for i in d["a"]["b"]:
print(i)
Output :
1
2
3
4
If I understand correctly, you want to access a nested dictionary structure if...
if I am given a List that is specific to the key
So, here you have a sample dictionary and key that you want to access
d = {'a': {'a': 0, 'b': 1},
'b': {'a': {'a': 2}, 'b': 3}}
key = ('b', 'a', 'a')
The lazy approach
This is fast if you know Python dictionaries already, no need to learn other stuff!
>>> value = d
>>> for level in key:
... value = temp[level]
>>> value
2
NestedDict from the ndicts package
If you pip install ndicts then you get the same "lazy approach" implementation in a nicer interface.
>>> from ndicts import NestedDict
>>> nd = NestedDict(d)
>>> nd[key]
2
>>> nd["b", "a", "a"]
2
This option is fast because you can't really write less code than nd[key] to get what you want.
Pandas dataframes
This is the solution that will give you performance. Lookups in dataframes should be quick, especially if you have a sorted index.
In this case we have hierarchical data with multiple levels, so I will create a MultiIndex first. I will use the NestedDict for ease, but anything else to flatten the dictionary will do.
>>> keys = list(nd.keys())
>>> values = list(nd.values())
>>> from pandas import DataFrame, MultiIndex
>>> index = MultiIndex.from_tuples(keys)
>>> df = DataFrame(values, index=index, columns="Data").sort_index()
>>> df
Data
a a NaN 0
b NaN 1
b a a 2
b NaN 3
Use the loc method to get a row.
>>> nd.loc[key]
Data 2
Name: (b, a, a), dtype: int64

I have a file with content like [(a,b)(c,d)(a,e)] i want to map a to b & e, c to d. How to keep this as a program?

I have retrieved some data from database which consists of user name and locations which he will be using for work. some users might use multiple locations. i have keep this data in excel clearly through python.
input :
[(a,b)(c,d)(e,f)(a,g)] content is in file
output:
a:b,g c:d e:f
1 possible solution is to create a dictionary and then append 2nd values of tuple to the key. I took the liberty of transforming elements to string to make minimal working code snippet.
lst = [('a','b'),('c','d'),('e','f'), ('a','g')]
d = {}
for k,v in lst:
d.setdefault(k, []).append(v)
Output
{'a': ['b', 'g'], 'c': ['d'], 'e': ['f']}

pack multiple variables of different datatypes in list/array python

I have multiple variables that I need to pack as one and hold it sequentially like in a array or list. This needs to be done in Python and I am still at Python infancy.
E.g. in Python:
a = Tom
b = 100
c = 3.14
d = {'x':1, 'y':2, 'z':3}
All the above in one sequential data structure. I can probably try and also a similar implementation I would have done in C++ just for the sake of clarity.
struct
{
string a;
int b;
float c;
map <char,int> d;// just as an example for dictionary in python
} node;
vector <node> v; // looking for something like this which can be iterable
If some one can give me a similar implementation for storing, iterating and modifying the contents would be really helpful. Any pointers in the right direction is also good with me.
Thanks
You can either use a dictionary like Michael proposes (but then you need to access the contents of v with v['a'], which is a little cumbersome), or you can use the equivalent of C++'s struct: a named tuple:
import collections
node = collections.namedtuple('node', 'a b c d')
# Tom = ...
v = node(Tom, 100, 3.14, {'x':1, 'y':2, 'z':3})
print node # node(a=…, b=100, c=3.14, d={'x':1, 'y':2, 'z':3})
print node.c # 3.14
print node[2] # 3.14 (works too, but is less meaningful and robust than something like node.last_name)
This is similar to, but simpler than defining your own class: type(v) == node, etc. Note however, as volcano pointed out, that the values stored in a namedtuple cannot be changed (a namedtuple is immutable).
If you indeed need to modify the values inside your records, the best option is a class:
class node(object):
def __init__(self, *arg_list):
for (name, arg) in zip('a b c d'.split(), arg_list):
setattr(self, name, arg)
v = node(1, 20, 300, "Eric")
print v.d # "Eric"
v.d = "Ajay" # Works
The last option, which I do not recommend, is indeed to use a list or a tuple, like ATOzTOA mentions: elements must be accessed in a not-so-legible way: node[3] is less meaningful than node.last_name; also, you cannot easily change the order of the fields, when using a list or a tuple (whereas the order is immaterial if you access a named tuple or custom class attributes).
Multiple node objects are customarily put in a list, the standard Python structure for such a purpose:
all_nodes = [node(…), node(…),…]
or
all_nodes = []
for … in …:
all_nodes.append(node(…))
or
all_nodes = [node(…) for … in …]
etc. The best method depends on how the various node objects are created, but in many cases a list is likely to be the best structure.
Note, however, that if you need to store something akin to an spreadsheet table and need speed and facilities for accessing its columns, you might be better off with NumPy's record arrays, or a package like Pandas.
You could put all the values in a dictionary, and have a list of these dictionaries.
{'a': a, 'b': b, 'c': c, 'd': d}
Otherwise, if this data is something that could be represented by a class, for example a 'Person'; create a class of type Person and create an object of that class with your data:
http://docs.python.org/2/tutorial/classes.html
Just use lists:
a = "Tom"
b = 100
c = 3.14
d = {'x':1, 'y':2, 'z':3}
data = [a, b, c, d]
print data
for item in data:
print item
Output:
['Tom', 100, 3.14, {'y': 2, 'x': 1, 'z': 3}]
Tom
100
3.14
{'y': 2, 'x': 1, 'z': 3}

Destructuring-bind dictionary contents

I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like
params = {'a':1,'b':2}
a,b = params.values()
But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?
from operator import itemgetter
params = {'a': 1, 'b': 2}
a, b = itemgetter('a', 'b')(params)
Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.
One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict:
pluck = lambda dict, *args: (dict[arg] for arg in args)
things = {'blah': 'bleh', 'foo': 'bar'}
foo, blah = pluck(things, 'foo', 'blah')
Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict:
sorted_vals = lambda dict: (t[1] for t in sorted(dict.items()))
things = {'foo': 'bar', 'blah': 'bleh'}
blah, foo = sorted_vals(things)
How come nobody posted the simplest approach?
params = {'a':1,'b':2}
a, b = params['a'], params['b']
Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy):
a,b = [d[k] for k in ('a','b')]
This works with generators too:
a,b = (d[k] for k in ('a','b'))
Here is a full example:
>>> d = dict(a=1,b=2,c=3)
>>> d
{'a': 1, 'c': 3, 'b': 2}
>>> a, b = [d[k] for k in ('a','b')]
>>> a
1
>>> b
2
>>> a, b = (d[k] for k in ('a','b'))
>>> a
1
>>> b
2
Here's another way to do it similarly to how a destructuring assignment works in JS:
params = {'b': 2, 'a': 1}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had:
params = {'b': 2, 'a': 1, 'c': 3}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
After the lambda has been applied, the rest variable will now contain:
{'c': 3}
Useful for omitting unneeded keys from a dictionary.
Hope this helps.
Maybe you really want to do something like this?
def some_func(a, b):
print a,b
params = {'a':1,'b':2}
some_func(**params) # equiv to some_func(a=1, b=2)
If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted
(Ab)using the import system
The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this:
import sys, types
mydict = {'a':1,'b':2}
sys.modules["mydict"] = types.SimpleNamespace(**mydict)
from mydict import a, b
A somewhat more serious hack would be to write a context manager to load and unload the module:
with obj_as_module(mydict, "mydict_module"):
from mydict_module import a, b
By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict).
See this answer for an implementation and some extensions of the idea.
One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.
Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations:
CPython implementation detail: This function relies on Python stack frame support
in the interpreter, which isn’t guaranteed to exist in all implementations
of Python. If running in an implementation without Python stack frame support
this function returns None.
Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to!
def destructure(dict_):
if not isinstance(dict_, dict):
raise TypeError(f"{dict_} is not a dict")
# the parent frame will contain the information about
# the current line
parent_frame = inspect.currentframe().f_back
# so we extract that line (by default the code context
# only contains the current line)
(line,) = inspect.getframeinfo(parent_frame).code_context
# "hello, key = destructure(my_dict)"
# -> ("hello, key ", "=", " destructure(my_dict)")
lvalues, _equals, _rvalue = line.strip().partition("=")
# -> ["hello", "key"]
keys = [s.strip() for s in lvalues.split(",") if s.strip()]
if missing := [key for key in keys if key not in dict_]:
raise KeyError(*missing)
for key in keys:
yield dict_[key]
In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"}
In [6]: hello, key = destructure(my_dict)
In [7]: hello
Out[7]: 'world'
In [8]: key
Out[8]: 'value'
This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries
With Python 3.10, you can do:
d = {"a": 1, "b": 2}
match d:
case {"a": a, "b": b}:
print(f"A is {a} and b is {b}")
but it adds two extra levels of indentation, and you still have to repeat the key names.
Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon.
try this
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
keys = data.keys()
a,b,c = [data[k] for k in keys]
result:
a == 'Apple'
b == 'Banana'
c == 'Carrot'
Well, if you want these in a class you can always do this:
class AttributeDict(dict):
def __init__(self, *args, **kwargs):
super(AttributeDict, self).__init__(*args, **kwargs)
self.__dict__.update(self)
d = AttributeDict(a=1, b=2)
Based on #ShawnFumo answer I came up with this:
def destruct(dict): return (t[1] for t in sorted(dict.items()))
d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' }
a, b, c = destruct(d)
(Notice the order of items in dict)
An old topic, but I found this to be a useful method:
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
for key in data.keys():
locals()[key] = data[key]
This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable.
Testing:
print(a)
print(b)
print(c)
Output
Apple
Banana
Carrot
An easy and simple way to destruct dict in python:
params = {"a": 1, "b": 2}
a, b = [params[key] for key in ("a", "b")]
print(a, b)
# Output:
# 1 2
I don't know whether it's good style, but
locals().update(params)
will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.
Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays:
params = {'a': 1, 'b': 2}
a, b = params.values()
print(a)
print(b)
Output:
1
2

Categories

Resources