Related
If I have a dictionary such as:
clues = {'w':'e','r':'t'}
How do I get the first of each letter in the two to join together in a string, it is something like...
for clue in clues:
clue = ''.join(
However I don't know how to get them into a string from this...
Edit:
You can use a list comprehension for that:
>>> clues = {'w':'e','r':'t'}
>>> [''.join(x) for x in (clues, clues.values())]
['wr', 'et']
>>>
how would you get the first of each letter in the two to join together
in a string
I think you are talking about the dictionary's keys. If so, then you can use str.join:
>>> clues = {'w':'e','r':'t'}
>>> ''.join(clues)
'wr'
>>>
Also, iterating over a dictionary (which is what str.join is doing) will yield its keys. Thus, there is no need to do:
''.join(clues.keys())
Finally, #DSM made a good point. Dictionaries are naturally unordered in Python. Meaning, you could get rw just as easily as you get wr.
If you want a dictionary with guarunteed order, check out collections.OrderedDict.
And if you want all the keys joined into a string, try this:
''.join(clues.keys())
It's not entirely clear what your question is, but if you want to join the key and value together, storing that result into a new set, this would be the solution:
>>> {''.join(key_value) for key_value in clues.items()}
set(['rt', 'we'])
Written long hand for clarity:
out_set = set()
for key_value in clues.items():
key_value_joined = ''.join(key_value)
out_set.add(key_value_joined)
This should do the trick:
''.join(clues.keys())
I have a list of the form
['A', 'B', 'C', 'D']
which I want to mutate into:
[('Option1','A'), ('Option2','B'), ('Option3','C'), ('Option4','D')]
I can iterate over the original list and mutate successfully, but the closest that I can come to what I want is this:
["('Option1','A')", "('Option2','B')", "('Option3','C')", "('Option4','D')"]
I need the single quotes but don't want the double quotes around each list.
[EDIT] - here is the code that I used to generate the list; although I've tried many variations. Clearly, I've turned 'element' into a string--obviously, I'm not thinking about it the right way here.
array = ['A', 'B', 'C', 'D']
listOption = 0
finalArray = []
for a in array:
listOption += 1
element = "('Option" + str(listOption) + "','" + a + "')"
finalArray.append(element)
Any help would be most appreciated.
[EDIT] - a question was asked (rightly) why I need it this way. The final array will be fed to an application (Indigo home control server) to populate a drop-down list in a config dialog.
[('Option{}'.format(i+1),item) for i,item in enumerate(['A','B','C','D'])]
# EDIT FOR PYTHON 2.5
[('Option%s' % (i+1), item) for i,item in enumerate(['A','B','C','D'])]
This is how I'd do it, but honestly I'd probably try not to do this and instead want to know why I NEEDED to do this. Any time you're making a variable with a number in it (or in this case a tuple with one element of data and one element naming the data BY NUMBER) think instead how you could organize your consuming code to not need that instead.
For instance: when I started coding professionally the company I work for had an issue with files not being purged on time at a few of our locations. Not all the files, mind you, just a few. In order to provide our software developer with the information to resolve the problem, we needed a list of files from which sites the purge process was failing on.
Because I was still wet behind the ears, instead of doing something SANE like making a dictionary with keys of the files and values of the sizes, I used locals() to create new variables WITH MEANING. Don't do this -- your variables should mean nothing to anyone but future coders. Basically I had a whole bunch of variables named "J_ITEM" and "J_INV" and etc with a value 25009 and etc, one for each file, then I grouped them all together with [item for item in locals() if item.startswith("J_")]. THAT'S INSANITY! Don't do this, build a saner data structure instead.
That said, I'm interested in how you put it all together. Do you mind sharing your code by editing your answer? Maybe we can work together on a better solution than this hackjob.
x = ['A','B','C','D']
option = 1
answer = []
for element in x:
t = ('Option'+str(option),element) #Creating the tuple
answer.append(t)
option+=1
print answer
A tuple is different from a string, in that a tuple is an immutable list. You define it by writing:
t = (something, something_else)
You probably defined t to be a string "(something, something_else)" which is indicated by the quotations surrounding the expression.
In addition to adsmith great answer, I would add the map way:
>>> map(lambda (index, item): ('Option{}'.format(index+1),item), enumerate(['a','b','c', 'd']))
[('Option1', 'a'), ('Option2', 'b'), ('Option3', 'c'), ('Option4', 'd')]
I am brand-new to Python, having decided to make the jump from Matlab. I have tried to find the answer to my question for days but without success!
The problem: I have a bunch of objects with certain attributes. Note that I am not talking about objects and attributes in the programming sense of the word - I am talking about literal astronomical objects about which I have different types of numerical data and physical attributes for.
In a loop in my script, I go through each source/object in my catalogue, do some calculations, and stick the results in a huge dictionary. The form of the script is like this:
for i in range ( len(ObjectCatalogue) )
calculate quantity1 for source i
calculate quantity2 for source i
determine attribute1 for source i
sourceDataDict[i].update( {'spectrum':quantity1} )
sourceDataDict[i].update( {'peakflux':quantity2} )
sourceDataDict[i].update( {'morphology':attribute1} )
So once I have gone through a hundred sources or so, I can, say, access the spectrum for object no. 20 with spectrumSource20 = sourceData[20]['spectrum'] etc.
What I want to do is be able to select all objects in the dictionary based on the value of the keyword 'morphology' say. So say the keyword for 'morphology' can take on the values 'simple' or 'complex'. Is there anyway I can do this without resorting to a loop? I.e. - could I do something like create a new dictionary that contains all the sources that take the 'complex' value for the 'morphology' keyword?
Hard to explain, but using logical indexing that I am used to from Matlab, it would look something like
complexSourceDataDict = sourceDataDict[*]['morphology'=='complex']
(where * indicates all objects in the dictionary)
Anyway - any help would be greatly appreciated!
Without a loop, no. With a list comprehension, yes:
complex = [src for src in sourceDataDict.itervalues() if src.get('morphology') == 'complex']
If sourceDataDict happens to really be a list, you can drop the itervalues:
complex = [src for src in sourceDataDict if src.get('morphology') == 'complex']
If you think about it, evaluating a * would imply a loop operation under the hood anyways (assuming it were valid syntax). So your trick is to do the most efficient looping you can with the data structure you are using.
The only way to get more efficient would be to index all of the data objects "morphology" keys ahead of time and keep them up to date.
There's not a direct way to index nested dictionaries out of order, like your desired syntax wants to do. However, there are a few ways to do it in Python, with varying interfaces and performance characteristics.
The best performing solution would probably be to create an additional dictionary which indexes by whatever characteristic you care about. For instance, to find values with the 'morphology' value is 'complex', you'd d something like this:
from collections import defaultdict
# set up morphology dict (you could do this as part of generating the morphology)
morph_dict = defaultdict(list)
for data in sourceDataDict.values():
morph_dict[data["morphology"]].append(data)
# later, you can access a list of the values with any particular morphology
complex_morph = morph_dict["complex"]
While this is high-performance, it may be annoying to need to set up the reverse indexes for everything ahead of time. An alternative might be to use a list comprehension or generator expression to iterate over your dictionary and finding the appropriate values:
complex = (d for d in sourceDataDict.values() if d["morphology"] == "complex")
for c in complex:
do_whatever(c)
I believe you are dealing with a structure similar to the following
sourceDataDict = [
{'spectrum':1,
'peakflux':10,
'morphology':'simple'
},
{'spectrum':2,
'peakflux':11,
'morphology':'comlex'
},
{'spectrum':3,
'peakflux':12,
'morphology':'simple'
},
{'spectrum':4,
'peakflux':13,
'morphology':'complex'
}
]
you can do something similar using List COmprehension
>>> [e for e in sourceDataDict if e.get('morphology',None) == 'complex']
[{'morphology': 'complex', 'peakflux': 13, 'spectrum': 4}]
Using itertools.ifilter, you can achieve a similar result
>>> list(itertools.ifilter(lambda e:e.get('morphology',None) == 'complex', sourceDataDict))
[{'morphology': 'complex', 'peakflux': 13, 'spectrum': 4}]
Please note, the use of get instead of indexing is to ensure that the functionality wont fail even when the key 'morphology' does not exist. In case, its definite to exist, you can rewrite the above as
>>> [e for e in sourceDataDict if e['morphology'] == 'complex']
[{'morphology': 'complex', 'peakflux': 13, 'spectrum': 4}]
>>> list(itertools.ifilter(lambda e:e['morphology'] == 'complex', sourceDataDict))
[{'morphology': 'complex', 'peakflux': 13, 'spectrum': 4}]
Working with huge amount of data, you may want to store it somewhere, so some sort of database and ORM (for instance), but latter is a matter of taste. Sort of RDBMS may be solution.
As for raw python, there is no built-in solution except of functional routines like filter. Anyway you face iteration at some step (implicitly or not).
The simpliest way is is keeping additional dict with keys as attribute values:
objectsBy['morphology'] = {'complex': set(), 'simple': set()}
for item in sources:
...
objMorphology = compute_morphology(item)
objectsBy['morphology'][objMorphology] += item
...
I need to make a dictionary where you can reference [[1,2],[3,4]] --> ([1,2]:0, [2,3]:0)
I've tried different ways but I can't use a list in a dictionary. So i tried using tuples, but its still the same. Any help is appreciated!
You need to use tuples:
dict.fromkeys((tuple(i) for i in [[1,2],[3,4]]), 0)
or (for python2.7+)
{tuple(i): 0 for i in [[1,2], [3,4]]}
Edit:
Reading the comments, OP probably want to count occurrences of a list:
>>> collections.Counter(tuple(i) for i in [[1,2], [1,2], [3,4]])
Counter({(1, 2): 2, (3, 4): 1})
Lists can't be used as dictionary keys since they aren't hashable (probably because they can be mutated so coming up with a reasonable hash function is impossible). tuple however poses no problem:
d = {(1,2):0, (3,4):0}
Note that in your example, you seem to imply that you're trying to build a dictionary like this:
((1,2):0, (3,4):0)
That won't work. You need curly brackets to make a dictionary.
([1,2]:0, [2,3]:0) is not a dictionary. I think you meant to use: {(1,2):0, (2,3):1}
I have a problem which requires a reversable 1:1 mapping of keys to values.
That means sometimes I want to find the value given a key, but at other times I want to find the key given the value. Both keys and values are guaranteed unique.
x = D[y]
y == D.inverse[x]
The obvious solution is to simply invert the dictionary every time I want a reverse-lookup: Inverting a dictionary is very easy, there's a recipe here but for a large dictionary it can be very slow.
The other alternative is to make a new class which unites two dictionaries, one for each kind of lookup. That would most likely be fast but would use up twice as much memory as a single dict.
So is there a better structure I can use?
My application requires that this should be very fast and use as little as possible memory.
The structure must be mutable, and it's strongly desirable that mutating the object should not cause it to be slower (e.g. to force a complete re-index)
We can guarantee that either the key or the value (or both) will be an integer
It's likely that the structure will be needed to store thousands or possibly millions of items.
Keys & Valus are guaranteed to be unique, i.e. len(set(x)) == len(x) for for x in [D.keys(), D.valuies()]
The other alternative is to make a new
class which unites two dictionaries,
one for each kind of lookup. That
would most likely be fast but would
use up twice as much memory as a
single dict.
Not really. Have you measured that? Since both dictionaries would use references to the same objects as keys and values, then the memory spent would be just the dictionary structure. That's a lot less than twice and is a fixed ammount regardless of your data size.
What I mean is that the actual data wouldn't be copied. So you'd spend little extra memory.
Example:
a = "some really really big text spending a lot of memory"
number_to_text = {1: a}
text_to_number = {a: 1}
Only a single copy of the "really big" string exists, so you end up spending just a little more memory. That's generally affordable.
I can't imagine a solution where you'd have the key lookup speed when looking by value, if you don't spend at least enough memory to store a reverse lookup hash table (which is exactly what's being done in your "unite two dicts" solution).
class TwoWay:
def __init__(self):
self.d = {}
def add(self, k, v):
self.d[k] = v
self.d[v] = k
def remove(self, k):
self.d.pop(self.d.pop(k))
def get(self, k):
return self.d[k]
The other alternative is to make a new class which unites two dictionaries, one for each > kind of lookup. That would most likely use up twice as much memory as a single dict.
Not really, since they would just be holding two references to the same data. In my mind, this is not a bad solution.
Have you considered an in-memory database lookup? I am not sure how it will compare in speed, but lookups in relational databases can be very fast.
Here is my own solution to this problem: http://github.com/spenthil/pymathmap/blob/master/pymathmap.py
The goal is to make it as transparent to the user as possible. The only introduced significant attribute is partner.
OneToOneDict subclasses from dict - I know that isn't generally recommended, but I think I have the common use cases covered. The backend is pretty simple, it (dict1) keeps a weakref to a 'partner' OneToOneDict (dict2) which is its inverse. When dict1 is modified dict2 is updated accordingly as well and vice versa.
From the docstring:
>>> dict1 = OneToOneDict()
>>> dict2 = OneToOneDict()
>>> dict1.partner = dict2
>>> assert(dict1 is dict2.partner)
>>> assert(dict2 is dict1.partner)
>>> dict1['one'] = '1'
>>> dict2['2'] = '1'
>>> dict1['one'] = 'wow'
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict1['one'] = '1'
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict1.update({'three': '3', 'four': '4'})
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict3 = OneToOneDict({'4':'four'})
>>> assert(dict3.partner is None)
>>> assert(dict3 == {'4':'four'})
>>> dict1.partner = dict3
>>> assert(dict1.partner is not dict2)
>>> assert(dict2.partner is None)
>>> assert(dict1.partner is dict3)
>>> assert(dict3.partner is dict1)
>>> dict1.setdefault('five', '5')
>>> dict1['five']
'5'
>>> dict1.setdefault('five', '0')
>>> dict1['five']
'5'
When I get some free time, I intend to make a version that doesn't store things twice. No clue when that'll be though :)
Assuming that you have a key with which you look up a more complex mutable object, just make the key a property of that object. It does seem you might be better off thinking about the data model a bit.
"We can guarantee that either the key or the value (or both) will be an integer"
That's weirdly written -- "key or the value (or both)" doesn't feel right. Either they're all integers, or they're not all integers.
It sounds like they're all integers.
Or, it sounds like you're thinking of replacing the target object with an integer value so you only have one copy referenced by an integer. This is a false economy. Just keep the target object. All Python objects are -- in effect -- references. Very little actual copying gets done.
Let's pretend that you simply have two integers and can do a lookup on either one of the pair. One way to do this is to use heap queues or the bisect module to maintain ordered lists of integer key-value tuples.
See http://docs.python.org/library/heapq.html#module-heapq
See http://docs.python.org/library/bisect.html#module-bisect
You have one heapq (key,value) tuples. Or, if your underlying object is more complex, the (key,object) tuples.
You have another heapq (value,key) tuples. Or, if your underlying object is more complex, (otherkey,object) tuples.
An "insert" becomes two inserts, one to each heapq-structured list.
A key lookup is in one queue; a value lookup is in the other queue. Do the lookups using bisect(list,item).
It so happens that I find myself asking this question all the time (yesterday in particular). I agree with the approach of making two dictionaries. Do some benchmarking to see how much memory it's taking. I've never needed to make it mutable, but here's how I abstract it, if it's of any use:
class BiDict(list):
def __init__(self,*pairs):
super(list,self).__init__(pairs)
self._first_access = {}
self._second_access = {}
for pair in pairs:
self._first_access[pair[0]] = pair[1]
self._second_access[pair[1]] = pair[0]
self.append(pair)
def _get_by_first(self,key):
return self._first_access[key]
def _get_by_second(self,key):
return self._second_access[key]
# You'll have to do some overrides to make it mutable
# Methods such as append, __add__, __del__, __iadd__
# to name a few will have to maintain ._*_access
class Constants(BiDict):
# An implementation expecting an integer and a string
get_by_name = BiDict._get_by_second
get_by_number = BiDict._get_by_first
t = Constants(
( 1, 'foo'),
( 5, 'bar'),
( 8, 'baz'),
)
>>> print t.get_by_number(5)
bar
>>> print t.get_by_name('baz')
8
>>> print t
[(1, 'foo'), (5, 'bar'), (8, 'baz')]
How about using sqlite? Just create a :memory: database with a two-column table. You can even add indexes, then query by either one. Wrap it in a class if it's something you're going to use a lot.