How to combine initialization and assignment of dictionary in Python? - python

I would like to figure out if any deal is selected twice or more.
The following example is stripped down for sake of readability. But in essence I thought the best solution would be using a dictionary, and whenever any deal-container (e.g. deal_pot_1) contains the same deal twice or more, I would capture it as an error.
The following code served me well, however by itself it throws an exception...
if deal_pot_1:
duplicates[deal_pot_1.pk] += 1
if deal_pot_2:
duplicates[deal_pot_2.pk] += 1
if deal_pot_3:
duplicates[deal_pot_3.pk] += 1
...if I didn't initialize this before hand like the following.
if deal_pot_1:
duplicates[deal_pot_1.pk] = 0
if deal_pot_2:
duplicates[deal_pot_2.pk] = 0
if deal_pot_3:
duplicates[deal_pot_3.pk] = 0
Is there anyway to simplify/combine this?

There are basically two options:
Use a collections.defaultdict(int). Upon access of an unknown key, it will initialise the correposnding value to 0.
For a dictionary d, you can do
d[x] = d.get(x, 0) + 1
to initialise and increment in a single statement.
Edit: A third option is collections.Counter, as pointed out by Mark Byers.

It looks like you want collections.Counter.

Look at collections.defaultdict. It looks like you want defaultdict(int).

So you only want to know if there are duplicated values? Then you could use a set:
duplicates = set()
for value in values:
if value in duplicates():
raise Exception('Duplicate!')
duplicates.add(value)
If you would like to find all duplicated:
maybe_duplicates = set()
confirmed_duplicates = set()
for value in values:
if value in maybe_duplicates():
confirmed_duplicates.add(value)
else:
maybe_duplicates.add(value)
if confirmed_duplicates:
raise Exception('Duplicates: ' + ', '.join(map(str, confirmed_duplicates)))

A set is probably the way to go here - collections.defaultdict is probably more than you need.
Don't forget to come up with a canonical order for your hands - like sort the cards from least to greatest, by suit and face value. Otherwise you might not detect some duplicates.

Related

Testing if a dictionary key ends in a letter?

tools = {"Wooden_Sword1" : 10, "Bronze_Helmet1 : 20}
I have code written to add items, i'm adding an item like so:
tools[key_to_find] = int(b)
the key_to_find is the tool and the b is the durability and i need to find a way so if i'm adding and Wooden_Sword1 already exists it adds a Wooden_Sword2 instead. This has to work with other items as well
As user3483203 and ShadowRanger commented, it's probably a bad idea to use numbers in your key string as part of the data. Manipulating those numbers will be awkward, and there are better alternatives. For instance, rather than storing a single value for each numbered key, use simple keys and store a list. The index into the list will take the place of the number in the key.
Here's how you could implement it:
tools = {"Wooden_Sword" : [10], "Bronze_Helmet" : [20]}
Add a new wooden sword with durability 10:
tools.setdefault("Wooden_Sword", []).append(10)
Find how many bronze helmets we have:
helmets = tools.get("Bronze_Helmet", [])
print("we have {} helmets".format(len(helmets)))
Find the first bronze helmet with a non-zero durability, and reduce it by 1:
helmets = tools.get("Bronze_Helmet", [])
for i, durability in helmets:
if durability > 0:
helmets[i] -= 1
break
else: # this runs if the break statement was never reached and the loop ran to completion
take_extra_damage() # or whatever
You could simplify some of this code by using a collections.defaultdict instead of a regular dictionary, but if you learn how to use get and setdefault it's not too hard to get by with the regular dict.
To ensure a key name is not taken yet, and add a number if it is, create the new name and test. Then increment the number if it is already in your list. Just repeat until none is found.
In code:
def next_name(basename, lookup):
if basename not in lookup:
return basename
number = 1
while basename+str(number) in lookup:
number += 1
return basename+str(number)
While this code does what you ask, you may want to look at other methods. A possible drawback is that there is no association between, say, WoodenShoe1 and WoodenShoe55 – if 'all wooden shoes' need their price increased, you'd have to iterate over all possible names between 1 and 55, just in case these existed at some time.
From what I understand of the question, your keys have 2 parts: "Name" and "ID". The ID is just an integer that starts at 1, so you can initialize a counter for every name:
numOfWoodenSwords = 0
And to add to the array:
numOfWoodenSwords += 1
tools["wodden_sword" + str(numOfWoodenSwords)] = int(b)
If you need to have an unknown amount of tools, I recommend looking at the re module: https://docs.python.org/3/library/re.html.
Or you could iterate over tools.keys to see if the entry exists.
You could write a function that determines if a character is a letter:
def is_letter(char):
return 65 <= ord(char) <= 90 or 97 <= ord(char) <= 122
Then when you are looking at a key in your dictionary, simply:
if is_letter(key[-1]):
...

Constantly getting IndexError and am unsure why in Python

I am new to python and really programming in general and am learning python through a website called rosalind.info, which is a website that aims to teach through problem solving.
Here is the problem, wherein you're asked to calculate the percentage of guanine and thymine to the string of DNA given to for each ID, then return the ID of the sample with the greatest percentage.
I'm working on the sample problem on the page and am experiencing some difficulty. I know my code is probably really inefficient and cumbersome but I take it that's to be expected for those who are new to programming.
Anyway, here is my code.
gc = open("rosalind_gcsamp.txt","r")
biz = gc.readlines()
i = 0
gcc = 0
d = {}
for i in xrange(biz.__len__()):
if biz[i].startswith(">"):
biz[i] = biz[i].replace("\n","")
biz[i+1] = biz[i+1].replace("\n","") + biz[i+2].replace("\n","")
del biz[i+2]
What I'm trying to accomplish here is, given input such as this:
>Rosalind_6404
CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCC
TCCCACTAATAATTCTGAGG
Break what's given into a list based on the lines and concatenate the two lines of DNA like so:
['>Rosalind_6404', 'CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCCTCCCACTAATAATTCTGAGG', 'TCCCACTAATAATTCTGAGG\n']
And delete the entry two indices after the ID, which is >Rosalind. What I do with it later I still need to figure out.
However, I keep getting an index error and can't, for the life of me, figure out why. I'm sure it's a trivial reason, I just need some help.
I've even attempted the following to limited success:
for i in xrange(biz.__len__()):
if biz[i].startswith(">"):
biz[i] = biz[i].replace("\n","")
biz[i+1] = biz[i+1].replace("\n","") + biz[i+2].replace("\n","")
elif biz[i].startswith("A" or "C" or "G" or "T") and biz[i+1].startswith(">"):
del biz[i]
which still gives me an index error but at least gives me the biz value I want.
Thanks in advance.
It is very easy do with itertools.groupby using lines that start with > as the keys and as the delimiters:
from itertools import groupby
with open("rosalind_gcsamp.txt","r") as gc:
# group elements using lines that start with ">" as the delimiter
groups = groupby(gc, key=lambda x: not x.startswith(">"))
d = {}
for k,v in groups:
# if k is False we a non match to our not x.startswith(">")
# so use the value v as the key and call next on the grouper object
# to get the next value
if not k:
key, val = list(v)[0].rstrip(), "".join(map(str.rstrip,next(groups)[1],""))
d[key] = val
print(d)
{'>Rosalind_0808': 'CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGACTGGGAACCTGCGGGCAGTAGGTGGAAT', '>Rosalind_5959': 'CCATCGGTAGCGCATCCTTAGTCCAATTAAGTCCCTATCCAGGCGCTCCGCCGAAGGTCTATATCCATTTGTCAGCAGACACGC', '>Rosalind_6404': 'CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCCTCCCACTAATAATTCTGAGG'}
If you need order use a collections.OrderedDict in place of d.
You are looping over the length of biz. So in your last iteration biz[i+1] and biz[i+2] don't exist. There is no item after the last.

accessing values of a dictionary with duplicate keys

I have a dictionary that looks like this:
reply = {icon:[{name:whatever,url:logo1.png},{name:whatever,url:logo2.png}]}
how do i access logo1.png ?
I tried :
print reply[icon][url]
and it gives me a error:
list indices must be integers, not str
EDIT:
Bear in mind sometimes my dictionary changes to this :
reply = {icon:{name:whatever,url:logo1.png}}
I need a general solution which will work for both kinds of dictionaries
EDIT2:
My solution was like this :
try:
icon = reply['icon']['url']
print icon
except Exception:
icon = reply['icon'][0]['url']
print ipshit,icon
This works but looks horrible. I was wondering if there was an easier way than this
Have you tried this?
reply[icon][0][url]
If you know for sure all the different kinds of responses that you will get, you'll have to write a parser where you're explicitly checking if the values are lists or dicts.
You could try this if it is only the two possibilities that you've described:
def get_icon_url(reply):
return reply['icon'][0]['url']\
if type(reply['icon']) is list else reply['icon']['url']
so in this case, icon is the key to a list, that has two dictionaries with two key / value pairs in each. Also, it looks like you might want want your keys to be strings (icon = 'icon', name='name').. but perhaps they are variables in which case disregard, i'm going to use strings below because it seems the most correct
so:
reply['icon'] # is equal to a list: []
reply['icon'][0] # is equal to a dictionary: {}
reply['icon'][0]['name'] # is equal to 'whatever'
reply['icon'][0]['url'] # is equal to 'logo1.png'
reply['icon'][1] # is equal to the second dictionary: {}
reply['icon'][1]['name'] # is equal to 'whatever'
reply['icon'][1]['url'] # is equal to 'logo2.png'
you can access elements of those inner dictionaries by either knowing how many items are in the list, and reference theme explicitly as done above, or you can iterating through them:
for picture_dict in reply['icon']:
name = picture_dict['name'] # is equal to 'whatever' on both iterations
url = picture_dict['url'] #is 'logo1.png' on first iteration, 'logo2.png' on second.
Cheers!
Not so different, but maybe it looks better (KeyError gives finer control):
icon_data = reply['icon']
try:
icon = icon_data['url']
print icon
except KeyError:
icon = icon_data[0]['url']
print ipshit,icon
or:
icon_data = reply['icon']
if isinstance(icon_data, list):
icon_data = icon_data[0]
icon = icon_data['url']

Python dictionary key error.. big mess of dictionaries in dictionaries in lists

This is kind of convoluted, so if I'm missing out on an easy construct for this, please let me know :)
I'm analysing the results of some matching experiments. At the end game, I want to be able to query things such as experiments[0]["cat"]["cat"], which yields the number of times "cat" was matched against "cat". Conversely, I could do experiments[0]["cat"]["dog"], when the first query was a cat and the match attempt was a dog.
The following is my code to populate this structure:
# initializing the first layer, a list of dictionaries.
experiments = []
for assignment in assignments:
match_sums = {}
experiments.append(match_sums)
for i in xrange(len(classes)):
for experiment in xrange(len(experiments)):
# experiments[experiment][classes[i]] should hold a dictionary,
# where the keys are the things that were matched against classes[i],
# and the value is the number of times this occurred.
experiments[experiment][classes[i]] = collections.defaultdict(dict)
# matches[experiment][i] is an integer for what the i'th match was in an experiment.
# classes[j] for some integer j is the string name of the i'th match. could be "dog" or "cat".
experiments[experiment][classes[i]][classes[matches[experiment][i]]] += 1
total_class_sums[classes[i]] = total_class_sums.get(classes[i], 0) + 1
print experiments[0]["cat"]["cat"]
exit()
So clearly this is a bit convoluted. And I'm getting a value of "1" for the last match, rather than a full dictionary at experiments[0]["cat"]. Have I approached this wrong? What could the bug here be? Sorry for the craziness and thanks for any possible help!
Two points:
Dictionary keys can be tuples; and
If you're counting things, use collections.Counter. (You can use defaultdict(int), but Counter is more useful.)
So, instead of
experiments[experiment][classes[i]][classes[matches[experiment][i]]] += 1
write
experiments = Counter()
...
experiments[experiment, classes[i], classes[matches[experiment][i]]] += 1
I just trying to guess your needs, so i tried to change order of your dimensions.
for className, classIdx in enumerate(classes):
experiment = collections.defaultdict(list)
experiments[className] = experiment
for assignment,assignmentIdx in enumerate(assignments):
counterpart = classes[matches[assignmentIdx][classIdx]]
experiment[counterpart].append((assignment,assignmentIdx))
print(len(experiments["cat"]["cat"]), len(experiments["cat"]))

Is it possible, in python, to update or initialize a dictionary key with a single command?

For instance, say I want to build an histogram, I would go like that:
hist = {}
for entry in data:
if entry["location"] in hist:
hist[entry["location"]] += 1
else:
hist[entry["location"]] = 1
Is there a way to avoid the existence check and initialize or update the key depending on its existence?
What you want here is a defaultdict:
from collections import defaultdict
hist = defaultdict(int)
for entry in data:
hist[entry["location"]] += 1
defaultdict default-constructs any entry that doesn't already exist in the dict, so for ints they start out at 0 and you just add one for every item.
Yes, you can do:
hist[entry["location"]] = hist.get(entry["location"], 0) + 1
With reference types, you can often use setdefault for this purpose, but this isn't appropriate when the right hand side of your dict is just an integer.
Update( hist.setdefault( entry["location"], MakeNewEntry() ) )
I know you've already accepted an answer but just so you know, since Python 2.7, there's also the Counter module, which is explicitly made for such situations.
from collections import Counter
hist = Counter()
for entry in data:
hist[entry['location']] += 1
http://docs.python.org/library/collections.html#collections.Counter
Is ternary operator one command?
hist[entry["location"]] = hist[entry["location"]]+1 if entry["location"] in hist else 1
(edited because I messed it up the first time)

Categories

Resources