I hope you are all well.
This is how my data looks:
dictionary1 = {2876: 1, 9212: 1, 953997: 1, 9205: 1, 9206: 1, 9207: 1, 9208: 1, 9209: 1, 9210: 1, 9211: 1, 6908: 1, 1532: 1, 945237: 1, 6532: 2, 6432: 4}
data1 = [[2876, 5423],[2312, 4532],[953997, 5643]...]
I am trying to run a statement that looks like this:
for y in data1:
if y[0] in dictionary1 and dictionary1[y[0]] == 1:
dictionary1[y[1]] = 2
Presumably this would create a new dataset looking like this:
dictionary1 = {5423: 2, 953997: 2, 2876: 1, 9212: 1, 953997: 1, 9205: 1, 9206: 1, 9207: 1, 9208: 1, 9209: 1, 9210: 1, 9211: 1, 6908: 1, 1532: 1, 945237: 1, 6532: 2, 6432: 4}
What am I doing wrong? Is dictionary1[y[0]] == 1 the correct way to check a key's value?
Thank you everyone.
Dictionary comprehension converts the list of lists to a dictionary:
dict1 = {t[0]:t[1:] for t in dictionary1}
Then it should be easy to do what you want:
for y in data1:
if y in dict1 and dict1[y] ==1:
dictionary1[y] = 2
You can use dict.get(key, default) to avoid an exception for missing values, and provide a safe default. This reduces your loop to a single condition:
#!python3
dictionary1 = {2876: 1, 9212: 1, 953997: 1, 9205: 1, 9206: 1, 9207: 1, 9208: 1, 9209: 1, 9210: 1, 9211: 1, 6908: 1, 1532: 1, 945237: 1, 6532: 2, 6432: 4}
data1 = [[2876, 5423],[2312, 4532],[953997, 5643]]
for x,y in data1:
if dictionary1.get(x, 0) == 1:
dictionary1[y] = 2
print(dictionary1)
You could use dict.update(other) to bulk-overwrite the values in dictionary1 with a one-liner dict comprehension:
dictcompr = {b:2 for a,b in data1 if dictionary1.get(a,0) == 1}
dictionary1.update(dictcompr)
And then you can combine them into one single, unholy, unmaintainable, barely-readable mess:
dictionary1.update({b:2 for a,b in data1 if dictionary1.get(a,0) == 1})
Update:
To delete all keys having a value of 1, you have some choices:
for k,v in dictionary1.items():
if v == 1:
del dictionary1[k]
# Versus:
d2 = dict(filter(lambda item: item[1] != 1, dictionary1.items()))
dictionary1 = d2
# or
dictionary1.clear()
dictionary1.update(d2)
Frankly, for your purposes the for loop is probably better. The filter approach can take the lambda as a parameter, to configure what gets filtered. Using clear()/update() is a win if you expect multiple references to the dictionary. That is, A = B = dictionary1. In this case, clear/update would keep the same underlying object, so the linkage still holds. (This is also true of the for loop - the benefit is solely for the filter which requires a temporary.)
please try this,
for y in data1:
if y[0] in dictionary1.keys() and dictionary1.keys() == y[0]:
dictionary1[y[1]] = 2
u can simply use
for y in data1:
if dictionary1.has_key(y[0]):
dictionary1[y[1]] = 2
Hope this is what u r looking for .
Related
Hi all I have a dictionary and a list of key
Diz = {'X080213_2_0004_2_000005': {'cHMW': 1, 'sRib': 9}, 'X280113_1_0002_2_000003': {'cMMW': 1, 'sRib': 7}}
L = ['Triangle','Traingle5R','Rectangle','CircularMMW','CircularHMW']
I would like to fill the dictionary with the missing key present in the list and set them at Zero, by keeping the older one with their respective value
Diz = {'X080213_2_0004_2_000005': {'cHMW': 1, 'sRib': 9}, 'X280113_1_0002_2_000003': {'cMMW': 1, 'sRib': 7}}
L = ['Triangle','Traingle5R','Rectangle','cMMW','cHMW',"sRib"]
I am trying this code but it set all key at zero also the one that has a starting value
for el in L:
for k,v in Diz.items():
for k2,v2 in v.items():
if el not in k2:
Diz[k][el] = 0
print(Diz)
I would like to have this output
Diz = {'X080213_2_0004_2_000005': {'cHMW': 1, 'sRib': 9,'Triangle':0,'Traingle5R':0,'Rectangle':0,'cMMW','cHMW}, 'X280113_1_0002_2_000003': {'cMMW': 1, 'sRib': 7,'Triangle':0,'Traingle5R':0,'Rectangle':0,"cHMW":0}}
And at the end I would also produce a table in a txt file for each line a key and dictionary value.
You can iterate through each keys of the dictionary, and use dict.get for inner dictionary passing 0 as default value and assign it back to the inner dictionary.
for key in Diz:
for item in L:
Diz[key][item] = Diz[key].get(item, 0)
OUTPUT:
{'X080213_2_0004_2_000005': {'cHMW': 1, 'sRib': 9, 'Triangle': 0, 'Traingle5R': 0, 'Rectangle': 0, 'cMMW': 0}, 'X280113_1_0002_2_000003': {'cMMW': 1, 'sRib': 7, 'Triangle': 0, 'Traingle5R': 0, 'Rectangle': 0, 'cHMW': 0}}
I'm newbie to python. I have this program:
wordlist = ['pea', 'rpai', 'rpai', 'schiai', 'pea', 'rpe', 'zoi', 'zoi', 'briai', 'rpe']
dictionary = {}
counter = 0
result = list(map(lambda x: dictionary[wordlist[x]] = dictionary.get(wordlist[x], counter +=1), wordlist))
print(result)
Result has to be:
result = [0, 1, 1, 2, 0, 3, 4, 4, 5, 3]
What I have to do is append all of the element in the list (as key) inside the dictionary with an incremental counter as value of the key. With this code I get "lambda cannot contain assignment. How can I do this? Thank you!
EDIT FOR EXPLANATION:
With the list of strings I have to create a dictionary with element of list of str as "argument" and value as "key"
The value is calculated like this:
The first element of the list is 0.
The following element, if it is a new string never appeared (unique) has last value (in this case 0) =+1.
Instead if the new element is a duplicate string (there is already one in the dictionary) it take the same originale value as the first.
The dictionary will be:
{'pea': 0, 'rpai': 1, 'rpai': 1, 'schiai': 2, 'pea': 0, 'rpe': 3,
'zoi': 4, 'zoi': 4, 'briai': 5,'rpe': 3}
And result instead with list will be:
[0, 1, 1, 2, 0, 3, 4, 4, 5, 3]
I guess the easiest solution with vanilla Python is to use defaultdict:
from collections import defaultdict
wordlist = ["pea","rpai","rpai","schiai","pea","rpe", "zoi","zoi","briai","rpe"]
vocab = defaultdict(lambda: len(vocab))
# result will be [0, 1, 1, 2, 0, 3, 4, 4, 5, 3]
result = [vocab[word] for word in wordlist]
A more verbose equivalent, leading to the same result:
vocab = {}
result = []
for word in wordlist:
if word not in vocab:
vocab[word] = len(vocab)
result.append(vocab[word])
Update:
Use dictionary's setdefault then.
wordlist = ["pea","rpai","rpai","schiai","pea","rpe", "zoi","zoi","briai","rpe"]
dic = {}
res = list(map(lambda x: dic.setdefault(x, len(dic)), wordlist))
print(res)
Dictonary can't have same keys. You just need a for loop:
wordlist = ["pea","rpai","rpai","schiai","pea","rpe", "zoi","zoi","briai","rpe"]
c = 0
dic = {}
res = []
for i in range(len(wordlist)):
word = wordlist[i]
if word in dic:
res.append(dic[word])
else:
dic[word] = c
res.append(c)
c += 1
print(res)
Once you have the dictionary built the code for the lambda will be as follows.
list(map(lambda x: dictionary[x], wordlist))
This assumes you already have the keys and values of the dictionary populated. Is this the case, like so?
{'pea': 0, 'rpai': 1, 'schiai': 2, 'rpe': 3, 'zoi': 4, 'briai': 5, 'rpei': 6}
All you need to do is make a dict with the first occurrence of each word, then look up each word in the dict. Don't use map and lambda, they'll only make it harder, or at least less readable.
first_occ = {}
counter = 0
for word in wordlist:
if word not in first_occ:
first_occ[word] = counter
counter += 1
result = [first_occ[w] for w in wordlist]
print(result) # -> [0, 1, 1, 2, 0, 3, 4, 4, 5, 3]
I have two dictionaries, and I want to compare them and see what is different between the two. Where I am getting confused is the dict. Is there a name for this?
Everything is working fine, I just don't really understand why it works or what it is doing.
x = {"#04": 0, "#05": 0, "#07": 0, "#08": 1, "#09": 0, "#10": 0, "#11": 1, "#12": 1, "#14": 1, "#15": 1, "#17": 0, "#18": 1, "#19": 1, "#20": 1}
y = {"#04": 1, "#05": 0, "#07": 0, "#08": 1, "#09": 0, "#10": 0, "#11": 1, "#12": 1, "#14": 1, "#15": 0, "#17": 1, "#18": 1, "#19": 0, "#20": 1}
dict = {k: x[k] for k in x if y[k] != x[k]}
list = []
for k, v in dict.items()
if v==0:
difference = k + ' became ' + '0'
list.append(difference)
else:
difference = k + ' became ' + '1'
list.append(difference)
print(list)
It should print ['#04 became 0', '#15 became 1', '#17 became 0', '#19 became 1'] but I don't understand how the dict works to loop through the x and y dictionaries.
The procedure implemented is comparing two dictionaries assuming that both have the same keys (potentially y could have more entries).
To make this comparison quick, and facilitate the next code block, they decided to generate a dictionary that only contains the keys that have different values.
To generate such dictionary, they use a "dictionary comprehension", which is very efficient.
Now, this construct:
d = {k: x[k] for k in x if y[k] != x[k]}
can be rewritten as:
d = {}
for k,v in x: # for each key->value pairs in dictionary x
if y[k] != x[k]: # if the corresponding elements are different
d[k] = x[k] # store the key->value pair in the new dictionary
You could replace x[k] with v above.
I have a series of functions that end up giving a list, with the first item containing a number, derived from the dictionaries, and the second and third items are dictionaries.
These dictionaries have been previously randomly generated.
The function I am using generates a given number of these dictionaries, trying to get the highest number possible as the first item. (It's designed to optimise dice rolls).
This all works fine, and I can print the value of the highest first item from all iterations. However, when I try and print the two dictionaries associated with this first number (bearing in mind they're all in a list together), it just seemingly randomly generates the two other dictionaries.
def repeat(type, times):
best = 0
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
print("The highest average success is", best)
return best
This works great. The last thing shown is:
BEST: (3.58, [{'strength': 4, 'intelligence': 1, 'charisma': 1, 'stamina': 4, 'willpower': 2, 'dexterity': 2, 'wits': 5, 'luck': 2}, {'agility': 1, 'brawl': 2, 'investigation': 3, 'larceny': 0, 'melee': 1, 'survival': 0, 'alchemy': 3, 'archery': 0, 'crafting': 0, 'drive': 1, 'magic': 0, 'medicine': 0, 'commercial': 0, 'esteem': 5, 'instruction': 2, 'intimidation': 2, 'persuasion': 0, 'seduction': 0}])
The highest average success is 3.58
But if I try something to store the list which gave this number:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
bestChar = x
print("The highest average success is", best)
print("Therefore the best character is", bestChar)
return best, bestChar
I get this as the last result, which is fine:
BEST: (4.15, [{'strength': 2, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 4, 'luck': 1}, {'agility': 1, 'brawl': 0, 'investigation': 5, 'larceny': 0, 'melee': 0, 'survival': 0, 'alchemy': 7, 'archery': 0, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 3, 'intimidation': 0, 'persuasion': 0, 'seduction': 0}])
The highest average success is 4.15
but the last line is
Therefore the best character is (4.15, [{'strength': 1, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 2, 'luck': 3}, {'agility': 1, 'brawl': 0, 'investigation': 1, 'larceny': 4, 'melee': 2, 'survival': 0, 'alchemy': 2, 'archery': 4, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 0, 'intimidation': 2, 'persuasion': 1, 'seduction': 0}])
As you can see this doesn't match with what I want, and what is printed literally right above it.
Through a little bit of checking, I realised what it gives out as the "Best Character" is just the last one generated, which is not the best, just the most recent. However, it isn't that simple, because the first element IS the highest result that was recorded, just not from the character in the rest of the list. This is really confusing because it means the list is somehow being edited but at no point can I see where that would happen.
Am I doing something stupid whereby the character is randomly generated every time? I wouldn't think so since x[0] gives the correct result and is stored fine, so what changes when it's the whole list?
From the function rollForCharacter() it returns rollResult, character which is just the number and then the two dictionaries.
I would greatly appreciate it if anyone could figure out and explain where I'm going wrong and why it can print the correct answer to the console yet not store it correctly a line below!
EDIT:
Dictionary 1 Code:
attributes = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row)-1):
val = randint(0, p)
rowValues[row[i]] = val + 1
p -= val
rowValues[row[-1]] = p + 1
return attributes.update(rowValues)
def getPoints():
points = [7, 5, 3]
shuffle(points)
row1 = ['strength', 'intelligence', 'charisma']
row2 = ['stamina', 'willpower']
row3 = ['dexterity', 'wits', 'luck']
for i in range(0, len(points)):
row = eval("row" + str(i+1))
assignRow(row, points[i])
Dictionary 2 Code:
skills = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row) - 1):
val = randint(0, p)
rowValues[row[i]] = val
p -= val
rowValues[row[-1]] = p
return skills.update(rowValues)
def getPoints():
points = [11, 7, 4]
shuffle(points)
row1 = ['agility', 'brawl', 'investigation', 'larceny', 'melee', 'survival']
row2 = ['alchemy', 'archery', 'crafting', 'drive', 'magic', 'medicine']
row3 = ['commercial', 'esteem', 'instruction', 'intimidation', 'persuasion', 'seduction']
for i in range(0, len(points)):
row = eval("row" + str(i + 1))
assignRow(row, points[i])
It does look like the dictionary is being re-generated, which could easily happen if the function rollForCharacter returns either a generator or alternatively is overwriting a global variable which is being overwritten by a subsequent cycle of the loop.
A simple-but-hacky way to solve the problem would be to take a deep copy of the dictionary at the time of storing, so that you're sure you're keeping the values at that point:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
# Create a brand new tuple, containing a copy of the current dict
bestChar = (x[0], x[1].copy())
The correct answer would be however to pass a unique dictionary variable that is not affected by later code.
See this SO answer with a bit more context about how passing a reference to a dictionary can be risky as it's still a mutable object.
I am trying to find the sub-substring(s) with certain frequency in a list a substrings (i.e.windows) of a larger string in python. It is, the goal is to find what sub-substrings (of a fixed length) are present (if any, in certain required frequency) in at least one of the substrings:
strand='JJJKJKHHGHJKKLHHGJJJHHGJJJ'
#now, I break the string by windows (substrings) and define the patterns to look (sub-substrings) :
A=20 #(fixed lenght of each window (substring) moving along the string in a one-by-one way)
B=3 #(fixed length of the pattern (sub-substring))
C=3 #(frequency of pattern (sub-substring))
pattcount = {}
for i in range(0, len(strand)-A+1):
win=strand[i:i+A]
for n in range(0, len(win)-B+1):
patt=win[n:n+B]
pattcount[patt] = pattcount[patt] + 1 if pattcount.has_key(patt) else 1
pattgroup = []
for p,f in pattcount.iteritems():
if f != C:
pattgroup = pattgroup
elif f == C:
pattgroup += [p]
print (" ".join(pattgroup))
therefore, I obtain as a result:
JKJ
When the answer should be only:
HHG (it is contained C=3 times in a window of length 20)
And no JKJ or JJJ (the latter is contained C=3 times but in the entire string, not in a window of lenght 20)
What am I doing wrong? How can I find just the patterns present in the needed frequency but in at least one window? (without adding any match of the pattern from other windows to the final count)
Thanks in advance.
The problem is that the program runs through the windows (7 in this case) and counts the frequency of every pattern without resetting. So in the end, it finds that HHG is encountered 18 times (like badc0re says). You ask that it finds 3 repetitions so JKJ is provided because it is found only 3 times throughout all the windows.
So basically, you have to reset that pattcount variable every time you start a new window. You can store the pattcount of each window in another variable such as pattcounttotal. I've made a rough example:
strand='JJJKJKHHGHJKKLHHGJJJHHGJJJ'
#now, I break the string by windows (substrings) and define the patterns to look (sub-substrings) :
A=20 #(fixed lenght of each window (substring) moving along the string in a one-by-one way)
B=3 #(fixed length of the pattern (sub-substring))
C=3 #(frequency of pattern (sub-substring))
pattcounttotal = {} #So in this var everything will be stored
for i in range(0, len(strand)-A+1):
pattcount = {} #This one is moved inside the for loop so it is emptied with every window
win=strand[i:i+A]
for n in range(0, len(win)-B+1):
patt=win[n:n+B]
#I've partly rewritten your code to make it more readable (for me)
if pattcount.has_key(patt):
pattcount[patt] = pattcount[patt] + 1
else:
pattcount[patt] = 1
pattcounttotal[i] = pattcount #This pattcount is stored into the total one
Now you have to go through pattcounttotal (instead of pattcount) to find a pattern which is repeated three times over only one window.
The output, I receive when I "print" the pattcounttotal var, is given below:
{0: {'HGH': 1, 'KJK': 1, 'KLH': 1, 'HJK': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 1, 'HGJ': 1, 'JJK': 1, 'JJJ': 2, 'JKJ': 1, 'JKK': 1, 'JKH': 1, 'HHG': 2, 'GHJ': 1},
1: {'JJH': 1, 'HGH': 1, 'KJK': 1, 'JKK': 1, 'KLH': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 1, 'HGJ': 1, 'JJK': 1, 'HJK': 1, 'JKJ': 1, 'JKH': 1, 'HHG': 2, 'GHJ': 1, 'JJJ': 1},
2: {'JHH': 1, 'HGH': 1, 'KJK': 1, 'JJH': 1, 'KLH': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 1, 'HGJ': 1, 'JKH': 1, 'HJK': 1, 'JKJ': 1, 'JKK': 1, 'HHG': 2, 'GHJ': 1, 'JJJ': 1},
3: {'JHH': 1, 'KJK': 1, 'JJH': 1, 'KLH': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 1, 'HGJ': 1, 'JKH': 1, 'HJK': 1, 'HGH': 1, 'JKK': 1, 'HHG': 3, 'GHJ': 1, 'JJJ': 1},
4: {'JHH': 1, 'JJH': 1, 'KLH': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 1, 'HGJ': 2, 'HHG': 3, 'GHJ': 1, 'HGH': 1, 'JKK': 1, 'JKH': 1, 'HJK': 1, 'JJJ': 1},
5: {'JHH': 1, 'JJH': 1, 'KLH': 1, 'LHH': 1, 'KHH': 1, 'KKL': 1, 'GJJ': 2, 'HHG': 3, 'GHJ': 1, 'HGH': 1, 'JKK': 1, 'HGJ': 2, 'HJK': 1, 'JJJ': 1},
6: {'JHH': 1, 'JJH': 1, 'KLH': 1, 'LHH': 1, 'KKL': 1, 'GJJ': 2, 'HHG': 3, 'GHJ': 1, 'HGH': 1, 'JKK': 1, 'HGJ': 2, 'HJK': 1, 'JJJ': 2}}
So basically each window is numbered from 0-6 and the frequencies of each pattern are presented. So if I glance over it quickly, I see that in windows 3-5 the HHG pattern is encountered 3 times.
(If you want an "automated" output than you still have to write a piece of code that cycles through each window (so through the upper level of the dictionary) and keeps track of the patterns that are encountered three times.) updated
I thought it was a fun problem so I made that "automated" output too. It was very similar to what you already had. Here is the code I used (hopefully it's not a spoiler :) )
pattgroup = []
for i in pattcounttotal.iteritems(): #It first retrieves the upper levels of pattcounttotal
for p,f in i[1].iteritems(): #i[0] is the window's number and i[1] is the dictionary that contains the frequencies
if f != C:
pattgroup = pattgroup
elif f == C:
if not(p in pattgroup): #This makes sure if HHG is not already in pattgroup.
pattgroup += [p]
print (" ".join(pattgroup))
If I put these two codes together and run the program, I get HHG back.
You can simplify to
from collections import defaultdict
strand='JJJKJKHHGHJKKLHHGJJJHHGJJJ'
print strand
A=20
B=3
C=3
D = A-B+1
pattcount = defaultdict(int)
res = set()
for i in xrange(len(strand)-A+1):
pattcount.clear()
win=strand[i:i+A]
for n in xrange(D):
pattcount[win[n:n+B]] += 1
res.update(k for k,v in pattcount.iteritems() if v==3)
print 'i==%d res==%s' % (i,res)
print (" ".join(res))
Edit
We can even avoid to use win:
from collections import defaultdict
strand='JJJKJKHHGHJKKLHHGJJJHHGJJJ'
print strand
A=20
B=3
C=3
D = A-B+1
pattcount = defaultdict(int)
res = set()
for i in xrange(len(strand)-A+1):
pattcount.clear()
for n in xrange(i,i+D):
pattcount[strand[n:n+B]] += 1
res.update(k for k,v in pattcount.iteritems() if v==3)
print (" ".join(res))