I am trying to bin the values in my data and put them in a dictionary in Python.
However, after creating the dictionary, its key-range produces weird aritfacts, like 0.6900000000000001 instead of 0.69. They only appear after creating the dictionary, though, the initial array "key_range" has only normal values. Therefore, the last two lines of my code produce KeyErrors, since the value 0.69 does not exist.
Does anyone know what is going on? Is it wrong to use the zip-function? Can I not create a functioning dictionary like this? I suppose I can iterate through the key values and round them manually, but I imagine there are more elegant solutions.
Cheers, and thanks
import numpy as np
key_range = np.arange(0, 1, 0.01) # these numbers are perfectly OK.
values = [0] * len(key_range)
value_dict = dict(zip(key_range, values)) # and here, I get weird artifacts.
print(value_dict)
for i in range(0, len(data)):
value_dict[data[i]] = value_dict[data[i]] + 1
I suppose I can iterate through the key values and round them manually, but I imagine there are more elegant solutions.. For what it is worth, you can fix them within your expression that creates value_dict, which still looks pretty elegant to me:
value_dict = dict(zip(map(lambda x: round(x,2), key_range), values))
Related
I am writing a script in order to calculate all the euclidean distances between a X value and a lot of other values in a dictonary, obtaining a float that, then, I convert in a list. The problem is I don't obtain a list with all the outcomes but many lists with only one element inside, my outcome.
My script for the moment is:
single_mineral = {}
for k in new_dict.keys():
single_mineral = new_dict[k]
Zeff = single_mineral["Zeff_norm"]
rhoe = single_mineral["Rhoe_norm"]
eucl_Zeff= (calculated_Zeff_norm, Zeff)
eucl_rhoe= (calculated_rhoe_norm, rhoe)
dst= [(distance.euclidean(eucl_Zeff, eucl_rhoe))]
print(dst)
I obtain something like that:
[0.29205348037179407]
[0.23436642937625374]
[0.3835446564476642]
[0.11616594912309205]
[0.21792958584034935]
and they are not linked somehow (so I can't use intertools.chain).
I want to create a single list with all these lists (the final goal is the ascending order...for this reason I need only one list).
I guess the solution is a for loop but I have no idea how to do it. I don't understand where it needs to run and how can I add my outcomes, which are always called "dst"?
Please, help me! Thank you very much in advance!
if you want to do get all in one list then you need
# before loop
dst = []
# loop
for k in new_dict.keys():
# ... code ...
#dst.append( [distance.euclidean(eucl_Zeff, eucl_rhoe)] )
dst.append( distance.euclidean(eucl_Zeff, eucl_rhoe) )
# after loop
print(dst)
EDIT: I'm sorry, I realized that I didn't give the full question, this edit should contain the full question. I thougth that my previous question was similar enough to the full question that I could get the answer that way without having to add too many additional details, but I was wrong.
I have a dictionary params in python, where each entry is a numpy array of some sort. But the arrays in this dictionary are not all of the same shape. Some are size (n,m), some are size (1,m), some still are totally different sizes. I can call a specific number in this dictionary via:
params[key][i,j]
I need to loop over x of the keys in this dictionary, so I have a loop like:
for i in range(0,x):
for j in range(0,(dim of 0th axis of array at key x)):
for k in range(0,(dim of 1 axis of array at key x)):
stuff = f(params['W'+str(i)][i,j])
I want to create a new dictionary paramscheck which will be contain the elements that I'm going to loop over, but the specific numbers in paramscheck will be some function of the corresponding numbers in params.
I can't quite figure out the best way to do this.
If I write code like:
paramscheck = {}
for i in range(0,x):
for j in range(0,n):
for k in range(0,m):
paramscheck['W'+str(i)][i,j] = f(params['W'+str(i)][i,j])
Then I will get that I'm outside the range of the index of paramscheck since that dictionary started out empty.
If I copy params to paramscheck first and then do the loop with a line like
paramscheck = params
Then, I won't have issue accessing any entry in paramscheck, but my params dictionary will be modified simultaneously (from what I understand of Python, dictionaries are not actually copied, it just makes an extra label that references to the one underlying dictionary).
I've tried something like paramscheck = params.copy() and paramscheck = dict(params), but still the params dictionary is being modified when I go to modify paramscheck.
How should I accomplish this?
If your operation is elementwise by nature (e.g. x + 1), then you could just use something like
def elementwise_f(x):
return f(x)
paramscheck = {k: elementwise_f(v) for k, v in array_dict.items()}
otherwise something like the following might also work well
import numpy as np
from functools import partial
from collections import defaultdict
paramscheck = defaultdict(partial(np.zeros, (10, 20))) # instead of {}
if you want to just initialize new one with old values without changing old values while editing the new one, try deepcopy instead of copy
from copy import deepcopy
paramscheck = deepcopy(params)
paramscheck = {}
for key in params:
paramscheck[key] = {} # this line
for i in range(0,n):
for j in range(0,m):
paramscheck[key][i,j] = f(params[key][i,j])
I have a set of values that get modified like so:
iskDropped = irrelevant1
iskDestroyed = irrelevant2
iskTotal = irrelevant3
iskDropped = condense_value(int(iskDropped[:-7].replace(',','')))
iskDestroyed = condense_value(int(iskDestroyed[:-7].replace(',','')))
iskTotal = condense_value(int(iskTotal[:-7].replace(',','')))
As you can see, all three lines go through the same changes. (condensed, shortened, and commas removed) before overwriting their original value.
I want to condense those three lines if possible because it feels inefficient.
I was trying something like this:
for value in [iskDropped,iskDestroyed,iskTotal]:
value = condense_value(int(value[:-7].replace(',','')))
which if you changed into a print statement successfully does print the correct values but it does not work in the regard of overwriting / updating the values (iskDropped,iskDestroyed, and iskTotal) that I need to call later in the program.
Is it possible to condense these lines in Python? If so can someone point me in the right direction?
You can do it like this:
iskDropped, iskDestroyed, iskTotal = [condense_value(int(value[:-7].replace(',',''))) for value in [iskDropped, iskDestroyed, iskTotal]]
This works by looping through the list of your 3 variables, performing the condense_value function on each and creates a list of the results, then finally unpacks the list back into the original values.
I have a dict containing 2 key-value pairs, one is a list
dict_special =
{'money' : 100,
'sweets' : ['bonbons','sherbet', 'toffee','pineapple cube']}
I would like to turn the first value into a list also, so I can append items
i.e.
dict_special =
{'money' : [100, 250, 400]
'sweets' : ['bonbons','sherbet', 'toffee','pineapple cube']}
This is what I have tried so far:
newlist = [dict_special['money']]
newlist.append(250)
dict_special['money'] = newlist
But I feel that there must be a more succinct and Pythonic way to get there.
Any suggestions?
A more concise way to write this:
newlist = [dict_special['money']]
newlist.append(250)
dict_special['money'] = newlist
… would be:
dict_special['money'] = [dict_special['money'], 250]
However, it's worth looking at why you're trying to do this. How did you create the dict in the first place? Maybe you should have been creating it with [100] in the first place, instead of 100. If not, maybe you should have another step for converting the input dictionary (with 100) into the one you want to use (with [100]) generically rather than doing it on the fly here. Maybe you even want to use a "multidict" class instead of using a dict directly.
Without knowing more about your code and your problem, it's hard to say, but trying to make these kinds of changes in an ad-hoc way is usually a sign that something is wrong somewhere else in the code.
How about:
newList = {key: value if isinstance(value, list) else [value]
for key, value in dict_special.items()}
I'm using Python 3. I have two lists of strings and I'm looking for mismatches between the two. The code I have works for smaller lists but not the larger lists I'm writing it for.
Input from the non-working lists is in the following format:
mmec11.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec13.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec12.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec14.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
My function to compare two lists of data in the above format is:
result = []
for x in mmeList1:
if x not in mmeList2:
result.append(x)
return result
The problem is it's not working. I get an output file of both lists combined into one long list. When I put a test is to say "Hi" every time a match was made, nothing happened. Does anyone have any ideas where I'm going wrong. I work for a telecommunications company and we're trying to go through large database dumps to find missing MMEs.
I'm wondering if maybe my input function is broken? The function is:
for line in input:
field = line.split()
tempMME = field[0]
result.append(tempMME)
I'm not very experienced with this stuff and I'm wondering if the line.split() function is messing up due to the periods in the MME names?
Thank you for any and all help!
If you don't need to preserve ordering, the following will result in all mmes that exist in list2 but not list1.
result = list(set(mmeList2) - set(mmeList1))
I tested your compare function and it's working fine, assuming that the data in mmeList1 and mmeList2 is correct.
For example, I ran a test of your compare function using the following data.
mmeList1:
mmec11.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec13.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec12.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec14.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmeList2:
mmec11.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec13.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec12.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
mmec15.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
Result contained:
mmec14.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org
I suspect the problem is that mmeList1 and mmeList2 don't contain what you think they contain. Unfortunately, we can't help you more without seeing how mmeList1 and mmeList2 are populated.
If you want to see the differences in both, (i.e. Result should contain mmec14 AND mmec15), then what you want to use is Sets.
For example:
mmeSet1 = set(mmecList1)
mmeSet2 = set(mmecList2)
print mmeSet1.symmetric_difference(mmeSet2)
will result in:
['mmec14.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org', 'mmec15.mmegifffa.mme.epc.mnc980.mcc310.3gppnetwork.org']
At first, using set() on list is best way for decreasing iteration.Try this one
result = []
a=list(set(mmeList1))
b=list(set(mmeList2))
for x in a:
if x not in b:
result.append(x)
return result