Long list/array into a dictionary with indices as key - python

I am trying to solve a coding exercise.
Part of it is creating a dictionary from a random list of integers.
The dictionary must have as key the index of the element in the original list and as value the element of the list.
This is my function:
def my_funct(pricesLst):
price_dict = {}
for i in range(0, len(pricesLst)):
price_dict[i] = pricesLst[i]
print(price_dict)
a = np.random.randint(1,100,5)
my_funct(a)
The output I get is the right one:
{0: 42, 1: 23, 2: 38, 3: 27, 4: 61}
HOWEVER if the list is longer, I get a weird results as output.
Example:
a = np.random.randint(1,1000000000,5000000)
my_funct(a)
The output is:
{2960342: 133712726, 2960343: 58347003, 2960344: 340350742, 949475: 944928187.........4999982: 417669027, 4999983: 650062265, 4999984: 656764316, 4999985: 32618345, 4999986: 213384749, 4999987: 383964739, 4999988: 229138815, 4999989: 203341047, 4999990: 54928779, 4999991: 139476448, 4999992: 244547714, 4999993: 790982769, 4999994: 298507070, 4999995: 715927973, 4999996: 365280953, 4999997: 543382916, 4999998: 532161768, 4999999: 598932697}
I am not sure why does it occur.
Why aren't the keys of my dictionary starting from 0 as it happens for the shortest list?
The only thing I can think of is that the list is too long and thus python, instead of using the index starting from 0 as key, it associate the space in memory.

Because dicts in python are not necessarily ordered. You should use an ordered dictionary which is declared as:
my_ordered_dict=OrderedDict()

The dictionaries are ordered in python 3.7. If you are older python version (<3.7), then you will have to use ordered dictionary.
You can use ordered dictionary as follows:
from collections import OrderedDict
import numpy as np
def my_funct(pricesLst):
price_dict = OrderedDict()
for i in range(0, len(pricesLst)):
price_dict[i] = pricesLst[i]
print(price_dict)
a = np.random.randint(1,10000,10000)
my_funct(a)

Related

How to start from second key when iterating over dictionary using for loop in Python

I am computing returns from data in a dictionary. My keys are dates and for every key I have a dataframe with data to compute my returns. To compute the returns I need data today and yesterday (t and t-1), hence I want to initiate from the second observation (key).
Since I do not have much experience my initial thought was to execute like this:
dict_return = {}
for t, value in dict_data.items()[1:]:
returns = 'formula'
dict_returns[t] = returns
Which gave me the error:
TypeError: 'dict_items' object is not subscriptable
Searching for an answer, the only discussion I could find was skipping the first item, e.g. like this:
from itertools import islice
for key, value in islice(largeSet.items(), 1, None):
Is there a simple approach to skip the first key?
Thank you
If you are in Python 3 you need to use a list, Dict_ items ([‘No surfacing ‘,’flippers’]) returns a dict_ The items object is no longer of the list type and does not support index, this is why the list type can be used
I can think of 2 options, both require an extra step:
Option 1: Create a second dict without your key and loop over that
loop_dict = dict_data.pop(<key_to_remove>)
Then loop over loop_dict as you have done above.
Option 2: Create a list of keys from your dict and loop over that
keys = dict_data.keys()
loop_keys = keys[1:]
for key in loop_keys:
Etc
If you pass a reference to your dictionary to list() you will get a list of the dictionary's keys. This is because dictionaries are iterable. In your code you're not interested in the key's value so:
dict_data = {'a': 1, 'b': 2} # or whatever
dict_data[list(dict_data)[1]] = 3
print(dict_data)
Output:
{'a': 1, 'b': 3}

iter through the dict store the key value and iter again to look for similar word in dict and delete form dict eg(Light1on,Light1off) in Python

[I had problem on how to iter through dict to find a pair of similar words and output it then the delete from dict]
My intention is to generate a random output label then store it into dictionary then iter through the dictionary and store the first key in the list or some sort then iter through the dictionary to search for similar key eg Light1on and Light1off has Light1 in it and get the value for both of the key to store into a table in its respective columns.
such as
Dict = {Light1on,Light2on,Light1off...}
store value equal to Light1on the iter through the dictionary to get eg Light1 off then store its Light1on:value1 and Light1off:value2 into a table or DF with columns name: On:value1 off:value2
As I dont know how to insert the code as code i can only provide the image sry for the trouble,its my first time asking question here thx.
from collections import defaultdict
import difflib, random
olist = []
input = 10
olist1 = ['Light1on','Light2on','Fan1on','Kettle1on','Heater1on']
olist2 = ['Light2off','Kettle1off','Light1off','Fan1off','Heater1off']
events = list(range(input + 1))
for i in range(len(olist1)):
output1 = random.choice(olist1)
print(output1,'1')
olist1.remove(output1)
output2 = random.choice(olist2)
print(output2,'2')
olist2.remove(output2)
olist.append(output1)
olist.append(output2)
print(olist,'3')
outputList = {olist[i]:events[i] for i in range(10)}
print (str(outputList),'4')
# Iterating through the keys finding a pair match
for s in range(5):
for i in outputList:
if i == list(outputList)[0]:
skeys = difflib.get_close_matches(i, outputList, n=2, cutoff=0.75)
print(skeys,'5')
del outputList[skeys]
# Modified Dictionary
difflib.get_close_matches('anlmal', ['car', 'animal', 'house', 'animaltion'])
['animal']
Updated: I was unable to delete the pair of similar from the list(Dictionary) after founding par in the dictionary
You're probably getting an error about a dictionary changing size during iteration. That's because you're deleting keys from a dictionary you're iterating over, and Python doesn't like that:
d = {1:2, 3:4}
for i in d:
del d[i]
That will throw:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
To work around that, one solution is to store a list of the keys you want to delete, then delete all those keys after you've finished iterating:
keys_to_delete = []
d = {1:2, 3:4}
for i in d:
if i%2 == 1:
keys_to_delete.append(i)
for i in keys_to_delete:
del d[i]
Ta-da! Same effect, but this way avoids the error.
Also, your code above doesn't call the difflib.get_close_matches function properly. You can use print(help(difflib.get_close_matches)) to see how you are meant to call that function. You need to provide a second argument that indicates the items to which you wish to compare your first argument for possible matches.
All of that said, I have a feeling that you can accomplish your fundamental goals much more simply. If you spend a few minutes describing what you're really trying to do (this shouldn't involve any references to data types, it should just involve a description of your data and your goals), then I bet someone on this site can help you solve that problem much more simply!

Finding all keys of a multi-key dictionary based on one key

I have a dictionary in which 3 keys are assigned to each value: dictTest[c,pH,T] = value. I would like to retrieve all values corresponding to a given, single key: dictTest[c,*,*] = value(s)
I looked online but could not find any solutions in Python, only C#. I've tried using dictTest[c,*,*] but get a syntax error. Another option I can see is using multi-level keys, i.e. have the first level as c, second as pH and so on, i.e. dictTest[c][pH][T] = value (from http://python.omics.wiki/data-structures/dictionary/multiple-keys)
Here is some test code:
dictTest={}
dictTest[1,100,10]=10
dictTest[1,101,11]=11
The following gives a syntax error:
print(dictTest[1,*,*])
Whilst trying to specify only one key gives a key error:
print(dictTest[1])
I've also tried the above mentioned multi-level keys, but it raises a syntax error when I try and define the dictionary:
dictTest[1][100][10]=10
In the above example, I would like to specify only the first key, (i.e. key1=1, and return both values of the dictionary, as the first key value of both is 1.
Thanks,
Mustafa.
dictTest={}
dictTest[1,100,10]=10
dictTest[1,101,11]=11
dictTest[2,102,11]=12
print([dictTest[i] for i in dictTest.keys() if i[0]==1])
print([dictTest[i] for i in dictTest if i[0]==1]) #more pythonic way
#creating a function instead of printing directly
def get_first(my_dict,val):
return [my_dict[i] for i in my_dict if i[0]==val]
print(get_first(dictTest,1))
The key of your dictionary is a tuple of 3 values. It's not a "multi-key" dict that you can search efficiently based on one of the element of the tuple.
You could perform a linear search based on the first key OR you could create another dictionary with the first key only, which would be much more efficient if the access is repeated.
Since the key repeats, you need a list as value. For instance, let the value be a tuple containing the rest of the key and the current value. Like this:
dictTest={}
dictTest[1,100,10]=10
dictTest[1,101,11]=11
dictTest[2,101,11]=30
import collections
newdict = collections.defaultdict(list)
for (newkey,v2,v3),value in dictTest.items():
newdict[newkey].append(((v2,v3),value))
now newdict[1] is [((101, 11), 11), ((100, 10), 10)] (the list of all values matching this key, with - added - the rest of the original key so no data is lost)
and the whole dict:
>>> dict(newdict)
{1: [((101, 11), 11), ((100, 10), 10)], 2: [((101, 11), 30)]}
To create a multi level nested dictionary, you can use of recursivly created defaultdicts:
from collections import defaultdict
def recursive_defaultdict():
return defaultdict(recursive_defaultdict)
dictTest = recursive_defaultdict()
dictTest[1][100][10] = 10
dictTest[1][101][11] = 11
print(dictTest[1][100])
Output:
defaultdict(<function recursive_defaultdict at 0x1061fe848>, {10: 10})
Another option to implement is:
from collections import defaultdict
dictTest = defaultdict(lambda: defaultdict(dict))
dictTest[1][100][10] = 10
dictTest[1][101][11] = 11
print(dict(dictTest[1]))
The output is:
{100: {10: 10}, 101: {11: 11}}

Appending to a Dictionary Value List in Loop

I have some code where I am using a list of names, and a file (eventually multiple files) of results (team, name, place). The end result I am looking for is to have each person's name (key) associated with a list of points (values). However, when I use the code below I end up with a result like
'Abe': [100, 80, 90], 'Bob': [100, 80, 90], 'Cam': [100, 80, 90] instead of
'Abe': [100], 'Bob': [80], 'Cam': [90]
f=open("NamesList.txt","r")
lines=f.read().splitlines() #get names
Scores=dict.fromkeys(lines,[]) #make a dictionary with names as keys, but no values yet
f1=open("ResultsTest.txt","r") #open results file: column1-team, column 2- name, column 3-place
lines=f1.read().splitlines()
A={1:100,2:90,3:80} #points assignment, 100 for 1, 90 for 2, 80 for 3
for l in lines:
a=l.split('\t') #a[0] is team a[1] is name a[2] is place
score=A.get(int(a[2])) #look up points value corresponding to placing
Scores[a[1]].append(score)
I can get the result I need by adding in
Scores[a[1]]=[]
before the second last line, but I believe this prevents me from eventually being able to append multiple scores to each key (since I'm re-initializing inside the loop). Any insight into my error would be appreciated.
By using Scores=dict.fromkeys(lines,[]) you're initializing every key of the dict with a reference to the same list, so changes made to the list are reflected across all keys. You can use a dict comprehension for initialization instead:
Scores = {line: [] for line in lines}
Alternatively, you can initialize Scores as a normal dict {} and use the dict.setdefault method to initialize its keys with lists:
Scores.setdefault(a[1], []).append(score)
The problem you encounter comes from the way you create Scores:
Scores=dict.fromkeys(lines,[])
When using dict.fromkeys, the same value is used for all keys of the dict. Here, the same, one and only empty list is the common value for all your keys. So, whichever key you access it through, you always update the same list.
When doing Scores[a[1]]=[], you actually create a new, different empty list, that becomes the value for the key a[1] only, so the problem disappears.
You could create the dict differently, using a dict comprehension:
Scores = {key: [] for key in lines} # here, a new empty list is created for each key
or use a collections.defaultdict
from collections import defaultdict
Scores = defaultdict(list)
which will automatically initialize Score['your_key'] to an empty list when needed.
IIUC, your dictionary is mapping place to score. You can leverage a defaultdict to replace your fromkeys method:
from collections import defaultdict
# Initialize an empty dictionary as you've done with default list entries
scores = defaultdict(list)
# Using the with context manager allows for safe file handling
with open("ResultsTest.txt", 'r') as f1:
lines = f1.read().splitlines()
# Points lookup as you've done before
lookup = {1: 100, 2: 90, 3: 80}
for l in lines:
team, name, place = l.split('\t') # unpacking makes this way more readable
score = lookup.get(int(place))
scores[team].append(score)

Length of list in dictionary (with defaultdict to create dictionary) python

This is how I generated my dictionary from the lists shown below:
Genes = ['A2M', 'A2M', 'ACADS', 'ACADVL']
Isoforms = ['NM_000014', 'NM_000016', 'NM_000017', 'NM_000018']
ExonPos = ['9220303,9220778,9221335,9222340,9223083,9224954,9225248,9227155,9229351,9229941,9230296,9231839,9232234,9232689,9241795,9242497,9242951,9243796,9246060,9247568,9248134,9251202,9251976,9253739,9254042,9256834,9258831,9259086,9260119,9261916,9262462,9262909,9264754,9264972,9265955,9268359,', '76190031,76194085,76198328,76198537,76199212,76200475,76205664,76211490,76215103,76216135,76226806,76228376,', '121163570,121164828,121174788,121175158,121175639,121176082,121176335,121176622,121176942,121177098,', '7123149,7123440,7123782,7123922,7124084,7124242,7124856,7125270,7125495,7125985,7126451,7126962,7127131,7127286,7127464,7127639,7127798,7127960,7128127,7128275,']
#Length = len(ExonPos)
from collections import defaultdict
d = defaultdict(lambda: defaultdict(list))
for k, iso, exon in zip(Genes, Isoforms, ExonPos):
d[k][iso] = exon.split(",")
length = len(d[exon])
print length
print(d)
This allowed me to make my dictionary with repeated keys. However, now I'm trying to find the length of the individual lists in my dictionary as shown with length = len(d[exon]), however, my output keeps giving me zeros. Is there something special about using defaultdict that I'm not aware of? Maybe it's my version of python (which is 2.7.6)? I've tried multiple different ways, but I feel like the len() function should work.
Its because of that you are printing the length of d[exon] and your defaultdict has not any key with name exon instead you need :
len(d[k][iso])
Then the result will be :
37
13
11
21
You need to access by the keys:
length = len(d[k][iso])
exon.split(",") is the value. d[k][iso] are the two keys to access the value.
With a normal dict you would get a keyError but because you are using a defaultdict you are actually temporarily creating a key value pairing where the value is an empty list.

Categories

Resources