Mapping list elements into their positions (Python) - python

I have a list of URLs. (We can assume that a given URL is met in the list no more than once.)
I need a fast way to determine which of two URLs is before in the list.
I think, I should create the dict from URL to its position in the list.
What is the easy way (without writing a for loop with manual increasing of the counter) to map elements of a list into their positions in the list?
The best thing I conceived is:
order = {}
i = 0
for item in list:
order[item] = i
i += 1
Now to check if url1 is before url2, I check order[url1] < order[url2].
Can this code be shortened?

This creates your order
order = {k: v for v, k in enumerate(list)}
Example:
L = list('abc')
Your version:
order1 = {}
i = 0
for item in L:
order1[item] = i
i += 1
print(order1)
My version:
order2 = {k: v for v, k in enumerate(L)}
print(order2)
Output:
{'a': 0, 'b': 1, 'c': 2}
{'a': 0, 'b': 1, 'c': 2}
Better don't use listfor your variable name because it is a built-in.
enumerate provides an iterate that gives you the index and the value for each iteration through your list.

If you want to know which comes first for a specific pair of items, you can use the index method on the list:
a = ['cat', 'dog', 'fish']
a.index('cat') < a.index('dog') # True
a.index('fish') < a.index('dog') # False

List of URLs:
urls = ['A', 'B', 'C', 'D']
List of indices:
index = range(len(urls))
Create the dict:
order = dict(zip(urls, index))
Test:
print(order['A'] < order['B']) # True
Demo

Related

Count how many times are items from list 1 in list 2

I have 2 lists:
1. ['a', 'b', 'c']
2. ['a', 'd', 'a', 'b']
And I want dictionary output like this:
{'a': 2, 'b': 1, 'c': 0}
I already made it:
#b = list #1
#words = list #2
c = {}
for i in b:
c.update({i:words.count(i)})
But it is very slow, I need to process like 10MB txt file.
EDIT: Entire code, currently testing so unused imports..
import string
import os
import operator
import time
from collections import Counter
def getbookwords():
a = open("wu.txt", encoding="utf-8")
b = a.read().replace("\n", "").lower()
a.close()
b.translate(string.punctuation)
b = b.split(" ")
return b
def wordlist(words):
a = open("wordlist.txt")
b = a.read().lower()
b = b.split("\n")
a.close()
t = time.time()
#c = dict((i, words.count(i)) for i in b )
c = Counter(words)
result = {k: v for k, v in c.items() if k in set(b)}
print(time.time() - t)
sorted_d = sorted(c.items(), key=operator.itemgetter(1))
return(sorted_d)
print(wordlist(getbookwords()))
Since speed is currently an issue, it might be worth considering not passing through the list for each thing you want to count. The set() function allows you to only use the unique keys in your list words.
An important thing to remember for speed in all cases is the line unique_words = set(b). Without this, an entire pass through your list is being done to create a set from b at every iteration in whichever kind of data structure you happen to use.
c = {k:0 for k in set(words)}
for w in words:
c[w] += 1
unique_words = set(b)
c = {k:counts[k] for k in c if k in unique_words}
Alternatively, defaultdicts can be used to eliminate some of the initialization.
from collections import defaultdict
c = defaultdict(int)
for w in words:
c[w] += 1
unique_words = set(b)
c = {k:counts[k] for k in c if k in unique_words}
For completeness sake, I do like the Counter based solutions in the other answers (like from Reut Sharabani). The code is cleaner, and though I haven't benchmarked it I wouldn't be surprised if a built-in counting class is faster than home-rolled solutions with dictionaries.
from collections import Counter
c = Counter(words)
unique_words = set(b)
c = {k:v for k, v in c.items() if k in unique_words}
Try using collections.Counter and move b to a set, not a list:
from collections import Counter
c = Counter(words)
b = set(b)
result = {k: v for k, v in c.items() if k in b}
Also, if you can read the words lazily and not create an intermediate list that should be faster.
Counter provides the functionality you want (counting items), and filtering the result against a set uses hashing which should be a lot faster.
You can use collection.Counter on a generator that skips ignored keys using a set lookup.
from collections import Counter
keys = ['a', 'b', 'c']
lst = ['a', 'd', 'a', 'b']
unique_keys = set(keys)
count = Counter(x for x in lst if x in unique_keys)
print(count) # Counter({'a': 2, 'b': 1})
# count['c'] == 0
Note that count['c'] is not printed, but is still 0 by default in a Counter.
Here's an example I just coughed up in repl. Assuming you're not counting duplicates in list two. We create a hash table using a dictionary. For each item in the list were matching two, we create a key value pair with the item being the key and we set the value to 0.
Next we iterate through the second list, for each value, we check if the value has been defined already, if it has been, than we increment the value using the key. Else, we ignore.
Least amount of iterations possible. You hit each item in each list only once.
x = [1, 2, 3, 4, 5];
z = [1, 2, 2, 2, 1];
y = {};
for n in x:
y[n] = 0; //Set the value to zero for each item in the list
for n in z:
if(n in y): //If we defined the value in the hash already, increment by one
y[n] += 1;
print(y)
#Makalone, above answers are appreciable. You can also try the below code sample which uses Python's Counter() from collections module.
You can try it at http://rextester.com/OTYG56015.
Python code »
from collections import Counter
list1 = ['a', 'b', 'c']
list2 = ['a', 'd', 'a', 'b']
counter = Counter(list2)
d = {key: counter[key] for key in set(list1)}
print(d)
Output »
{'a': 2, 'c': 0, 'b': 1}

Find dict matching another dict

I have unlimited list of elements(they are rows in document db) containing dicts like:
list = [
{a:''}
{a:'', b:''}
{a:'',b:'',c:''}
]
And I have input - element, unlimited in count of it's dicts, like:
{a:'', c:''}
I need a function to find element index matching most dict keys with input.
In this case it would be list[2], because it contains both {a:''} and {c:''}
Could you help me/prompt me how to do it?
You can use the builtin max function and provide a matching key:
# The input to search over
l = [{'a':''}, {'a':'', 'b':''}, {'a':'','b':'','c':''}]
# Extract the keys we'd like to search for
t = set({'a': '', 'c': ''}.keys())
# Find the item in l that shares maximum number of keys with the requested item
match = max(l, key=lambda item: len(t & set(item.keys())))
To extract the index in one pass:
max_index = max(enumerate(l), key=lambda item: len(t & set(item[1].keys())))[0]
>>> lst = [{'a':'a'},{'a':'a','b':'b'},{'a':'a','b':'b','c':'c'}]
>>> seen = {}
>>> [seen.update({key:value}) for dct in lst for key,value in dict(dct).items() if key not in seen.keys()]
>>> seen
Output
{'a': 'a', 'c': 'c', 'b': 'b'}
check here

Python. Adding multiple items to keys in a dict

I am trying to build a dict from a set of unique values to serve as the keys and a zipped list of tuples to provide the items.
set = ("a","b","c")
lst 1 =("a","a","b","b","c","d","d")
lst 2 =(1,2,3,3,4,5,6,)
zip = [("a",1),("a",2),("b",3),("b",3),("c",4),("d",5)("d",6)
dct = {"a":1,2 "b":3,3 "c":4 "d":5,6}
But I am getting:
dct = {"a":1,"b":3,"c":4,"d":5}
here is my code so far:
#make two lists
rtList = ["EVT","EVT","EVT","EVT","EVT","EVT","EVT","HIL"]
raList = ["C64G","C64R","C64O","C32G","C96G","C96R","C96O","RA96O"]
# make a set of unique codes in the first list
routes = set()
for r in rtList:
routes.add(r)
#zip the lists
RtRaList = zip(rtList,raList)
#print RtRaList
# make a dictionary with list one as the keys and list two as the values
SrvCodeDct = {}
for key, item in RtRaList:
for r in routes:
if r == key:
SrvCodeDct[r] = item
for key, item in SrvCodeDct.items():
print key, item
You don't need any of that. Just use a collections.defaultdict.
import collections
rtList = ["EVT","EVT","EVT","EVT","EVT","EVT","EVT","HIL"]
raList = ["C64G","C64R","C64O","C32G","C96G","C96R","C96O","RA96O"]
d = collections.defaultdict(list)
for k,v in zip(rtList, raList):
d[k].append(v)
You may achieve this using dict.setdefault method as:
my_dict = {}
for i, j in zip(l1, l2):
my_dict.setdefault(i, []).append(j)
which will return value of my_dict as:
>>> my_dict
{'a': [1, 2], 'c': [4], 'b': [3, 3], 'd': [5, 6]}
OR, use collections.defaultdict as mentioned by TigerhawkT3.
Issue with your code: You are not making the check for existing key. Everytime you do SrvCodeDct[r] = item, you are updating the previous value of r key with item value. In order to fix this, you have to add if condition as:
l1 = ("a","a","b","b","c","d","d")
l2 = (1,2,3,3,4,5,6,)
my_dict = {}
for i, j in zip(l1, l2):
if i in my_dict: # your `if` check
my_dict[i].append(j) # append value to existing list
else:
my_dict[i] = [j]
>>> my_dict
{'a': [1, 2], 'c': [4], 'b': [3, 3], 'd': [5, 6]}
However this code can be simplified using collections.defaultdict (as mentioned by TigerhawkT3), OR using dict.setdefault method as:
my_dict = {}
for i, j in zip(l1, l2):
my_dict.setdefault(i, []).append(j)
In dicts, all keys are unique, and each key can only have one value.
The easiest way to solve this is have the value of the dictionary be a list, as to emulate what is called a multimap. In the list, you have all the elements that is mapped-to by the key.
EDIT:
You might want to check out this PyPI package: https://pypi.python.org/pypi/multidict
Under the hood, however, it probably works as described above.
Afaik, there is nothing built-in that supports what you are after.

How to add dictionary keys with defined values to a list

I'm trying to only add keys with a value >= n to my list, however I can't give the key an argument.
n = 2
dict = {'a': 1, 'b': 2, 'c': 3}
for i in dict:
if dict[i] >= n:
list(dict.keys([i])
When I try this, it tells me I can't give .keys() an argument. But if I remove the argument, all keys are added, regardless of value
Any help?
You don't need to call .keys() method of dict as you are already iterating data_dict's keys using for loop.
n = 2
data_dict = {'a': 1, 'b': 2, 'c': 3}
lst = []
for i in data_dict:
if data_dict[i] >= n:
lst.append(i)
print lst
Results:
['c', 'b']
You can also achieve this using list comprehension
result = [k for k, v in data_dict.iteritems() if v >= 2]
print result
You should read this: Iterating over Dictionaries.
Try using filter:
filtered_keys = filter(lambda x: d[x] >= n, d.keys())
Or using list comprehension:
filtered_keys = [x for x in d.keys() if d[x] >= n]
The error in your code is that dict.keys returns all keys, as the docs mention:
Return a copy of the dictionary’s list of keys.
What you want is one key at a time, which list comprehension gives you. Also, when filtering, which is basically what you do, consider using the appropriate method (filter).

Appending values to dictionary in Python

I have a dictionary to which I want to append to each drug, a list of numbers. Like this:
append(0), append(1234), append(123), etc.
def make_drug_dictionary(data):
drug_dictionary={'MORPHINE':[],
'OXYCODONE':[],
'OXYMORPHONE':[],
'METHADONE':[],
'BUPRENORPHINE':[],
'HYDROMORPHONE':[],
'CODEINE':[],
'HYDROCODONE':[]}
prev = None
for row in data:
if prev is None or prev==row[11]:
drug_dictionary.append[row[11][]
return drug_dictionary
I later want to be able to access the entirr set of entries in, for example, 'MORPHINE'.
How do I append a number into the drug_dictionary?
How do I later traverse through each entry?
Just use append:
list1 = [1, 2, 3, 4, 5]
list2 = [123, 234, 456]
d = {'a': [], 'b': []}
d['a'].append(list1)
d['a'].append(list2)
print d['a']
You should use append to add to the list. But also here are few code tips:
I would use dict.setdefault or defaultdict to avoid having to specify the empty list in the dictionary definition.
If you use prev to to filter out duplicated values you can simplfy the code using groupby from itertools
Your code with the amendments looks as follows:
import itertools
def make_drug_dictionary(data):
drug_dictionary = {}
for key, row in itertools.groupby(data, lambda x: x[11]):
drug_dictionary.setdefault(key,[]).append(row[?])
return drug_dictionary
If you don't know how groupby works just check this example:
>>> list(key for key, val in itertools.groupby('aaabbccddeefaa'))
['a', 'b', 'c', 'd', 'e', 'f', 'a']
It sounds as if you are trying to setup a list of lists as each value in the dictionary. Your initial value for each drug in the dict is []. So assuming that you have list1 that you want to append to the list for 'MORPHINE' you should do:
drug_dictionary['MORPHINE'].append(list1)
You can then access the various lists in the way that you want as drug_dictionary['MORPHINE'][0] etc.
To traverse the lists stored against key you would do:
for listx in drug_dictionary['MORPHINE'] :
do stuff on listx
To append entries to the table:
for row in data:
name = ??? # figure out the name of the drug
number = ??? # figure out the number you want to append
drug_dictionary[name].append(number)
To loop through the data:
for name, numbers in drug_dictionary.items():
print name, numbers
If you want to append to the lists of each key inside a dictionary, you can append new values to them using + operator (tested in Python 3.7):
mydict = {'a':[], 'b':[]}
print(mydict)
mydict['a'] += [1,3]
mydict['b'] += [4,6]
print(mydict)
mydict['a'] += [2,8]
print(mydict)
and the output:
{'a': [], 'b': []}
{'a': [1, 3], 'b': [4, 6]}
{'a': [1, 3, 2, 8], 'b': [4, 6]}
mydict['a'].extend([1,3]) will do the job same as + without creating a new list (efficient way).
You can use the update() method as well
d = {"a": 2}
d.update{"b": 4}
print(d) # {"a": 2, "b": 4}
how do i append a number into the drug_dictionary?
Do you wish to add "a number" or a set of values?
I use dictionaries to build associative arrays and lookup tables quite a bit.
Since python is so good at handling strings,
I often use a string and add the values into a dict as a comma separated string
drug_dictionary = {}
drug_dictionary={'MORPHINE':'',
'OXYCODONE':'',
'OXYMORPHONE':'',
'METHADONE':'',
'BUPRENORPHINE':'',
'HYDROMORPHONE':'',
'CODEINE':'',
'HYDROCODONE':''}
drug_to_update = 'MORPHINE'
try:
oldvalue = drug_dictionary[drug_to_update]
except:
oldvalue = ''
# to increment a value
try:
newval = int(oldval)
newval += 1
except:
newval = 1
drug_dictionary[drug_to_update] = "%s" % newval
# to append a value
try:
newval = int(oldval)
newval += 1
except:
newval = 1
drug_dictionary[drug_to_update] = "%s,%s" % (oldval,newval)
The Append method allows for storing a list of values but leaves you will a trailing comma
which you can remove with
drug_dictionary[drug_to_update][:-1]
the result of the appending the values as a string means that you can append lists of values as you need too and
print "'%s':'%s'" % ( drug_to_update, drug_dictionary[drug_to_update])
can return
'MORPHINE':'10,5,7,42,12,'
vowels = ("a","e","i","o","u") #create a list of vowels
my_str = ("this is my dog and a cat") # sample string to get the vowel count
count = {}.fromkeys(vowels,0) #create dict initializing the count to each vowel to 0
for char in my_str :
if char in count:
count[char] += 1
print(count)

Categories

Resources