How to convert a nested list into a dictionary? - python

I would like to take a nested list such as:
list = [[Name, Height, Weight],[Dan, 175, 75],[Mark, 165, 64], [Sam, 183, 83]]
and convert it into a dictionary like:
dict = {Name: [Dan,Mark, Sam], Height: [175, 165, 183], Weight: [75, 64, 83]}
my current code is unfortunately not really giving me the dictionary format I'm looking for.
i = 1
z = 0
for items in list[0]:
dict[items] = [list[i][z]]
i += 1
z += 1
can someone please assist me and find where I'm going wrong?

Separate the keys and the rest first, then construct the dictionary with zip:
keys, *rest = list_of_lists
out = dict(zip(keys, zip(*rest)))
where list_of_lists is what you called list (but please refrain from that as it shadows the builtin list). First * is slurping all the lists starting from second one. The second * in zip kind of transposes the lists to reorder them
to get
>>> out
{"Name": ("Dan", "Mark", "Sam"),
"Height": (175, 165, 183),
"Weight": (75, 64, 83)}
this gives tuples in the values but to get lists, you can map:
out = dict(zip(keys, map(list, zip(*rest))))

Welcome to stackoverflow :)
We seldom use i++ i+=1 for loop count or step in python if we can easily use for i in ... even if we don't know how many list in it.
your original data is a list of list. first list is the key of dictionary, other list is each records.
we often use zip(*your_list) when your data (list in list) is equal length. zip function will help you rearrange your_list. the * in front of your_list means put each record in your_list to zip function's argument one by one
then put it in a for loop, like for rec in zip(list):.
so, you can write your code like:
your_dict = {}
for rec in zip(yout_list):
k = rec[0] #dict key
v = list(rec[1:]) #dict value, convert to list if needed
your_dict[k] = v # set key and value
e.g.
that's it!

#Mustafa's answer is the most concise but if you are a beginner you might find it easier to break it down into steps.
data =[
['Name', 'Height', 'Weight'],
['Dan', 175, 75], ['Mark', 165, 64], ['Sam', 183, 83]
]
keys = data[0]
values = [list(items) for items in zip(*data[1:])]
results = dict(zip(keys, values))

Related

Efficiently convert a long json array to the corresponding list

I could have a very long and variable json object like below (the length of the object could varry)
{
'1': 230,
'2':240,
'3':100,
'4':20,
...
'670000':100
}
I need to convert above JSON object to a simple unit8 array while keeping the order of elements without saving the indexes
[230,240,100,20,...,100]
well, I come up with below solution using genrators
def f(js):
... for x in js:
... yield js[x]
[x for x in f(js)]
But I wonder why if there is a more efficient solution as well?
Simply turn it into python dictionary and append every value to the list like the following
data = {
'1': 230,
'2':340,
'3':100,
'4':20,
'670000':100
}
print([i for i in data.values()])
output
[230, 340, 100, 20, 100]
You can create a generator that steps through the items of a dictionary.
The dict.values() method will return a view object that reflects the current object.
def values_of(obj):
for value in obj.values():
yield value
data = {
'1': 230,
'2': 340,
'3': 100,
'4': 20,
'670000': 100
}
print(list(values_of(data)))

Simple Python question with dictionaries and lists

The goal of the function is to make a grade adjustment based off of a dictionary and list. For instance
def adjust_grades(roster, grade_adjustment)
adjust_grades({'ann': 75, 'bob': 80}, [5, -5])
will return
{'ann': 80, 'bob': 75}
I just need a nudge in the right direction, I'm new to Python so I thought to put a nested for loop for each num in grade_adjustment but its not the right way.
Assuming Python 3.7 (ordered dicts) and the length of the adjustments match the length of the items in the dictionary, you can zip them together as follows:
for name, adjustment_amount in zip(roster, grade_adjustment):
roster[name] += adjustment_amount
>>> roster
{'ann': 80, 'bob': 75}
This is making several assumptions:
the dictionary and the list have the same length (your final code should make sure they do)
you are using a version of python in which the order of the dictionary keys is preserved (if not, you can make grade_adjustment a dictionary as well, as mentioned by other comments)
result = roster.copy()
for index, key in enumerate(roster):
result[key] += grade_adjustment[index]
You can use
def adjust_grades(roster, grade_adjustment):
for k, v in enumerate(grade_adjustment):
roster[list(roster.keys())[k]] = roster[list(roster.keys())[k]] + v
return roster
This gives output as you said {'ann': 80, 'bob': 75}
assuming 3.7 or ordered dict and equal length:
def adjust_grades(roster, grade_adjustment):
return {key:value + adjustment for (key, value), adjustment in
zip(roster.items(), grade_adjustment)}
print(adjust_grades({'ann': 75, 'bob': 80}, [5, -5]))

New dict of top n values (and keys) from dictionary (Python)

I have a dictionary of names and the number of times the names appear in the phone book:
names_dict = {
'Adam': 100,
'Anne': 400,
'Britney': 321,
'George': 645,
'Joe': 200,
'John': 1010,
'Mike': 500,
'Paul': 325,
'Sarah': 150
}
Preferably without using sorted(), I want to iterate through the dictionary and create a new dictionary that has the top five names only:
def sort_top_list():
# create dict of any 5 names first
new_dict = {}
for i in names_dict.keys()[:5]:
new_dict[i] = names_dict[i]:
# Find smallest current value in new_dict
# and compare to others in names_dict
# to find bigger ones; replace smaller name in new_dict with bigger name
for k,v in address_dict.iteritems():
current_smallest = min(new_dict.itervalues())
if v > current_smallest:
# Found a bigger value; replace smaller key/ value in new_dict with larger key/ value
new_dict[k] = v
# ?? delete old key/ value pair from new_dict somehow
I seem to be able to create a new dictionary that gets a new key/ value pair whenever we iterate through names_dict and find a name/ count that is higher than what we have in new_dict. I can't figure out, though, how to remove the smaller ones from new_dict after we add the bigger ones from names_dict.
Is there a better way - without having to import special libraries or use sorted() - to iterate through a dict and create a new dict of the top N keys with the highest values?
You should use the heapq.nlargest() function to achieve this:
import heapq
from operator import itemgetter
top_names = dict(heapq.nlargest(5, names_dict.items(), key=itemgetter(1)))
This uses a more efficient algorithm (O(NlogK) for a dict of size N, and K top items) to extract the top 5 items as (key, value) tuples, which are then passed to dict() to create a new dictionary.
Demo:
>>> import heapq
>>> from operator import itemgetter
>>> names_dict = {'Adam': 100, 'Anne': 400, 'Britney': 321, 'George': 645, 'Joe': 200, 'John': 1010, 'Mike': 500, 'Paul': 325, 'Sarah': 150}
>>> dict(heapq.nlargest(5, names_dict.items(), key=itemgetter(1)))
{'John': 1010, 'George': 645, 'Mike': 500, 'Anne': 400, 'Paul': 325}
You probably want to use the collections.Counter() class instead. The Counter.most_common() method would have made your use-case trivial to solve. The implementation for that method uses heapq.nlargest() under the hood.
These are not special libraries, they are part of the Python standard library. You otherwise would have to implement a binary heap yourself to achieve this. Unless you are specifically studying this algorithm, there is little point in re-implementing your own, the Python implementation is highly optimised with an extension written in C for some critical functions).
I do not know, why you don't want to use sort and the solution is not perfect and even doesn't match your problem exactly, but I hope it can inspire you to find your own implementation. I think it was only a short example for the real Problem you have.
But as you have seen on the other answer: Normally it is better to use code, that is written before instead of do all the things yourself.
names_dict = {'Joe' : 200, 'Anne': 400, 'Mike': 500, 'John': 1010, 'Sarah': 150, 'Paul': 325, 'George' : 645, 'Adam' : 100, 'Britney': 321}
def extract_top_n(dictionary, count):
#first step: Find the topmost values
highest_values = []
for k,v in dictionary.iteritems():
print k,v, highest_values, len(highest_values)
highest_values.append(v)
l = len(highest_values)
for i in range(l-1):
print i,l
if l-i < 1:
break
if highest_values[l-i-1]>highest_values[l-i-2]:
temp = highest_values[l-i-2]
highest_values[l-i-2] = highest_values[l-i-1]
highest_values[l-i-1] = temp
highest_values = highest_values [:count]
#fill the dirctionary with all entries at least as big as the smallest of the biggest
#but pay attention: If there are more than 2 occurances of one of the top N there will be more than N entries in the dictionary
last_interesting = highest_values[len(highest_values)-1]
return_dictionary = {}
for k,v in dictionary.iteritems():
if v >= last_interesting:
return_dictionary[k] = v
return return_dictionary
print extract_top_n(names_dict,3)

Keeping name and score together while sorting

so I need to sort some high scores into order and here is the code I already have:
def sortscores():
namelist = []
scorelist = []
hs = open("hst.txt", "r")
hscounter = 0
for line in hs:
if counter%2 !=0:
name = line
othername = name[0:len(name)-1]
namelist.append(othername)
else:
scorelist.append(int(line))
This puts the names and scores into lists so now I need to have them sorted but I can't use the .sort() function because I have to write the sort myself so can anyone tell me how I would do this? (sort the scores into descending order whilst keeping the names with the correct scores)
If you store your high scores in (name, score) tuples, then you can easily keep them together. Since you need to write the sort function yourself, it might be helpful to look at an example of using tuples in another problem. Here's an example of simply finding the maximum score while keeping the name and scores together.
First, set up the data. You can use zip for this
names = ['John', 'Jane', 'Tim', 'Sara']
scores = [100, 120, 80, 90]
data = list(zip(names, scores)) # For Python 2.X you don't need the 'list' constructor
print(data)
Outputs:
[('John', 100), ('Jane', 120), ('Tim', 80), ('Sara', 90)]
Now find the maximum entry:
max_entry = ('', 0)
for entry in data:
if entry[1] > max_entry[1]:
max_entry = entry
print(max_entry)
Outputs:
('Jane', 120)
you could make a copy of your dict, find the biggest value, save the key to a list, remove the key from the dict and then do that again until the copied dict is empty.
import copy
scores = {'hawks': 23, 'eagles': 42, 'knights': 33, 'rabbits': 44} #this or read from .txt
scorescopy = copy.deepcopy(scores) #makes a copy of the dict, so you don't change the dict when deleting keys from the copy
rank = [] #the list in which we want the keys ranked by value
def keywithmaxval(scores): #finde the key with the highest value (stolen from another stackoverflow question)
values = list(scores.values())
keys = list(scores.keys())
return keys[values.index(max(values))]
while len(scorescopy) > 0: #repeats until copy of dict is empty
maxkey = keywithmaxval(scorescopy)
scorescopy.pop(maxkey) #deletes key from copy of dict
rank.append(maxkey) #puts key in the ranked list
print 'rank', rank #list of keys ranked by value
print 'copy of dict', scorescopy #copy of dict, should be empty after we looped trough
print 'original dict',scores #original dict, should be unchanged
print '\nRank:'
for key in rank: print key,':',scores[key] #pretty list of keys and vals

How to create a new layer of sublists based on a common key within each sublist in order to categorize the sublists?

How to create a new layer of sublists based on a common key within each sublist in order to categorize the sublists? In other words, how do you place sublists into a new sublist within the list where each item at index 1 is the same?
For example, I'd like to turn the following list of sublists into a list of sublists in which each sublist is in a new sublist where each item at index 1 is the same within that sublist. I'd like to place the sublists of apples, bananas and oranges in this list into a new sublist.
lsta = [['2014W01','apple',21,'apple#gmail.com'],['2014W02','apple',19,'apple#g.com'],['2014W02','banana',51,'b#gmail.com'],['2014W03','apple',100,'apple#gmail.com'],['2014W01','banana',71,'b#yahoo.com'],['2014W02','organge',21,'organge#gmail.com']]
I'd like the three sublists of apples to be contained within a new sublist, as well as the two sublists of bananas into a new sublist, etc.
Desired_List = [[['2014W01','apple',21,'apple#gmail.com'],['2014W02','apple',19,'apple#g.com'],['2014W03','apple',100,'apple#gmail.com']],[['2014W02','banana',51,'b#gmail.com'],['2014W01','banana',71,'b#yahoo.com']],[['2014W02','organge',21,'organge#gmail.com']]]
Bonus points, if you could tell me how to do multiple categorizations (e.g. not only separating by fruit type, but also by week)?
In [43]: import itertools as IT
In [44]: import operator
In [46]: [list(grp) for key, grp in IT.groupby(sorted(lsta, key=operator.itemgetter(1)), key=operator.itemgetter(1))]
Out[46]:
[[['2014W01', 'apple', 21, 'apple#gmail.com'],
['2014W02', 'apple', 19, 'apple#g.com'],
['2014W03', 'apple', 100, 'apple#gmail.com']],
[['2014W02', 'banana', 51, 'b#gmail.com'],
['2014W01', 'banana', 71, 'b#yahoo.com']],
[['2014W02', 'organge', 21, 'organge#gmail.com']]]
Normally, I'd use itertools.groupby on this, but just for fun, here's a method that does all the heavy lifting manually
def transform(lista):
d = {}
for subl in lista:
k = subl.pop(1)
if k not in d:
d[k] = []
d[k].append(subl)
answer = []
for k, lists in d.items():
temp = []
for l in lists:
l.insert(1, k)
temp.append(l)
answer.append(temp)
return answer
Output:
In [56]: transform(lsta)
Out[56]:
[[['2014W02', 'organge', 21, 'organge#gmail.com']],
[['2014W01', 'apple', 21, 'apple#gmail.com'],
['2014W02', 'apple', 19, 'apple#g.com'],
['2014W03', 'apple', 100, 'apple#gmail.com']],
[['2014W02', 'banana', 51, 'b#gmail.com'],
['2014W01', 'banana', 71, 'b#yahoo.com']]]
I'll take a bit of a different tack. You probably want your group-by field to be the lookup value in a dict. The value can just be a list of various.. whatever you want to call each sublist here. I'll call each one a FruitPerson.
from collections import defaultdict, namedtuple
FruitPerson = namedtuple('FruitPerson','id age email')
d = defaultdict(list)
for sublist in lsta:
d[sublist[1]].append(FruitPerson(sublist[0],*sublist[2:]))
Then, for example:
d['apple']
Out[19]:
[FruitPerson(id='2014W01', age=21, email='apple#gmail.com'),
FruitPerson(id='2014W02', age=19, email='apple#g.com'),
FruitPerson(id='2014W03', age=100, email='apple#gmail.com')]
d['apple'][0]
Out[20]: FruitPerson(id='2014W01', age=21, email='apple#gmail.com')
d['apple'][0].id
Out[21]: '2014W01'
Edit: okay, multiple-categorization-bonus-point question. You just need to nest your dictionaries. The syntax gets a little goofy because the argument to defaultdict has to be a callable; you can do this with either lambda or functools.partial:
FruitPerson = namedtuple('FruitPerson','age email') #just removed 'id' field
d = defaultdict(lambda: defaultdict(list))
for sublist in lsta:
d[sublist[1]][sublist[0]].append(FruitPerson(*sublist[2:]))
d['apple']
Out[37]: defaultdict(<type 'list'>, {'2014W03': [FruitPerson(age=100, email='apple#gmail.com')], '2014W02': [FruitPerson(age=19, email='apple#g.com')], '2014W01': [FruitPerson(age=21, email='apple#gmail.com')]})
d['apple']['2014W01']
Out[38]: [FruitPerson(age=21, email='apple#gmail.com')]
d['apple']['2014W01'][0].email
Out[40]: 'apple#gmail.com'
Though honestly at this point you should consider moving up to a real relational database that can understand SELECT whatever FROM whatever WHERE something type queries.

Categories

Resources