Related
My code:
import openpyxl
workbook = openpyxl.load_workbook('master.xlsx')
worksheet = workbook.worksheets[0]
result = {}
for k, v in zip(worksheet['A'], worksheet['B']):
result[k.internal_value] = v.internal_value
print(result)
The output I get:
{'PPPPP': '22', 'bbbbb': '20', 'ccccc': '30', 'ddddd': '40', 'eeeee': '50'}
Excel file:
The output I want:
{'PPPPP': ['22','10'], 'bbbbb': ['20','30'], 'ccccc': ['30','30'], 'ddddd': '40', 'eeeee': '50'}
You can do it using pandas
import pandas as pd
df = pd.read_excel('master.xlsx', 0, None, ['A', 'B'])
result = {}
for x, y in zip(df['A'], df['B']):
if x in result:
result[x] = [result.get(x)]
result[x].append(str(y))
else:
result[x] = str(y)
print(result)
{'ppp': ['10', '22'], 'bbb': ['20', '30'], 'ccc': ['30', '30'], 'ddd': '40', 'eee': '50'}
Use a defaultdict, with an empty list as the default, and append each new value:
from collections import defauldict
import openpyxl
workbook = openpyxl.load_workbook('master.xlsx')
worksheet = workbook.worksheets[0]
result = defaultdict(list)
for k, v in zip(worksheet['A'], worksheet['B']):
result[k.internal_value].append(v.internal_value)
print(result)
Here EVERY result will be a list, even when you only have one value. e.g. you will get 'ddddd': ['40'] but you should be able to handle all key value pairs consistently.
I was trying to concatenate two lists, which are 'Names' and 'Ages'.
But I wanted to do that with appending their index of [i+1] each time to another list.
So instead of ['John', '17', 'Mike', '21'], My goal was that each pair has a different index, and were a list element.Like that --> [['John', '17'], ['Mike', '21']]
(Note: I know I can do that with zip() function, this is for practice)
So I ended up with that code -->
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
a = []
b = []
for i in range(len(names)):
a.append(names[i])
a.append(ages[i])
b.append([] + a)
a.clear()
print(b)
Output --> [['Ana', '17'], ['John', '22'], ['Bob', '33'], ['Mike', '8'], ['July', '76']]
So as you can see I managed to do that, but the weird thing is that line b.append([] + a). I got what I want accidently, when I type b.append(a) it returns empty b list.
But by following the path in the attached code, I'm accomplishing what I'm trying to do. Can anybody explain why this is working ? I could not catch it.
Thanks in advance.
Adding prints in the code shows that b gets 'cleared' after the loop, and it was not storing the correct information inside the loop. It is essentially copies of the same a:
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
a = []
b = []
for i in range(len(names)):
a.append(names[i])
a.append(ages[i])
print(a)
b.append(a)
print(b)
a.clear()
print(b)
['Ana', '17']
[['Ana', '17']]
['John', '22']
[['John', '22'], ['John', '22']]
['Bob', '33']
[['Bob', '33'], ['Bob', '33'], ['Bob', '33']]
['Mike', '8']
[['Mike', '8'], ['Mike', '8'], ['Mike', '8'], ['Mike', '8']]
['July', '76']
[['July', '76'], ['July', '76'], ['July', '76'], ['July', '76'], ['July', '76']]
[[], [], [], [], []]
This is because lists are mutable in python. When you clear it, the data b is pointing to gets removed as well. When you do []+a, you are creating a new list which is not a reference to a any more. By changing the code this way you can get what you want:
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
b = []
for i in range(len(names)):
a = []
a.append(names[i])
a.append(ages[i])
b.append(a)
print(b)
To help you understand what I mean by mutable, see the following example:
a = ['some data']
b = [a]
print(b)
a.clear()
print(b)
[['some data']]
[[]]
And this is why a+[] works:
a = ['some data']
b = [a+[]]
print(b)
a.clear()
print(b)
[['some data']]
[['some data']]
if both the list have same no of elements then you can use zip() function.
Note: The zip() function will only iterate till the smallest list passed.
list1=[]
for x,y in zip(names,ages):
list1.append([x,y])
print(list1)
Following is my solution using a list comprehension.
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
new_list = [ [names[i], ages[i]] for i in range(len(names))]
print(new_list)
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
a = []
b = []
for i in range(len(names)):
a.append(names[i])
a.append(ages[i])
b.append([]+a)
a.clear()
print(b)
According to your code b.append([]+a)
it is concatenation every time with array.if you checked with print statement like this
for i in range(len(names)):
a.append(names[i])
a.append(ages[i])
print("=>",a)
then show you output is
=> ['Ana', '17']
=> ['John', '22']
=> ['Bob', '33']
=> ['Mike', '8']
=> ['July', '76']
so that when you add b.append([]+a)
we understand more clearly for now
b = []
when we try
b.append([]+a)
it's mean above array concatenate the many arrays into one array .
I think you solve your problem easily when you using
zip() for iteration.
myList=[]
for a,b in zip(names,ages):
list.append([a,b])
print(myList)
output:
[['Ana', '17'], ['John', '22'], ['Bob', '33'], ['Mike', '8'], ['July', '76']]
I would use zip within a list comprehension:
names = ['Ana', 'John', 'Bob', 'Mike', 'July']
ages = ['17', '22', '33', '8', '76']
b = [[name,age] for name, age in zip(names,ages)]
you can use zip and list to write this in a single line of code:
result = list(zip(names, ages))
What is the best way to merge two lists into one and also combine double values? For example:
list_01 = [['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-06'],
['10', '20', '30', '40']]
list_02 = [['2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07'],
['10', '20', '30', '40']]
The final list should look like this:
list_03 = [['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07'],
['10', '20', '40', '30', '70', '40']]
Whenever the dates have matched, the integer-values in the second column have been summed together.
Right now, my only real solution is to pass both lists trough several loops, but I wonder if there might be a better solution.
Thanks and a great evening for all of you.
Your "integers" should really be ints, not strings, and your lists should probably be Counters, as you seem to be counting things per day. Then you can simply add them:
from collections import Counter
list_01 = [['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-06'],
['10', '20', '30', '40']]
list_02 = [['2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07'],
['10', '20', '30', '40']]
def to_counter(lst):
return Counter(dict(zip(lst[0], map(int, lst[1]))))
counter = to_counter(list_01) + to_counter(list_02)
for item in counter.items():
print(item)
Prints:
('2020-01-02', 10)
('2020-01-03', 20)
('2020-01-04', 40)
('2020-01-06', 70)
('2020-01-05', 20)
('2020-01-07', 40)
Try this, make dictionaries. Let me know if this isn't what you want or is confusing.
dict_01 = {list_01[0][i]:int(list_01[1][i]) for i in range(len(list_01[0]))}
dict_02 = {list_02[0][i]:int(list_02[1][i]) for i in range(len(list_02[0]))}
dates = list(set(list_01[0] + list_02[0]))
dates.sort()
list_03 = [dates, [dict_01.get(date, 0) + dict_02.get(date, 0) for date in dates]]
#Tomerikoo points out a more elegant way to form the dictionaries.
dict_01 = dict(zip(*list_01))
dict_02 = dict(zip(*list_02))
As #HeapOverflow points out if you do this you should change the sums.
list_03 = [dates, [int(dict_01.get(date, 0)) + int(dict_02.get(date, 0)) for date in dates]]
This returns
[['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07'], [10, 20, 40, 20, 70, 40]]
I think this is right, and the 2020-01-05 should be 20 not 30.
The best way to do it would probably be to use defaultdict, which provides a default value to each key, even if you've never introduced that key to the dictionary before. Then all you have to do is add whatever value belongs to that key (which is the date) from both lists. Then when you have this dictionary, just get the key-value pairs as items and unzip it into two lists.
from collections import defaultdict
mydict = defaultdict(int) # default values are 0
list_01 = [['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-06'], ['10', '20', '30', '40']]
list_02 = [['2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07'], ['10', '20', '30', '40']]
for t in [list_01, list_02]:
for key, value in zip(t[0], t[1]):
mydict[key] += int(value)
print(list(zip(*sorted(mydict.items()))))
This prints:
[('2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06','2020-01-07'),
(10, 20, 40, 20, 70, 40)]
I want to combine the list of ages of the groups which are having a repeated name...
My code:
dic1 = {'g1': ['45', '35', '56', '65'], 'g2': ['67', '76'], 'g3':['8', '96']}
dic2 = {'g1': ['akshay', 'swapnil', 'parth','juhi'], 'g2': ['megha', 'varun'], 'g3': ['gaurav', 'parth']}
for key2,name_list in dic2.items():
for name in name_list:
if name=='parth':
for key1,age_list in dic1.items():
if key1==key2:
print(age_list)
The output is:
['45', '35', '56', '65']
['8', '96']
I want the output as:
['45', '35', '56', '65', '8', '96']
Can someone help me with this?
there's more pythonic than that, you need to chain the lists. Also, no need for so many loops. A one-liner should do.
dic1 = {'g1': ['45', '35', '56', '65'], 'g2': ['67', '76'], 'g3':['8', '96']}
dic2 = {'g1': ['akshay', 'swapnil', 'parth','juhi'], 'g2': ['megha', 'varun'], 'g3': ['gaurav', 'parth']}
import itertools
result = list(itertools.chain.from_iterable(dic1[k] for k,v in dic2.items() if 'parth' in v))
>>> result
['45', '35', '56', '65', '8', '96']
A variant without itertools would be:
result = [x for k,v in dic2.items() if 'parth' in v for x in dic1[k]]
With a dict of sets instead of a dict of lists:
dic2 = {'g1': {'akshay', 'swapnil', 'parth','juhi'}, 'g2': {'megha', 'varun'}, 'g3': {'gaurav', 'parth'}}
those turn your O(N**3) algorithm into a O(N) algorithm (because in lookup in a list is O(N) but O(1) in a set).
If you have a missing key, just replace dic1[k] by dic1.get(k,[]) or even dic1.get(k) or [].
You could either use itertools as mentioned in other answers, or just simplify your own code.
There is no need to have a three layer nested for loop. As python only allows
unique keys, you could eliminate the innermost for loop like so:
output_list = []
for key, name_list in dic2.items():
if "parth" in name_list:
output_list += dic1[key]
print(output_list)
As and when you get the required age list which is to be displayed, add it to the output_list with a simple +=.
Though the above code is easier to understand, I recommend using itertools.
I have a dictionary that looks like this:
scores = {'Ben': ['10', '9'], 'Alice': ['10', '10'], 'Tom': ['9', '8']}
I have calculated the average of the values for each person in the dictionary and I want to then store the averages in a separate dictionary. I would like it to look like this:
averages = {'Ben': [9.5], 'Alice': [10], 'Tom': [8.5]}
I have calculated the averages using this code:
for key, values in scores.items():
avg = float(sum([int(i) for i in values])) / len(values)
print(avg)
This gives the following output:
9.5
10.0
8.5
How can I output the averages in a separate dictionary as shown above?
Thanks in advance.
averages = {} # Create a new empty dictionary to hold the averages
for key, values in scores.items():
averages[key] = float(sum([int(i) for i in values])) / len(values)
# Rather than store the averages in a local variable, store them in under the appropriate key in your new dictionary.
Use dict_comprehension.
>>> scores = {'Ben': ['10', '9'], 'Alice': ['10', '10'], 'Tom': ['9', '8']}
>>> {i:[float(sum(int(x) for x in scores[i]))/len(scores[i])] for i in scores}
{'Ben': [9.5], 'Alice': [10.0], 'Tom': [8.5]}
You can use a dictionary comprehension to loop over your items and calculate the proper result:
>>> from __future__ import division
>>> scores = {'Ben': ['10', '9'], 'Alice': ['10', '10'], 'Tom': ['9', '8']}
>>> scores = {k:[sum(map(int,v))/len(v)] for k,v in scores.items()}
>>> scores
{'Ben': [9.5], 'Alice': [10.0], 'Tom': [8.5]}
Note that you need to convert your values to int that you can do it with map function map(int,v).
you can do this with a dict comprehension in one line:
averages = {k: sum(float(i) for i in v) / len(v) for k, v in scores.items() if v}