I have created a simple program to take a list of simplified sku's and qty's and add it to another list. However if the sku is already in the second list I would like to just increment the qty as opposed to adding a copy. I have tried a few different for loops and I haven't been able to get it to work.
myList = [["a",1],["c",1],["a",1]] #[sku, qty]
newList = [["null",0]] #placeholder value so second for loop functions
for eachItem in myList:
for eachNew in newList:
if eachItem[0] == eachNew[0]: # if sku is in list increment qty
eachNew[1] += eachItem[1]
myList.remove(eachItem)
else:
newList.append([eachItem[0], eachItem[1]]) #else add the sku to the list
myList.remove(eachItem)
#remove null place holder
for eachItem in newList:
if eachItem[0] == "null":
newList.remove(eachItem)
for eachItem in newList:
print(eachItem)
My desired output would be:
['a', 2]
['c', 1]
EDIT: I just realized I wasn't clear enough in my OP. I don't want to count the number of times a sku appears I want to add all the quantities. It is possible that there will be quantities of more than one.
Try using value_counts from the pandas libray like so:
import pandas as pd
myList = [["a",1],["c",1],["a",1]]
myList = pd.Series(myList)
mylist.value_counts().to_list()
would yeild:
[['a',2],['c',1]]
like this example
In [83]: data
array([4, 6, 6, 1, 2, 1, 0, 5, 3, 2, 4, 3, 1, 3, 5, 3, 0, 0, 4, 4, 6, 1, 0,
4, 3, 2, 1, 3, 1, 5, 6, 3, 1, 2, 4, 4, 3, 3, 2, 2, 2, 3, 2, 3, 0, 1,
2, 4, 5, 5])
In [84]: s = Series(data)
In [85]: s.value_counts()
3 11
2 9
4 8
1 8
5 5
0 5
6 4
dtype: int64
The programming behind it is much more efficient than a for loop because the basis is written in C. You can use the to_list() method, mylist.value_counts().to_list(), to get it exactly to your desired output.
* this code is untested
Look into the the collections.Counter class in the python library documentation. It won't require any loops.
How about something like this?
This is a dict, that maps a sku to a quantity, that will allow you to aggregate values together and keep track of what you have already counted
Dont over complicate it
myList = [["a",1],["c",1],["a",1]] #[sku, qty]
counter_dict = {} #schema {"sku": quantity:int }
for eachItem in myList:
if eachItem[0] in counter_dict:
counter_dict[eachItem[0]] += eachItem[1]
else:
counter_dict[eachItem[0]] = eachItem[1]
for key in counter_dict.keys():
print([key, counter_dict[key]])
Not as elegant as pandas and I couldn't figure out collections.Counter so, this is more along the lines of what you were doing. It builds a dictionary and then builds the list again with a comprehension.
mylist = [["a",1],["c",1],["a",1]]
newlist = [["null",0]]
mydict = {}
for i in mylist+newlist:
if i[0] in mydict:
mydict[i[0]]+=i[1]
elif i[0] != 'null':
mydict[i[0]]=i[1]
print [[x,mydict[x]] for x in mydict]
[['a', 2], ['c', 1]]
Hi! I hava a list of lists, and when the first element of the sublists are equal, i need to add the second elements of those and print the results. I have thought about it for long, but i just can't seem to figure out how this could be done. Here's an example of my problem:
num_list = [[1, 2], [3, 4], [1, 2], [3, 4], [3, 4]]
# 0th and 2nd sublists both have 1 as their first element.
# sum = 2 + 2. print out 4.
# all the remaining sublists have 3 as their first element.
# sum = 4 + 4 + 4. print out 12.
Thank you very much!
PS: I'm aware that this kind of mapping would be better done with a dictionary, but this is just a simplified version of my problem. My actual program has sublists that have more than 2 values and i need to compare more than 1 value that need to be equal.
You can use defaultdict:
from collections import defaultdict
num_list = [[1, 2], [3, 4], [1, 2], [3, 4], [3, 4]]
d = defaultdict(int)
for item in num_list:
d[item[0]] += item[1]
And the results are:
>>> d
defaultdict(<type 'int'>, {1: 4, 3: 12})
You can still use a dictonary for this task. Use tuples as keys:
>>> d = {(1,1): (2,2), (3,3): (4,4)}
>>> d
{(1, 1): (2, 2), (3, 3): (4, 4)}
>>> d[(1,1)]
(2, 2)
You might also want to learn about the Counter class. If your elements are more complex, I suggest wrapping them in objects and implement the __add__ method to customize how they're added together.
from collections import Counter
c = Counter()
c[(1,1)] = 10
c[(2,2)] = 10
c[(1,1)] += 1
c2 = Counter()
c2[(2,2)] = 4
c2[(2,3)] = 5
Which gives:
>>> c
Counter({(1, 1): 11, (2, 2): 10})
>>> c + c2
Counter({(2, 2): 14, (1, 1): 11, (2, 3): 5})
Note that you cannot use Lists as keys, as lists are mutable and thus unhashable. You have to use tuples.
It seems that you did not describe your problem quite accurate enough.
Your real problem can only be grasped from your comments on both the question and the answer from #Blender. His nice solution for the problem works not immediately for what I understand is your problem case, ... but almost.
Here's a way to extend to suit your needs:
# some toy example data - I understand you want the first 2 sub_list
# to be "merged" because BOTH strings in pos 0 and 2 match
data = [['42x120x1800', 50, '50x90x800', 60],
['42x120x1800', 8, '50x90x800', 10],
['2x10x800', 5, '5x9x80', 6]]
from collections import defaultdict
# I'm using a lambda to initialize the items of the dict
# to a two-element list of zeros
d = defaultdict(lambda :[0, 0])
for sub_list in data:
key = (sub_list[0], sub_list[2])
d[key][0] += sub_list[1]
d[key][1] += sub_list[3]
for key in d:
print key, d[key]
# ('2x10x800', '5x9x80') [5, 6]
# ('42x120x1800', '50x90x800') [58, 70]
If you then want to go back to the initial representation of the data:
new_data = [[key[0], val[0], key[1], val[1]] for key, val in d.iteritems()]
# [['2x10x800', 5, '5x9x80', 6], ['42x120x1800', 58, '50x90x800', 70]]
Using standard dict():
num_list = [[1, 2], [3, 4], [1, 2], [3, 4], [3, 4]]
d = dict()
for e in num_list:
#get() checks if key exists, if not - returns 0
d[e[0]] = d.get(e[0], 0) + e[1]
print(d)
It prints:
{1: 4, 3: 12}
I have data like
[2, 2, 2, 2, 2, 3, 13, 113]
which I then want to sort into separate lists by keys generated by myself. In fact I want to generate all possible lists.
Some examples:
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [0, 0, 1, 2, 1, 3, 3, 1]
sublists: [2, 2], [2, 2, 113], [2], [3, 13]
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [0, 1, 0, 0, 0, 1, 1, 0]
sublists: [2, 2, 2, 2, 113], [2, 3, 13]
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [2, 3, 0, 0, 4, 4, 1, 3]
sublists: [2, 2], [13], [2], [2, 113], [2, 3]
All possible keys are generated by
def generate_keys(prime_factors):
key_size = len(prime_factors) - 1
key_values = [str(i) for i in range(key_size)]
return list(itertools.combinations_with_replacement(key_values, \
len(prime_factors)))
Then I thought I could use the keys to shift the values into the sublists. That's the part I'm stuck on. I thought itertools.groupby would be my solution but upon further investigation I see no way to use my custom lists as keys for groupby.
How do I split my big list into smaller sublists using these keys? There may even be a way to do this without using keys. Either way, I don't know how to do it and looking at other Stack Overflow questions has eben in the ballpark but not exactly this question.
This does what you want:
def sift(keys, values):
answer = collections.defaultdict(list)
kvs = zip(keys, values)
for k,v in kvs:
answer[k].append(v)
return [answer[k] for k in sorted(answer)]
In [205]: keys = [0, 0, 1, 2, 1, 3, 3, 1]
In [206]: values = [2, 2, 2, 2, 2, 3, 13, 113]
In [207]: sift(keys,values)
Out[207]: [[2, 2], [2, 2, 113], [2], [3, 13]]
Explanation:
collections.defaultdict is a handy dict-like class that lets you define what should happen in the event that a key doesn't exist in the dictionary that you're trying to manipulate. For example, in my code, I have answer[k].append(v). We know that append is a list function, so we know that answer[k] should be a list. However, if I was using a conventional dict and I tried to append to the value of a non-existent key, I would have gotten a KeyError as follows:
In [212]: d = {}
In [213]: d[1] = []
In [214]: d
Out[214]: {1: []}
In [215]: d[1].append('one')
In [216]: d[1]
Out[216]: ['one']
In [217]: d
Out[217]: {1: ['one']}
In [218]: d[2].append('two')
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/Users/USER/<ipython-input-218-cc58f739eefa> in <module>()
----> 1 d[2].append('two')
KeyError: 2
This was only made possible because I defined answer = collections.defaultdict(list). If I had defined answer = collections.defaultdict(int), I would gotten a different error - one that would tell me that int objects don't have an append method.
zip on the other hand takes two lists (well actually, it takes at least two iterables), lets call them list1 and list2 and returns a list of tuples in which the ith tuple contains two objects. The first is list1[i] and the second is list2[i]. If list1 and list2 are of unequal length, len(zip(list1, list2)) would be the smaller value among len(list1) and len(list2) (i.e. min(len(list1), len(list2)).
Once I've zipped keys and values, I want to create a dict such that maps a value from keys to a list of values from values. This is why I used a defaultdict, so that I wouldn't have to check for the existence of a key in it before I appended to its value. If I had used a conventional dict, I would have had to do this:
answer = {}
kvs = zip(keys, values)
for k,v, in kvs:
if k in answer:
answer[k].append(v)
else:
answer[k] = [v]
Now that you have a dict (or a dict-like object) that maps values from keys to lists of ints that share the same key, all you need to do is get the lists which are the values of answer in sorted order, sorted by the keys of answer. sorted(answer) gives me a list of all of answers keys in sorted order.
Once I have this list of sorted keys, all I have to do is get their values, which are lists of ints, and put all those lists into one big list and return that big list.
… annnnnd Done! Hope that helps
This question already has answers here:
Using a dictionary to count the items in a list
(8 answers)
Closed 7 months ago.
Given an unordered list of values like
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
How can I get the frequency of each value that appears in the list, like so?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
In Python 2.7 (or newer), you can use collections.Counter:
>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]
If you are using Python 2.6 or older, you can download an implementation here.
If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):
from itertools import groupby
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]
Output:
[4, 4, 2, 1, 2]
Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]
Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:
from collections import defaultdict
appearances = defaultdict(int)
for curr in a:
appearances[curr] += 1
In Python 2.7+, you could use collections.Counter to count items
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]
Counting the frequency of elements is probably best done with a dictionary:
b = {}
for item in a:
b[item] = b.get(item, 0) + 1
To remove the duplicates, use a set:
a = list(set(a))
You can do this:
import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)
Output:
(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))
The first array is values, and the second array is the number of elements with these values.
So If you want to get just array with the numbers you should use this:
np.unique(a, return_counts=True)[1]
Here's another succint alternative using itertools.groupby which also works for unordered input:
from itertools import groupby
items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]
results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}
results
format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
I would simply use scipy.stats.itemfreq in the following manner:
from scipy.stats import itemfreq
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq = itemfreq(a)
a = freq[:,0]
b = freq[:,1]
you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]
counter=Counter(a)
kk=[list(counter.keys()),list(counter.values())]
pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.
Suppose we have a list:
fruits = ['banana', 'banana', 'apple', 'banana']
We can find out how many of each fruit we have in the list like so:
import numpy as np
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}
Result:
{'banana': 3, 'apple': 1}
This answer is more explicit
a = [1,1,1,1,2,2,2,2,3,3,3,4,4]
d = {}
for item in a:
if item in d:
d[item] = d.get(item)+1
else:
d[item] = 1
for k,v in d.items():
print(str(k)+':'+str(v))
# output
#1:4
#2:4
#3:3
#4:2
#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}
For your first question, iterate the list and use a dictionary to keep track of an elements existsence.
For your second question, just use the set operator.
def frequencyDistribution(data):
return {i: data.count(i) for i in data}
print frequencyDistribution([1,2,3,4])
...
{1: 1, 2: 1, 3: 1, 4: 1} # originalNumber: count
I am quite late, but this will also work, and will help others:
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))
for x in a_l:
freq_list.append(a.count(x))
print 'Freq',freq_list
print 'number',a_l
will produce this..
Freq [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
# 1. Get counts and store in another list
output = []
for i in set(a):
output.append(a.count(i))
print(output)
# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
list() constructor is used to convert the set(a) into list and referred by the same variable a
Output
D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]
Simple solution using a dictionary.
def frequency(l):
d = {}
for i in l:
if i in d.keys():
d[i] += 1
else:
d[i] = 1
for k, v in d.iteritems():
if v ==max (d.values()):
return k,d.keys()
print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))
#!usr/bin/python
def frq(words):
freq = {}
for w in words:
if w in freq:
freq[w] = freq.get(w)+1
else:
freq[w] =1
return freq
fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
dictionary = OrderedDict()
for val in lists:
dictionary.setdefault(val,[]).append(1)
return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]
To remove duplicates and Maintain order:
list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]
i'm using Counter to generate a freq. dict from text file words in 1 line of code
def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
[wrd.lower() for wrdList in
[words for words in
[re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
for wrd in wrdList])
For the record, a functional answer:
>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]
It's cleaner if you count zeroes too:
>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]
An explanation:
we start with an empty acc list;
if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.
The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.
Another approach of doing this, albeit by using a heavier but powerful library - NLTK.
import nltk
fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
Found another way of doing this, using sets.
#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)
#create dictionary of frequency of socks
sock_dict = {}
for sock in sock_set:
sock_dict[sock] = ar.count(sock)
For an unordered list you should use:
[a.count(el) for el in set(a)]
The output is
[4, 4, 2, 1, 2]
Yet another solution with another algorithm without using collections:
def countFreq(A):
n=len(A)
count=[0]*n # Create a new list initialized with '0'
for i in range(n):
count[A[i]]+= 1 # increase occurrence for value A[i]
return [x for x in count if x] # return non-zero count
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)
You can use the in-built function provided in python
l.count(l[i])
d=[]
for i in range(len(l)):
if l[i] not in d:
d.append(l[i])
print(l.count(l[i])
The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.
Two birds for one shot ! X D
This approach can be tried if you don't want to use any library and keep it simple and short!
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)
o/p
[4, 4, 2, 1, 2]