Get incremental count of list for all the elements - python

I have a list with 24 million elements and I want to increment count of each element iteratively and store the count in another list in faster way. For example, my list is:
a=['bike','bike','jeep','horse','horse','horse','flight','flight','cycle']
My expected output is
[1, 2, 1, 1, 2, 3, 1, 2, 1]
The code i used is
z=[]
for i in a:
z.append(a.count(i))
But my output is bit different
[2, 2, 1, 3, 3, 3, 2, 2, 1]
My order of this newly created list is also important and should be based on my list(a). Any help is really appreciated.

Based on your expected output, since you need the count of elements till that index of the list at which you are iterating at that point of time, the below code should work:
from collections import defaultdict
a=['bike','bike','jeep','horse','horse','horse','flight','flight','cycle']
a_dict = defaultdict(int)
a_output = []
for x in a:
a_dict[x] += 1
a_output.append(a_dict[x])
print(a_output)
Output:
[1, 2, 1, 1, 2, 3, 1, 2, 1]

Here is one solution -
a=['bike','bike','jeep','horse','horse','horse','flight','flight','cycle']
countArr = []
temp = {}
for i in a:
if i in temp:
temp[i]+=1
countArr.append(temp.get(i))
else:
temp[i] = 1
countArr.append(temp.get(i))

You could use a dictionary and a for loop to accomplish this:
counts = {}
a = ['bike','bike','jeep','horse','horse','horse','flight','flight','cycle']
z = []
for i in a:
if i in counts:
counts[i] += 1
else:
counts[i] = 1
z.append(counts[i])
print(z)
# [1, 2, 1, 1, 2, 3, 1, 2, 1]
You can also do this fun hacky thing with a list comprehension, which exploits the evaluation order of tuples and does essentially the same as the above for loop but condensed into one line:
counts = {}
z = [(counts.__setitem__(i, counts[i] + 1 if i in counts else 1), counts[i])[1] for i in a]
print(z)
# [1, 2, 1, 1, 2, 3, 1, 2, 1]

you can use sub-array count :
a=['bike','bike','jeep','horse','horse','horse','flight','flight','cycle']
z=[]
i = 0
while i < len(a):
#print(a[0:i])
#print(a[i])
z.append(a[0:i].count(a[i]) + 1)
i+= 1
print(z)

Related

Add a sequence of n numbers, getting n from another list, in Python

I have the following two lists:
numlist = [1, 1, 1, 1, 4, 1, 1, 4, 1, 2]
lenwords = [2,3,5]
I want to see the number at each index in len words as such:
for number in range(len(lenwords)):
print(lenwords[number])
And then take that number of items in numlist suggested by each index in lenwords (2,3,5) and add them together, like so:
add 1+1 then 1+1+4 then 1+1+4+1+2
I'm thinking that I could use itertools, but not sure how to do so.
I make an iterable out of numlist, then iterate over lenwords, using itertools.islice to pull out the count you want from the numlist generator.
https://docs.python.org/3/library/itertools.html#itertools.islice
import itertools
def sumlengths(numlist, lenwords):
numbers = iter(numlist)
for length in lenwords:
yield sum(itertools.islice(numbers, length))
numlist = [1, 1, 1, 1, 4, 1, 1, 4, 1, 2]
lenwords = [2,3,5]
print (*sumlengths(numlist, lenwords))
2 6 9
I did not validate the length of the inputs.
Another approach:
numlist = [1, 1, 1, 1, 4, 1, 1, 4, 1, 2]
lenwords = [2,3,5]
gen = iter(numlist)
result = []
for n in lenwords:
total = 0
for _ in range(n):
total += next(gen)
result.append(total)
The resulting list total is [2, 6, 9], as desired.
with out using ittertools:
numlist = [1, 1, 1, 1, 4, 1, 1, 4, 1, 2]
lenwords = [2, 3, 5]
counter = 0
for number in lenwords:
q = numlist[counter:counter+number]
print(sum(q))
counter += number
output
2
6
9

Repeat values in array until specific length [duplicate]

This question already has answers here:
How to replicate array to specific length array
(4 answers)
Closed 2 years ago.
I need some kind of function or little tip for my problem.
So I got a list let's say
[1,2,3,4]
but I need this array to be longer with the same elements repeated so let's say I need an array of length 10 so it becomes:
[1,2,3,4,1,2,3,4,1,2]
So I need to extend the list with the same values as in the list in the same order
returnString = the array or string to return with extended elements
array = the basic array which needs to be extended
length = desired length
EDIT:
returnString = ""
array = list(array)
index = 0
while len(str(array)) != length:
if index <= length:
returnString += array[index]
index += 1
else:
toPut = index % length
returnString.append(array[toPut])
index += 1
return returnString
This is simple with itertools.cycle and itertools.islice:
from itertools import cycle, islice
input = [1, 2, 3, 4]
output = list(islice(cycle(input), 10))
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2]
You can use itertools.cycle to iterate repeatedly over the list, and take as many values as you want.
from itertools import cycle
lst = [1, 2, 3, 4]
myiter = cycle(lst)
print([next(myiter) for _ in range(10)])
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2]
You can also use it to extend the list (it doesn't matter if you append to the end while you are iterating over it, although removing items would not work).
from itertools import cycle
lst = [1, 2, 3, 4]
myiter = cycle(lst)
for _ in range(6):
lst.append(next(myiter))
print(lst)
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2]
One way could be:
Iterate over the desired length - len(x_lst), So you have 10 - 4 = 6 (new elements to be added). Now since the list element should repeat, you can append the x_lst elements on the go by the indices (0,1,2,3,4,5).
x = [1,2,3,4]
length = 10
for i in range(length - len(x)):
x.append(x[i])
print(x)
OUTPUT:
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2]
Try this:
n = 10
lst =[1,2,3,4]
new_lst = [lst[i%len(lst)] for i in range(n)]
print(new_lst)
Output:
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2]
I will show a tip to you:
If you have this array [1,2,3,4] so you can create a separated newArray that get this value and fill the newArray with this repeated values.
How? Loop! I think for can do this to you, just point the array and newArray to it knows which it will fill.
NumOfValues = int(input("Number of Values: "))
List1 = [1,2,3,4]
List2 = []
Count = 0
while len(List2) < NumOfValues:
List2.append(List1[Count])
Count += 1
if Count > len(List1) - 1:
Count = 0
print(List2)
First multiply the list by the number of times it needs to be repeated. If that's not the desired length, extend it with the appropriate slice of the list.
old_len = len(original)
new_len = 10
result = original * new_len // old_len
if new_len % old_len != 0:
result += original[:new_len % old_len]

How to create a list that represent the number of times a given item was shown?

This problem seems really stupid bu I can't get my head around it.
I have the following list:
a = [2, 1, 3, 1, 1, 2, 3, 2, 3]
I have to produce a second list which have the same size as the previous one but the values that appear should be the amount of times that a value showed up in the array until that point. For example:
b = [1, 1, 1, 2, 3, 2, 2, 3, 3]
So b[0] = 1 because it's the first time the item '2' appear on the 'a' list. b[5] = 2 and b[7] = 3 because it's the second and third time that the item '2' appear on the list 'a'.
Here a solution:
from collections import defaultdict
a = [2, 1, 3, 1, 1, 2, 3, 2, 3]
b = []
d = defaultdict(int)
for x in a:
d[x] +=1
b.append(d[x])
print(b)
Output:
[1, 1, 1, 2, 3, 2, 2, 3, 3]
I think using dictionary might help you, basically I am iterating the list and storing the current frequency of the number.
a = [2, 1, 3, 1, 1, 2, 3, 2, 3]
d = {}
z = []
for i in a:
if i not in d:
d[i] = 1
z.append(1)
else:
d[i]+=1
z.append(d[i])
print(z)
output = [1, 1, 1, 2, 3, 2, 2, 3, 3]

count list by order

I want to count a given list like:
list = [1 , 1, 2, 2, 4, 2, 3, 3]
and the result will be:
2122141223
So what the code does is count by order how many times the x number is in row. In the example above there is 1 and then another 1, so = 2 (the number of occurence) 1 (the number itself)
list = [1, 1, 2, 1, 4, 6]
i = 0
n = len(list)
c = 1
list2 =[]
while i in range(0, n) and c in range (1 , n):
if list[i] == list[i+1]:
listc= i+c
listx = str(listc)
list2.insert(i, i+c)
i += 1
c += 1
else:
f = i + 1
i += 1
c += 1
That's what I've done and I don't know how to continue.
What I'm trying to do is a loop that checks if the number are identical, if they are they will continue to the next number until it runs with different number.
You can use the Python groupby function as follows:
from itertools import groupby
my_list = [1, 1, 2, 2, 4, 2, 3, 3]
print ''.join('{}{}'.format(len(list(g)), k) for k,g in groupby(my_list))
Giving you the following output:
2122141223
The k gives you the key (e.g. 1, 2, 4, 2, 3), and the g gives an iterator. By converting this to a list, its length can be determined.
Or without using the groupby function, you could do the following:
my_list = [1, 1, 2, 2, 4, 2, 3, 3]
current = my_list[0]
count = 1
output = []
for value in my_list[1:]:
if value == current:
count += 1
else:
output.append('{}{}'.format(count, current))
current = value
count = 1
output.append('{}{}'.format(count, current))
print ''.join(output)

How to count the frequency of the elements in an unordered list? [duplicate]

This question already has answers here:
Using a dictionary to count the items in a list
(8 answers)
Closed 7 months ago.
Given an unordered list of values like
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
How can I get the frequency of each value that appears in the list, like so?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
In Python 2.7 (or newer), you can use collections.Counter:
>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]
If you are using Python 2.6 or older, you can download an implementation here.
If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):
from itertools import groupby
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]
Output:
[4, 4, 2, 1, 2]
Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]
Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:
from collections import defaultdict
appearances = defaultdict(int)
for curr in a:
appearances[curr] += 1
In Python 2.7+, you could use collections.Counter to count items
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]
Counting the frequency of elements is probably best done with a dictionary:
b = {}
for item in a:
b[item] = b.get(item, 0) + 1
To remove the duplicates, use a set:
a = list(set(a))
You can do this:
import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)
Output:
(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))
The first array is values, and the second array is the number of elements with these values.
So If you want to get just array with the numbers you should use this:
np.unique(a, return_counts=True)[1]
Here's another succint alternative using itertools.groupby which also works for unordered input:
from itertools import groupby
items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]
results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}
results
format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
I would simply use scipy.stats.itemfreq in the following manner:
from scipy.stats import itemfreq
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq = itemfreq(a)
a = freq[:,0]
b = freq[:,1]
you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]
counter=Counter(a)
kk=[list(counter.keys()),list(counter.values())]
pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.
Suppose we have a list:
fruits = ['banana', 'banana', 'apple', 'banana']
We can find out how many of each fruit we have in the list like so:
import numpy as np
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}
Result:
{'banana': 3, 'apple': 1}
This answer is more explicit
a = [1,1,1,1,2,2,2,2,3,3,3,4,4]
d = {}
for item in a:
if item in d:
d[item] = d.get(item)+1
else:
d[item] = 1
for k,v in d.items():
print(str(k)+':'+str(v))
# output
#1:4
#2:4
#3:3
#4:2
#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}
For your first question, iterate the list and use a dictionary to keep track of an elements existsence.
For your second question, just use the set operator.
def frequencyDistribution(data):
return {i: data.count(i) for i in data}
print frequencyDistribution([1,2,3,4])
...
{1: 1, 2: 1, 3: 1, 4: 1} # originalNumber: count
I am quite late, but this will also work, and will help others:
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))
for x in a_l:
freq_list.append(a.count(x))
print 'Freq',freq_list
print 'number',a_l
will produce this..
Freq [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
# 1. Get counts and store in another list
output = []
for i in set(a):
output.append(a.count(i))
print(output)
# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
list() constructor is used to convert the set(a) into list and referred by the same variable a
Output
D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]
Simple solution using a dictionary.
def frequency(l):
d = {}
for i in l:
if i in d.keys():
d[i] += 1
else:
d[i] = 1
for k, v in d.iteritems():
if v ==max (d.values()):
return k,d.keys()
print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))
#!usr/bin/python
def frq(words):
freq = {}
for w in words:
if w in freq:
freq[w] = freq.get(w)+1
else:
freq[w] =1
return freq
fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
dictionary = OrderedDict()
for val in lists:
dictionary.setdefault(val,[]).append(1)
return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]
To remove duplicates and Maintain order:
list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]
i'm using Counter to generate a freq. dict from text file words in 1 line of code
def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
[wrd.lower() for wrdList in
[words for words in
[re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
for wrd in wrdList])
For the record, a functional answer:
>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]
It's cleaner if you count zeroes too:
>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]
An explanation:
we start with an empty acc list;
if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.
The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.
Another approach of doing this, albeit by using a heavier but powerful library - NLTK.
import nltk
fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
Found another way of doing this, using sets.
#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)
#create dictionary of frequency of socks
sock_dict = {}
for sock in sock_set:
sock_dict[sock] = ar.count(sock)
For an unordered list you should use:
[a.count(el) for el in set(a)]
The output is
[4, 4, 2, 1, 2]
Yet another solution with another algorithm without using collections:
def countFreq(A):
n=len(A)
count=[0]*n # Create a new list initialized with '0'
for i in range(n):
count[A[i]]+= 1 # increase occurrence for value A[i]
return [x for x in count if x] # return non-zero count
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)
You can use the in-built function provided in python
l.count(l[i])
d=[]
for i in range(len(l)):
if l[i] not in d:
d.append(l[i])
print(l.count(l[i])
The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.
Two birds for one shot ! X D
This approach can be tried if you don't want to use any library and keep it simple and short!
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)
o/p
[4, 4, 2, 1, 2]

Categories

Resources