gzip_files=["complete-credit-ctrl-txn-SE06_2013-07-17-00.log.gz","complete-credit-ctrl-txn-SE06_2013-07-17-01.log.gz"]
def input_func():
num = input("Enter the number of MIN series digits: ")
return num
for i in gzip_files:
import gzip
f=gzip.open(i,'rb')
file_content=f.read()
digit = input_func()
file_content = file_content.split('[')
series = [] #list of MIN
for line in file_content:
MIN = line.split('|')[13:15]
for x in MIN:
n = digit
x = x[:n]
series.append(x)
break
#count the number of occurences in the list named series
for i in series:
print i
#end count
Result:
63928
63928
63929
63929
63928
63928
That is only a part of the result. the actual result shows a really long list. Now i want to just list unique numbers and specify how many times it showed on the list.
So
63928 = 4,
63929 = 2
I would use a collections.Counter class here.
>>> a = [1, 1, 1, 2, 3, 4, 4, 5]
>>> from collections import Counter
>>> Counter(a)
Counter({1: 3, 4: 2, 2: 1, 3: 1, 5: 1})
Just pass your series variable to Counter and you'll get a dictionary where the keys are the unique elements and the values are their occurences in the list.
collections.Counter was introduced in Python 2.7. Use the following list comprehension for versions below 2.7
>>> [(elem, a.count(elem)) for elem in set(a)]
[(1, 3), (2, 1), (3, 1), (4, 2), (5, 1)]
You can then just convert this into a dictionary for easy access.
>>> dict((elem, a.count(elem)) for elem in set(a))
{1: 3, 2: 1, 3: 1, 4: 2, 5: 1}
You can use a Counter() for this.
So this will print what you need:
from collections import Counter
c = Counter(series)
for item,count in c.items():
print "%s = %s" % (item,count)
Compile a dictionary using unique numbers as keys, and their total occurrences as values:
d = {} #instantiate dictionary
for s in series:
# set default key and value if key does not exist in dictionary
d.setdefault(s, 0)
# increment by 1 for every occurrence of s
d[s] += 1
If this problem were any more complex. Implementation of map reduce (aka map fold) may be appropriate.
Map Reduce:
https://en.wikipedia.org/wiki/MapReduce
Python map function:
http://docs.python.org/2/library/functions.html#map
Python reduce function:
http://docs.python.org/2/library/functions.html#reduce
Related
I have the following list:
lst= (1,(1,2), 3, (3,4), 1, 3)
and I want to use the dictionary function generate output that will count the number of times each value occurs such that it would look like this:
{1:3, 2:1, 3:3, 4:1}
I am lost on how to do this.
Thank you!
Below is my attempt:
def f(*args):
for x in args:
d = {x:args.count(x) for x in args}
return d
For arbitrary depth tuples, you could use a recursive function for flattening:
def flatten_nested_tuples(tuples):
for tup in tuples:
if isinstance(tup, tuple):
yield from flatten_nested_tuples(tup)
else:
yield tup
The yield from x syntax is equivalent to the for item in x: yield item. Its just a shorter way to create generators. You can have a look at this answer and this answer for more information about generators and the yield keyword.
To count we can use collections.Counter to count the flattened tuples:
from collections import Counter
lst= (1,(1,2), 3, (3,4), 1, 3)
print(Counter(flatten_nested_tuples(lst)))
Output:
Counter({1: 3, 3: 3, 2: 1, 4: 1})
Note: Counter is a subclass of dict, so you can treat it like a regular dict.
If you want to count yourself without any modules, you have to do the 0 initializing yourself:
counts = {}
for item in flatten_nested_tuples(lst):
counts[item] = counts.get(item, 0) + 1
print(counts)
# {1: 3, 2: 1, 3: 3, 4: 1}
Or without using dict.get():
counts = {}
for item in flatten_nested_tuples(lst):
if item not in counts:
counts[item] = 0
counts[item] += 1
print(counts)
# {1: 3, 2: 1, 3: 3, 4: 1}
I realize that title may be confusing, so allow me to explain.
I take input from a list that looks like L = [21.123, 22.123, 23.123, 21.123]
I remove the decimals, and sort the list high to low. I also change it to a dictionary with occurrences, which looks like
newlist = {23: 1, 22: 1, 21: 2}
What I need to do is to make a list of keys and values, which I can do. This gives me two lists, of [23, 22, 21] and [1, 1, 2] one for values and one for occurrences. I need to turn my occurrence list into the number of occurrences that are the same as, or lower than it's corresponding key.
I would like my list to look like [23, 22, 21] (which is easy to do) and [4, 3, 2] because 4 of the times are 23 seconds or less, 3 of the times are 22 seconds or less, and 2 of the times are 21 seconds or less.
I'm pretty sure I need a for loop to iterate through every frequency value, and change that value to be the total number of times entered into the list, and subtract any value more than it. I'm not sure how to go about this, so any help would be greatly appreciated.
You want a dictionary where, for each item in your data, the key is the rounded value (int(item)) and the value is the number of of items that are smaller than or equal to this rounded value.
A dictionary comprehension (combined with a list comprehension) can do this:
data = [21.123, 22.123, 23.123, 21.123]
aggregate = {
item: len([n for n in data if int(n) <= item])
for item in set(map(int, data))
}
print(aggregate) # -> {21: 2, 22: 3, 23: 4}
which is the single-statement form of writing such a loop:
aggregate = {}
for item in set(map(int, data)):
aggregate[item] = len([n for n in data if int(n) <= item])
}
Using set() makes the list unique. This way the loop only runs as often as necessary.
Here's a functional solution. The marginally tricky part is the backwards cumulative sum, which is possible feeding a reversed tuple to itertools.accumulate and then reversing the result.
from collections import Counter
from itertools import accumulate
from operator import itemgetter
L = [21.123, 22.123, 23.123, 21.123]
c = Counter(map(int, L)) # Counter({21: 2, 22: 1, 23: 1})
counter = sorted(c.items(), reverse=True) # [(23, 1), (22, 1), (21, 2)]
keys, counts = zip(*counter) # ((23, 22, 21), (1, 1, 2))
cumsum = list(accumulate(counts[::-1]))[::-1] # [4, 3, 2]
Your desired result is stored in keys and cumsum:
print(keys)
(23, 22, 21)
print(cumsum)
[4, 3, 2]
Assuming you get the counts correctly from [21.123, 22.123, 23.123, 21.123], a simple nested loop with a running sum can do the rest:
from collections import Counter
newlist = {23: 1, 22: 1, 21: 2}
counts = Counter()
for k in newlist:
for v in newlist:
if v <= k:
counts[k] += newlist[v]
print(counts)
# Counter({23: 4, 22: 3, 21: 2})
You could also use itertools.product() to condense the double loops into one:
from itertools import product
from collections import Counter
newlist = {23: 1, 22: 1, 21: 2}
counts = Counter()
for k, v in product(newlist, repeat=2):
if v <= k:
counts[k] += newlist[v]
print(counts)
# Counter({23: 4, 22: 3, 21: 2})
The above stores the counts in a collections.Counter(), you can get [4, 3, 2] by calling list(counts.values()).
I found my own solution which seems relatively simple. Code looks like
counter = 0
print(valuelist)
for i in valuelist:
print(int(solves - counter))
counter = counter + i
redonevalues.append(solves - counter + 1)
It takes my values, goes to the first one, adds the occurrences to counter, subtracts counter from solves, and adds 1 to even it out
Suppose I have a list a = [-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1] in python what i want is if there is any built in function in python in which we pass a list and it will return which element are present at what what index ranges for example
>>> index_range(a)
{-1 :'0-2,9-11', 1:'3-5,12-14', 2:'6-8'}
I have tried to use Counter function from collection.Counter library but it only outputs the count of the element.
If there is not any built in function can you please guide me how can i achieve this in my own function not the whole code just a guideline.
You can create your custom function using itertools.groupby and collections.defaultdict to get the range of numbers in the form of list as:
from itertools import groupby
from collections import defaultdict
def index_range(my_list):
my_dict = defaultdict(list)
for i, j in groupby(enumerate(my_list), key=lambda x: x[1]):
index_range, numlist = list(zip(*j))
my_dict[numlist[0]].append((index_range[0], index_range[-1]))
return my_dict
Sample Run:
>>> index_range([-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1])
{1: [(3, 5), (12, 14)], 2: [(6, 8)], -1: [(0, 2), (9, 11)]}
In order to get the values as string in your dict, you may either modify the above function, or use the return value of the function in dictionary comprehension as:
>>> result_dict = index_range([-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1])
>>> {k: ','.join('{}:{}'.format(*i) for i in v)for k, v in result_dict.items()}
{1: '3:5,12:14', 2: '6:8', -1: '0:2,9:11'}
You can use a dict that uses list items as keys and their indexes as values:
>>> lst = [-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1]
>>> indexes = {}
>>> for index, item in enumerate(lst):
... indexes.setdefault(value, []).append(index)
>>> indexes
{1: [3, 4, 5, 12, 13, 14], 2: [6, 7, 8], -1: [0, 1, 2, 9, 10, 11]}
You could then merge the index lists into ranges if that's what you need. I can help you with that too if necessary.
I thought I set out a simple project for myself but I guess not. I think im using the Ordered dict function long because I keep getting:
ValueError: too many values to unpack (expected 2)
Code:
import random
import _collections
shop = {
'bread': 2,
'chips': 4,
'tacos': 5,
'tuna': 4,
'bacon': 8,
}
print(shop)
'''
items = list(shop.keys())
random.shuffle(items)
_collections.OrderedDict(items)
'''
n = random.randrange(0, len(shop.keys()))
m = random.randrange(n, len(shop.keys()))
if m <= n:
m += 1
print(n, " ", m)
for key in shop.keys():
value = shop[key] * random.uniform(0.7,2.3)
print(key, "=", int(value))
if n < m:
n += 1
else:
break
I would like for this code to mix up the dictionary, then multiply the values by 0.7 - 2.3. Then loop within the range 0-5 times in order to give me few random keys from the dictionary.
I have placed ''' ''' over the code that I struggle with and gives me the errors.
You are very close, but you cannot just give the list of keys ot the new OrderedDict, you must give the values too... try this:
import random
import collections
shop = {
'bread': 2,
'chips': 4,
'tacos': 5,
'tuna': 4,
'bacon': 8,
}
print(shop)
items = list(shop.keys())
random.shuffle(items)
print(items)
ordered_shop = collections.OrderedDict()
for item in items:
ordered_shop[item] = shop[item]
print(ordered_shop)
Example output:
{'chips': 4, 'tuna': 4, 'bread': 2, 'bacon': 8, 'tacos': 5}
['bacon', 'chips', 'bread', 'tuna', 'tacos']
OrderedDict([('bacon', 8), ('chips', 4), ('bread', 2), ('tuna', 4), ('tacos', 5)])
You could also do this like this (as pointed out by #ShadowRanger):
items = list(shop.items())
random.shuffle(items)
oshop = collections.OrderedDict(items)
This works because the OrderedDict constructor takes a list of key-value tuples. On reflection, this is probably what you were after with your initial approach - swap keys() for items().
d = collections.OrderedDict.fromkeys(items)
And then use newly created dict d as you wish.
suppose the list
[7,7,7,7,3,1,5,5,1,4]
I would like to remove duplicates and get them counted while preserving the order of the list. To preserve the order of the list removing duplicates i use the function
def unique(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
that is giving to me the output
[7,3,1,5,1,4]
but the desired output i want would be (in the final list could exists) is:
[7,3,3,1,5,2,4]
7 is written because it's the first item in the list, then the following is checked if it's the different from the previous. If the answer is yes count the occurrences of the same item until a new one is found. Then repeat the procedure. Anyone more skilled than me that could give me a hint in order to get the desired output listed above? Thank you in advance
Perhaps something like this?
>>> from itertools import groupby
>>> seen = set()
>>> out = []
>>> for k, g in groupby(lst):
if k not in seen:
length = sum(1 for _ in g)
if length > 1:
out.extend([k, length])
else:
out.append(k)
seen.add(k)
...
>>> out
[7, 4, 3, 1, 5, 2, 4]
Update:
As per your comment I guess you wanted something like this:
>>> out = []
>>> for k, g in groupby(lst):
length = sum(1 for _ in g)
if length > 1:
out.extend([k, length])
else:
out.append(k)
...
>>> out
[7, 4, 3, 1, 5, 2, 1, 4]
Try this
import collections as c
lst = [7,7,7,7,3,1,5,5,1,4]
result = c.OrderedDict()
for el in lst:
if el not in result.keys():
result[el] = 1
else:
result[el] = result[el] + 1
print result
prints out: OrderedDict([(7, 4), (3, 1), (1, 2), (5, 2), (4, 1)])
It gives a dictionary though. For a list, use:
lstresult = []
for el in result:
# print k, v
lstresult.append(el)
if result[el] > 1:
lstresult.append(result[el] - 1)
It doesn't match your desired output but your desired output also seems like kind of a mangling of what is trying to be represented