Group consecutive integers together - python

Have the following code:
import sys
ints = [1,2,3,4,5,6,8,9,10,11,14,34,14,35,16,18,39,10,29,30,14,26,64,27,48,65]
ints.sort()
ints = list(set(ints))
c = {}
for i,v in enumerate(ints):
if i+1 >= len(ints):
continue
if ints[i+1] == v + 1 or ints[i-1] == v - 1:
if len(c) == 0:
c[v] = [v]
c[v].append(ints[i+1])
else:
added=False
for x,e in c.items():
last = e[-1]
if v in e:
added=True
break
if v - last == 1:
c[x].append(v)
added=True
if added==False:
c[v] = [v]
else:
if v not in c:
c[v] = [v]
print('input ', ints)
print('output ', c))
The objective:
Given a list of integers, create a dictionary that contains consecutive integers grouped together to reduce the overall length of the list.
Here is output from my current solution:
input [1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 14, 16, 18, 26, 27, 29, 30, 34, 35, 39, 48, 64, 65]
output {1: [1, 2, 3, 4, 5, 6], 8: [8, 9, 10, 11], 14: [14], 16: [16], 18: [18], 26: [26, 27], 29: [29, 30], 34: [34, 35], 39: [39], 48: [48], 64: [64]}
Conditions/constraints:
If the current integer is either a) in an existing list or b) is the last item in an existing list, we don't want to create another list for this item.
i.e. in the range 1-5 inclusive, when we get to 3, don't create a list 3,4, instead append 3 to the existing list [1,2]
My current iteration works fine, but it gets exponentially slower the bigger the list is because of the for x,e in c.items() existing list check.
How can I make this faster while still achieving the same result?
New solution (from 13 seconds to 0.03 seconds using an input list of 19,000 integers):
c = {}
i = 0
last_list = None
while i < len(ints):
cur = ints[i]
if last_list is None:
c[cur] = [cur]
last_list = c[cur]
else:
if last_list[-1] == cur-1:
last_list.append(cur)
else:
c[cur] = [cur]
last_list = c[cur]
i += 1

As you have lists of consecutive numbers, I suggest you to use range objects instead of lists:
d, head = {}, None
for x in l:
if head is None or x != d[head].stop:
head = x
d[head] = range(head, x+1)

The solution is simple if you use a for loop and just keep track of your current list. Don't forget to make a new list when you find a gap:
result = {}
cl = None
for i in ints:
if cl is None or i - 1 != cl[-1]:
cl = result.setdefault(i, [])
cl.append(i)

There is a great library called more_itertools which has a method called: consecutive_groups():
import more_itertools as mit
x = [1,2,3,4,5,6,8,9,10,11,14,34,14,35,16,18,39,10,29,30,14,26,64,27,48,65]
x = [list(j) for j in mit.consecutive_groups(sorted(list(set(x))))]
# [[1, 2, 3, 4, 5, 6], [8, 9, 10, 11], [14], [16], [18], [26, 27], [29, 30], [34, 35], [39], [48], [64, 65]]
dct_x = {i[0]: i for i in x}
print(dct_x)
Output:
{1: [1, 2, 3, 4, 5, 6], 8: [8, 9, 10, 11], 14: [14], 16: [16], 18: [18], 26: [26, 27], 29: [29, 30], 34: [34, 35], 39: [39], 48: [48], 64: [64, 65]}
One more comment, you want to sort after converting to and from a set, since sets are unordered.

One can solve this task in O(n) (linear) complexity. Just keep it simple:
integers = [1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 14, 16, 18, 26, 27, 29, 30, 34, 35, 39, 48, 64, 65]
helper = []
counter = 0
while counter < len(integers):
if not helper or helper[-1] + 1 != integers[counter]:
print('gap found', integers[counter]) # do your logic
helper.append(integers[counter])
counter += 1
The algorithm above assumes that the input list is already sorted. It gives us a huge advantage. At the same time one can sort the list of integers explicitly before running this algorithm. The total complexity of the solution will be then: O(n * log n) + O(n) which is efficiently O(n * log n). And O(n * log n) is the complexity of the sorting procedure.
I would kindly suggest to remember this extremely useful trick of using sorting before approaching a task for future usages.

Here's a simple implementation that achieves what you are after, using list slicing:
integers = [1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 14, 16, 18, 26, 27, 29, 30, 34, 35, 39, 48, 64, 65]
for i, integer in enumerate(integers):
if i == 0:
out_dict = {}
start = 0
else:
if integer != prev_integer + 1:
out_dict[integers[start]] = integers[start:i]
start = i
if i == len(integers) - 1:
out_dict[integers[start]] = integers[start:]
prev_integer = integer
>>>out_dict = {1: [1, 2, 3, 4, 5, 6], 8: [8, 9, 10, 11], 14: [14], 16: [16], 18: [18], 26: [26, 27], 29: [29, 30], 34: [34, 35], 39: [39], 48: [48], 64: [64]}
Note: The dictionary will likely not be sorted by ascending keys, as dict types are not ordered.

You can try with itertools , But i would like to try recursion :
input_dta=[1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 14, 16, 18, 26, 27, 29, 30, 34, 35, 39, 48, 64, 65]
final_=[]
def consecutives(data):
sub_final=[]
if not data:
return 0
else:
for i,j in enumerate(data):
try:
if abs(data[i]-data[i+1])==1:
sub_final.extend([data[i],data[i+1]])
else:
if sub_final:
final_.append(set(sub_final))
return consecutives(data[i+1:])
except IndexError:
pass
final_.append(set(sub_final))
consecutives(input_dta)
print(final_)
output:
[{1, 2, 3, 4, 5, 6}, {8, 9, 10, 11}, {26, 27}, {29, 30}, {34, 35}, {64, 65}]

Related

Add new element in the next sublist depending in if it has been added or not (involves also a dictionary problem) python

Community of Stackoverflow:
I'm trying to create a list of sublists with a loop based on a random sampling of values of another list; and each sublist has the restriction of not having a duplicate or a value that has already been added to a prior sublist.
Let's say (example) I have a main list:
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
#I get:
[[1,13],[4,1],[8,13]]
#I WANT:
[[1,13],[4,9],[8,14]] #(no duplicates when checking previous sublists)
The real code that I thought it would work is the following (as a draft):
matrixvals=list(matrix.index.values) #list where values are obtained
lists=[[]for e in range(0,3)] #list of sublists that I want to feed
vls=[] #stores the values that have been added to prevent adding them again
for e in lists: #initiate main loop
for i in range(0,5): #each sublist will contain 5 different random samples
x=random.sample(matrixvals,1) #it doesn't matter if the samples are 1 or 2
if any(x) not in vls: #if the sample isn't in the evaluation list
vls.extend(x)
e.append(x)
else: #if it IS, then do a sample but without those already added values (line below)
x=random.sample([matrixvals[:].remove(x) for x in vls],1)
vls.extend(x)
e.append(x)
print(lists)
print(vls)
It didn't work as I get the following:
[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [13]], [[11], [7], [13], [17], [25]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 13, 11, 7, 13, 17, 25]
As you can see, number 13 is repeated 3 times, and I don't understand why
I would want:
[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [70]], [[11], [7], [100], [18], [27]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 70, 11, 7, 100, 18, 27] #no dups
In addition, is there a way to convert the sample.random results as values instead of lists? (to obtain):
[[25,16,15,31,17]], [4, 2, 13, 42,70], [11, 7, 100, 18, 27]]
Also, the final result in reality isn't a list of sublists, actually is a dictionary (the code above is a draft attempt to solve the dict problem), is there a way to obtain that previous method in a dict? With my present code I got the next results:
{'1stkey': {'1stsubkey': {'list1': [41,
40,
22,
28,
26,
14,
41,
15,
40,
33],
'list2': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33],
'list3': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33]},
'2ndsubkey': {'list1': [21,
7,
31,
12,
8,
22,
27,...}
Instead of that result, I would want the following:
{'1stkey': {'1stsubkey': {'list1': [41,40,22],
'list2': [28, 26, 14],
'list3': [41, 15, 40, 33]},
'2ndsubkey': {'list1': [21,7,31],
'list2':[12,8,22],
'list3':[27...,...}#and so on
Is there a way to solve both list and dict problem? Any help will be very appreciated; I can made some progress even only with the list problem
Thanks to all
I realize you may be more interested in finding out why your particular approach isn't working. However, if I've understood your desired behavior, I may be able to offer an alternative solution. After posting my answer, I will take a look at your attempt.
random.sample lets you sample k number of items from a population (collection, list, whatever.) If there are no repeated elements in the collection, then you're guaranteed to have no repeats in your random sample:
from random import sample
pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
num_samples = 4
print(sample(pool, k=num_samples))
Possible output:
[9, 11, 8, 7]
>>>
It doesn't matter how many times you run this snippet, you will never have repeated elements in your random sample. This is because random.sample doesn't generate random objects, it just randomly picks items which already exist in a collection. This is the same approach you would take when drawing random cards from a deck of cards, or drawing lottery numbers, for example.
In your case, pool is the pool of possible unique numbers to choose your sample from. Your desired output seems to be a list of three lists, where each sublist has two samples in it. Rather than calling random.sample three times, once for each sublist, we should call it once with k=num_sublists * num_samples_per_sublist:
from random import sample
pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
num_sublists = 3
samples_per_sublist = 2
num_samples = num_sublists * samples_per_sublist
assert num_samples <= len(pool)
print(sample(pool, k=num_samples))
Possible output:
[14, 10, 1, 8, 6, 3]
>>>
OK, so we have six samples rather than four. No sublists yet. Now you can simply chop this list of six samples up into three sublists of two samples each:
from random import sample
pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
num_sublists = 3
samples_per_sublist = 2
num_samples = num_sublists * samples_per_sublist
assert num_samples <= len(pool)
def pairwise(iterable):
yield from zip(*[iter(iterable)]*samples_per_sublist)
print(list(pairwise(sample(pool, num_samples))))
Possible output:
[(4, 11), (12, 13), (8, 15)]
>>>
Or if you really want sublists, rather than tuples:
def pairwise(iterable):
yield from map(list, zip(*[iter(iterable)]*samples_per_sublist))
EDIT - just realized that you don't actually want a list of lists, but a dictionary. Something more like this? Sorry I'm obsessed with generators, and this isn't really easy to read:
keys = ["1stkey"]
subkeys = ["1stsubkey", "2ndsubkey"]
num_lists_per_subkey = 3
num_samples_per_list = 5
num_samples = num_lists_per_subkey * num_samples_per_list
min_sample = 1
max_sample = 50
pool = list(range(min_sample, max_sample + 1))
def generate_items():
def generate_sub_items():
from random import sample
samples = sample(pool, k=num_samples)
def generate_sub_sub_items():
def chunkwise(iterable, n=num_samples_per_list):
yield from map(list, zip(*[iter(iterable)]*n))
for list_num, chunk in enumerate(chunkwise(samples), start=1):
key = f"list{list_num}"
yield key, chunk
for subkey in subkeys:
yield subkey, dict(generate_sub_sub_items())
for key in keys:
yield key, dict(generate_sub_items())
print(dict(generate_items()))
Possible output:
{'1stkey': {'1stsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}, '2ndsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}}}
>>>

Seperate array into three new arrays using inequalities in Python

I am trying to split an array into three new arrays using inequalities.
This will give you an idea of what I am trying to achieve:
measurement = [1, 5, 10, 13, 40, 43, 60]
for x in measurement:
if 0 < x < 6:
small = measurement
elif 6 < x < 15:
medium = measurement
else
large = measurement
Intended Output:
small = [1, 5]
medium = [10, 13]
large = [40, 43, 60]
If your array is sorted, you can do :
measurement = [1, 5, 10, 13, 40, 43, 60]
one_third = len(measurement) // 3
two_third = (2 * len(measurement)) // 3
small = measurement[:one_third]
medium = measurement[one_third : two_thirds]
large = measurement[two_thirds:]
You could easily generalize to any number of split with a loop. Not sure if you wanted explicitly those inequalities or just split with the array in three. If its the first one, my answer is not right
You can use numpy:
arr = np.array(measurement)
small = arr[(arr>0)&(arr<6)] # array([1, 5])
medium = arr[(arr>6)&(arr<15)] # array([10, 13])
large = arr[(arr>15)] # array([40, 43, 60])
You can also use dictionary:
d = {'small':[], 'medium':[], 'large':[]}
for x in measurement:
if 0 < x < 6:
d['small'].append(x)
elif 6 < x < 15:
d['medium'].append(x)
else:
d['large'].append(x)
Output:
{'small': [1, 5], 'medium': [10, 13], 'large': [40, 43, 60]}
With the bisect module you can do something along these lines:
from bisect import bisect
breaks=[0,6,15,float('inf')]
buckets={}
m = [1, 5, 10, 13, 40, 43, 60]
for e in m:
buckets.setdefault(breaks[bisect(breaks, e)], []).append(e)
You then have a dict of lists matching what you are looking for:
>>> buckets
{6: [1, 5], 15: [10, 13], inf: [40, 43, 60]}
You can also form tuples of your break points and list that will become a dict to form the sub lists:
m = [1, 5, 10, 13, 40, 43, 60]
buckets=[('small',[]), ('medium',[]), ('large',[]), ('other',[])]
breaks=[(0,6),(6,15),(15,float('inf'))]
for x in m:
buckets[
next((i for i,t in enumerate(breaks) if t[0]<=x<t[1]), -1)
][1].append(x)
>>> dict(buckets)
{'small': [1, 5], 'medium': [10, 13], 'large': [40, 43, 60], 'other': []}

Count total number of occurrences of given list of integers in another

How do I count the number of times the same integer occurs?
My code so far:
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
print(x)
q += 1
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
The output of this is:
2
2
1
3
What I want to achieve is counting the number of times '1', '2' '3' matches occurs.
I have tried:
v = 0
if searchAlgorithm(a, b) == 2:
v += 1
print(v)
But that results in 0
You can use intersection of sets to find elements that are common in both lists. Then you can get the length of the sets. Here is how it looks:
num_common_elements = (len(set(a).intersection(i)) for i in b)
You can then iterate over the generator num_common_elements to use the values. Or you can cast it to a list to see the results:
print(list(num_common_elements))
[Out]: [2, 2, 1, 3]
If you want to implement the intersection functionality yourself, you can use the sum method to implement your own version. This is equivalent to doing len(set(x).intersection(set(y))
sum(i in y for i in x)
This works because it generates values such as [True, False, False, True, True] representing where the values in the first list are present in the second list. The sum method then treats the Trues as 1s and Falses as 0s, thus giving you the size of the intersection set
This is based on what I understand from your question. Probably you are looking for this:
from collections import Counter
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
lst = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
lst.append(x)
q += 1
print(Counter(lst))
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
# Counter({2: 2, 1: 1, 3: 1})
Thanks to some for their helpful feedback, I have since come up a more simplified solution that does exactly what I want.
By storing the results of the matches in a list, I can then return the list out of the searchAlgorithm function and simple use .count() to count all the matches of a specific number within the list.
def searchAlgorithm (target, array):
i = 0
q = 0
results = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
results.append(x)
q += 1
return results
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
d2 = (searchAlgorithm(winNum, lotto).count(2))

Comparing lists with their indices and content in Python

I have a list of numbers as
N = [13, 14, 15, 25, 27, 31, 35, 36, 43]
After some calculations, for each element in N, I get the following list as the answers.
ndlist = [4, 30, 0, 42, 48, 4, 3, 42, 3]
That is, for the first index in N (which is 13), my answer is 4 in ndlist.
For some indices in N, I get the same answer in ndlist. For example, when N= 13 and 31, the answer is 4 in ndlist.
I need to find the numbers in N (13 and 31 in my example) such that they have the same answer in ndlist.
Can someone help me to that?
You can use a defaultdict and put those into a list keyed by the answer like:
Code:
N = [13, 14, 15, 25, 27, 31, 35, 36, 43]
ndlist = [4, 30, 0, 42, 48, 4, 3, 42, 3]
from collections import defaultdict
answers = defaultdict(list)
for n, answer in zip(N, ndlist):
answers[answer].append(n)
print(answers)
print([v for v in answers.values() if len(v) > 1])
Results:
defaultdict(<class 'list'>, {4: [13, 31], 30: [14],
0: [15], 42: [25, 36], 48: [27], 3: [35, 43]})
[[13, 31], [25, 36], [35, 43]]
Here is a way using only a nested list comprehension:
[N[idx] for idx, nd in enumerate(ndlist) if nd in [i for i in ndlist if ndlist.count(i)>1]]
#[13, 25, 31, 35, 36, 43]
To explain: the inner list comprehension ([i for i in ndlist if ndlist.count(i)>1]) gets all duplicate values in ndlist, and the rest of the list comprehension extracts the corresponding values in N where those values are found in ndlist

Python: Renumerate elements in multiple lists

Suppose I have a dictionary with lists as follows:
{0: [31, 32, 58, 59], 1: [31, 32, 12, 13, 37, 38], 2: [12, 13]}
I am trying to obtain the following one from it:
{0: [1, 2, 3, 4], 1: [1, 2, 5, 6, 7, 8], 2: [5, 6]}
So I renumerate all the entries in order of occurence but skipping those that were already renumerated.
What I have now is a bunch of for loops going back and forth which works, but doesn't look good at all, could anyone please tell me the way it should be done in Python 2.7?
Thank you
import operator
data = {0: [31, 32, 58, 59], 1: [31, 32, 12, 13, 37, 38], 2: [12, 13]}
# the accumulator is the new dict with renumbered values combined with a list of renumbered numbers so far
# item is a (key, value) element out of the original dict
def reductor(acc, item):
(out, renumbered) = acc
(key, values) = item
def remapper(v):
try:
x = renumbered.index(v)
except ValueError:
x = len(renumbered)
renumbered.append(v)
return x
# transform current values to renumbered values
out[key] = map(remapper, values)
# return output and updated list of renumbered values
return (out, renumbered)
# now reduce the original data
print reduce(reductor, sorted(data.iteritems(), key=operator.itemgetter(0)), ({}, []))
If you're not worried about memory or speed you can use an intermediate dictionary to map the new values:
a = {0: [31, 32, 58, 59], 1: [31, 32, 12, 13, 37, 38], 2: [12, 13]}
b = {}
c = {}
for key in sorted(a.keys()):
c[key] = [b.setdefault(val, len(b)+1) for val in a[key]]
Just use a function like this:
def renumerate(data):
ids = {}
def getid(val):
if val not in ids:
ids[val] = len(ids) + 1
return ids[val]
return {k : map(getid, data[k]) for k in sorted(data.keys())}
Example
>>> data = {0: [31, 32, 58, 59], 1: [31, 32, 12, 13, 37, 38], 2: [12, 13]}
>>> print renumerate(data)
{0: [1, 2, 3, 4], 1: [1, 2, 5, 6, 7, 8], 2: [5, 6]}
data = {0: [31, 32, 58, 59], 1: [31, 32, 12, 13, 37, 38], 2: [12, 13]}
from collections import defaultdict
numbered = defaultdict(lambda: len(numbered)+1)
result = {key: [numbered[v] for v in val] for key, val in sorted(data.iteritems(), key=lambda item: item[0])}
print result

Categories

Resources