"Compressing" a list of integers - python

I have a list of integers as follows:
my_list = [2,2,2,2,3,4,2,2,4,4,3]
What I want is to have this as a list os strings, indexed and 'compressed', that is, with each element indicated by its position in the list and with each successive duplicate element indicated as a range, like this:
my_new_list = ['0-3,2', '4,3', '5,4', '6-7,2', '8-9,4', '10,3']
EDIT: The expected output should indicate that list elements 0 to 3 have the number 2, element 3, the number 3, element 5, the number 4, elements 6 and 7, the number 2, elements 8 and 9, number 4, and element 10, number 3.
EDIT 2: The output list need not (indeed cannot) be a list of integers, but a list of strings instead.
I could find many examples of finding (and deleting) duplicated elements from lists, but nothing along the lines of what I need.
Could someone point out a relevant example or suggest an algorithm for solving this?
Thanks in advance!

Like most problems involving cascading consecutive duplicates, you can still use groupby() for this. Just group indices by the value at each index.
values = [2,2,2,2,3,4,2,2,4,4,3]
result = []
for key, group in itertools.groupby(range(len(values)), values.__getitem__):
indices = list(group)
if len(indices) > 1:
result.append('{}-{},{}'.format(indices[0], indices[-1], key))
else:
result.append('{},{}'.format(indices[0], key))
print(result)
Output:
['0-3,2', '4,3', '5,4', '6-7,2', '8-9,4', '10,3']

Here is a lazy version that works on any sequence, and yields slices. Thus it's generic and memory efficient.
def compress(seq):
start_index = 0
previous = None
n = 0
for i, x in enumerate(seq):
if previous and x != previous:
yield previous, slice(start_index, i)
start_index = i
previous = x
n += 1
if previous:
yield previous, slice(start_index, n)
Usage :
assert list(compress([2, 2, 2, 2, 3, 4, 2, 2, 4, 4, 3])) == [
(2, slice(0, 4)),
(3, slice(4, 5)),
(4, slice(5, 6)),
(2, slice(6, 8)),
(4, slice(8, 10)),
(3, slice(10, 11)),
]
Why slices? Because it's convenient (can be used as-is for indexing) and the semantics (upper bound not included) are more "standard". Changing that to tuples or string with upper bound is easy btw.

You could use enumerate with a generator function
def seq(l):
it = iter(l)
# get first element and set the start index to 0.
start, prev = 0, next(it)
# use enumerate to track the rest of the indexes
for ind, ele in enumerate(it, 1):
# if last seen element is not the same the sequence is over
# if start i == ind - 1 the sequence had just a single element.
if prev != ele:
yield ("{}-{}, {}".format(start, ind - 1, prev)) \
if start != ind - 1 else ("{}, {}".format(start, prev))
start = ind
prev = ele
yield ("{}-{}, {}".format(start-1, ind-1, prev)) \
if start != ind else ("{}, {}".format(start, prev))
Output:
In [3]: my_list = [2, 2, 2, 2, 3, 4, 2, 2, 4, 4, 3]
In [4]: list(seq(my_list))
Out[4]: ['0-3, 2', '4, 3', '5, 4', '6-7, 2', '8-9, 4', '10, 3']
I was going to use groupby but will be faster.
In [11]: timeit list(seq(my_list))
100000 loops, best of 3: 4.38 µs per loop
In [12]: timeit itools()
100000 loops, best of 3: 9.23 µs per loop

Construct the list with number of consecutive occurences with the item. Then iterate the list and get the list with the range of index of each item.
from itertools import groupby
new_list = []
for k, g in groupby([2,2,2,2,3,4,2,2,4,4,3]):
sum_each = 0
for i in g:
sum_each += 1
##Construct the list with number of consecutive occurences with the item like this `[(4, 2), (1, 3), (1, 4), (2, 2), (2, 4), (1, 3)]`
new_list.append((sum_each, k))
x = 0
for (n, item) in enumerate(new_list):
if item[0] > 1:
new_list[n] = str(x) + '-' + str(x+item[0]-1) + ',' + str(item[1])
else:
new_list[n] = str(x) + ',' + str(item[1])
x += item[0]
print new_list

First off, your requested results are not valid python. I'm going to assume that the following format would work for you:
my_new_list = [ ((0,3),2), ((4,4),3), ((5,5),4), ((6,7),2), ((8,9),4), ((10,10),3) ]
Given that, you can first transform my_list into a list of ((index,index),value) tuples, then use reduce to gather that into ranges:
my_new_list = reduce(
lambda new_list,item:
new_list[:-1] + [((new_list[-1][0][0],item[0][1]),item[1])]
if len(new_list) > 0 and new_list[-1][1] == item[1]
else new_list + [item]
, [((index,index),value) for (index,value) in enumerate(my_list)]
, []
)
This does the following:
transform the list into ((index,index),value) tuples:
[((index,index),value) for (index,value) in enumerate(my_list)]
use reduce to merge adjacent items with the same value: If the list being built has at least 1 item and the last item in the list has the same value as the item being processed, reduce it to the list minus the last item, plus a new item consisting of the first index from the last list item plus the second index of the current item and the value of the current item. If the list being built is empty or the last item in the list is not the same value as the item being processed, just add the current item to the list.
Edited to use new_list instead of list as my lambda parameter; using list as a parameter or variable name is bad form

Here's a generator-based solution similar to Padraic's. However it avoids enumerate()-based index tracking and thus is probably faster for huge lists. I didn't worry about your desired output formatting, either.
def compress_list(ilist):
"""Compresses a list of integers"""
left, right = 0, 0
length = len(ilist)
while right < length:
if ilist[left] == ilist[right]:
right += 1
continue
yield (ilist[left], (left, right-1))
left = right
# at the end of the list, yield the last item
yield (ilist[left], (left, right-1))
It would be used like this:
my_list = [2,2,2,2,3,4,2,2,4,4,3]
my_compressed_list = [i for i in compress_list(my_list)]
my_compressed_list
Resulting in output of:
[(2, (0, 3)),
(3, (4, 4)),
(4, (5, 5)),
(2, (6, 7)),
(4, (8, 9)),
(3, (10, 10))]

Some good answers here, and thought I would offer an alternative. We iterate through the list of numbers and keep an updating current value, associated with a list of indicies for that value current_indicies. We then look-ahead one element to see if the consecutive number differs from current, if it does we go ahead and add it as a 'compressed number'.
def compress_numbers(l):
result = []
current = None
current_indicies = None
for i, item in enumerate(l):
if current != item:
current = item
current_indicies = [i]
elif current == item:
current_indicies.append(i)
try:
if l[i+1] != current:
result.append(format_entry(current_indicies, current))
except:
result.append(format_entry(current_indicies, current))
return result
# Helper method to format entry in the list.
def format_entry(indicies, value):
i_range = None
if len(indicies) > 1:
i_range = '{}-{}'.format(indicies[0], indicies[-1])
else:
i_range = indicies[0]
return '{},{}'.format(i_range, value)
Sample Output:
>>> print compress_numbers([2, 2, 2, 2, 3, 4, 2, 2, 4, 4, 3])
['0-3,2', '4,3', '5,4', '6-7,2', '8-9,4', '10,3']

Related

Python split list before and after specific value

I want to split a list into tuples after and before a specific value.
Example
Input:
list1 = [2, 1, 1, 2, 1, 2, 1, 1]
print(some_func(list1, 2))
Output:
>> [(2,1,1), (1,1,2,1), (1,2,1,1)]
so like I want every tuple to be sliced by the '2' but also keep other values in the tuple. How can I achieve this easily?
Any help is appreciated
def split_on(lst, val):
try:
# get a tuple between the start of lst and the second occurrence of val
first_idx = lst.index(val)
remainder = lst[first_idx + 1:]
second_idx = remainder.index(val) + (first_idx + 1)
# and recur with the rest of the list beyond the first occurrence
return [tuple(lst[:second_idx])] + split_on(remainder, val)
except ValueError:
# base case: there's zero or one occurrences of val,
# so we just return the whole lst as a tuple
return [tuple(lst)]
split_on([2,1,1,2,1,2,1,1], 2)
# [(2, 1, 1), (1, 1, 2, 1), (1, 2, 1, 1)]
Note that this is not a terribly efficient solution, and for very large lists will start to get pretty slow, since list slicing is kind of expensive as an operation. Something in itertools might help with a different, more efficicient approach.
You could find the indexes of the 2s and then pair each index with the one that is two over to form sub ranges:
def neighbors(aList,value):
indices = [-1] + [i for i,v in enumerate(aList) if v == value] + [len(aList)]
return [ tuple(aList[s+1:e]) for s,e in zip(indices,indices[2:]) ]
list1 = [2, 1, 1, 2, 1, 2, 1, 1]
print(neighbors(list1,2))
[(2, 1, 1), (1, 1, 2, 1), (1, 2, 1, 1)]
Note that this will return an empty list if the value is not in aList. You will have to add a condition to return the whole list instead if len(indices)<3: return [tuple(aList)] if that's what you want it to do.

Continue loop after the first assignment [duplicate]

How do I access the index while iterating over a sequence with a for loop?
xs = [8, 23, 45]
for x in xs:
print("item #{} = {}".format(index, x))
Desired output:
item #1 = 8
item #2 = 23
item #3 = 45
Use the built-in function enumerate():
for idx, x in enumerate(xs):
print(idx, x)
It is non-pythonic to manually index via for i in range(len(xs)): x = xs[i] or manually manage an additional state variable.
Check out PEP 279 for more.
Using a for loop, how do I access the loop index, from 1 to 5 in this case?
Use enumerate to get the index with the element as you iterate:
for index, item in enumerate(items):
print(index, item)
And note that Python's indexes start at zero, so you would get 0 to 4 with the above. If you want the count, 1 to 5, do this:
count = 0 # in case items is empty and you need it after the loop
for count, item in enumerate(items, start=1):
print(count, item)
Unidiomatic control flow
What you are asking for is the Pythonic equivalent of the following, which is the algorithm most programmers of lower-level languages would use:
index = 0 # Python's indexing starts at zero
for item in items: # Python's for loops are a "for each" loop
print(index, item)
index += 1
Or in languages that do not have a for-each loop:
index = 0
while index < len(items):
print(index, items[index])
index += 1
or sometimes more commonly (but unidiomatically) found in Python:
for index in range(len(items)):
print(index, items[index])
Use the Enumerate Function
Python's enumerate function reduces the visual clutter by hiding the accounting for the indexes, and encapsulating the iterable into another iterable (an enumerate object) that yields a two-item tuple of the index and the item that the original iterable would provide. That looks like this:
for index, item in enumerate(items, start=0): # default is zero
print(index, item)
This code sample is fairly well the canonical example of the difference between code that is idiomatic of Python and code that is not. Idiomatic code is sophisticated (but not complicated) Python, written in the way that it was intended to be used. Idiomatic code is expected by the designers of the language, which means that usually this code is not just more readable, but also more efficient.
Getting a count
Even if you don't need indexes as you go, but you need a count of the iterations (sometimes desirable) you can start with 1 and the final number will be your count.
count = 0 # in case items is empty
for count, item in enumerate(items, start=1): # default is zero
print(item)
print('there were {0} items printed'.format(count))
The count seems to be more what you intend to ask for (as opposed to index) when you said you wanted from 1 to 5.
Breaking it down - a step by step explanation
To break these examples down, say we have a list of items that we want to iterate over with an index:
items = ['a', 'b', 'c', 'd', 'e']
Now we pass this iterable to enumerate, creating an enumerate object:
enumerate_object = enumerate(items) # the enumerate object
We can pull the first item out of this iterable that we would get in a loop with the next function:
iteration = next(enumerate_object) # first iteration from enumerate
print(iteration)
And we see we get a tuple of 0, the first index, and 'a', the first item:
(0, 'a')
we can use what is referred to as "sequence unpacking" to extract the elements from this two-tuple:
index, item = iteration
# 0, 'a' = (0, 'a') # essentially this.
and when we inspect index, we find it refers to the first index, 0, and item refers to the first item, 'a'.
>>> print(index)
0
>>> print(item)
a
Conclusion
Python indexes start at zero
To get these indexes from an iterable as you iterate over it, use the enumerate function
Using enumerate in the idiomatic way (along with tuple unpacking) creates code that is more readable and maintainable:
So do this:
for index, item in enumerate(items, start=0): # Python indexes start at zero
print(index, item)
It's pretty simple to start it from 1 other than 0:
for index, item in enumerate(iterable, start=1):
print index, item # Used to print in python<3.x
print(index, item) # Migrate to print() after 3.x+
for i in range(len(ints)):
print(i, ints[i]) # print updated to print() in Python 3.x+
Here's how you can access the indices and array's elements using for-in loops.
1. Looping elements with counter and += operator.
items = [8, 23, 45, 12, 78]
counter = 0
for value in items:
print(counter, value)
counter += 1
Result:
# 0 8
# 1 23
# 2 45
# 3 12
# 4 78
2. Looping elements using enumerate() method.
items = [8, 23, 45, 12, 78]
for i in enumerate(items):
print("index/value", i)
Result:
# index/value (0, 8)
# index/value (1, 23)
# index/value (2, 45)
# index/value (3, 12)
# index/value (4, 78)
3. Using index and value separately.
items = [8, 23, 45, 12, 78]
for index, value in enumerate(items):
print("index", index, "for value", value)
Result:
# index 0 for value 8
# index 1 for value 23
# index 2 for value 45
# index 3 for value 12
# index 4 for value 78
4. You can change the index number to any increment.
items = [8, 23, 45, 12, 78]
for i, value in enumerate(items, start=1000):
print(i, value)
Result:
# 1000 8
# 1001 23
# 1002 45
# 1003 12
# 1004 78
5. Automatic counter incrementation with range(len(...)).
items = [8, 23, 45, 12, 78]
for i in range(len(items)):
print("Index:", i, "Value:", items[i])
Result:
# ('Index:', 0, 'Value:', 8)
# ('Index:', 1, 'Value:', 23)
# ('Index:', 2, 'Value:', 45)
# ('Index:', 3, 'Value:', 12)
# ('Index:', 4, 'Value:', 78)
6. Using for-in loop inside function.
items = [8, 23, 45, 12, 78]
def enum(items, start=0):
counter = start
for value in items:
print(counter, value)
counter += 1
enum(items)
Result:
# 0 8
# 1 23
# 2 45
# 3 12
# 4 78
7. Of course, we can't forget about while loop.
items = [8, 23, 45, 12, 78]
counter = 0
while counter < len(items):
print(counter, items[counter])
counter += 1
Result:
# 0 8
# 1 23
# 2 45
# 3 12
# 4 78
8. yield statement returning a generator object.
def createGenerator():
items = [8, 23, 45, 12, 78]
for (j, k) in enumerate(items):
yield (j, k)
generator = createGenerator()
for i in generator:
print(i)
Result:
# (0, 8)
# (1, 23)
# (2, 45)
# (3, 12)
# (4, 78)
9. Inline expression with for-in loop and lambda.
items = [8, 23, 45, 12, 78]
xerox = lambda upperBound: [(i, items[i]) for i in range(0, upperBound)]
print(xerox(5))
Result:
# [(0, 8), (1, 23), (2, 45), (3, 12), (4, 78)]
10. Iterate over two lists at once using Python's zip() function.
items = [8, 23, 45, 12, 78]
indices = [0, 1, 2, 3, 4]
for item, index in zip(items, indices):
print("{}: {}".format(index, item))
Result:
# 0: 8
# 1: 23
# 2: 45
# 3: 12
# 4: 78
11. Loop over 2 lists with a while loop and iter() & next() methods.
items = [8, 23, 45, 12, 78]
indices = range(len(items))
iterator1 = iter(indices)
iterator2 = iter(items)
try:
while True:
i = next(iterator1)
element = next(iterator2)
print(i, element)
except StopIteration:
pass
Result:
# 0 8
# 1 23
# 2 45
# 3 12
# 4 78
As is the norm in Python, there are several ways to do this. In all examples assume: lst = [1, 2, 3, 4, 5]
Using enumerate (considered most idiomatic)
for index, element in enumerate(lst):
# Do the things that need doing here
This is also the safest option in my opinion because the chance of going into infinite recursion has been eliminated. Both the item and its index are held in variables and there is no need to write any further code to access the item.
Creating a variable to hold the index (using for)
for index in range(len(lst)): # or xrange
# you will have to write extra code to get the element
Creating a variable to hold the index (using while)
index = 0
while index < len(lst):
# You will have to write extra code to get the element
index += 1 # escape infinite recursion
There is always another way
As explained before, there are other ways to do this that have not been explained here and they may even apply more in other situations. For example, using itertools.chain with for. It handles nested loops better than the other examples.
Old fashioned way:
for ix in range(len(ints)):
print(ints[ix])
List comprehension:
[ (ix, ints[ix]) for ix in range(len(ints))]
>>> ints
[1, 2, 3, 4, 5]
>>> for ix in range(len(ints)): print ints[ix]
...
1
2
3
4
5
>>> [ (ix, ints[ix]) for ix in range(len(ints))]
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5)]
>>> lc = [ (ix, ints[ix]) for ix in range(len(ints))]
>>> for tup in lc:
... print(tup)
...
(0, 1)
(1, 2)
(2, 3)
(3, 4)
(4, 5)
>>>
Accessing indexes & Performance Benchmarking of approaches
The fastest way to access indexes of list within loop in Python 3.7 is to use the enumerate method for small, medium and huge lists.
Please see different approaches which can be used to iterate over list and access index value and their performance metrics (which I suppose would be useful for you) in code samples below:
# Using range
def range_loop(iterable):
for i in range(len(iterable)):
1 + iterable[i]
# Using enumerate
def enumerate_loop(iterable):
for i, val in enumerate(iterable):
1 + val
# Manual indexing
def manual_indexing_loop(iterable):
index = 0
for item in iterable:
1 + item
index += 1
See performance metrics for each method below:
from timeit import timeit
def measure(l, number=10000):
print("Measure speed for list with %d items" % len(l))
print("range: ", timeit(lambda :range_loop(l), number=number))
print("enumerate: ", timeit(lambda :enumerate_loop(l), number=number))
print("manual_indexing: ", timeit(lambda :manual_indexing_loop(l), number=number))
# Measure speed for list with 1000 items
measure(range(1000))
# range: 1.161622366
# enumerate: 0.5661940879999996
# manual_indexing: 0.610455682
# Measure speed for list with 100000 items
measure(range(10000))
# range: 11.794482958
# enumerate: 6.197628574000001
# manual_indexing: 6.935181098000001
# Measure speed for list with 10000000 items
measure(range(10000000), number=100)
# range: 121.416859069
# enumerate: 62.718909123
# manual_indexing: 69.59575057400002
As the result, using enumerate method is the fastest method for iteration when the index needed.
Adding some useful links below:
What is the difference between range and xrange functions in Python 2.X?
What is faster for loop using enumerate or for loop using xrange in Python?
range(len(list)) or enumerate(list)?
You can use enumerate and embed expressions inside string literals to obtain the solution.
This is a simple way:
a=[4,5,6,8]
for b, val in enumerate(a):
print('item #{} = {}'.format(b+1, val))
First of all, the indexes will be from 0 to 4. Programming languages start counting from 0; don't forget that or you will come across an index-out-of-bounds exception. All you need in the for loop is a variable counting from 0 to 4 like so:
for x in range(0, 5):
Keep in mind that I wrote 0 to 5 because the loop stops one number before the maximum. :)
To get the value of an index, use
list[index]
You can do it with this code:
ints = [8, 23, 45, 12, 78]
index = 0
for value in (ints):
index +=1
print index, value
Use this code if you need to reset the index value at the end of the loop:
ints = [8, 23, 45, 12, 78]
index = 0
for value in (ints):
index +=1
print index, value
if index >= len(ints)-1:
index = 0
According to this discussion: object's list index
Loop counter iteration
The current idiom for looping over the indices makes use of the built-in range function:
for i in range(len(sequence)):
# Work with index i
Looping over both elements and indices can be achieved either by the old idiom or by using the new zip built-in function:
for i in range(len(sequence)):
e = sequence[i]
# Work with index i and element e
or
for i, e in zip(range(len(sequence)), sequence):
# Work with index i and element e
via PEP 212 – Loop Counter Iteration.
In your question, you write "how do I access the loop index, from 1 to 5 in this case?"
However, the index for a list runs from zero. So, then we need to know if what you actually want is the index and item for each item in a list, or whether you really want numbers starting from 1. Fortunately, in Python, it is easy to do either or both.
First, to clarify, the enumerate function iteratively returns the index and corresponding item for each item in a list.
alist = [1, 2, 3, 4, 5]
for n, a in enumerate(alist):
print("%d %d" % (n, a))
The output for the above is then,
0 1
1 2
2 3
3 4
4 5
Notice that the index runs from 0. This kind of indexing is common among modern programming languages including Python and C.
If you want your loop to span a part of the list, you can use the standard Python syntax for a part of the list. For example, to loop from the second item in a list up to but not including the last item, you could use
for n, a in enumerate(alist[1:-1]):
print("%d %d" % (n, a))
Note that once again, the output index runs from 0,
0 2
1 3
2 4
That brings us to the start=n switch for enumerate(). This simply offsets the index, you can equivalently simply add a number to the index inside the loop.
for n, a in enumerate(alist, start=1):
print("%d %d" % (n, a))
for which the output is
1 1
2 2
3 3
4 4
5 5
If I were to iterate nums = [1, 2, 3, 4, 5] I would do
for i, num in enumerate(nums, start=1):
print(i, num)
Or get the length as l = len(nums)
for i in range(l):
print(i+1, nums[i])
If there is no duplicate value in the list:
for i in ints:
indx = ints.index(i)
print(i, indx)
You can also try this:
data = ['itemA.ABC', 'itemB.defg', 'itemC.drug', 'itemD.ashok']
x = []
for (i, item) in enumerate(data):
a = (i, str(item).split('.'))
x.append(a)
for index, value in x:
print(index, value)
The output is
0 ['itemA', 'ABC']
1 ['itemB', 'defg']
2 ['itemC', 'drug']
3 ['itemD', 'ashok']
You can use the index method:
ints = [8, 23, 45, 12, 78]
inds = [ints.index(i) for i in ints]
It is highlighted in a comment that this method doesn’t work if there are duplicates in ints. The method below should work for any values in ints:
ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup[0] for tup in enumerate(ints)]
Or alternatively
ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup for tup in enumerate(ints)]
if you want to get both the index and the value in ints as a list of tuples.
It uses the method of enumerate in the selected answer to this question, but with list comprehension, making it faster with less code.
A simple answer using a while loop:
arr = [8, 23, 45, 12, 78]
i = 0
while i < len(arr):
print("Item ", i + 1, " = ", arr[i])
i += 1
Output:
Item 1 = 8
Item 2 = 23
Item 3 = 45
Item 4 = 12
Item 5 = 78
You can simply use a variable such as count to count the number of elements in the list:
ints = [8, 23, 45, 12, 78]
count = 0
for i in ints:
count = count + 1
print('item #{} = {}'.format(count, i))
To print a tuple of (index, value) in a list comprehension using a for loop:
ints = [8, 23, 45, 12, 78]
print [(i,ints[i]) for i in range(len(ints))]
Output:
[(0, 8), (1, 23), (2, 45), (3, 12), (4, 78)]
In addition to all the excellent answers above, here is a solution to this problem when working with pandas Series objects. In many cases, pandas Series have custom/unique indices (for example, unique identifier strings) that can't be accessed with the enumerate() function.
xs = pd.Series([8, 23, 45])
xs.index = ['G923002', 'G923004', 'G923005']
print(xs)
Output:
# G923002 8
# G923004 23
# G923005 45
# dtype: int64
We can see below that enumerate() doesn't give us the desired result:
for id, x in enumerate(xs):
print("id #{} = {}".format(id, x))
Output:
# id #0 = 8
# id #1 = 23
# id #2 = 45
We can access the indices of a pandas Series in a for loop using .items():
for id, x in xs.items():
print("id #{} = {}".format(id, x))
Output:
# id #G923002 = 8
# id #G923004 = 23
# id #G923005 = 45
One-liner lovers:
[index for index, datum in enumerate(data) if 'a' in datum]
Explaination:
>>> data = ['a','ab','bb','ba','alskdhkjl','hkjferht','lal']
>>> data
['a', 'ab', 'bb', 'ba', 'alskdhkjl', 'hkjferht', 'lal']
>>> [index for index, datum in enumerate(data) if 'a' in datum]
[0, 1, 3, 4, 6]
>>> [index for index, datum in enumerate(data) if 'b' in datum]
[1, 2, 3]
>>>
Points to take:
Python list doesn't provide an index; if you are using for
If you enumerate a list it will return you ANOTHER list
BUT that list will have a different type
it will wrap each and every element with an index as tuple
we can access tuples as variables, separated with comma(,)
Thanks. Keep me in your prayers.
You can use range(len(some_list)) and then lookup the index like this
xs = [8, 23, 45]
for i in range(len(xs)):
print("item #{} = {}".format(i + 1, xs[i]))
Or use the Python’s built-in enumerate function which allows you to loop over a list and retrieve the index and the value of each item in the list
xs = [8, 23, 45]
for idx, val in enumerate(xs, start=1):
print("item #{} = {}".format(idx, val))
It can be achieved with the following code:
xs = [8, 23, 45]
for x, n in zip(xs, range(1, len(xs)+1)):
print("item #{} = {}".format(n, x))
Here, range(1, len(xs)+1); If you expect the output to start from 1 instead of 0, you need to start the range from 1 and add 1 to the total length estimated since python starts indexing the number from 0 by default.
Final Output:
item #1 = 8
item #2 = 23
item #3 = 45
A loop with a "counter" variable set as an initialiser that will be a parameter, in formatting the string, as the item number.
The for loop accesses the "listos" variable which is the list. As we access the list by "i", "i" is formatted as the item price (or whatever it is).
listos = [8, 23, 45, 12, 78]
counter = 1
for i in listos:
print('Item #{} = {}'.format(counter, i))
counter += 1
Output:
Item #1 = 8
Item #2 = 23
Item #3 = 45
Item #4 = 12
Item #5 = 78
This serves the purpose well enough:
list1 = [10, 'sumit', 43.21, 'kumar', '43', 'test', 3]
for x in list1:
print('index:', list1.index(x), 'value:', x)

Python - how to find the position of the second lowest value in a list [duplicate]

This question already has answers here:
Finding the nth smallest number in a list?
(2 answers)
Closed 5 years ago.
I have the following list:
list = [7,3,6,4,6]
I know I can use list.index(min(list)) to find the position of the lowest number in the list, but how do I use it to find the second lowest?
don't use list as a var name
edit, originally misread as index of 2nd highest - fixed now, thanx Jean-François Fabre
lst = [7,3,6,4,6]
lst.index(sorted(lst)[1])
Out[161]: 3
lst[3]
Out[162]: 4
sorted(lst)
Out[163]: [3, 4, 6, 6, 7]
the above has a problem with repeated numbers in the input list, by using .index you get the index of the 1st match
lst = [1, 1, 7,3,6,4,6]
lst.index(sorted(lst)[1])
Out[9]: 0 # 0 is wrong, the postion of the 2nd smallest is 1
I think this fixes it
n = 1
sorted([*enumerate(lst)], key=lambda x: x[1])[n][0]
Out[11]: 1
looking at the pieces
[*enumerate(lst)]
Out[12]: [(0, 1), (1, 1), (2, 7), (3, 3), (4, 6), (5, 4), (6, 6)]
enumerate pairs a count with the values in the input lst, * forces 'unpacking' of the enumerate object, the outer sq brackets 'catch' this output in a list
inside Python builtin sorted the 2nd argument key=lambda x: x[1] tell it to look in the 2nd position of the tuples from [*enumerate(lst)] which are the numbers from lst
sorted([*enumerate(lst)], key=lambda x: x[1])
Out[13]: [(0, 1), (1, 1), (3, 3), (5, 4), (4, 6), (6, 6), (2, 7)]
the indexing that list with [n][0] gets the n-th sorted tuple, and takes the 1st value from the tuple which is the index asigned in enumerate
try this:
list.index(sorted(list)[1])
you could sort the enumerated list according to value, and pick the second item.
def second_pos(numbers):
return sorted(enumerate(numbers),key=lambda x:x[::-1])[1][0]
print(second_pos([7,3,6,4,6]))
result: 3, as 4 is the second lowest value
This solution involves one sort operation only, no index operation afterwards to find the index, saving that last O(n) operation.
Note that there's a tiebreaker picking the lowest positionned item in case 2 values are equal.
Also note that if the list is too small, you can get an IndexError
Use set to get rid of duplicates, like this
lowest_nth = 2
lst.index(sorted(set(lst))[lowest_nth-1])
Edit (important remark):
If you try with this code, will see a wrong result with duplicated values if set is not used:
lst = [7, 6, 6, 4, 3, 8]
def get_index_without_using_set():
return lst.index(sorted(lst)[lowest_nth-1])
def get_index_using_set():
return lst.index(sorted(set(lst))[lowest_nth-1])
print(lst)
print()
print('Without set:')
for lowest_nth in range(1,6):
print('lowest {}: {}'.format(lowest_nth, get_index_without_using_set()))
print()
print('With set:')
for lowest_nth in range(1,6):
print('lowest {}: {}'.format(lowest_nth, get_index_using_set()))
Output:
[7, 6, 6, 4, 3, 8]
Without set:
lowest 1: 4
lowest 2: 3
lowest 3: 1
lowest 4: 1 <-- wrong index, see below
lowest 5: 0
With set:
lowest 1: 4
lowest 2: 3
lowest 3: 1
lowest 4: 0
lowest 5: 5
This seems appropriate:
def find_2nd_smallest(iterable):
try:
st = set(iterable)
st.discard(min(st))
return iterable.index(min(st))
except:
return None
my_list = [7,3,6,4,6]
print (find_2nd_smallest(my_list))
prints the index of the first occurrence of the 2nd smallest:
3
And if you want a function where you can input the number smallest (the 2 means 2nd smallest):
def find_nth_smallest(iterable,n):
try:
st = set(iterable)
for i in range(n-1):
st.discard(min(st))
return iterable.index(min(st))
except: return None
my_list = [7,3,6,4,6]
print (find_nth_smallest(my_list,2))
Please note these functions may not be the most efficient solutions.
You can just remove the min and then get the min of the new list that has a new min , which would be the second min. I made a copy of the list so it would work but probably don't need to do that.
lst_copy = list(lst)
lst.remove(min(lst))
lst_copy.index(min(lst))

How to extract from a Python list while also accounting for the position of the extracted elements?

Given a list x e.g.
[4,6,7,21,1,7,3]
I need to extract those values that are less than or equal to 4. This is easily done, but I also need to take some note of where in the list those values occurred. If all values were unique I know I could probably use list.index() in some way. But there will be duplicated values. How best to achieve this?
how about simply
[(i, val) for i, val in enumerate([[4,6,7,21,1,7,3]) if val <= 4]
or depending on your use-case, perhaps a dictionary would be more suitable? Either from index to value:
{i:val for i, val in enumerate([4,6,7,21,1,7,3]) if val <= 4}
or from value to index:
from collections import defaultdict
indexes = defaultdict(list)
for i, val in enumerate([4,6,7,21,1,7,3]):
if val <= 4:
indexes[val].append(i)
you can make another list which will store tuples of the elements less than equal to 4 as first element and their index as second element, like this:
my_list = [4, 6, 7, 21, 1, 7, 3]
req_list = []
for i in range(len(my_list)):
e = my_list[i]
if e <= 4:
req_list.append((e, i))
here req_list will have pair-tuples with the first element as the element less than equal to 4 and the second element the index of that element.
e.g.
if
my_list = [4, 6, 7, 21, 1, 7, 3]
then
req_list = [(4, 0), (1, 4), (3, 6)]

how to find the max number of items in a list such that certain pairs are not together in the output?

I have a list of numbers
l = [1,2,3,4,5]
and a list of tuples which describe which items should not be in the output together.
gl_distribute = [(1, 2), (1,4), (1, 5), (2, 3), (3, 4)]
the possible lists are
[1,3]
[2,4,5]
[3,5]
and I want my algorithm to give me the second one [2,4,5]
I was thinking to do it recursively.
In the first case (t1) I call my recursive algorithm with all the items except the 1st, and in the second case (t2) I call it again removing the pairs from gl_distribute where the 1st item appears.
Here is my algorithm
def check_distribute(items, distribute):
i = sorted(items[:])
d = distribute[:]
if not i:
return []
if not d:
return i
if len(remove_from_distribute(i, d)) == len(d):
return i
first = i[0]
rest = items[1:]
distr_without_first = remove_from_distribute([first], d)
t1 = check_distribute(rest, d)
t2 = check_distribute(rest, distr_without_first)
t2.append(first)
if len(t1) >= len(t2):
return t1
else:
return t2
The remove_from_distribute(items, distr_list) removes the pairs from distr_list that include any of the items in items.
def remove_from_distribute(items, distribute_list):
new_distr = distribute_list[:]
for item in items:
for pair in distribute_list:
x, y = pair
if x == item or y == item and pair in new_distr:
new_distr.remove((x,y))
if new_distr:
return new_distr
else:
return []
My output is [4, 5, 3, 2, 1] which obviously is not correct. Can you tell me what I am doing wrong here? Or can you give me a better way to approach this?
I will suggest an alternative approach.
Assuming your list and your distribution are sorted and your list is length of n, and your distribution is length of m.
First, create a list of two tuples with all valid combinations. This should be a O(n^2) solution.
Once you have the list, it's just a simple loop through the valid combination and find the longest list. There are probably some better solutions to further reduce the complexity.
Here are my sample codes:
def get_valid():
seq = [1, 2, 3, 4, 5]
gl_dist = [(1, 2), (1,4), (1, 5), (2, 3), (3, 4)]
gl_index = 0
valid = []
for i in xrange(len(seq)):
for j in xrange(i+1, len(seq)):
if gl_index < len(gl_dist):
if (seq[i], seq[j]) != gl_dist[gl_index] :
valid.append((seq[i], seq[j]))
else:
gl_index += 1
else:
valid.append((seq[i], seq[j]))
return valid
>>>> get_valid()
[(1, 3), (2, 4), (2, 5), (3, 5), (4, 5)]
def get_list():
total = get_valid()
start = total[0][0]
result = [start]
for i, j in total:
if i == start:
result.append(j)
else:
start = i
return_result = list(result)
result = [i, j]
yield return_result
yield list(result)
raise StopIteration
>>> list(get_list())
[[1, 3], [2, 4, 5], [3, 5], [4, 5]]
I am not sure I fully understand your output as I think 4,5 and 5,2 should be possible lists as they are not in the list of tuples:
If so you could use itertools to get the combinations and filter based on the gl_distribute list using sets to see if any two numbers in the different combinations in combs contains two elements that should not be together, then get the max
combs = (combinations(l,r) for r in range(2,len(l)))
final = []
for x in combs:
final += x
res = max(filter(lambda x: not any(len(set(x).intersection(s)) == 2 for s in gl_distribute),final),key=len)
print res
(2, 4, 5)

Categories

Resources