Related
Given a list of data as follows:
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
I would like to create an algorithm that is able to offset the list of certain number of steps. For example, if the offset = -1:
def offsetFunc(inputList, offsetList):
#make something
return output
where:
output = [0,0,0,0,1,1,5,5,5,5,5,5,3,3,3,2,2]
Important Note: The elements of the list are float numbers and they are not in any progression. So I actually need to shift them, I cannot use any work-around for getting the result.
So basically, the algorithm should replace the first set of values (the 4 "1", basically) with the 0 and then it should:
Detect the lenght of the next range of values
Create a parallel output vectors with the values delayed by one set
The way I have roughly described the algorithm above is how I would do it. However I'm a newbie to Python (and even beginner in general programming) and I have figured out time by time that Python has a lot of built-in functions that could make the algorithm less heavy and iterating. Does anyone have any suggestion to better develop a script to make this kind of job? This is the code I have written so far (assuming a static offset at -1):
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
output = []
PrevVal = 0
NextVal = input[0]
i = 0
while input[i] == NextVal:
output.append(PrevVal)
i += 1
while i < len(input):
PrevVal = NextVal
NextVal = input[i]
while input[i] == NextVal:
output.append(PrevVal)
i += 1
if i >= len(input):
break
print output
Thanks in advance for any help!
BETTER DESCRIPTION
My list will always be composed of "sets" of values. They are usually float numbers, and they take values such as this short example below:
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
In this example, the first set (the one with value "1.236") is long 4 while the second one is long 6. What I would like to get as an output, when the offset = -1, is:
The value "0.000" in the first 4 elements;
The value "1.236" in the second 6 elements.
So basically, this "offset" function is creating the list with the same "structure" (ranges of lengths) but with the values delayed by "offset" times.
I hope it's clear now, unfortunately the problem itself is still a bit silly to me (plus I don't even speak good English :) )
Please don't hesitate to ask any additional info to complete the question and make it clearer.
How about this:
def generateOutput(input, value=0, offset=-1):
values = []
for i in range(len(input)):
if i < 1 or input[i] == input[i-1]:
yield value
else: # value change in input detected
values.append(input[i-1])
if len(values) >= -offset:
value = values.pop(0)
yield value
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
print list(generateOutput(input))
It will print this:
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
And in case you just want to iterate, you do not even need to build the list. Just use for i in generateOutput(input): … then.
For other offsets, use this:
print list(generateOutput(input, 0, -2))
prints:
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 3, 3]
Using deque as the queue, and using maxlen to define the shift length. Only holding unique values. pushing inn new values at the end, pushes out old values at the start of the queue, when the shift length has been reached.
from collections import deque
def shift(it, shift=1):
q = deque(maxlen=shift+1)
q.append(0)
for i in it:
if q[-1] != i:
q.append(i)
yield q[0]
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
print list(shift(Sample))
#[0, 0, 0, 0, 1.236, 1.236, 1.236, 1.236, 1.236, 1.236]
My try:
#Input
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
shift = -1
#Build service structures: for each 'set of data' store its length and its value
set_lengths = []
set_values = []
prev_value = None
set_length = 0
for value in input:
if prev_value is not None and value != prev_value:
set_lengths.append(set_length)
set_values.append(prev_value)
set_length = 0
set_length += 1
prev_value = value
else:
set_lengths.append(set_length)
set_values.append(prev_value)
#Output the result, shifting the values
output = []
for i, l in enumerate(set_lengths):
j = i + shift
if j < 0:
output += [0] * l
else:
output += [set_values[j]] * l
print input
print output
gives:
[1, 1, 1, 1, 5, 5, 3, 3, 3, 3, 3, 3, 2, 2, 2, 5, 5]
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
def x(list, offset):
return [el + offset for el in list]
A completely different approach than my first answer is this:
import itertools
First analyze the input:
values, amounts = zip(*((n, len(list(g))) for n, g in itertools.groupby(input)))
We now have (1, 5, 3, 2, 5) and (4, 2, 6, 3, 2). Now apply the offset:
values = (0,) * (-offset) + values # nevermind that it is longer now.
And synthesize it again:
output = sum([ [v] * a for v, a in zip(values, amounts) ], [])
This is way more elegant, way less understandable and probably way more expensive than my other answer, but I didn't want to hide it from you.
I am trying to write a function in python. The function is based on a algorithm. It is summation using sides of polygons with n sides.
For each "loop" you add n[i]+n[1+i].
In python can you do this with for loops?
This is a very easy thing to do in languages like java and c++. But the nature of python for loops make it less obvious. Can for loops accomplish this or should while loops be use?
You can use zip and for-loop here:
>>> lis = range(10)
>>> [x+y for x, y in zip(lis, lis[1:])]
[1, 3, 5, 7, 9, 11, 13, 15, 17]
If the list is huge then you can use itertools.izip and iter:
from itertools import izip, tee
it1, it2 = tee(lis) #creates two iterators from the list(or any iterable)
next(it2) #drop the first item
print [x+y for x, y in izip(it1, it2)]
#[1, 3, 5, 7, 9, 11, 13, 15, 17]
for i in range(N): # i = 0,1, ... N-1
val = n[i] + n[i+1]
if you want to 'wrap around', you can write
for i in range(N): # i = 0,1, ... N-1
val = n[i] + n[(i+1)%N]
.. or use the fact that n[-1] is the same as the last element
for i in range(N): # i = 0,1, ... N-1
val = n[i-1] + n[i] # [N-1]+[0], [0]+[1], ... [N-2] + [N-1]
This approach will likely be slower but may be easier to follow than zips and iterations.
Say I have a list:
l = [1, 2, 3, 4]
And I want to cycle through it. Normally, it would do something like this,
1, 2, 3, 4, 1, 2, 3, 4, 1, 2...
I want to be able to start at a certain point in the cycle, not necessarily an index, but perhaps matching an element. Say I wanted to start at whatever element in the list ==4, then the output would be,
4, 1, 2, 3, 4, 1, 2, 3, 4, 1...
How can I accomplish this?
Look at itertools module. It provides all the necessary functionality.
from itertools import cycle, islice, dropwhile
L = [1, 2, 3, 4]
cycled = cycle(L) # cycle thorugh the list 'L'
skipped = dropwhile(lambda x: x != 4, cycled) # drop the values until x==4
sliced = islice(skipped, None, 10) # take the first 10 values
result = list(sliced) # create a list from iterator
print(result)
Output:
[4, 1, 2, 3, 4, 1, 2, 3, 4, 1]
Use the arithmetic mod operator. Suppose you're starting from position k, then k should be updated like this:
k = (k + 1) % len(l)
If you want to start from a certain element, not index, you can always look it up like k = l.index(x) where x is the desired item.
I'm not such a big fan of importing modules when you can do things by your own in a couple of lines. Here's my solution without imports:
def cycle(my_list, start_at=None):
start_at = 0 if start_at is None else my_list.index(start_at)
while True:
yield my_list[start_at]
start_at = (start_at + 1) % len(my_list)
This will return an (infinite) iterator looping your list. To get the next element in the cycle you must use the next statement:
>>> it1 = cycle([101,102,103,104])
>>> next(it1), next(it1), next(it1), next(it1), next(it1)
(101, 102, 103, 104, 101) # and so on ...
>>> it1 = cycle([101,102,103,104], start_at=103)
>>> next(it1), next(it1), next(it1), next(it1), next(it1)
(103, 104, 101, 102, 103) # and so on ...
import itertools as it
l = [1, 2, 3, 4]
list(it.islice(it.dropwhile(lambda x: x != 4, it.cycle(l)), 10))
# returns: [4, 1, 2, 3, 4, 1, 2, 3, 4, 1]
so the iterator you want is:
it.dropwhile(lambda x: x != 4, it.cycle(l))
Hm, http://docs.python.org/library/itertools.html#itertools.cycle doesn't have such a start element.
Maybe you just start the cycle anyway and drop the first elements that you don't like.
Another weird option is that cycling through lists can be accomplished backwards. For instance:
# Run this once
myList = ['foo', 'bar', 'baz', 'boom']
myItem = 'baz'
# Run this repeatedly to cycle through the list
if myItem in myList:
myItem = myList[myList.index(myItem)-1]
print myItem
Can use something like this:
def my_cycle(data, start=None):
k = 0 if not start else start
while True:
yield data[k]
k = (k + 1) % len(data)
Then run:
for val in my_cycle([0,1,2,3], 2):
print(val)
Essentially the same as one of the previous answers. My bad.
I asked some similar questions [1, 2] yesterday and got great answers, but I am not yet technically skilled enough to write a generator of such sophistication myself.
How could I write a generator that would raise StopIteration if it's the last item, instead of yielding it?
I am thinking I should somehow ask two values at a time, and see if the 2nd value is StopIteration. If it is, then instead of yielding the first value, I should raise this StopIteration. But somehow I should also remember the 2nd value that I asked if it wasn't StopIteration.
I don't know how to write it myself. Please help.
For example, if the iterable is [1, 2, 3], then the generator should return 1 and 2.
Thanks, Boda Cydo.
[1] How do I modify a generator in Python?
[2] How to determine if the value is ONE-BUT-LAST in a Python generator?
This should do the trick:
def allbutlast(iterable):
it = iter(iterable)
current = it.next()
for i in it:
yield current
current = i
>>> list(allbutlast([1,2,3]))
[1, 2]
This will iterate through the entire list, and return the previous item so the last item is never returned.
Note that calling the above on both [] and [1] will return an empty list.
First off, is a generator really needed? This sounds like the perfect job for Python’s slices syntax:
result = my_range[ : -1]
I.e.: take a range form the first item to the one before the last.
the itertools module shows a pairwise() method in its recipes. adapting from this recipe, you can get your generator:
from itertools import *
def n_apart(iterable, n):
a,b = tee(iterable)
for count in range(n):
next(b)
return zip(a,b)
def all_but_n_last(iterable, n):
return (value for value,dummy in n_apart(iterable, n))
the n_apart() function return pairs of values which are n elements apart in the input iterable, ignoring all pairs . all_but_b_last() returns the first value of all pairs, which incidentally ignores the n last elements of the list.
>>> data = range(10)
>>> list(data)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(n_apart(data,3))
[(0, 3), (1, 4), (2, 5), (3, 6), (4, 7), (5, 8), (6, 9)]
>>> list(all_but_n_last(data,3))
[0, 1, 2, 3, 4, 5, 6]
>>>
>>> list(all_but_n_last(data,1))
[0, 1, 2, 3, 4, 5, 6, 7, 8]
The more_itertools project has a tool that emulates itertools.islice with support for negative indices:
import more_itertools as mit
list(mit.islice_extended([1, 2, 3], None, -1))
# [1, 2]
gen = (x for x in iterable[:-1])
I want an algorithm to iterate over list slices. Slices size is set outside the function and can differ.
In my mind it is something like:
for list_of_x_items in fatherList:
foo(list_of_x_items)
Is there a way to properly define list_of_x_items or some other way of doing this using python 2.5?
edit1: Clarification Both "partitioning" and "sliding window" terms sound applicable to my task, but I am no expert. So I will explain the problem a bit deeper and add to the question:
The fatherList is a multilevel numpy.array I am getting from a file. Function has to find averages of series (user provides the length of series) For averaging I am using the mean() function. Now for question expansion:
edit2: How to modify the function you have provided to store the extra items and use them when the next fatherList is fed to the function?
for example if the list is lenght 10 and size of a chunk is 3, then the 10th member of the list is stored and appended to the beginning of the next list.
Related:
What is the most “pythonic” way to iterate over a list in chunks?
If you want to divide a list into slices you can use this trick:
list_of_slices = zip(*(iter(the_list),) * slice_size)
For example
>>> zip(*(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8)]
If the number of items is not dividable by the slice size and you want to pad the list with None you can do this:
>>> map(None, *(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]
It is a dirty little trick
OK, I'll explain how it works. It'll be tricky to explain but I'll try my best.
First a little background:
In Python you can multiply a list by a number like this:
[1, 2, 3] * 3 -> [1, 2, 3, 1, 2, 3, 1, 2, 3]
([1, 2, 3],) * 3 -> ([1, 2, 3], [1, 2, 3], [1, 2, 3])
And an iterator object can be consumed once like this:
>>> l=iter([1, 2, 3])
>>> l.next()
1
>>> l.next()
2
>>> l.next()
3
The zip function returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. For example:
zip([1, 2, 3], [20, 30, 40]) -> [(1, 20), (2, 30), (3, 40)]
zip(*[(1, 20), (2, 30), (3, 40)]) -> [[1, 2, 3], [20, 30, 40]]
The * in front of zip used to unpack arguments. You can find more details here.
So
zip(*[(1, 20), (2, 30), (3, 40)])
is actually equivalent to
zip((1, 20), (2, 30), (3, 40))
but works with a variable number of arguments
Now back to the trick:
list_of_slices = zip(*(iter(the_list),) * slice_size)
iter(the_list) -> convert the list into an iterator
(iter(the_list),) * N -> will generate an N reference to the_list iterator.
zip(*(iter(the_list),) * N) -> will feed those list of iterators into zip. Which in turn will group them into N sized tuples. But since all N items are in fact references to the same iterator iter(the_list) the result will be repeated calls to next() on the original iterator
I hope that explains it. I advice you to go with an easier to understand solution. I was only tempted to mention this trick because I like it.
If you want to be able to consume any iterable you can use these functions:
from itertools import chain, islice
def ichunked(seq, chunksize):
"""Yields items from an iterator in iterable chunks."""
it = iter(seq)
while True:
yield chain([it.next()], islice(it, chunksize-1))
def chunked(seq, chunksize):
"""Yields items from an iterator in list chunks."""
for chunk in ichunked(seq, chunksize):
yield list(chunk)
Use a generator:
big_list = [1,2,3,4,5,6,7,8,9]
slice_length = 3
def sliceIterator(lst, sliceLen):
for i in range(len(lst) - sliceLen + 1):
yield lst[i:i + sliceLen]
for slice in sliceIterator(big_list, slice_length):
foo(slice)
sliceIterator implements a "sliding window" of width sliceLen over the squence lst, i.e. it produces overlapping slices: [1,2,3], [2,3,4], [3,4,5], ... Not sure if that is the OP's intention, though.
Do you mean something like:
def callonslices(size, fatherList, foo):
for i in xrange(0, len(fatherList), size):
foo(fatherList[i:i+size])
If this is roughly the functionality you want you might, if you desire, dress it up a bit in a generator:
def sliceup(size, fatherList):
for i in xrange(0, len(fatherList), size):
yield fatherList[i:i+size]
and then:
def callonslices(size, fatherList, foo):
for sli in sliceup(size, fatherList):
foo(sli)
Answer to the last part of the question:
question update: How to modify the
function you have provided to store
the extra items and use them when the
next fatherList is fed to the
function?
If you need to store state then you can use an object for that.
class Chunker(object):
"""Split `iterable` on evenly sized chunks.
Leftovers are remembered and yielded at the next call.
"""
def __init__(self, chunksize):
assert chunksize > 0
self.chunksize = chunksize
self.chunk = []
def __call__(self, iterable):
"""Yield items from `iterable` `self.chunksize` at the time."""
assert len(self.chunk) < self.chunksize
for item in iterable:
self.chunk.append(item)
if len(self.chunk) == self.chunksize:
# yield collected full chunk
yield self.chunk
self.chunk = []
Example:
chunker = Chunker(3)
for s in "abcd", "efgh":
for chunk in chunker(s):
print ''.join(chunk)
if chunker.chunk: # is there anything left?
print ''.join(chunker.chunk)
Output:
abc
def
gh
I am not sure, but it seems you want to do what is called a moving average. numpy provides facilities for this (the convolve function).
>>> x = numpy.array(range(20))
>>> x
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19])
>>> n = 2 # moving average window
>>> numpy.convolve(numpy.ones(n)/n, x)[n-1:-n+1]
array([ 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5,
9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5])
The nice thing is that it accomodates different weighting schemes nicely (just change numpy.ones(n) / n to something else).
You can find a complete material here:
http://www.scipy.org/Cookbook/SignalSmooth
Expanding on the answer of #Ants Aasma: In Python 3.7 the handling of the StopIteration exception changed (according to PEP-479). A compatible version would be:
from itertools import chain, islice
def ichunked(seq, chunksize):
it = iter(seq)
while True:
try:
yield chain([next(it)], islice(it, chunksize - 1))
except StopIteration:
return
Your question could use some more detail, but how about:
def iterate_over_slices(the_list, slice_size):
for start in range(0, len(the_list)-slice_size):
slice = the_list[start:start+slice_size]
foo(slice)
For a near-one liner (after itertools import) in the vein of Nadia's answer dealing with non-chunk divisible sizes without padding:
>>> import itertools as itt
>>> chunksize = 5
>>> myseq = range(18)
>>> cnt = itt.count()
>>> print [ tuple(grp) for k,grp in itt.groupby(myseq, key=lambda x: cnt.next()//chunksize%2)]
[(0, 1, 2, 3, 4), (5, 6, 7, 8, 9), (10, 11, 12, 13, 14), (15, 16, 17)]
If you want, you can get rid of the itertools.count() requirement using enumerate(), with a rather uglier:
[ [e[1] for e in grp] for k,grp in itt.groupby(enumerate(myseq), key=lambda x: x[0]//chunksize%2) ]
(In this example the enumerate() would be superfluous, but not all sequences are neat ranges like this, obviously)
Nowhere near as neat as some other answers, but useful in a pinch, especially if already importing itertools.
A function that slices a list or an iterator into chunks of a given size. Also handles the case correctly if the last chunk is smaller:
def slice_iterator(data, slice_len):
it = iter(data)
while True:
items = []
for index in range(slice_len):
try:
item = next(it)
except StopIteration:
if items == []:
return # we are done
else:
break # exits the "for" loop
items.append(item)
yield items
Usage example:
for slice in slice_iterator([1,2,3,4,5,6,7,8,9,10],3):
print(slice)
Result:
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]