Storing randomly generated dictionaries in Python - python

I have a series of functions that end up giving a list, with the first item containing a number, derived from the dictionaries, and the second and third items are dictionaries.
These dictionaries have been previously randomly generated.
The function I am using generates a given number of these dictionaries, trying to get the highest number possible as the first item. (It's designed to optimise dice rolls).
This all works fine, and I can print the value of the highest first item from all iterations. However, when I try and print the two dictionaries associated with this first number (bearing in mind they're all in a list together), it just seemingly randomly generates the two other dictionaries.
def repeat(type, times):
best = 0
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
print("The highest average success is", best)
return best
This works great. The last thing shown is:
BEST: (3.58, [{'strength': 4, 'intelligence': 1, 'charisma': 1, 'stamina': 4, 'willpower': 2, 'dexterity': 2, 'wits': 5, 'luck': 2}, {'agility': 1, 'brawl': 2, 'investigation': 3, 'larceny': 0, 'melee': 1, 'survival': 0, 'alchemy': 3, 'archery': 0, 'crafting': 0, 'drive': 1, 'magic': 0, 'medicine': 0, 'commercial': 0, 'esteem': 5, 'instruction': 2, 'intimidation': 2, 'persuasion': 0, 'seduction': 0}])
The highest average success is 3.58
But if I try something to store the list which gave this number:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
bestChar = x
print("The highest average success is", best)
print("Therefore the best character is", bestChar)
return best, bestChar
I get this as the last result, which is fine:
BEST: (4.15, [{'strength': 2, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 4, 'luck': 1}, {'agility': 1, 'brawl': 0, 'investigation': 5, 'larceny': 0, 'melee': 0, 'survival': 0, 'alchemy': 7, 'archery': 0, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 3, 'intimidation': 0, 'persuasion': 0, 'seduction': 0}])
The highest average success is 4.15
but the last line is
Therefore the best character is (4.15, [{'strength': 1, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 2, 'luck': 3}, {'agility': 1, 'brawl': 0, 'investigation': 1, 'larceny': 4, 'melee': 2, 'survival': 0, 'alchemy': 2, 'archery': 4, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 0, 'intimidation': 2, 'persuasion': 1, 'seduction': 0}])
As you can see this doesn't match with what I want, and what is printed literally right above it.
Through a little bit of checking, I realised what it gives out as the "Best Character" is just the last one generated, which is not the best, just the most recent. However, it isn't that simple, because the first element IS the highest result that was recorded, just not from the character in the rest of the list. This is really confusing because it means the list is somehow being edited but at no point can I see where that would happen.
Am I doing something stupid whereby the character is randomly generated every time? I wouldn't think so since x[0] gives the correct result and is stored fine, so what changes when it's the whole list?
From the function rollForCharacter() it returns rollResult, character which is just the number and then the two dictionaries.
I would greatly appreciate it if anyone could figure out and explain where I'm going wrong and why it can print the correct answer to the console yet not store it correctly a line below!
EDIT:
Dictionary 1 Code:
attributes = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row)-1):
val = randint(0, p)
rowValues[row[i]] = val + 1
p -= val
rowValues[row[-1]] = p + 1
return attributes.update(rowValues)
def getPoints():
points = [7, 5, 3]
shuffle(points)
row1 = ['strength', 'intelligence', 'charisma']
row2 = ['stamina', 'willpower']
row3 = ['dexterity', 'wits', 'luck']
for i in range(0, len(points)):
row = eval("row" + str(i+1))
assignRow(row, points[i])
Dictionary 2 Code:
skills = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row) - 1):
val = randint(0, p)
rowValues[row[i]] = val
p -= val
rowValues[row[-1]] = p
return skills.update(rowValues)
def getPoints():
points = [11, 7, 4]
shuffle(points)
row1 = ['agility', 'brawl', 'investigation', 'larceny', 'melee', 'survival']
row2 = ['alchemy', 'archery', 'crafting', 'drive', 'magic', 'medicine']
row3 = ['commercial', 'esteem', 'instruction', 'intimidation', 'persuasion', 'seduction']
for i in range(0, len(points)):
row = eval("row" + str(i + 1))
assignRow(row, points[i])

It does look like the dictionary is being re-generated, which could easily happen if the function rollForCharacter returns either a generator or alternatively is overwriting a global variable which is being overwritten by a subsequent cycle of the loop.
A simple-but-hacky way to solve the problem would be to take a deep copy of the dictionary at the time of storing, so that you're sure you're keeping the values at that point:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
# Create a brand new tuple, containing a copy of the current dict
bestChar = (x[0], x[1].copy())
The correct answer would be however to pass a unique dictionary variable that is not affected by later code.
See this SO answer with a bit more context about how passing a reference to a dictionary can be risky as it's still a mutable object.

Related

Python sudoku backtracking

I was trying out the backtracking algorithm with an easy example (sudoku). I first tried another approach where more possibilities are canceled, but after I got the same error I switched to an easier solution.
look for the first unsolved spot
fill in every number between 1 and 9 and backtrack the new field if it is valid
When I run it and output the non-valid fields I can see that when the algorithm goes out of a recursion call the spot that was in that recursion call is still a 9 (so the algorithm couldn't find anything for that spot)
e.g. the first two lines look something like this (it's trying to solve an empty field):
[1, 2, 3, 4, 6, 9, 9, 9, 9]
[9, 9, 9, 9, 9, 9, 9, 0, 0]
I thought it was a reference error and inserted
[e for e in field]
in the backtracking call so that the old field doesn't get altered although that didn't seem to help.
Here is my code:
for i in range(9):
a = [field[i][j] for j in range(9) if field[i][j] != 0]
if len(a) != len(set(a)):
return False
for i in range(9):
a = [field[j][i] for j in range(9) if field[j][i] != 0]
if len(a) != len(set(a)):
return False
for x in range(3):
for y in range(3):
a = []
for addX in range(3):
for addY in range(3):
spot = field[x * 3 + addX][y * 3 + addY]
if spot != 0:
a.append(spot)
if len(a) != len(set(a)):
return False
return True
def findEmpty(field):
for i in range(9):
for j in range(9):
if field[i][j] == 0:
return i, j
def backtracking(field):
find = findEmpty(field)
if not find:
return True, field
else:
x, y = find
for i in range(1, 10):
print(f"Trying {i} at {x} {y}")
field[x][y] = i
if isValid(field):
s = backtracking([e for e in field])
if s[0]:
return s
else:
print("Not valid")
for row in field:
print(row)
return False, None
field = [[0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
solution = backtracking(field)
if solution[0]:
print("There was a solution. The field is:")
for row in solution[1]:
print(row)
else:
print("No solution was found")
Okay, so based on what I can see in the logs, what happens is that when the code gets to 9 and still does not get an answer, it will backtrack, but keeps the value at 9.
So what happens is, every single time the program backtracks, it leaves the value at 9 and then go to the previous value, which might also go to 9, which is invalid as the value we backtracked from is already a 9. This causes a cycle where the program would backtrack straight to the start and make most slots 9, as you can see in your example.
So the solution would be to add a few lines to backtrack() as below. In short, that extra 2 lines checks if the invalid answer is a 9, if it is, it is resetted to a 0 and backtracks to the previous value until it gets a valid answer.
def backtracking(field):
find = findEmpty(field)
if not find:
return True, field
else:
x, y = find
for i in range(1, 10):
print(f"Trying {i} at {x} {y}")
field[x][y] = i
if isValid(field):
s = backtracking(field)
if s[0]:
return s
else:
print("Not valid")
if field[x][y] == 9:
field[x][y] = 0
for row in field:
print(row)
return False, None
Solution it gave:
[2, 3, 4, 5, 1, 6, 7, 8, 9]
[1, 5, 6, 7, 8, 9, 2, 3, 4]
[7, 8, 9, 2, 3, 4, 1, 5, 6]
[3, 1, 2, 4, 5, 7, 6, 9, 8]
[4, 6, 5, 1, 9, 8, 3, 2, 7]
[8, 9, 7, 3, 6, 2, 4, 1, 5]
[5, 2, 8, 6, 4, 1, 9, 7, 3]
[6, 7, 3, 9, 2, 5, 8, 4, 1]
[9, 4, 1, 8, 7, 3, 5, 6, 2]
I did some research and apparently it really is a reference error. For me importing pythons copy library and assigning each new field saying
f = copy.deepcopy(field)
fixes the issue (this also works for the complex example).
1- Create rows, cols, and boxes (3x3 units) array of dictionaries to store which indices of each row, col, and boxes have numbers.
2- Take a screenshot of the board. Run a for-loop and mark which points include numbers.
3- Call the recursive backtrack function.
4- Always in recursive functions define the base case to exit out of the recursion.
5- Start a for-loop to see which coordinate is ".". If you see ".", apply steps:[6,7,8,9]
6- Now we need to insert a valid number here. A valid number is a number that does not exist in the current row, col, and box.
7- If you find a valid number, insert it into the board and update rows, cols, and boxes.
8- After we inserted the valid point, we call backtrack function for the next ".".
9- When calling the backtrack, decide at which point you are. If you are in the last column, your next backtrack function will start from the next row and column 0. But if you are in the middle of row, you just increase the column parameter for next backtrack function.
10- If in step 5 your value is not ".", that means there is already a valid number here so call next backtracking depending on where position is. If you are in the last column, your next backtrack function will start from the next row and column 0. But if you are in the middle of row, you just increase the column parameter for next backtrack function.
class Solution:
def solveSudoku(self, board: List[List[str]]) -> None:
"""
Do not return anything, modify board in-place instead.
"""
n=len(board)
# create state variables,keep track of rows, cols and boxes
rows=[{} for _ in range(n)]
cols=[{} for _ in range(n)]
boxes=[{} for _ in range(n)]
# get the initial state of the grid
for r in range(n):
for c in range(n):
if board[r][c]!='.':
val=board[r][c]
box_id=self.get_box_id(r,c)
boxes[box_id][val]=True
rows[r][val]=True
cols[c][val]=True
# this backtracking just tells if shoul move to the next cell or not
self.backtrack(board,boxes,rows,cols,0,0)
def backtrack(self,board,boxes,rows,cols,r,c):
# base case. If I hit the last row or col, means all digits were correct so far
if r>=len(board) or c>=len(board[0]):
return True
# situation when cell is empty, fill it with correct value
if board[r][c]==".":
for num in range(1,10):
box_id=self.get_box_id(r,c)
box=boxes[box_id]
row=rows[r]
col=cols[c]
str_num=str(num)
# check rows, cols and boxes make sure str_num is not used before
if self.is_valid(box,col,row,str_num):
board[r][c]=str_num
boxes[box_id][str_num]=True
cols[c][str_num]=True
rows[r][str_num]=True
# if I am the last col and I placed the correct val, move to next row. So first col of the next row
if c==len(board)-1:
if self.backtrack(board,boxes,rows,cols,r+1,0):
return True
# if I am in col between 0-8, move to the col+1, in the same row
else:
if self.backtrack(board,boxes,rows,cols,r,c+1):
return True
# If I got a wrong value, then backtrack. So clear the state that you mutated
del box[str_num]
del row[str_num]
del col[str_num]
board[r][c]="."
# if cell is not empty just call the next backtracking
else:
if c==len(board)-1:
if self.backtrack(board,boxes,rows,cols,r+1,0):
return True
else:
if self.backtrack(board,boxes,rows,cols,r,c+1):
return True
return False
def is_valid(self,box,row,col,num):
if num in box or num in row or num in col:
return False
else:
return True
# a helper to get the id of the 3x3 sub grid, given row and column
def get_box_id(self,r,c):
row=(r//3)*3
col=c//3
return row+col

Finding indices of first non-zero items in a list

I have the following list :
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
I would like to find the indices of all the first numbers in the list that are not equal to zero.
In this case the output should be:
output = [3,5,10]
Is there a Pythonic way to do this?
According to the output, I think you want the first index of continuous non-zero sequences.
As for Pythonic, I understand it as list generator, while it's poorly readable.
# works with starting with non-zero element.
# list_test = [1, 0, 0, 1, 0, 2, 5, 4, 0, 0, 5, 5, 3, 0, 0]
list_test = [0, 0, 0, 1, 0, 2, 5, 4, 0, 0, 5, 5, 3, 0, 0]
output = [i for i in range(len(list_test)) if list_test[i] != 0 and (i == 0 or list_test[i - 1] == 0)]
print(output)
There is also a numpy based solution:
import numpy as np
l = np.array([0,0,0,1,0,2,5,4,0,0,5,5,3,0,0])
non_zeros = np.where(l != 0)[0]
diff = np.diff(non_zeros)
np.append(non_zeros [0], non_zeros [1 + np.where(diff>=2)[0]]) # array([ 3, 5, 10], dtype=int64)
Explanation:
First, we find the non-zero places, then we calculate the pair differences of those position (we need to add 1 because its out[i] = a[i+1] - a[i], read more about np.diff) then we need to add the first element of non-zero and also all the values where the difference was greater then 1)
Note:
It will also work for the case where the array start with non-zero element or all non-zeros.
From the Link.
l = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
v = {}
for i, x in enumerate(l):
if x != 0 and x not in v:
v[x] = i
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
res = {}
for index, item in enumerate(list_test):
if item > 0:
res.setdefault(index, None)
print(res.keys())
I don't knwo what you mean by Pythonic way, but this is an answer using a simple loop:
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
out = []
if list_test[0] == 0:
out.append(0)
for i in range(1, len(list_test)):
if (list_test[i-1] == 0) and (list_test[i] != 0):
out.append(i)
Don't hesitate to precise what you mean by "Pythonic" !

Partition a number into a given set of numbers

Here is what I am trying to do. Given a number and a set of numbers, I want to partition that number into the numbers given in the set (with repetitions).
For example :
take the number 9, and the set of numbers = {1, 4, 9}.
It will yield the following partitions :
{ (1, 1, 1, 1, 1, 1, 1, 1, 1), (1, 1, 1, 1, 1, 4), (1, 4, 4), (9,)}
No other possible partitions using the set {1, 4, 9} cannot be formed to sum the number 9.
I wrote a function in Python which do the task :
S = [ 1, 4, 9, 16 ]
def partition_nr_into_given_set_of_nrs(nr , S):
lst = set()
# Build the base case :
M = [1]*(nr%S[0]) + [S[0]] * (nr //S[0])
if set(M).difference(S) == 0 :
lst.add(M)
else :
for x in S :
for j in range(1, len(M)+1):
for k in range(1, nr//x +1 ) :
if k*x == sum(M[:j]) :
lst.add( tuple(sorted([x]*k + M[j:])) )
return lst
It works correctly but I want to see some opinions about it. I'm not satisfied about the fact that it uses 3 loops and I guess that it can be improved in a more elegant way. Maybe recursion is more suited in this case. Any suggestions or corrections would be appreciated. Thanks in advance.
I would solve this using a recursive function, starting with the largest number and recursively finding solutions for the remaining value, using smaller and smaller numbers.
def partition_nr_into_given_set_of_nrs(nr, S):
nrs = sorted(S, reverse=True)
def inner(n, i):
if n == 0:
yield []
for k in range(i, len(nrs)):
if nrs[k] <= n:
for rest in inner(n - nrs[k], k):
yield [nrs[k]] + rest
return list(inner(nr, 0))
S = [ 1, 4, 9, 16 ]
print(partition_nr_into_given_set_of_nrs(9, S))
# [[9], [4, 4, 1], [4, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]
Of course you could also do without the inner function by changing the parameters of the function and assuming that the list is already sorted in reverse order.
If you want to limit the number of parts for large numbers, you can add an aditional parameter indicating the remaining allowed number of elements and only yield result if that number is still greater than zero.
def partition_nr_into_given_set_of_nrs(nr, S, m=10):
nrs = sorted(S, reverse=True)
def inner(n, i, m):
if m > 0:
if n == 0:
yield []
for k in range(i, len(nrs)):
if nrs[k] <= n:
for rest in inner(n - nrs[k], k, m - 1):
yield [nrs[k]] + rest
return list(inner(nr, 0, m))
Here is a solution using itertools and has two for loops so time complexity is about O(n*n) (roughly)
A little memoization applied to reshape list by removing any element that is greater than max sum needed.
Assuming you are taking sum to be max of your set (9 in this case).
sourceCode
import itertools
x = [ 1, 4, 9, 16 ]
s = []
n = 9
#Remove elements >9
x = [ i for i in x if i <= n]
for i in xrange(1,n + 1):
for j in itertools.product(x,repeat = i):
if sum(j) == n:
s.append(list(j))
#Sort each combo
s =[sorted(i) for i in s]
#group by unique combo
print list(k for k,_ in itertools.groupby(s))
Result
>>>
>>>
[[9], [1, 4, 4], [1, 1, 1, 1, 1, 4], [1, 1, 1, 1, 1, 1, 1, 1, 1]]
EDIT
You can further optimize speed (if needed) by stopping finding combo's after sum of product is > 9
e.g.
if sum(j) > n + 2:
break

Pythonic way to manipulate same dictionary

A very naive question.. I have the following function:
def vectorize(pos, neg):
vec = {item_id:1 for item_id in pos}
for item_id in neg:
vec[item_id] = 0
return vec
Example:
>>> print vectorize([1, 2] [3, 200, 201, 202])
{1: 1, 2: 1, 3: 0, 200: 0, 201: 0, 202: 0}
I feel, this is too verbose in python.. Is there a more pythonic way to do this...
Basically, I am returning a dictionary whose values are 1 if its in pos (list) and 0 otherwise?
I'm not particularly sure if this is more pythonic... Maybe a little bit more efficient? Dunno, really
pos = [1, 2, 3, 4]
neg = [5, 6, 7, 8]
def vectorize(pos, neg):
vec = dict.fromkeys(pos, 1)
vec.update(dict.fromkeys(neg, 0))
return vec
print vectorize(pos, neg)
Outputs:
{1: 1, 2: 1, 3: 1, 4: 1, 5: 0, 6: 0, 7: 0, 8: 0}
But I like your way too... Just giving an idea here.
I'd probably just do:
def vectorize(pos, neg):
vec = {}
vec.update((item, 1) for item in pos)
vec.update((item, 0) for item in neg)
return vec
But your code is fine as well.
You could use
vec = {item_id : 0 if item_id in neg else 1 for item_id in pos}
Note however that the lookup item_id in neg won't be efficient if neg is a list (as opposed to a set).
Update: After seeing your expected output.
Note that the above does not insert 0s for items that are only in neg. If you want that too, the following one-liner could be used.
vec = dict([(item_id, 1) for item_id in pos] + [(item_id, 0) for item_id in neg])
If you want to avoid creating the two temporary lists, itertools.chain could help.
from itertools import chain
vec = dict(chain(((item_id, 1) for item_id in pos), ((item_id, 0) for item_id in neg)))
This would be Pythonic, in the sense of being relatively short and making maximum use of the language's features:
def vectorize(pos, neg):
pos_set = set(pos)
return {item_id: int(item_id in pos_set) for item_id in set(pos+neg)}
print vectorize([1, 2], [3, 200, 201, 202])

Algorithm to offset a list of data

Given a list of data as follows:
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
I would like to create an algorithm that is able to offset the list of certain number of steps. For example, if the offset = -1:
def offsetFunc(inputList, offsetList):
#make something
return output
where:
output = [0,0,0,0,1,1,5,5,5,5,5,5,3,3,3,2,2]
Important Note: The elements of the list are float numbers and they are not in any progression. So I actually need to shift them, I cannot use any work-around for getting the result.
So basically, the algorithm should replace the first set of values (the 4 "1", basically) with the 0 and then it should:
Detect the lenght of the next range of values
Create a parallel output vectors with the values delayed by one set
The way I have roughly described the algorithm above is how I would do it. However I'm a newbie to Python (and even beginner in general programming) and I have figured out time by time that Python has a lot of built-in functions that could make the algorithm less heavy and iterating. Does anyone have any suggestion to better develop a script to make this kind of job? This is the code I have written so far (assuming a static offset at -1):
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
output = []
PrevVal = 0
NextVal = input[0]
i = 0
while input[i] == NextVal:
output.append(PrevVal)
i += 1
while i < len(input):
PrevVal = NextVal
NextVal = input[i]
while input[i] == NextVal:
output.append(PrevVal)
i += 1
if i >= len(input):
break
print output
Thanks in advance for any help!
BETTER DESCRIPTION
My list will always be composed of "sets" of values. They are usually float numbers, and they take values such as this short example below:
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
In this example, the first set (the one with value "1.236") is long 4 while the second one is long 6. What I would like to get as an output, when the offset = -1, is:
The value "0.000" in the first 4 elements;
The value "1.236" in the second 6 elements.
So basically, this "offset" function is creating the list with the same "structure" (ranges of lengths) but with the values delayed by "offset" times.
I hope it's clear now, unfortunately the problem itself is still a bit silly to me (plus I don't even speak good English :) )
Please don't hesitate to ask any additional info to complete the question and make it clearer.
How about this:
def generateOutput(input, value=0, offset=-1):
values = []
for i in range(len(input)):
if i < 1 or input[i] == input[i-1]:
yield value
else: # value change in input detected
values.append(input[i-1])
if len(values) >= -offset:
value = values.pop(0)
yield value
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
print list(generateOutput(input))
It will print this:
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
And in case you just want to iterate, you do not even need to build the list. Just use for i in generateOutput(input): … then.
For other offsets, use this:
print list(generateOutput(input, 0, -2))
prints:
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 3, 3]
Using deque as the queue, and using maxlen to define the shift length. Only holding unique values. pushing inn new values at the end, pushes out old values at the start of the queue, when the shift length has been reached.
from collections import deque
def shift(it, shift=1):
q = deque(maxlen=shift+1)
q.append(0)
for i in it:
if q[-1] != i:
q.append(i)
yield q[0]
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
print list(shift(Sample))
#[0, 0, 0, 0, 1.236, 1.236, 1.236, 1.236, 1.236, 1.236]
My try:
#Input
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
shift = -1
#Build service structures: for each 'set of data' store its length and its value
set_lengths = []
set_values = []
prev_value = None
set_length = 0
for value in input:
if prev_value is not None and value != prev_value:
set_lengths.append(set_length)
set_values.append(prev_value)
set_length = 0
set_length += 1
prev_value = value
else:
set_lengths.append(set_length)
set_values.append(prev_value)
#Output the result, shifting the values
output = []
for i, l in enumerate(set_lengths):
j = i + shift
if j < 0:
output += [0] * l
else:
output += [set_values[j]] * l
print input
print output
gives:
[1, 1, 1, 1, 5, 5, 3, 3, 3, 3, 3, 3, 2, 2, 2, 5, 5]
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
def x(list, offset):
return [el + offset for el in list]
A completely different approach than my first answer is this:
import itertools
First analyze the input:
values, amounts = zip(*((n, len(list(g))) for n, g in itertools.groupby(input)))
We now have (1, 5, 3, 2, 5) and (4, 2, 6, 3, 2). Now apply the offset:
values = (0,) * (-offset) + values # nevermind that it is longer now.
And synthesize it again:
output = sum([ [v] * a for v, a in zip(values, amounts) ], [])
This is way more elegant, way less understandable and probably way more expensive than my other answer, but I didn't want to hide it from you.

Categories

Resources