Related
I am trying to write a simple function to find if 0,0,1 occurs in a list, in that order.
It should return True or False.
The list can contain any number of numbers.
For the function ZeroZeroOne examples would be as follows:
>> ZeroZeroOne( [0,0,1] )
>> True
>> ZeroZeroOne( [1,0,0] )
>> False
# there are 2s in between but the following does have 0,0,1 occurring and in correct order
>> ZeroZeroOne( [0,2,2,2,2,0,1] )
>> True
I have this function:
def ZeroZeroOne(nums):
FoundIt = False
#quick return if defo not possible
if (nums.count(0) < 2) and (nums.count(1) == 0):
return FoundIt
n = len(nums)
for x in range(n-2):
if nums[x] == 0:
for i,z in enumerate(nums[(x+1):]):
if z==0 and z!=1:
for j,q in enumerate(nums[(i+1):]):
if q==1 and q!=0:
FoundIt=True
return FoundIt
Why does the function return True for this list [0, 1, 0, 2, 1]?
Moreover....
This function seems overly-complex for a seemingly simple problem.
Is there a correct approach to this problem in Python - a canonical or Pythonic approach?
Or is ones approach simply opinion-based?
You can trivially modify the ordered subsequence test from this answer for an elegant solution:
def ZeroZeroOne(arr):
test = iter(a for a in arr if a in (0, 1))
return all(z in test for z in (0, 0, 1))
I realize now that you don't want to accept 0, 1 0, 1.
You can use itertools.tee to check for a match:
def ZeroZeroOne(arr):
e = itertools.tee((a for a in arr if a in (0, 1)), 3)
# move second iterator forward one
next(e[1])
# move third iterator forward two
next(e[2])
next(e[2])
return (0, 0, 1) in zip(*e)
The nice thing about using tee in this case is that it effectively maintains a rolling buffer of the last three elements for you. You don't need to make a new slice or loop over indices it anything like that.
Just for fun, here's a more general solution in pure python. It accepts any iterable for arr and template:
def contains_template(arr, template):
template = tuple(template)
unique = set(template)
filtered = (a for a in arr if a in unique)
e = itertools.tee(filtered, len(template))
for n, it in enumerate(e):
for _ in range(n):
next(it)
return template in zip(*e)
While itertools.tee is a nice way to maintain a rolling buffer, you can implement the same thing using a list (or more efficiently, collections.deque):
def contains_template(arr, template):
template = list(template)
unique = set(template)
filtered = (a for a in arr if a in unique)
buffer = [next(filtered) for _ in range(len(template) - 1)]
buffer.insert(0, None)
for e in filtered:
buffer.pop(0)
buffer.append(e)
if template == buffer:
return True
return False
Finally, here is the really simple solution, without a rolling buffer:
def contains_template(arr, template):
template = list(template)
n = len(template)
unique = set(template)
filtered = [a for a in arr if a in unique]
return any(filtered[i:i + n] == template for i in range(len(filtered) - n))
You can also do it with a recursive function :
def check(seq, liste, i=0, j=0):
if i >= len(seq):
return True
if j >= len(liste):
return False
if seq[i] == liste[j]:
return check(seq, liste, i + 1, j + 1)
elif liste[j] in seq:
# look for the last index you can restart from
for k in range(i - 1, -1, -1):
if seq[k] == liste[j]:
if seq[:k] == seq[i - k:i]:
ind = k
break
else:
ind = 0
return check(seq, liste, ind, j + (not i))
else:
return check(seq, liste, i, j + 1)
# seq = [0,0,1] for ZeroZeroOne
print(check([0, 0, 1], [0, 0, 0, 0, 1])) # True
print(check([0, 0, 1], [0, 200, 0, 0, 101, 1])) # True
print(check([0, 2, 2, 0, 1], [0, 2, 0, 4, 2, 5, 2, 0, 3, 1])) # True
print(check([0, 2, 2, 0, 1], [0, 2, 4, 2, 5, 2, 0, 3, 1])) # False
You can achieve this with a single loop - O(n) time complexity. Since it is for this specific case. Try the code below.
def ZeroZeroOne(nums):
found_pattern = []
for num in nums:
if num == 1:
found_pattern.append(1)
if len(found_pattern) == 3:
return True
else:
found_pattern = []
elif num == 0 and len(found_pattern) < 2:
found_pattern.append(0)
return False
print(ZeroZeroOne([0, 0, 1]))
print(ZeroZeroOne([0, 1, 0, 2, 1]))
print(ZeroZeroOne([0, 2, 0, 1]))
print(ZeroZeroOne([0, 0, 0, 1]))
print(ZeroZeroOne([0, 2, 2, 2, 2, 0, 1]))
But I think you can generalize this as well if required. Probably you need to look in to how grep works and modify it for your use case if you want a generic approach.
I think this does what you want :)
def ZeroZeroOne(arr):
dropped = [x for x in arr if x==0 or x==1]
slices = [dropped[i:i+3] for i in range(len(dropped)-2)]
if [0,0,1] in slices: return True
else: return False
def ZeroZeroOne(nums):
filtered_nums = [x for x in nums if x in [0,1]]
return '*'.join([str(x) for x in [0,0,1]) in '*'.join([str(x) for x in filtered_nums])
I have the following list :
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
I would like to find the indices of all the first numbers in the list that are not equal to zero.
In this case the output should be:
output = [3,5,10]
Is there a Pythonic way to do this?
According to the output, I think you want the first index of continuous non-zero sequences.
As for Pythonic, I understand it as list generator, while it's poorly readable.
# works with starting with non-zero element.
# list_test = [1, 0, 0, 1, 0, 2, 5, 4, 0, 0, 5, 5, 3, 0, 0]
list_test = [0, 0, 0, 1, 0, 2, 5, 4, 0, 0, 5, 5, 3, 0, 0]
output = [i for i in range(len(list_test)) if list_test[i] != 0 and (i == 0 or list_test[i - 1] == 0)]
print(output)
There is also a numpy based solution:
import numpy as np
l = np.array([0,0,0,1,0,2,5,4,0,0,5,5,3,0,0])
non_zeros = np.where(l != 0)[0]
diff = np.diff(non_zeros)
np.append(non_zeros [0], non_zeros [1 + np.where(diff>=2)[0]]) # array([ 3, 5, 10], dtype=int64)
Explanation:
First, we find the non-zero places, then we calculate the pair differences of those position (we need to add 1 because its out[i] = a[i+1] - a[i], read more about np.diff) then we need to add the first element of non-zero and also all the values where the difference was greater then 1)
Note:
It will also work for the case where the array start with non-zero element or all non-zeros.
From the Link.
l = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
v = {}
for i, x in enumerate(l):
if x != 0 and x not in v:
v[x] = i
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
res = {}
for index, item in enumerate(list_test):
if item > 0:
res.setdefault(index, None)
print(res.keys())
I don't knwo what you mean by Pythonic way, but this is an answer using a simple loop:
list_test = [0,0,0,1,0,2,5,4,0,0,5,5,3,0,0]
out = []
if list_test[0] == 0:
out.append(0)
for i in range(1, len(list_test)):
if (list_test[i-1] == 0) and (list_test[i] != 0):
out.append(i)
Don't hesitate to precise what you mean by "Pythonic" !
I have a series of functions that end up giving a list, with the first item containing a number, derived from the dictionaries, and the second and third items are dictionaries.
These dictionaries have been previously randomly generated.
The function I am using generates a given number of these dictionaries, trying to get the highest number possible as the first item. (It's designed to optimise dice rolls).
This all works fine, and I can print the value of the highest first item from all iterations. However, when I try and print the two dictionaries associated with this first number (bearing in mind they're all in a list together), it just seemingly randomly generates the two other dictionaries.
def repeat(type, times):
best = 0
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
print("The highest average success is", best)
return best
This works great. The last thing shown is:
BEST: (3.58, [{'strength': 4, 'intelligence': 1, 'charisma': 1, 'stamina': 4, 'willpower': 2, 'dexterity': 2, 'wits': 5, 'luck': 2}, {'agility': 1, 'brawl': 2, 'investigation': 3, 'larceny': 0, 'melee': 1, 'survival': 0, 'alchemy': 3, 'archery': 0, 'crafting': 0, 'drive': 1, 'magic': 0, 'medicine': 0, 'commercial': 0, 'esteem': 5, 'instruction': 2, 'intimidation': 2, 'persuasion': 0, 'seduction': 0}])
The highest average success is 3.58
But if I try something to store the list which gave this number:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
bestChar = x
print("The highest average success is", best)
print("Therefore the best character is", bestChar)
return best, bestChar
I get this as the last result, which is fine:
BEST: (4.15, [{'strength': 2, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 4, 'luck': 1}, {'agility': 1, 'brawl': 0, 'investigation': 5, 'larceny': 0, 'melee': 0, 'survival': 0, 'alchemy': 7, 'archery': 0, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 3, 'intimidation': 0, 'persuasion': 0, 'seduction': 0}])
The highest average success is 4.15
but the last line is
Therefore the best character is (4.15, [{'strength': 1, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 2, 'luck': 3}, {'agility': 1, 'brawl': 0, 'investigation': 1, 'larceny': 4, 'melee': 2, 'survival': 0, 'alchemy': 2, 'archery': 4, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 0, 'intimidation': 2, 'persuasion': 1, 'seduction': 0}])
As you can see this doesn't match with what I want, and what is printed literally right above it.
Through a little bit of checking, I realised what it gives out as the "Best Character" is just the last one generated, which is not the best, just the most recent. However, it isn't that simple, because the first element IS the highest result that was recorded, just not from the character in the rest of the list. This is really confusing because it means the list is somehow being edited but at no point can I see where that would happen.
Am I doing something stupid whereby the character is randomly generated every time? I wouldn't think so since x[0] gives the correct result and is stored fine, so what changes when it's the whole list?
From the function rollForCharacter() it returns rollResult, character which is just the number and then the two dictionaries.
I would greatly appreciate it if anyone could figure out and explain where I'm going wrong and why it can print the correct answer to the console yet not store it correctly a line below!
EDIT:
Dictionary 1 Code:
attributes = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row)-1):
val = randint(0, p)
rowValues[row[i]] = val + 1
p -= val
rowValues[row[-1]] = p + 1
return attributes.update(rowValues)
def getPoints():
points = [7, 5, 3]
shuffle(points)
row1 = ['strength', 'intelligence', 'charisma']
row2 = ['stamina', 'willpower']
row3 = ['dexterity', 'wits', 'luck']
for i in range(0, len(points)):
row = eval("row" + str(i+1))
assignRow(row, points[i])
Dictionary 2 Code:
skills = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row) - 1):
val = randint(0, p)
rowValues[row[i]] = val
p -= val
rowValues[row[-1]] = p
return skills.update(rowValues)
def getPoints():
points = [11, 7, 4]
shuffle(points)
row1 = ['agility', 'brawl', 'investigation', 'larceny', 'melee', 'survival']
row2 = ['alchemy', 'archery', 'crafting', 'drive', 'magic', 'medicine']
row3 = ['commercial', 'esteem', 'instruction', 'intimidation', 'persuasion', 'seduction']
for i in range(0, len(points)):
row = eval("row" + str(i + 1))
assignRow(row, points[i])
It does look like the dictionary is being re-generated, which could easily happen if the function rollForCharacter returns either a generator or alternatively is overwriting a global variable which is being overwritten by a subsequent cycle of the loop.
A simple-but-hacky way to solve the problem would be to take a deep copy of the dictionary at the time of storing, so that you're sure you're keeping the values at that point:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
# Create a brand new tuple, containing a copy of the current dict
bestChar = (x[0], x[1].copy())
The correct answer would be however to pass a unique dictionary variable that is not affected by later code.
See this SO answer with a bit more context about how passing a reference to a dictionary can be risky as it's still a mutable object.
This is the code I have so far.
def find_duplicate_integers(arg):
stats = {}
for i in arg:
if i > 1:
if i in stats:
stats[i] += 1
else:
stats[i] = 1
return stats
This is the result I want
>>> find_duplicate_integers([1, 1, 3, 2, 3, 1, 0])
{1: 3, 3: 2}
But this is the result I get
>>> find_duplicate_integers([1, 1, 3, 2, 3, 1, 0])
{2: 1, 3: 2}
I apologize if this is due to a basic mistake, but I cannot figure out how to make this work. Any help would be greatly appreciated!
you can do that in one line actually:
def find_duplicate_integers(arg):
return {i: arg.count(i) for i in set(arg) if arg.count(i) > 1}
if you care about runtime, there might be faster ways to do it.
EDIT:
If you need it to be really fast, you can do it like this:
from collections import defaultdict
from random import SystemRandom
from timeit import Timer
def find_duplicate_integers3(arg):
d = defaultdict(lambda: 0)
for i in arg:
d[i] += 1
return {k: v for k, v in d.items() if v > 1}
rdev = SystemRandom()
numberList = [rdev.randint(0, 10 ** 3) for _ in range(1000)]
t1 = Timer(lambda: find_duplicate_integers3(numberList)) # Mine
t2 = Timer(lambda: find_duplicate_integers1(numberList)) # Goodies's
print(t1.timeit(number=1000)) # => 0.42611347176268827
print(t2.timeit(number=1000)) # => 1.0357027557108174
EDIT2:
As donkopotamus pointed out, there's an even better (and faster) way to do it: collections.Counter
If you want to know what is wrong with your code, look at the following change that I've made to your code.
def find_duplicate_integers(arg):
stats = {}
res = {}
for i in arg:
if i in stats:
stats[i] += 1
res[i] = stats[i]
else:
stats[i] = 1
return res
print find_duplicate_integers([1, 1, 3, 2, 3, 1, 0])
First of all, you were not even looking at because you put the condition to check only those integers which are greater than 1
if i > 1
This condition is not required (according to what i understand from your requirements).
Next I have created a different list just to store those vars which have value more than 1.
I stress, this is not the best way to solve this problem. I'm just trying to point out what was wrong in your code that gave you the results you were getting.
You can do this very easily using collections.Counter in pythons standard library
def find_duplicate_integers(arg):
return {k: v for k, v in collections.Counter(arg).items() if v > 1}
Then
>>> find_duplicate_integers([1, 1, 3, 2, 3, 1, 0])
{1: 3, 3: 2}
Use groupby from itertools.
from itertools import groupby
def find_duplicate_integers(numberlist, minimum=1):
repeats = dict([(a, sum(1 for _ in b)) for a, b in groupby(sorted(numberlist))])
return dict((a, b) for a, b in repeats.items() if b > minimum)
print(find_duplicate_integers([1, 1, 3, 2, 3, 1, 0])) # => {1: 3, 3: 2}
print(find_duplicate_integers([1, 1, 3, 2, 3, 1, 0], minimum=3)) # => {1: 3}
Comparing to #caenyon's solution. My fuction is #1 his is #2.
from itertools import groupby
from random import SystemRandom
from timeit import Timer
rdev = SystemRandom()
numberList = [rdev.randint(0, 10**3) for _ in range(1000)]
t1 = Timer(lambda: find_duplicate_integers1(numberList)) # Mine
t2 = Timer(lambda: find_duplicate_integers2(numberList)) # caenyon's
print(t1.timeit(number=1000)) # => 0.7377041084807044
print(t2.timeit(number=1000)) # => 16.82846828367938
It gets progressively slower as the size increases.
Given a list of data as follows:
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
I would like to create an algorithm that is able to offset the list of certain number of steps. For example, if the offset = -1:
def offsetFunc(inputList, offsetList):
#make something
return output
where:
output = [0,0,0,0,1,1,5,5,5,5,5,5,3,3,3,2,2]
Important Note: The elements of the list are float numbers and they are not in any progression. So I actually need to shift them, I cannot use any work-around for getting the result.
So basically, the algorithm should replace the first set of values (the 4 "1", basically) with the 0 and then it should:
Detect the lenght of the next range of values
Create a parallel output vectors with the values delayed by one set
The way I have roughly described the algorithm above is how I would do it. However I'm a newbie to Python (and even beginner in general programming) and I have figured out time by time that Python has a lot of built-in functions that could make the algorithm less heavy and iterating. Does anyone have any suggestion to better develop a script to make this kind of job? This is the code I have written so far (assuming a static offset at -1):
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
output = []
PrevVal = 0
NextVal = input[0]
i = 0
while input[i] == NextVal:
output.append(PrevVal)
i += 1
while i < len(input):
PrevVal = NextVal
NextVal = input[i]
while input[i] == NextVal:
output.append(PrevVal)
i += 1
if i >= len(input):
break
print output
Thanks in advance for any help!
BETTER DESCRIPTION
My list will always be composed of "sets" of values. They are usually float numbers, and they take values such as this short example below:
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
In this example, the first set (the one with value "1.236") is long 4 while the second one is long 6. What I would like to get as an output, when the offset = -1, is:
The value "0.000" in the first 4 elements;
The value "1.236" in the second 6 elements.
So basically, this "offset" function is creating the list with the same "structure" (ranges of lengths) but with the values delayed by "offset" times.
I hope it's clear now, unfortunately the problem itself is still a bit silly to me (plus I don't even speak good English :) )
Please don't hesitate to ask any additional info to complete the question and make it clearer.
How about this:
def generateOutput(input, value=0, offset=-1):
values = []
for i in range(len(input)):
if i < 1 or input[i] == input[i-1]:
yield value
else: # value change in input detected
values.append(input[i-1])
if len(values) >= -offset:
value = values.pop(0)
yield value
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
print list(generateOutput(input))
It will print this:
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
And in case you just want to iterate, you do not even need to build the list. Just use for i in generateOutput(input): … then.
For other offsets, use this:
print list(generateOutput(input, 0, -2))
prints:
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 3, 3]
Using deque as the queue, and using maxlen to define the shift length. Only holding unique values. pushing inn new values at the end, pushes out old values at the start of the queue, when the shift length has been reached.
from collections import deque
def shift(it, shift=1):
q = deque(maxlen=shift+1)
q.append(0)
for i in it:
if q[-1] != i:
q.append(i)
yield q[0]
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
print list(shift(Sample))
#[0, 0, 0, 0, 1.236, 1.236, 1.236, 1.236, 1.236, 1.236]
My try:
#Input
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
shift = -1
#Build service structures: for each 'set of data' store its length and its value
set_lengths = []
set_values = []
prev_value = None
set_length = 0
for value in input:
if prev_value is not None and value != prev_value:
set_lengths.append(set_length)
set_values.append(prev_value)
set_length = 0
set_length += 1
prev_value = value
else:
set_lengths.append(set_length)
set_values.append(prev_value)
#Output the result, shifting the values
output = []
for i, l in enumerate(set_lengths):
j = i + shift
if j < 0:
output += [0] * l
else:
output += [set_values[j]] * l
print input
print output
gives:
[1, 1, 1, 1, 5, 5, 3, 3, 3, 3, 3, 3, 2, 2, 2, 5, 5]
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
def x(list, offset):
return [el + offset for el in list]
A completely different approach than my first answer is this:
import itertools
First analyze the input:
values, amounts = zip(*((n, len(list(g))) for n, g in itertools.groupby(input)))
We now have (1, 5, 3, 2, 5) and (4, 2, 6, 3, 2). Now apply the offset:
values = (0,) * (-offset) + values # nevermind that it is longer now.
And synthesize it again:
output = sum([ [v] * a for v, a in zip(values, amounts) ], [])
This is way more elegant, way less understandable and probably way more expensive than my other answer, but I didn't want to hide it from you.