Related
I have a list of parameters such as this:
import numpy as np
param1 = np.arange(0., 1., 0.01)
param2 = np.arange(10., 8000., 100.)
...
I also have a function foo defined with a list of keyword arguments arg1, arg2, ... and their default values:
def foo(arg1=default1, arg2=default2, ...)
What I need to do is call this function, changing one of those default values (one by one) with the arguments and values from my list of parameters like so:
foo(arg1=param1[0])
foo(arg1=param1[1])
...
foo(arg2=param2[0])
foo(arg2=param2[0])
The best way that I thought of was to create a dictionary of all parameters, and then iterate over keys and values and create a new temporary dictionary out of it and then call the function:
all_params = {'arg1':param1, 'arg2':param2, ...}
for key, value_list in all_params.items():
for value in value_list:
tmp_dict = {key:value}
foo(**tmp_dict)
But I have a feeling that 1) I'm iterating in a non-Pythonic way, 2) and that there is obviously a much better way to solve this problem.
EDIT: streamlined the nested loops a bit according to #Sebastian's suggestion.
This is relatively simple in my opinion.
def foo(a=0, b=0, c=0):
return a * b + c
args1 = [1, 2]
args2 = [3, 4, 5]
args3 = [6, 7]
args = [args1, args2, args3]
d = {}
for n, a in enumerate(args): # Enumerate through all of the parameters.
for val in a: # For each parameter, iterate through all of the desired arguments.
a = [0, 0, 0] # default_args
a[n] = val # Insert the relavent argument into the correct parameter location.
d[tuple(a)] = foo(*a) # Call the function and unpack all of the arguments.
# This dictionary holds the function arguments as keys the returned values for those arguments.
>>> d
{(0, 0, 6): 6,
(0, 0, 7): 7,
(0, 3, 0): 0,
(0, 4, 0): 0,
(0, 5, 0): 0,
(1, 0, 0): 0,
(2, 0, 0): 0}
1) I'm iterating in a non-Pythonic way
"Pythonic" is subjective.
2) That there is obviously a much better way to solve this
problem.
Not so, what you're currently doing is the only possible scenario considering you've to pass them by keyword and that you've to pass them one at a time.
As an improvement you might consider passing all your arguments at the same time.
MVCE:
First, define your function and dictionary:
In [687]: def foo(a, b, c):
...: print(a, b, c)
...:
In [688]: dict_ = {'a': [1, 2, 3], 'b' : [4, 5, 6], 'c' : [7, 8, 9]}
Convert to dict of iters:
In [689]: dict_ = {k : iter(v) for k, v in dict_.items()}
Run your loop:
In [690]: while True:
...: try:
...: foo(**{k : next(v) for k, v in dict_.items()})
...: except StopIteration:
...: break
...:
1 4 7
2 5 8
3 6 9
You can simplify the iterating slightly, so that you don't need to access all_params[key] again, like this:
for key, param in all_params.items():
for value in param:
I have a series of functions that end up giving a list, with the first item containing a number, derived from the dictionaries, and the second and third items are dictionaries.
These dictionaries have been previously randomly generated.
The function I am using generates a given number of these dictionaries, trying to get the highest number possible as the first item. (It's designed to optimise dice rolls).
This all works fine, and I can print the value of the highest first item from all iterations. However, when I try and print the two dictionaries associated with this first number (bearing in mind they're all in a list together), it just seemingly randomly generates the two other dictionaries.
def repeat(type, times):
best = 0
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
print("The highest average success is", best)
return best
This works great. The last thing shown is:
BEST: (3.58, [{'strength': 4, 'intelligence': 1, 'charisma': 1, 'stamina': 4, 'willpower': 2, 'dexterity': 2, 'wits': 5, 'luck': 2}, {'agility': 1, 'brawl': 2, 'investigation': 3, 'larceny': 0, 'melee': 1, 'survival': 0, 'alchemy': 3, 'archery': 0, 'crafting': 0, 'drive': 1, 'magic': 0, 'medicine': 0, 'commercial': 0, 'esteem': 5, 'instruction': 2, 'intimidation': 2, 'persuasion': 0, 'seduction': 0}])
The highest average success is 3.58
But if I try something to store the list which gave this number:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
bestChar = x
print("The highest average success is", best)
print("Therefore the best character is", bestChar)
return best, bestChar
I get this as the last result, which is fine:
BEST: (4.15, [{'strength': 2, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 4, 'luck': 1}, {'agility': 1, 'brawl': 0, 'investigation': 5, 'larceny': 0, 'melee': 0, 'survival': 0, 'alchemy': 7, 'archery': 0, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 3, 'intimidation': 0, 'persuasion': 0, 'seduction': 0}])
The highest average success is 4.15
but the last line is
Therefore the best character is (4.15, [{'strength': 1, 'intelligence': 3, 'charisma': 4, 'stamina': 4, 'willpower': 1, 'dexterity': 2, 'wits': 2, 'luck': 3}, {'agility': 1, 'brawl': 0, 'investigation': 1, 'larceny': 4, 'melee': 2, 'survival': 0, 'alchemy': 2, 'archery': 4, 'crafting': 0, 'drive': 0, 'magic': 0, 'medicine': 0, 'commercial': 1, 'esteem': 0, 'instruction': 0, 'intimidation': 2, 'persuasion': 1, 'seduction': 0}])
As you can see this doesn't match with what I want, and what is printed literally right above it.
Through a little bit of checking, I realised what it gives out as the "Best Character" is just the last one generated, which is not the best, just the most recent. However, it isn't that simple, because the first element IS the highest result that was recorded, just not from the character in the rest of the list. This is really confusing because it means the list is somehow being edited but at no point can I see where that would happen.
Am I doing something stupid whereby the character is randomly generated every time? I wouldn't think so since x[0] gives the correct result and is stored fine, so what changes when it's the whole list?
From the function rollForCharacter() it returns rollResult, character which is just the number and then the two dictionaries.
I would greatly appreciate it if anyone could figure out and explain where I'm going wrong and why it can print the correct answer to the console yet not store it correctly a line below!
EDIT:
Dictionary 1 Code:
attributes = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row)-1):
val = randint(0, p)
rowValues[row[i]] = val + 1
p -= val
rowValues[row[-1]] = p + 1
return attributes.update(rowValues)
def getPoints():
points = [7, 5, 3]
shuffle(points)
row1 = ['strength', 'intelligence', 'charisma']
row2 = ['stamina', 'willpower']
row3 = ['dexterity', 'wits', 'luck']
for i in range(0, len(points)):
row = eval("row" + str(i+1))
assignRow(row, points[i])
Dictionary 2 Code:
skills = {}
def assignRow(row, p): # p is the number of points you have to assign to each row
rowValues = {}
for i in range(0, len(row) - 1):
val = randint(0, p)
rowValues[row[i]] = val
p -= val
rowValues[row[-1]] = p
return skills.update(rowValues)
def getPoints():
points = [11, 7, 4]
shuffle(points)
row1 = ['agility', 'brawl', 'investigation', 'larceny', 'melee', 'survival']
row2 = ['alchemy', 'archery', 'crafting', 'drive', 'magic', 'medicine']
row3 = ['commercial', 'esteem', 'instruction', 'intimidation', 'persuasion', 'seduction']
for i in range(0, len(points)):
row = eval("row" + str(i + 1))
assignRow(row, points[i])
It does look like the dictionary is being re-generated, which could easily happen if the function rollForCharacter returns either a generator or alternatively is overwriting a global variable which is being overwritten by a subsequent cycle of the loop.
A simple-but-hacky way to solve the problem would be to take a deep copy of the dictionary at the time of storing, so that you're sure you're keeping the values at that point:
def repeat(type, times):
best = 0
bestChar = []
for i in range(0, times):
x = rollForCharacter(type)
if x[0] > best:
print("BEST:", x)
best = x[0]
# Create a brand new tuple, containing a copy of the current dict
bestChar = (x[0], x[1].copy())
The correct answer would be however to pass a unique dictionary variable that is not affected by later code.
See this SO answer with a bit more context about how passing a reference to a dictionary can be risky as it's still a mutable object.
Given a list of data as follows:
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
I would like to create an algorithm that is able to offset the list of certain number of steps. For example, if the offset = -1:
def offsetFunc(inputList, offsetList):
#make something
return output
where:
output = [0,0,0,0,1,1,5,5,5,5,5,5,3,3,3,2,2]
Important Note: The elements of the list are float numbers and they are not in any progression. So I actually need to shift them, I cannot use any work-around for getting the result.
So basically, the algorithm should replace the first set of values (the 4 "1", basically) with the 0 and then it should:
Detect the lenght of the next range of values
Create a parallel output vectors with the values delayed by one set
The way I have roughly described the algorithm above is how I would do it. However I'm a newbie to Python (and even beginner in general programming) and I have figured out time by time that Python has a lot of built-in functions that could make the algorithm less heavy and iterating. Does anyone have any suggestion to better develop a script to make this kind of job? This is the code I have written so far (assuming a static offset at -1):
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
output = []
PrevVal = 0
NextVal = input[0]
i = 0
while input[i] == NextVal:
output.append(PrevVal)
i += 1
while i < len(input):
PrevVal = NextVal
NextVal = input[i]
while input[i] == NextVal:
output.append(PrevVal)
i += 1
if i >= len(input):
break
print output
Thanks in advance for any help!
BETTER DESCRIPTION
My list will always be composed of "sets" of values. They are usually float numbers, and they take values such as this short example below:
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
In this example, the first set (the one with value "1.236") is long 4 while the second one is long 6. What I would like to get as an output, when the offset = -1, is:
The value "0.000" in the first 4 elements;
The value "1.236" in the second 6 elements.
So basically, this "offset" function is creating the list with the same "structure" (ranges of lengths) but with the values delayed by "offset" times.
I hope it's clear now, unfortunately the problem itself is still a bit silly to me (plus I don't even speak good English :) )
Please don't hesitate to ask any additional info to complete the question and make it clearer.
How about this:
def generateOutput(input, value=0, offset=-1):
values = []
for i in range(len(input)):
if i < 1 or input[i] == input[i-1]:
yield value
else: # value change in input detected
values.append(input[i-1])
if len(values) >= -offset:
value = values.pop(0)
yield value
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
print list(generateOutput(input))
It will print this:
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
And in case you just want to iterate, you do not even need to build the list. Just use for i in generateOutput(input): … then.
For other offsets, use this:
print list(generateOutput(input, 0, -2))
prints:
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 3, 3]
Using deque as the queue, and using maxlen to define the shift length. Only holding unique values. pushing inn new values at the end, pushes out old values at the start of the queue, when the shift length has been reached.
from collections import deque
def shift(it, shift=1):
q = deque(maxlen=shift+1)
q.append(0)
for i in it:
if q[-1] != i:
q.append(i)
yield q[0]
Sample = [1.236,1.236,1.236,1.236,1.863,1.863,1.863,1.863,1.863,1.863]
print list(shift(Sample))
#[0, 0, 0, 0, 1.236, 1.236, 1.236, 1.236, 1.236, 1.236]
My try:
#Input
input = [1,1,1,1,5,5,3,3,3,3,3,3,2,2,2,5,5]
shift = -1
#Build service structures: for each 'set of data' store its length and its value
set_lengths = []
set_values = []
prev_value = None
set_length = 0
for value in input:
if prev_value is not None and value != prev_value:
set_lengths.append(set_length)
set_values.append(prev_value)
set_length = 0
set_length += 1
prev_value = value
else:
set_lengths.append(set_length)
set_values.append(prev_value)
#Output the result, shifting the values
output = []
for i, l in enumerate(set_lengths):
j = i + shift
if j < 0:
output += [0] * l
else:
output += [set_values[j]] * l
print input
print output
gives:
[1, 1, 1, 1, 5, 5, 3, 3, 3, 3, 3, 3, 2, 2, 2, 5, 5]
[0, 0, 0, 0, 1, 1, 5, 5, 5, 5, 5, 5, 3, 3, 3, 2, 2]
def x(list, offset):
return [el + offset for el in list]
A completely different approach than my first answer is this:
import itertools
First analyze the input:
values, amounts = zip(*((n, len(list(g))) for n, g in itertools.groupby(input)))
We now have (1, 5, 3, 2, 5) and (4, 2, 6, 3, 2). Now apply the offset:
values = (0,) * (-offset) + values # nevermind that it is longer now.
And synthesize it again:
output = sum([ [v] * a for v, a in zip(values, amounts) ], [])
This is way more elegant, way less understandable and probably way more expensive than my other answer, but I didn't want to hide it from you.
I'd like to clump a list of data based off a list of ranges. The idea being that I'd like to make a histogram of the end result. I know about collections.Counter but have not seen someone us it or other built in to generate clumps. I have written out the long form but am hoping someone can offer up something that is more efficient.
def min_to_sec(val):
ret_val = 60 * int(val)
return ret_val
def hr_to_sec(val):
ret_val = 3600 * int(val)
return ret_val
def histogram(y_lst):
x_lst = [ 10,
20,
30,
40,
50,
60,
90,
min_to_sec(2),
min_to_sec(3),
min_to_sec(4),
min_to_sec(5),
min_to_sec(10),
min_to_sec(15),
min_to_sec(20),
]
results = {}
for y_val in y_lst:
for x_val in x_lst:
if y_val < x_val:
results[ str(x_val) ] = results.get( str(x_val), 0) + 1
break
else:
results['greater'] = results.get('greater', 0) + 1
return results
Updated to include an example of desired sample output:
So if my x_lst and y_list are:
x_lst = [10,20,30,40]
y_lst = [1,2,3,15,22,27,40]
I'd like a return value similar to Counter, of:
{
10:3,
20:1,
30:2,
}
So while my above code works, being that it's a nested for loop, it's quite slow, and I'm hoping there's a way to use something like collections.Count to do this 'clumping' operation.
You could use collections.Counter to do this kind of counting of elements in a list:
In [1]: from collections import Counter
In [2]: Counter([1, 2, 10, 1, 2, 100])
Out[2]: Counter({1: 2, 2: 2, 100: 1, 10: 1})
You can increment a Counter more simply using:
results['foo'] += 1
In order to count only those before the inequality, you could use itertools.takewhile:
In [3]: from itertools import takewhile
In [4]: Counter(takewhile(lambda x: x < 10, [1, 2, 10, 1, 2, 100]))
Out[4]: Counter({1: 1, 2: 1})
However this won't keep track of those which have broken out of the takewhile.
Have you considered using pandas? You could put y_lst into a DataFrame and pretty easily make a histogram.
Assuming you have matplotlib and pylab imported...
import pandas as pd
data = pd.DataFrame([1, 2, 3, 15, 22, 27, 40])
data[0].hist(bins = 4)
That would give you the histogram you describe above. However, once the data is in a pandas DataFrame it's not too challenging to slice it up however you'd like.
All,
As you know, by python iter we can use iter.next() to get the next item of data.
take a list for example:
l = [x for x in range(100)]
itl = iter(l)
itl.next() # 0
itl.next() # 1
Now I want a buffer can store *general iter pointed data * slice in fixed size, use above list iter to demo my question.
class IterPage(iter, size):
# class code here
itp = IterPage(itl, 5)
what I want is
print itp.first() # [0,1,2,3,4]
print itp.next() # [5,6,7,8,9]
print itp.prev() # [0,1,2,3,4]
len(itp) # 20 # 100 item / 5 fixed size = 20
print itp.last() # [96,97,98,99,100]
for y in itp: # iter may not support "for" and len(iter) then something alike code also needed here
print y
[0,1,2,3,4]
[5,6,7,8,9]
...
[96,97,98,99,100]
it is not a homework, but as a beginner of the python know little about to design an iter class, could someone share me how to code the class "IterPage" here?
Also, by below answers I found if the raw data what I want to slice is very big, for example a 8Giga text file or a 10^100 records table on a database, it may not able to read all of them into a list - I have no so much physical memories. Take the snippet in python document for example:
http://docs.python.org/library/sqlite3.html#
>>> c = conn.cursor()
>>> c.execute('select * from stocks order by price')
>>> for row in c:
... print row
...
(u'2006-01-05', u'BUY', u'RHAT', 100, 35.14)
(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
If here we've got about 10^100 records, In that case, it it possible only store line/records I want by this class with itp = IterPage(c, 5)? if I invoke the itp.next() the itp can just fetch next 5 records from database?
Thanks!
PS: I got an approach in below link:
http://code.activestate.com/recipes/577196-windowing-an-iterable-with-itertools/
and I also found someone want to make a itertools.iwindow() function however it is just been rejected.
http://mail.python.org/pipermail/python-dev/2006-May/065304.html
Since you asked about design, I'll write a bit about what you want - it's not a iterator.
The defining property of a iterator is that it only supports iteration, not random access. But methods like .first and .last do random access, so what you ask for is not a iterator.
There are of course containers that allow this. They are called sequences and the simplest of them is the list. It's .first method is written as [0] and it's .last is [-1].
So here is such a object that slices a given sequence. It stores a list of slice objects, which is what Python uses to slice out parts of a list. The methods that a class must implement to be a sequence are given by the abstact base class Sequence. It's nice to inherit from it because it throws errors if you forget to implement a required method.
from collections import Sequence
class SlicedList(Sequence):
def __init__(self, iterable, size):
self.seq = list(iterable)
self.slices = [slice(i,i+size) for i in range(0,len(self.seq), size)]
def __contains__(self, item):
# checks if a item is in this sequence
return item in self.seq
def __iter__(self):
""" iterates over all slices """
return (self.seq[slice] for slice in self.slices)
def __len__(self):
""" implements len( .. ) """
return len(self.slices)
def __getitem__(self, n):
# two forms of getitem ..
if isinstance(n, slice):
# implements sliced[a:b]
return [self.seq[x] for x in self.slices[n]]
else:
# implements sliced[a]
return self.seq[self.slices[n]]
s = SlicedList(range(100), 5)
# length
print len(s) # 20
#iteration
print list(s) # [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], ... , [95, 96, 97, 98, 99]]
# explicit iteration:
it = iter(s)
print next(it) # [0, 1, 2, 3, 4]
# we can slice it too
print s[0], s[-1] # [0, 1, 2, 3, 4] [95, 96, 97, 98, 99]
# get the first two
print s[0:2] # [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]
# every other item
print s[::2] # [[0, 1, 2, 3, 4], [10, 11, 12, 13, 14], [20, 21, 22, 23, 24], ... ]
Now if you really want methods like .start (what for anyways, just a verbose way for [0] ) you can write a class like this:
class Navigator(object):
def __init__(self, seq):
self.c = 0
self.seq = seq
def next(self):
self.c +=1
return self.seq[self.c]
def prev(self):
self.c -=1
return self.seq[self.c]
def start(self):
self.c = 0
return self.seq[self.c]
def end(self):
self.c = len(self.seq)-1
return self.seq[self.c]
n = Navigator(SlicedList(range(100), 5))
print n.start(), n.next(), n.prev(), n.end()
The raw data that I want to slice is
very big, for example a 8Giga text
file... I may not be able to read all of
them into a list - I do not have so much
physical memory. In that case, is it
possible only get line/records I want
by this class?
No, as it stands, the class originally proposed below converts the iterator into a
list, which make it 100% useless for your situation.
Just use the grouper idiom (also mentioned below).
You'll have to be smart about remembering previous groups.
To save on memory, only store those previous groups that you need.
For example, if you only need the most recent previous group, you could store that in
a single variable, previous_group.
If you need the 5 most recent previous groups, you could use a collections.deque with a maximum size of 5.
Or, you could use the window idiom to get a sliding window of n groups of groups...
Given what you've told us so far, I would not define a class for this, because I don't see many reusable elements to the solution.
Mainly, what you want can be done with the grouper idiom:
In [22]: l = xrange(100)
In [23]: itl=iter(l)
In [24]: import itertools
In [25]: for y in itertools.izip(*[itl]*5):
....: print(y)
(0, 1, 2, 3, 4)
(5, 6, 7, 8, 9)
(10, 11, 12, 13, 14)
...
(95, 96, 97, 98, 99)
Calling next is no problem:
In [28]: l = xrange(100)
In [29]: itl=itertools.izip(*[iter(l)]*5)
In [30]: next(itl)
Out[30]: (0, 1, 2, 3, 4)
In [31]: next(itl)
Out[31]: (5, 6, 7, 8, 9)
But making a previous method is a big problem, because iterators don't work this way. Iterators are meant to produce values without remembering past values.
If you need all past values, then you need a list, not an iterator:
In [32]: l = xrange(100)
In [33]: ll=list(itertools.izip(*[iter(l)]*5))
In [34]: ll[0]
Out[34]: (0, 1, 2, 3, 4)
In [35]: ll[1]
Out[35]: (5, 6, 7, 8, 9)
# Get the last group
In [36]: ll[-1]
Out[36]: (95, 96, 97, 98, 99)
Now getting the previous group is just a matter of keeping track of the list index.