array multiplication with integer - python

I have a 2D array "pop" in python. I want to multiply each element of column one with an integer. I used the following code
temp[i] = b*pop[i,0]+a*pop[i,1]
But it is returning error "list indices must be integers or slices, not tuple"

import random
from random import sample
rows=20
a,b = 0.5,1
pop=list(zip(sample(range(1, 100000), rows),sample(range(1, 100000), rows)))
profit = sample(range(1, 100000), rows)
#print(pop,profit)
mycombined=list(zip(pop,profit))
combined_array = np.asarray(mycombined)
print(combined_array)
m = len (combined_array)
it = 1500
#alpha = 0.01
J=0
for i in range(1,m,1):
bpop=combined_array[i][0][0]
apop=combined_array[i][0][1]
aprofit=combined_array[i][1]
temp=(bpop+apop-aprofit)**2
J=J+temp
Now you have a list of lists and can use list comprehensions to change the values
pop contains a two variable tuple
profit is a list of random numbers
the result is a list of lists or tuples

Related

Iterate over a list with floats, and use them in math equations

I am trying to iterate over a list with floats and calculate with the items in the lists, but I always get this:
list indices must be integers or slices, not float
as you can see below, there are lists named t, and sdt which have the same length and are both floats:
for i in t:
if t[i] == t[0] or t[1] or t[2] or t[3]:
for i in t[0:4]:
rp1x = r+h
rp1y = sdt[i] - .5*(l-w)
print(rp1x, rp1y)
you are trying to use the items from your list as indices, you should use:
my_items = t[0:4]
for item in t:
if item in my_items:
for i in my_items:
rp1x = r+h
rp1y = sdt[i] - .5*(l-w)
print(rp1x, rp1y)

Efficient way for generating N arrays of random numbers between different ranges

I want to generate N arrays of fixed length n of random numbers with numpy, but arrays must have numbers varying between different ranges.
So for example, I want to generate N=100 arrays of size n=5 and each array must have its numbers between:
First number between 0 and 10
Second number between 20 and 100
and so on...
First idea that comes to my mind is doing something like:
first=np.random.randint(0,11, 100)
second=np.random.randint(20,101, 100)
...
And then I should nest them, Is there a more efficient way?
I would just put them inside another array and iterate them through their index
from np.random import randint
array_holder = [[] for i in range(N)] # Get N arrays in a holder
ab_holder = [[a1, b1], [a2, b2]]
for i in range(len(array_holder)): # You iterate over each array
[a, b] = [ab_holder[i][0], ab_holder[i][1]]
for j in range(size): # Size is the ammount of elements you want in each array
array_holder[i].append(randint(a, b)) # Where a is your base and b ends the range
Another possibility. Setting ranges indicates both what the ranges of the individual parts of each arrays must be and how many there are. size is the number of values to sample in each individual part of an array. N is the size of the Monte-Carlo sample. arrays is the result.
import numpy as np
ranges = [ (0, 10), (20, 100) ]
size = 5
N = 100
arrays = [ ]
for n in range(N):
one_array = []
for r in ranges:
chunk = np.random.randint(*r, size=size)
one_array.append(chunk)
arrays.append(one_array)
It might make an appreciable difference to use numpy's append in place of Python's but I've written this in this way to make it easier to read (and to write :)).

Python - Random frequency for changing values in a list

I have a list that contains two lists and I am looking to randomly select one of the values within both of these lists and then multiply them by 0.5
For example, I receive a list like this:
[[-0.03680804604507722, 0.022112919584121357], [0.05806232738548797, -0.004015137642131433]]
What it sounds like you want to do is iterate through your list of lists, and at each list, randomly select an index, multiply the value at that index by 0.5 and place it back in the list.
import random
l = [[-0.03680804604507722, 0.022112919584121357], [0.05806232738548797, -0.004015137642131433]]
# for each sub list in the list
for sub_l in l:
# select a random integer between 0, and the number of elements in the sub list
rand_index = random.randrange(len(sub_l))
# and then multiply the value at that index by 0.5
# and store back in sub list
sub_l[rand_index] = sub_l[rand_index] * 0.5
You can use randint and the length of the list.
from random import randint
lst = [[-0.03680804604507722, 0.022112919584121357], [0.05806232738548797, -0.004015137642131433]]
for L in lst:
L[randint(0, len(L) - 1)] *= 0.5

Randomly grow values in a NumPy Array

I have a program that takes some large NumPy arrays and, based on some outside data, grows them by adding one to randomly selected cells until the array's sum is equal to the outside data. A simplified and smaller version looks like:
import numpy as np
my_array = np.random.random_integers(0, 100, [100, 100])
## Just creating a sample version of the array, then getting it's sum:
np.sum(my_array)
499097
So, supposing I want to grow the array until its sum is 1,000,000, and that I want to do so by repeatedly selecting a random cell and adding 1 to it until we hit that sum, I'm doing something like:
diff = 1000000 - np.sum(my_array)
counter = 0
while counter < diff:
row = random.randrange(0,99)
col = random.randrange(0,99)
coordinate = [row, col]
my_array[coord] += 1
counter += 1
Where row/col combine to return a random cell in the array, and then that cell is grown by 1. It repeats until the number of times by which it has added 1 to a random cell == the difference between the original array's sum and the target sum (1,000,000).
However, when I check the result after running this - the sum is always off. In this case after running it with the same numbers as above:
np.sum(my_array)
99667203
I can't figure out what is accounting for this massive difference. And is there a more pythonic way to go about this?
my_array[coordinate] does not do what you expect. It is selecting multiple rows and adding 1 to all of those entries. You could simply use my_array[row, col] instead.
You could simply write something like:
for _ in range(1000000 - np.sum(my_array)):
my_array[random.randrange(0, 99), random.randrange(0, 99)] += 1
(or xrange instead of range if using Python 2.x)
Replace my_array[coord] with my_array[row][col]. Your method chose two random integers and added 1 to every entry in the rows corresponding to both integers.
Basically you had a minor misunderstanding of how numpy indexes arrays.
Edit: To make this clearer.
The code posted chose two numbers, say 30 and 45, and added 1 to all 100 entries of row 30 and all 100 entries of row 45.
From this you would expect the total sum to be 100,679,697 = 200*(1,000,000 - 499,097) + 499,097
However when the random integers are identical (say, 45 and 45), only 1 is added to every entry of column 45, not 2, so in that case the sum only jumps by 100.
The problem with your original approach is that you are indexing your array with a list, which is interpreted as a sequence of indices into the row dimension, rather than as separate indices into the row/column dimensions (see here).
Try passing a tuple instead of a list:
coord = row, col
my_array[coord] += 1
A much faster approach would be to find the difference between the sum over the input array and the target value, then generate an array containing the same number of random indices into the array and increment them all in one go, thus avoiding looping in Python:
import numpy as np
def grow_to_target(A, target=1000000, inplace=False):
if not inplace:
A = A.copy()
# how many times do we need to increment A?
n = target - A.sum()
# pick n random indices into the flattened array
idx = np.random.random_integers(0, A.size - 1, n)
# how many times did we sample each unique index?
uidx, counts = np.unique(idx, return_counts=True)
# increment the array counts times at each unique index
A.flat[uidx] += counts
return A
For example:
a = np.zeros((100, 100), dtype=np.int)
b = grow_to_target(a)
print(b.sum())
# 1000000
%timeit grow_to_target(a)
# 10 loops, best of 3: 91.5 ms per loop

A faster way of comparing two lists of point-tuples?

I have two lists (which may or may not be the same length). In each list, are a series of tuples of two points (basically X, Y values).
I am comparing the two lists against each other to find two points with similar point values. I have tried list comprehension techniques, but it got really confusing with the nested tuples inside of the lists and I couldn't get it to work.
Is this the best (fastest) way of doing this? I feel like there might be a more Pythonic way of doing this.
Say I have two lists:
pointPairA = [(2,1), (4,8)]
pointPairB = [(3,2), (10,2), (4,2)]
And then an empty list for storing the pairs and a tolerance value to store only matched pairs
matchedPairs = []
tolerance = 2
And then this loop that unpacks the tuples, compares the difference, and adds them to the matchedPairs list to indicate a match.
for pointPairA in pointPairListA:
for pointPairB in pointPairListB:
## Assign the current X,Y values for each pair
pointPairA_x, pointPairA_y = pointPairA
pointPairB_x, pointPairB_x = pointPairB
## Get the difference of each set of points
xDiff = abs(pointPairA_x - pointPairB_x)
yDiff = abs(pointPairA1_y - pointPairB_y)
if xDiff < tolerance and yDiff < tolerance:
matchedPairs.append((pointPairA, pointPairB))
That would result in matchedPairs looking like this, with tuples of both point tuples inside:
[( (2,1), (3,2) ), ( (2,1), (4,2) )]
Here pointpairA is the single list and pointpairB would be one of the list of 20k
from collections import defaultdict
from itertools import product
pointPairA = [(2,1), (4,8)]
pointPairB = [(3,2), (10,2), (4,2)]
tolerance = 2
dA = defaultdict(list)
tolrange = range(-tolerance, tolerance+1)
for pA, dx, dy in product(pointPairA, tolrange, tolrange):
dA[pA[0]+dx,pA[1]+dy].append(pA)
# you would have a loop here though the 20k lists
matchedPairs = [(pA, pB) for pB in pointPairB for pA in dA[pB]]
print matchedPairs
If these lists are large, I would suggest finding a faster algorithm...
I would start by sorting both lists of pairs by the sum of the (x,y) in the pair. (Because two points can be close only if their sums are close.)
For any point in the first list, that will severely limit the range you need to search in the second list. Keep track of a "sliding window" on the second list, corresponding to the elements whose sums are within 2*tolerance of the sum of the current element of the first list. (Actually, you only need to keep track of the start of the sliding window...)
Assuming tolerance is reasonably small, this should convert your O(n^2) operation into O(n log n).
With list comprehension:
[(pa, pb) for pa in pointPairA for pb in pointPairB \
if abs(pa[0]-pb[0]) <= tolerance and abs(pa[1]-pb[1]) <= tolerance]
Slightly much faster than your loop:
(for 1 million executions)
>>> (list comprehension).timeit()
2.1963138580322266 s
>>> (your method).timeit()
2.454944133758545 s

Categories

Resources