r's rep() feature in python - python

I am relatively new to python and still figuring out stuff. I wanted to check if there is an equivalent of r's rep command in python to replicate entire vector and not each element. I used numpy.repeat but it only repeats each element given times, is there a way to tweak it to repeat the entire vector?
example:
y=np.repeat(np.arange(0,2),3)
print(y)
array([0, 0, 0, 1, 1, 1])
expected output using r's rep
a<-c(0,1)
rep(a,3)
0 1 0 1 0 1

I'm no expert in R by any means but as far as I can tell, this is what you are looking for:
>>> np.tile([0, 1], 3)
array([0, 1, 0, 1, 0, 1])

your expected output is not in python (even though that's what you want) but if i try to translate it basically you want something that transforms lets say [0,1,2]
to [0,1,2,0,1,2,0,1,2 ...] with any number of repetitions
in python you can simply multiply a list with a number to get that:
lst = [0,1]
lst2 = lst*3
print(lst2)
this will print [0, 1, 0, 1, 0, 1]

Straight from the docs. np.repeat simply repeat the element present in the iterable to the number of times specified in the argument.
Other than what has already been posted is use of repeat and chain of itertools
from itertools import repeat, chain
list(chain(*(repeat((1,2),3)))) # [1, 2, 1, 2, 1, 2]

Related

filtering numpy 2d array by dynamic idx list and dynamic val list (answer given in Q)

Seems like an easy problem, and a couple Q's out there close to this ... but I just can't seem to find the answer.
Say I have a numpy 2d array:
> arr = np.asarray(([1, 2, 1, 0, 3],[1,1,1,1,1],[2,2,2,2,2],[1,0,1,0,1]))
> arr
array([[1, 2, 1, 0, 3],
[1, 1, 1, 1, 0],
[1, 0, 2, 2, 2],
[1, 0, 1, 0, 1]])
And I want to use two criteria to get me that first and fourth row. The criteria being:
The 3rd col's value needs to be 1. The 4th needs needs to be 0.
When the filter is met -- it gets me the idx of rows that meet that criteria in response.
So I should be getting something like this when it's done right...
Response: [0,3]
...
[more info about my question]
I instinctively feel the syntax should be a one liner, something like this:
colIdx = [2,3]
vals = [1,0]
np.argwhere(arr[:,colIdx]==vals)[:,0] #<--- ie: but doesn't work
Being able to accommodate two variables like colIdx and vals would really work for me -- because I'll have dynamically created lists for both the columns being checked (ie: 2,3) and the values (ie: 1,0) I'm looking for.
I could have multiple columns being checked beyond just two as well. And the values aren't set either. Thus the need for a dynamic approach.
The closest Q+A I've seen so far on stack overflow for this sort of Q, is here: Numpy: Filtering rows by multiple conditions? -- but can't seem to work out the syntax for my problem.
ANSWERING MY OWN Q:
colIdx = [2,3]
vals = [1,0]
np.argwhere(np.all(arr[:,colIdx] == vals,axis=1)==True)[:,0]
Response:
array([0, 3])
Normally you should give the code you tried and then we try to correct it, but I'm in a good mood, so here you go:
import numpy as np
arr = np.asarray(([1, 2, 1, 0, 3],[1,1,1,1,1],[2,2,2,2,2],[1,0,1,0,1]))
idx = [i for i in range(len(arr)) if arr[i,2]==1 and arr[i,3]==0]
print(idx)
In python you can use enumerate() to return a list containing idx and element
import numpy as np
arr = np.asarray(([1, 2, 1, 0, 3],[1,1,1,1,1],[2,2,2,2,2],[1,0,1,0,1]))
list = []
for idx, element in enumerate(arr):
if(element[2] == 1 and element[3] == 0):
list.append(idx)
print(list)
(answer to my own Q) I think this is what I'm looking for...
colIdx = [2,3]
vals = [1,0]
np.argwhere(np.all(arr[:,colIdx] == vals,axis=1)==True)[:,0]
Response:
array([0, 3])

How to get permutations in Python with no repeated cases

I'm doing some work with the Ising model. I've written a code to help me count the multiplicity of a lattice but I can't get up to any big numbers without getting a MemoryError.
The basic idea is, you have a list of zeros and ones, say [0,0,1,1]. I want to generate a set of all possible orderings of the ones and zeros. So in this example I want a set like this:
[(1,1,0,0),(1,0,1,0),(1,0,0,1),(0,1,1,0),(0,1,0,1),(0,0,1,1)]
At the moment I have done it like this:
set_initial=[0,0,1,1]
set_intermediate=[]
for subset in itertools.permutations(set_initial,4):
set_intermediate.append(subset)
set_final=list(set(set_intermediate))
The issue is that in the set_intermediate, for this example, there are 2^4 elements, only six of which are unique. And to take another example such as [0,0,0,0,0,0,0,0,1], there are 2^9 elements, only 9 of which are unique.
Is there another way of doing this so that set_intermediate isn't such a bottleneck?
Instead of permutations, you can think in terms of selecting the positions of the 1s as combinations. (I knew I'd done something similar before..)
from itertools import combinations
def binary_perm(seq):
n_on = sum(seq)
for comb in combinations(range(len(seq)), n_on):
out = [0]*len(seq)
for loc in comb:
out[loc] = 1
yield out
Not super-speedy, but generates exactly the right number of outputs, and so can handle longer sequences:
>>> list(binary_perm([0,0,1,1]))
[[1, 1, 0, 0], [1, 0, 1, 0], [1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 0, 1], [0, 0, 1, 1]]
>>> %timeit sum(1 for x in binary_perm([1]+[0]*10**4))
1 loops, best of 3: 409 ms per loop
Of course, usually you'd want to avoid looping over these in the first place, but depending on what you're doing with the permuations you might not be able to get away with simply calculating the number of unique permutations directly.
Try this inbuilt method itertools.permutation(iterable,r)

Why count doesn't work for this list?(in python)

In the following code the result in "countOf1" is 0 instead of 12. What is the reason and how can i solve it?
import numpy as np
import pandas as pd
x = np.matrix(np.arange(12).reshape((1, 12)))
x[:,:]=1
countOf1=(x.tolist()).count(1)
It's because when you convert that into a list with tolist() you're getting a subset of a list. Meaning this is your x:
x.tolist()
Out[221]: [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]
To get your countOf1 to work you'll need to do it for x.tolist()[0]. This will give you:
x.tolist()[0].count(1)
Out[223]: 12
Remember that a numpy matrix is like a list of lists. Even though you only created one row vector, numpy writes it with 2 brackets ([[0,1,2,3,4...,11]]). So when you changed it to a list with tolist(), you created a list within a list. Since the list within the list != 1, the count is 0.

Generating binary lists that sum to a given number

I am attempting Project Euler #15, which essentially reduces to computing the number of binary lists of length 2*size such that their entries sum to size, for the particular case size = 20. For example, if size = 2 there are 6 such lists: [1,1,0,0], [1,0,1,0], [1,0,0,1], [0,1,1,0], [0,1,1,0], [0,1,0,1], [0,0,1,1]. Of course the number of such sequences is trivial to compute for any value size and is equal to some binomial coefficient but I am interested in explicitly generating the correct sequences in Python. I have tried the following:
import itertools
size = 20
binary_lists = itertools.product(range(2), repeat = 2*size)
lattice_paths = {lists for lists in binary_lists if sum(lists) == size}
but the last line makes me run into memory errors. What would be a neat way to accomplish this?
There are far too many for the case of size=20 to iterate over (even if we don't materialize them, 137846528820 is not a number we can loop over in a reasonable time), so it's not particularly useful.
But you can still do it using built-in tools by thinking of the positions of the 1s:
from itertools import combinations
def bsum(size):
for locs in combinations(range(2*size), size):
vec = [0]*(2*size)
for loc in locs:
vec[loc] = 1
yield vec
which gives
>>> list(bsum(1))
[[1, 0], [0, 1]]
>>> list(bsum(2))
[[1, 1, 0, 0], [1, 0, 1, 0], [1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 0, 1], [0, 0, 1, 1]]
>>> sum(1 for x in bsum(12))
2704156
>>> factorial(24)//factorial(12)**2
2704156
I'm not 100% sure of the math on this problem, but your last line is taking a generator and dumping it into a list, and based on your example, and your size of 20, that is a massive list. If you want to sum it, just iterate, but I don't think you can get a nice view of every combo

Python - how to find numbers in a list which are not the minimum

I have a list S = [a[n],b[n],c[n]] and for n=0 the minimum of list S is the value 'a'. How do I select the values b and c given that I know the minimum? The code I'm writing runs through many iterations of n, and I want to examine the elements which are not the minimum for a given iteration in the loop.
Python 2.7.3, 32-bit. Numpy 1.6.2. Scipy 0.11.0b1
If you can flatten the whole list into a numpy array, then use argsort, the first row of argsort will tell you which array contains the minimum value:
a = [1,2,3,4]
b = [3,-4,5,8]
c = [6,1,-7,12]
S = [a,b,c]
S2 = np.array(S)
S2.argsort(axis=0)
array([[0, 1, 2, 0],
[1, 2, 0, 1],
[2, 0, 1, 2]])
Maybe you can do something like
S.sort()
S[1:3]
This is what you want?

Categories

Resources