Split integer into equal chunks - python

What is the most efficient and reliable way in Python to split sectors up like this:
number: 101 (may vary of course)
chunk1: 1 to 30
chunk2: 31 to 61
chunk3: 62 to 92
chunk4: 93 to 101
Flow:
copy sectors 1 to 30
skip sectors in chunk 1 and copy 30 sectors starting from sector 31.
and so on...
I have this solved in a "manual" way using modules and basic math but there's got to be a function for this?
Thank you.

I assume that you will have number in a list format. So, in this case if you want very specific format of cluster of number sequence and you know where it should separate then using indexing is the best way as it will have less time complexity. So,you can always create a small code and make it a function to use repeatedly. Something like below:
def sectors(num_seq,chunk_size=30):
...: import numpy as np
...: sectors = int(np.ceil(len(num_seq)/float(chunk_size))) #create number of sectors
...: for i in range(sectors):
...: if i < (sectors - 1):
...: print num_seq[(chunk_size*i):(chunk_size*(i+1))] #All will chunk equal size except the last one.
...: else:
...: print num_seq[(chunk_size*i):] #Takes rest at the end.
Now, every time you want similar thing you can reuse it and it is efficient as you are defining list index value instead of searching through it.
Here is the output:
x = range(1,101)
print sectors(x)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
[31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60]
[61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
Please let me know if this meets your requirement.

Easy and fast(single iteration):
>>> input = range(1, 102)
>>> n = 30
>>> output = [input[i:i+n] for i in range(0, len(input), n)]
>>> output
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]
Another very simple and comprehensive way:
>>> f = lambda x,y: [ x[i:i+y] for i in range(0,len(x),y)]
>>> f(range(1, 102), 30)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]

You can try using numpy.histogram if you're looking to spit a number into equal sized bins (sectors).
This will create an array of numbers, demarcating each bin boundary:
import numpy as np
number = 101
values = np.arange(number, dtype=int)
bins = np.histogram(values, bins='auto')
print(bins)

Related

Adding each value in an RDD to its partition number

Quite new to PySpark so this might be simple. I have an RDD that ranges from 1 to 100 and has 4 partitions.
A = sc.parallelize(range(100), 4)
And I have to find a way to return another RDD where each value in the RDD is added to its partition number. The ideal example would be:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 52, 53, 54, 55, 56, 57, 58, 59,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99, 100, 101, 102]
Would like to know how I could amend the following code to get the desired results.
A = sc.parallelize(range(100), 4)
B =
print(B.collect())

Generate random subsample of fixed size of numpy array

I have searched for this and couldn't find it. Imagine I have a numpy array of size N. Now I want to generate it's subsample of size M. Basically I want M randomly chosen elements from this array. N >= M. How can I do it ?
np.random.choice():
>>> N = 100; M = 10
>>> a = np.arange(0, N)
>>> np.random.choice(a, M, replace=False)
array([22, 81, 63, 7, 10, 52, 30, 33, 18, 41])
With replace=False you get no repetitions, and in that case M must be <= N.
Edit: 2d case:
>>> a = np.arange(0,120).reshape(10,12)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[ 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[ 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47],
[ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[ 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71],
[ 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83],
[ 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95],
[ 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107],
[108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119]])
>>> idx = np.arange(0, 10)
>>> rand_idx = np.random.choice(idx, 5, replace=False)
>>> a[rand_idx]
array([[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47],
[84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]])

Set community index as a vertex attribute with IGraph Python

When I detect communities on a graph with Igraph in Python, I get a result like this:
print g.community_multilevel(return_levels=False)
Clustering with 100 elements and 4 clusters
[0] 16, 17, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 39, 40, 44
[1] 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 38, 92, 94, 96,
97, 98, 99
[2] 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 66, 67, 69
[3] 21, 41, 65, 68, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 93, 95
I'm adding the corresponding community number as an attribute to each vertex like this:
for v in g.vs():
c = 0
for i in g.community_multilevel(return_levels=False):
if v.index in i:
print v.index,i,c
v["group"] = c
c += 1
Is there a more elegant way to achieve this?
What you are doing is terribly inefficient because you are running the community detection algorithm for every single iteration of the outer loop even though its result should be the same no matter how many times you run it. A much simpler way to do it would be:
cl = g.community_multilevel(return_levels=False)
g.vs["group"] = cl.membership

Removing Elements from a range

i am a beginner and i would like to know how to remove the multiples of 11 and 4 from this the range. I would like to include all other numbers excluding 4, 11 and their variable. Is there a way of doing this without individual writing each code snippet?
for i in range(1,101):
print (2**i)-1
>>> [i for i in range(1,101) if i%4!=0 and i%11!=0]
[1, 2, 3, 5, 6, 7, 9,
10, 13, 14, 15, 17, 18, 19,
21, 23, 25, 26, 27, 29,
30, 31, 34, 35, 37, 38, 39,
41, 42, 43, 45, 46, 47, 49,
50, 51, 53, 54, 57, 58, 59,
61, 62, 63, 65, 67, 69,
70, 71, 73, 74, 75, 78, 79,
81, 82, 83, 85, 86, 87, 89,
90, 91, 93, 94, 95, 97, 98]

Percent list slicing

I'm using python 3.2.3 IDLE and this is my code:
originalList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
newList = orginalList[0.05:0.95] #<<<<I have no idea what I'm doing here
print (newList)
I have an original list of numbers, they are 1 - 100 and i want to make a new list from the original list however the new list must only have data that belongs to the sub-range 5%- 95% of the original list
so the new list must be like [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18....95]. How do i do that? i know my newList code is wrong
originalList.sort()
newList = originalList[int(len(originalList) * .05) : int(len(originalList) * .95)]
sl = slice(4, 95)
print(originalList[sl])
Also see http://docs.python.org/2/library/functions.html#slice
size = len(originalList)
newList = originalList[0.05*size - 1:0.95*size + 1]
If you want to get part of a list, the syntax is
List = [1,2,3,4,5,6,7,8,9,10]
newList = [*start index*:*Index to end AT*]
so, the first number is the index where the sub-list starts, while the second number is the index at which the sublist stops (that index is not included).
hope this helps!
I'd also use a list comprehension for creating the original list... less mistake prone.
originalList = range(1,101)
newList = originalList[(len(originalList)*.05)-1:len(originalList)*.95]
print newList
Gives the desired result...
Edit: Changed range to be more concise per comment below.
For lists of arbitrary length, you could do:
>>> l = range(200)
>>> percentage = 5
>>> skip = int(len(l) * (float(percentage) / 100) / 2)
>>> len(l[skip:-skip])
190
You could use the fidx module, which allows percentages as indexes:
import fidx
originalList = fidx([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100])
# or better: originalList = fidx.list(range(1,101))
newList = originalList[0.05:0.95]
print (newList)
which returns
[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95]

Categories

Resources