Set community index as a vertex attribute with IGraph Python - python

When I detect communities on a graph with Igraph in Python, I get a result like this:
print g.community_multilevel(return_levels=False)
Clustering with 100 elements and 4 clusters
[0] 16, 17, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 39, 40, 44
[1] 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 38, 92, 94, 96,
97, 98, 99
[2] 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 66, 67, 69
[3] 21, 41, 65, 68, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 93, 95
I'm adding the corresponding community number as an attribute to each vertex like this:
for v in g.vs():
c = 0
for i in g.community_multilevel(return_levels=False):
if v.index in i:
print v.index,i,c
v["group"] = c
c += 1
Is there a more elegant way to achieve this?

What you are doing is terribly inefficient because you are running the community detection algorithm for every single iteration of the outer loop even though its result should be the same no matter how many times you run it. A much simpler way to do it would be:
cl = g.community_multilevel(return_levels=False)
g.vs["group"] = cl.membership

Related

This is about the euler 11th python

nums = [8, 2, 22, 97, 38, 15, 00, 40, 00, 75, 4, 5, 7, 78, 52, 12, 50, 77, 91, 8,
49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 4, 56, 62, 00,
81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 3, 49, 13, 36, 65,
52, 70, 95, 23, 4, 60, 11, 42, 69, 24, 68, 56, 1, 32, 56, 71, 37, 2, 36, 91,
22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80,
24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50,
32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70,
67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21,
24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72,
21, 36, 23, 9, 75, 00, 76, 44, 20, 45, 35, 14, 00, 61, 33, 97, 34, 31, 33, 95,
78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92,
16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 00, 17, 54, 24, 36, 29, 85, 57,
86, 56, 00, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58,
19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40,
4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66,
88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69,
4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36,
20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16,
20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54,
1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48]
def row(n): #finds the row of the numbers
number_row = n//20
return number_row
def hor_mult(n):
hor_final = 1
num = 1
for i in range(4):
if n < 17+20*row(n): #finds if the number is 4 digit away from the end of the row
num *= nums[n+i]
if num > hor_final: #if the number is higher than final prints number
print(num)
hor_final = num
else:
hor_final = hor_final #else the final num stays the same
else:
return hor_final #
for n in range(400):
print(hor_mult(n))
I am trying to find the biggest back to back multiplication of 4 number, but my code prints every 4 multiplication of back to back numbers.
First part of the code (def row)finds the row of the 4 numbers because all four numbers must be on the same row.
In the second part (hor_mult) I tried to find the biggest 4 back to back mult of nums
There are these issues:
if n < 17+20*row(n): is a condition that does not depend on the loop, so it should not appear inside the loop. It is also a quite complex way to say that the column index should be less than 17, so why not write a function col instead of row? You can use the % operator for that.
The check if num > hor_final should not be made while the product of four isn't completed yet, as there might still be a 0 to be included, making the product less than what it currently is. So this check should not be in the loop, but appear after it. Moreover, you want to compare products of mulitple calls of hor_mult, so this check shouldn't be inside that function, nor should hor_final be a local name inside that function. The result is only final when the loop in the main code (over 400) has finished, so hor_final should be defined there.
print(num): the function shouldn't print anything: it cannot know by itself whether the maximum was achieved as that depends on other calls of hor_mult. Printing is not a job for this function. This function's job should be just to return a product of four. It is for the caller to decide whether it is great enough and to print. That printing can only happen when all products have been calculated -- not before.
else: return hor_final: no, you shouldn't return a partial product, not even 1. As indicated earlier, the corresponding if condition should be outside the loop, and its else case should return 0 (the least possible product when input is non-negative), not hor_final.
In the main program loop, there are 400 calls of print. It should be clear that this is wrong. You want to execute print only once. The loop should serve to find out which returned value is the greatest. That's the purpose of the loop. After the loop you should print, and only then.
Here is how the code could be fixed:
def col(n):
return n % 20
def hor_mult(n):
if col(n) < 17: # Only bother looping when there is room for 4 values
num = 1
for value in nums[n: n+4]: # pythonic way to get those 4 values
num *= value
return num
else:
return 0 # When not enough values to make the product of 4.
# pythonic way to make those 400 calls and get the maximum
hor_final = max(map(hor_mult, range(400)))
print(hor_final) # only print when you have full information
Note that the Euler Project challenge asks more than just this. You also need to check the products in other directions, which will be more challenging. To be really honest with you, seeing the problems in your attempt, I think Euler Project challenges are going to get too difficult at this stage, and I would advise you to first practice on simpler challenges.

Python Project Euler Question 11 - if statement won't execute if index is greater than 16

I have been attempting Problem 11 from Project Euler (https://projecteuler.net/problem=11) in Python, and have come across an error that confuses me. Basically, the 20x20 grid of numbers is a list that contains sublists of every line (or, every 20 number):
number_grid = [
[8, 2, 22, 97, 38, 15, 00, 40, 00, 75, 4, 5, 7, 78, 52, 12, 50, 77, 91, 8],
[49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 4, 56, 62, 00],
[81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 3, 49, 13, 36, 65],
[52, 70, 95, 23, 4, 60, 11, 42, 69, 24, 68, 56, 1, 32, 56, 71, 37, 2, 36, 91],
[22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80],
[24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50],
[32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70],
[67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21],
[24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72],
[21, 36, 23, 9, 75, 00, 76, 44, 20, 45, 35, 14, 00, 61, 33, 97, 34, 31, 33, 95],
[78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92],
[16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 00, 17, 54, 24, 36, 29, 85, 57],
[86, 56, 00, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58],
[19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40],
[4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66],
[88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69],
[4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36],
[20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16],
[20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54],
[1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48]
]
The problem asks me to find "the greatest product of four adjacent numbers in the same direction (up, down, left, right, or diagonally)," meaning I have to loop through every number and find those combinations.
To loop through each number I create nested for loops to first loop through the multiple lists, and then to loop through the numbers in those lists.
My for loops look like this:
for list in range(0,20):
for num in range(0,20):
print(list, num, number_grid[list][num])
I also make pre-set combinations in the form of a list based on the variables list and num:
left_side = [number_grid[list][num], number_grid[list][num-1], number_grid[list][num-2], number_grid[list][num-3]]
right_side = [number_grid[list][num], number_grid[list][num+1], number_grid[list][num+2], number_grid[list][num+3]]
up_side = [number_grid[list][num], number_grid[list-1][num], number_grid[list-2][num], number_grid[list-3][num]]
down_side = [number_grid[list][num], number_grid[list+1][num], number_grid[list+2][num], number_grid[list+3][num]]
diag_down_left = [[number_grid[list][num]], number_grid[list+1][num-1], number_grid[list+2][num-2], number_grid[list+3][num-3]]
diag_down_right = [[number_grid[list][num]], number_grid[list+1][num+1], number_grid[list+2][num+2], number_grid[list+3][num+3]]
diag_up_left = [[number_grid[list][num]], number_grid[list-1][num-1], number_grid[list-2][num-2], number_grid[list-3][num-3]]
diag_up_right = [[number_grid[list][num]], number_grid[list-1][num+1], number_grid[list-2][num+2], number_grid[list-3][num+3]]
The hard part of this problem (the way I see it), is that I can't have all 8 combinations (listed above) for every number. For example, for numbers that are less than 4 spaces away from the left side and less than 4 spaces below the first list cannot have a horizontally left, diagonally down left, diagonally up left and vertically up combination of number as there simply isn't enough space. This is expressed with the if statement looking like:
if num < 3 and list < 3:
print("Top Left")
combinations.append(right_side)
combinations.append(down_side)
combinations.append(diag_down_right)
This piece of code works, with the output in my console being:
0 0 8
Top Left
0 1 2
Top Left
0 2 22
Top Left
The numbers, such as "0 0 8" are from me printing which list it is, and which number place it is. So, "0 0" means the first number of the first list, and "1 1" would be the 2nd number of the 2nd list.
However, a problem occurs when the code loops to the top right part of the grid. The code I implement is:
elif list < 3 and num > 16:
print("Top Right")
combinations.append(left_side)
combinations.append(down_side)
combinations.append(diag_down_left)
Yet this doesn't seem to register. In fact, the full console log looks like this:
0 0 8
Top Left
0 1 2
Top Left
0 2 22
Top Left
0 3 97
0 4 38
0 5 15
0 6 0
0 7 40
0 8 0
0 9 75
0 10 4
0 11 5
0 12 7
0 13 78
0 14 52
0 15 12
0 16 50
0 17 77
Traceback (most recent call last):
File "11.py", line 30, in <module>
right_side = [number_grid[list][num], number_grid[list][num+1], number_grid[list][num+2], number_grid[list][num+3]]
IndexError: list index out of range
I'm confused, because list value is under 3, and the number index is above 16. I've done some troubleshooting, and the if statement works without the "and num > 16 for some reason, but I still need that part.
My full code looks like this, if there appear to be any errors in the structuring and such:
number_grid = [
[8, 2, 22, 97, 38, 15, 00, 40, 00, 75, 4, 5, 7, 78, 52, 12, 50, 77, 91, 8],
[49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 4, 56, 62, 00],
[81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 3, 49, 13, 36, 65],
[52, 70, 95, 23, 4, 60, 11, 42, 69, 24, 68, 56, 1, 32, 56, 71, 37, 2, 36, 91],
[22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80],
[24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50],
[32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70],
[67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21],
[24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72],
[21, 36, 23, 9, 75, 00, 76, 44, 20, 45, 35, 14, 00, 61, 33, 97, 34, 31, 33, 95],
[78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92],
[16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 00, 17, 54, 24, 36, 29, 85, 57],
[86, 56, 00, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58],
[19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40],
[4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66],
[88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69],
[4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36],
[20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16],
[20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54],
[1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48]
]
combinations = []
for list in range(0,20):
for num in range(0,20):
print(list, num, number_grid[list][num])
left_side = [number_grid[list][num], number_grid[list][num-1], number_grid[list][num-2], number_grid[list][num-3]]
right_side = [number_grid[list][num], number_grid[list][num+1], number_grid[list][num+2], number_grid[list][num+3]]
up_side = [number_grid[list][num], number_grid[list-1][num], number_grid[list-2][num], number_grid[list-3][num]]
down_side = [number_grid[list][num], number_grid[list+1][num], number_grid[list+2][num], number_grid[list+3][num]]
diag_down_left = [[number_grid[list][num]], number_grid[list+1][num-1], number_grid[list+2][num-2], number_grid[list+3][num-3]]
diag_down_right = [[number_grid[list][num]], number_grid[list+1][num+1], number_grid[list+2][num+2], number_grid[list+3][num+3]]
diag_up_left = [[number_grid[list][num]], number_grid[list-1][num-1], number_grid[list-2][num-2], number_grid[list-3][num-3]]
diag_up_right = [[number_grid[list][num]], number_grid[list-1][num+1], number_grid[list-2][num+2], number_grid[list-3][num+3]]
if num < 3 and list < 3:
print("Top Left")
combinations.append(right_side)
combinations.append(down_side)
combinations.append(diag_down_right)
elif num < 3 and list > 16:
print("Bottom Left")
combinations.append(right_side)
combinations.append(up_side)
combinations.append(diag_up_right)
elif list < 3 and num > 16:
print("Top Right")
combinations.append(left_side)
combinations.append(down_side)
combinations.append(diag_down_left)
elif num > 16 and list > 16:
print("Bottom Right")
combinations.append(left_side)
combinations.append(up_side)
combinations.append(diag_up_left)
elif num < 3:
print("Left")
combinations.append(right_side)
combinations.append(up_side)
combinations.append(down_side)
combinations.append(diag_up_right)
combinations.append(diag_down_right)
elif num > 16:
print("Right")
combinations.append(left_side)
combinations.append(up_side)
combinations.append(down_side)
combinations.append(diag_up_left)
combinations.append(diag_down_left)
if list < 3:
print("Top")
combinations.append(right_side)
combinations.append(left_side)
combinations.append(down_side)
combinations.append(diag_down_left)
combinations.append(diag_down_right)
elif list > 16:
print("Bottom")
combinations.append(right_side)
combinations.append(left_side)
combinations.append(up_side)
combinations.append(diag_up_left)
combinations.append(diag_up_right)
print(combinations)
In your inner loop (for num in range(0, 20):), you begin by making dangerous accesses and only afterwards start checking for indices that will end up going out of range. You want the pattern to be
check for and handle bad indices
access using the indices
increment index
as opposed to
access using the indices
check for and handle bad indices
increment index
because in the latter, we increment the index directly before accessing the indices when the loop iterates - where we really wanted to check the indices after modifying them.
In your particular case, you will crash when num becomes 17. Because the first thing you do with num after it becoming 17 is try to use it as an index, and num+3 is 20 making number_grid[list][num+3] result in an exception being thrown. Only after making that access, do you check to see if num is greater than 16.
Although Python let's you get away with looking at index -1, it doesn't let you get away with looking at indices larger than the size of the array. When list and num are large, you look at indices list + 3 and num + 3, even though they are larger than the size of the array.
You can only look at number_grid[list][num + 3] after you have determined that num + 3 is a valid index.
By the way, don't name a variable list in Python. This is bad practice. list is the name of a commonly used predefined function. For this code, I'd use row and col or something like that.

Split integer into equal chunks

What is the most efficient and reliable way in Python to split sectors up like this:
number: 101 (may vary of course)
chunk1: 1 to 30
chunk2: 31 to 61
chunk3: 62 to 92
chunk4: 93 to 101
Flow:
copy sectors 1 to 30
skip sectors in chunk 1 and copy 30 sectors starting from sector 31.
and so on...
I have this solved in a "manual" way using modules and basic math but there's got to be a function for this?
Thank you.
I assume that you will have number in a list format. So, in this case if you want very specific format of cluster of number sequence and you know where it should separate then using indexing is the best way as it will have less time complexity. So,you can always create a small code and make it a function to use repeatedly. Something like below:
def sectors(num_seq,chunk_size=30):
...: import numpy as np
...: sectors = int(np.ceil(len(num_seq)/float(chunk_size))) #create number of sectors
...: for i in range(sectors):
...: if i < (sectors - 1):
...: print num_seq[(chunk_size*i):(chunk_size*(i+1))] #All will chunk equal size except the last one.
...: else:
...: print num_seq[(chunk_size*i):] #Takes rest at the end.
Now, every time you want similar thing you can reuse it and it is efficient as you are defining list index value instead of searching through it.
Here is the output:
x = range(1,101)
print sectors(x)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
[31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60]
[61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
Please let me know if this meets your requirement.
Easy and fast(single iteration):
>>> input = range(1, 102)
>>> n = 30
>>> output = [input[i:i+n] for i in range(0, len(input), n)]
>>> output
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]
Another very simple and comprehensive way:
>>> f = lambda x,y: [ x[i:i+y] for i in range(0,len(x),y)]
>>> f(range(1, 102), 30)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]
You can try using numpy.histogram if you're looking to spit a number into equal sized bins (sectors).
This will create an array of numbers, demarcating each bin boundary:
import numpy as np
number = 101
values = np.arange(number, dtype=int)
bins = np.histogram(values, bins='auto')
print(bins)

Removing Elements from a range

i am a beginner and i would like to know how to remove the multiples of 11 and 4 from this the range. I would like to include all other numbers excluding 4, 11 and their variable. Is there a way of doing this without individual writing each code snippet?
for i in range(1,101):
print (2**i)-1
>>> [i for i in range(1,101) if i%4!=0 and i%11!=0]
[1, 2, 3, 5, 6, 7, 9,
10, 13, 14, 15, 17, 18, 19,
21, 23, 25, 26, 27, 29,
30, 31, 34, 35, 37, 38, 39,
41, 42, 43, 45, 46, 47, 49,
50, 51, 53, 54, 57, 58, 59,
61, 62, 63, 65, 67, 69,
70, 71, 73, 74, 75, 78, 79,
81, 82, 83, 85, 86, 87, 89,
90, 91, 93, 94, 95, 97, 98]

Percent list slicing

I'm using python 3.2.3 IDLE and this is my code:
originalList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
newList = orginalList[0.05:0.95] #<<<<I have no idea what I'm doing here
print (newList)
I have an original list of numbers, they are 1 - 100 and i want to make a new list from the original list however the new list must only have data that belongs to the sub-range 5%- 95% of the original list
so the new list must be like [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18....95]. How do i do that? i know my newList code is wrong
originalList.sort()
newList = originalList[int(len(originalList) * .05) : int(len(originalList) * .95)]
sl = slice(4, 95)
print(originalList[sl])
Also see http://docs.python.org/2/library/functions.html#slice
size = len(originalList)
newList = originalList[0.05*size - 1:0.95*size + 1]
If you want to get part of a list, the syntax is
List = [1,2,3,4,5,6,7,8,9,10]
newList = [*start index*:*Index to end AT*]
so, the first number is the index where the sub-list starts, while the second number is the index at which the sublist stops (that index is not included).
hope this helps!
I'd also use a list comprehension for creating the original list... less mistake prone.
originalList = range(1,101)
newList = originalList[(len(originalList)*.05)-1:len(originalList)*.95]
print newList
Gives the desired result...
Edit: Changed range to be more concise per comment below.
For lists of arbitrary length, you could do:
>>> l = range(200)
>>> percentage = 5
>>> skip = int(len(l) * (float(percentage) / 100) / 2)
>>> len(l[skip:-skip])
190
You could use the fidx module, which allows percentages as indexes:
import fidx
originalList = fidx([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100])
# or better: originalList = fidx.list(range(1,101))
newList = originalList[0.05:0.95]
print (newList)
which returns
[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95]

Categories

Resources