I was wondering if there's a good way to find the next available gap to create a network block given a list of existing ones?
For example, I have these networks in my list:
[
'10.0.0.0/24',
'10.0.0.0/20',
'10.10.0.0/20',
]
and then someone comes along and ask: "Do you have have enough space for 1 /22 for me?"
I'd like to be able to suggest something along the line:
"Here's a space: x.x.x.x/22" (x.x.x.x is something that comes before 10.0.0.0)
or
"Here's a space: x.x.x.x/22" (x.x.x.x is something in between 10.0.0.255 and 10.10.0.0)
or
"Here's a space: x.x.x.x/22" (x.x.x.x is something that comes after 10.10.15.255)
I'd really appreciate any suggestions.
The ipaddress library is good for this sort of use case. You can use the IPv4Network class to define subnet ranges, and the IPv4Address objects it can return can be converted into integers for comparison.
What I do below:
Establish your given list as a list of IPv4Networks
Determine the size of the block we're looking for
Iterate through the list, computing the amount of space between consecutive blocks, and checking if our wanted block fits.
You could also return an IPv4Network with the subnet built into it, instead of an IPv4Address, but I'll leave that as an exercise to the reader.
from ipaddress import IPv4Network, IPv4Address
networks = [
IPv4Network('10.0.0.0/24')
IPv4Network('10.0.0.0/20')
IPv4Network('10.0.10.0/20')
]
wanted = 22
wanted_size = 2 ** (32 - wanted) # number of addresses in a /22
space_found = None
for i in range(1, len(networks):
previous_network_end = int(networks[i-1].network_address + int(networks[i-1].hostmask))
next_network_start = int(networks[i].network_address)
free_space_size = next_network_start - previous_network_end
if free_space_size >= wanted_size:
return IPv4Address(networks[i-1] + 1) # first available address
Related
from ipaddress class, I know the address_exclude method. below is an example from the documentation:
>>> n1 = ip_network('192.0.2.0/28')
>>> n2 = ip_network('192.0.2.1/32')
>>> list(n1.address_exclude(n2))
[IPv4Network('192.0.2.8/29'), IPv4Network('192.0.2.4/30'),
IPv4Network('192.0.2.2/31'), IPv4Network('192.0.2.0/32')]
but what about if I want to remove two or more subnets from a network? for example, how can I delete from the 192.168.10.0/26 his subnets 192.168.10.24/29 and 192.168.10.48/28? the result should be 192.168.10.0/28, 192.168.10.16/29 and 192.168.10.32/28.
I'm trying to find a way to write the algoritm that I use in my mind using the address_exclude method but I can't. is there a simple way to implement what I just explained?
When you exclude one network from another, the result can be multiple networks (original one got split) - so, for the rest of the networks to exclude, you need to first find which part they would fit into before excluding them as well.
Here's one possible solution:
from ipaddress import ip_network, collapse_addresses
complete = ip_network('192.168.10.0/26')
# I chose the larger subnet for exclusion first, can be automated with network comparison
subnets = list(complete.address_exclude(ip_network('192.168.10.48/28')))
# other network to exclude
other_exclude = ip_network('192.168.10.24/29')
result = []
# Find which subnet the other exclusion will happen in
for sub in subnets:
# If found, exclude & add the result
if other_exclude.subnet_of(sub):
result.extend(list(sub.address_exclude(other_exclude)))
else:
# Other subnets can be added directly
result.append(sub)
# Collapse in case of overlaps
print(list(collapse_addresses(result)))
Output:
[IPv4Network('192.168.10.0/28'), IPv4Network('192.168.10.16/29'), IPv4Network('192.168.10.32/28')]
Expanding on my brain wave posted on #rdas's response, posting my solution.
It seems better to split the initial network into the smallest chunks you are asking, and do this for all ranges to be removed. Then exclude them from the list and return result.
from ipaddress import ip_network, collapse_addresses
def remove_ranges(mynetwork,l_of_ranges):
# find smallest chunk
l_chunk = sorted(list(set([x.split('/')[1] for x in l_of_ranges])))
l_mynetwork = list(ip_network(mynetwork).subnets(new_prefix=int(l_chunk[-1])))
l_chunked_ranges = [ ]
for nw in l_of_ranges:
l_chunked_ranges.extend(list(ip_network(nw).subnets(new_prefix=int(l_chunk[-1]))))
#l_removed_networks = [ ]
#for mynw in l_mynetwork:
# if not mynw in l_chunked_ranges:
# l_removed_networks.append(mynw)
#result = list(collapse_addresses(l_removed_networks))
result = list(collapse_addresses(set(l_mynetwork) - set(l_chunked_ranges)))
return [str(r) for r in result]
if __name__ == '__main__':
mynetwork = "10.110.0.0/16"
l_of_ranges = ["10.110.0.0/18","10.110.72.0/21","10.110.80.0/21","10.110.96.0/21"]
print(f"My network: {mynetwork}, Existing: {l_of_ranges} ")
a = remove_ranges(mynetwork,l_of_ranges)
print(f"Remaining: {a}")
With the result:
My network: 10.110.0.0/16, Existing: ['10.110.0.0/18', '10.110.72.0/21', '10.110.80.0/21', '10.110.96.0/21']
Remaining: ['10.110.64.0/21', '10.110.88.0/21', '10.110.104.0/21', '10.110.112.0/20', '10.110.128.0/17']
Which seems to be valid.
In my Python application I have an array of IP address strings which looks something like this:
[
"50.28.85.81-140", // Matches any IP address that matches the first 3 octets, and has its final octet somewhere between 81 and 140
"26.83.152.12-194" // Same idea: 26.83.152.12 would match, 26.83.152.120 would match, 26.83.152.195 would not match
]
I installed netaddr and although the documentation seems great, I can't wrap my head around it. This must be really simple - how do I check if a given IP address matches one of these ranges? Don't need to use netaddr in particular - any simple Python solution will do.
The idea is to split the IP and check every component separately.
mask = "26.83.152.12-192"
IP = "26.83.152.19"
def match(mask, IP):
splitted_IP = IP.split('.')
for index, current_range in enumerate(mask.split('.')):
if '-' in current_range:
mini, maxi = map(int,current_range.split('-'))
else:
mini = maxi = int(current_range)
if not (mini <= int(splitted_IP[index]) <= maxi):
return False
return True
Not sure this is the most optimal, but this is base python, no need for extra packages.
parse the ip_range, creating a list with 1 element if simple value, and a range if range. So it creates a list of 4 int/range objects.
then zip it with a split version of your address and test each value in range of the other
Note: Using range ensures super-fast in test (in Python 3) (Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3?)
ip_range = "50.28.85.81-140"
toks = [[int(d)] if d.isdigit() else range(int(d.split("-")[0]),int(d.split("-")[1]+1)) for d in ip_range.split(".")]
print(toks) # debug
for test_ip in ("50.28.85.86","50.284.85.200","1.2.3.4"):
print (all(int(a) in b for a,b in zip(test_ip.split("."),toks)))
result (as expected):
[[50], [28], [85], range(81, 140)]
True
False
False
I would like to go through a gene and get a list of 10bp long sequences containing the exon/intron borders from each feature.type =='mRNA'. It seems like I need to use compoundLocation, and the locations used in 'join' but I can not figure out how to do it, or find a tutorial.
Could anyone please give me an example or point me to a tutorial?
Assuming all the info in the exact format you show in the comment, and that you're looking for 20 bp on either side of each intro/exon boundary, something like this might be a start:
Edit: If you're actually starting from a GenBank record, then it's not much harder. Assuming that the full junction string you're looking for is in the CDS feature info, then:
for f in record.features:
if f.type == 'CDS':
jct_info = str(f.location)
converts the "location" information into a string and you can continue as below.
(There are ways to work directly with the location information without converting to a string - in particular you can use "extract" to pull the spliced sequence directly out of the parent sequence -- but the steps involved in what you want to do are faster and more easily done by converting to str and then int.)
import re
jct_info = "join{[0:229](+), [11680:11768](+), [11871:12135](+), [15277:15339](+), [16136:16416](+), [17220:17471](+), [17547:17671](+)"
jctP = re.compile("\[\d+\:\d+\]")
jcts = jctP.findall(jct_info)
jcts
['[0:229]', '[11680:11768]', '[11871:12135]', '[15277:15339]', '[16136:16416]', '[17220:17471]', '[17547:17671]']
Now you can loop through the list of start:end values, pull them out of the text and convert them to ints so that you can use them as sequence indexes. Something like this:
for jct in jcts:
(start,end) = jct.replace('[', '').replace(']', '').split(':')
try: # You need to account for going out of index, e.g. where start = 0
start_20_20 = seq[int(start)-20:int(start)+20]
except IndexError:
# do your alternatives e.g. start = int(start)
I want to use an IP address string, ie: 192.168.1.23 but only keep the first three bytes of the IP address and then append 0-255. I want to transform that IP address into a range of IP address' I can pass to NMAP to conduct a sweep scan.
The easiest solution of course is to simply trim off the last two characters of the string, but of course this won't work if the IP is 192.168.1.1 or 192.168.1.123
Here is the solution I came up with:
lhost = "192.168.1.23"
# Split the lhost on each '.' then re-assemble the first three parts
lip = self.lhost.split('.')
trange = ""
for i, val in enumerate(lip):
if (i < len(lip) - 1):
trange += val + "."
# Append "0-255" at the end, we now have target range trange = "XX.XX.XX.0-255"
trange += "0-255"
It works fine but feels ugly and not efficient to me. What is a better way to do this?
You could use the rfind function of string object.
>>> lhost = "192.168.1.23"
>>> lhost[:lhost.rfind(".")] + ".0-255"
'192.168.1.0-255'
The rfind function is similar with find() but searching from the end.
rfind(...)
S.rfind(sub [,start [,end]]) -> int
Return the highest index in S where substring sub is found,
such that sub is contained within S[start:end]. Optional
arguments start and end are interpreted as in slice notation.
Return -1 on failure.
A more complicate solution could use regular express as:
>>> import re
>>> re.sub("\d{1,3}$","0-255",lhost)
'192.168.1.0-255'
Hope it be helpful!
You could split and get the first three values, join by a '.', and then add ".0-255"
>>> lhost = "192.168.1.23"
>>> '.'.join(lhost.split('.')[0:-1]) + ".0-255"
'192.168.1.0-255'
>>>
Not all IPs belong to class C. I think that the code must be flexible to accommodate various IP ranges and their masks,
I had previously written a tiny python module to calculate network ID< broadcast ID for a given IP address with any network mask.
code can be found here : https://github.com/brownbytes/tamepython/blob/master/subnet_calculator.py
I think networkSubnet() and hostRange() are functions which can be of some help to you.
I like this:
#!/usr/bin/python3
ip_address = '128.200.34.1'
list_ = ip_address.split('.')
assert len(list_) == 4
list_[3] = '0-255'
print('.'.join(list_))
I'm rewriting some code from Ruby to Python. The code is for a Perceptron, listed in section 8.2.6 of Clever Algorithms: Nature-Inspired Programming Recipes. I've never used Ruby before and I don't understand this part:
def test_weights(weights, domain, num_inputs)
correct = 0
domain.each do |pattern|
input_vector = Array.new(num_inputs) {|k| pattern[k].to_f}
output = get_output(weights, input_vector)
correct += 1 if output.round == pattern.last
end
return correct
end
Some explanation: num_inputs is an integer (2 in my case), and domain is a list of arrays: [[1,0,1], [0,0,0], etc.]
I don't understand this line:
input_vector = Array.new(num_inputs) {|k| pattern[k].to_f}
It creates an array with 2 values, every values |k| stores pattern[k].to_f, but what is pattern[k].to_f?
Try this:
input_vector = [float(pattern[i]) for i in range(num_inputs)]
pattern[k].to_f
converts pattern[k] to a float.
I'm not a Ruby expert, but I think it would be something like this in Python:
def test_weights(weights, domain, num_inputs):
correct = 0
for pattern in domain:
output = get_output(weights, pattern[:num_inputs])
if round(output) == pattern[-1]:
correct += 1
return correct
There is plenty of scope for optimising this: if num_inputs is always one less then the length of the lists in domain then you may not need that parameter at all.
Be careful about doing line by line translations from one language to another: that tends not to give good results no matter what languages are involved.
Edit: since you said you don't think you need to convert to float you can just slice the required number of elements from the domain value. I've updated my code accordingly.