I have been self teaching myself python for the past 2 weeks. Today, I came across a problem and I have a very annoying solution to it(I feel bad for whoever has to read it). So firstly, I will introduce the problem and my solution to it.
Problem:Complete the getHost() function, which takes a single string argument representing a URL and returns a string that cor-
responds to the next-to-last section of the hostname. For example, given the URL "http://www.example.com/", the function
would return the string "example". Given the URL "ftp://this.is.a.long.name.net/path/to/some/file.php", the function would
return the string "name". While the path and filename sections of the URL are optional, you may assume that the full
hostname is always followed by a single forward slash ("/").
My solution:
def getHost(x):
newstring = ""
listofx = []
for i in range(len(x)):
listofx.append(x[i])
for j in range(2):
a = listofx.index("/")
listofx.reverse()
for k in range(a+1):
listofx.pop()
listofx.reverse()
b = listofx.index("/")
for g in range(len(listofx)-b):
listofx.pop()
for t in range(listofx.count(".")-1):
for o in range(listofx.index(".")+1):
listofx.reverse()
listofx.pop()
listofx.reverse()
for f in range(len(listofx)-listofx.index(".")):
listofx.pop()
for h in range(len(listofx)):
newstring = newstring + listofx[h]
print (newstring)
I HATE my solution because look at how many for loops I used. I felt like I had no choice since strings are immutable. I would appreciate someone can showing me a solution using while loops and the find()/rfind() methods. I do not want to keep converting strings to lists to solve these type of problems.
Using find and rfind:
def getHost(x):
index1 = x.find('//')
index2 = x.find('/', index1+2)
index3 = x.rfind('.',index1+2, index2)
return(x[:index3].split('.')[-1])
Yeah, there is better (pythonic) way
def extract(data):
print(data.split('/')[2].split('.')[-2])
extract("http://www.example.com/")
extract("ftp://this.is.a.long.name.net/path/to/some/file.php")
Output (obviously)
example
name
Assuming that your URL always has the double forward slash you could use something like the following;
url = "http://www.example.com/"
url = url.split("/")
url = url[2].split(".")
getHost = url[-2]
print(getHost)
Actually, a simpler version that doesn't need rfind:
def getHost(x):
index1 = x.find('//')
index2 = x.find('/', index1+2)
return(x[:index2].split('.')[-2])
print(getHost("ftp://this.is.a.long.name.net/path/to/some/file.php"))
print(getHost("http://www.example.com/"))
Related
If I have a list of strings such as the following:
"apple.test.banana", "test.example","example.example.test".
Is there a way to return only "test.banana" and "example.test"?
I need to check and see if there are two dots, and if there are, return only the value described above.
I attempted to use:
string="apple.test.banana"
dot_count=0
for i in string:
if i==".":
dot_count=dot_count+1
if dot_count>1:
string.split(".")[1]
But this appears to only return the string "test".
Any advice would be greatly appreciated. Thank you.
You are completely right, except for the last line, which sould say '.'.join(string.split(".")[1:]).
Also, instead of the for loop, you can just use .count(): dot_count = string.count('.') (this doesn't affect anything, just makes your code easier to read)
So the program becomes:
string = "apple.test.banana"
dot_count = string.count('.')
if dot_count > 1:
print('.'.join(string.split(".")[1:]))
Which outputs: test.banana
i am still learning python and i am struggling to find a way to get this to work the way i want.
#First Method
staticPaths = [
"[N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-480]",
"[N4-OLDCLOUD-IPSTORAGE/epg-N4-NFS-NETAPP-8040-C01]/[vlan-481]]",
"['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-484]",
"['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-485]",
"['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C01-N2][vlan-480]"
]
for path in staticPaths:
filter_object = filter(lambda a: 'vlan-480' in a, staticPaths)
print(list(filter_object))
So what i am trying to do here is filter out anything that matches ‘vlan-480’ and return the entire line, so for example, if i run that code, i receive the correct output. which would be -
[N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C01-N2][vlan-480]
['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-480]
However where is states ‘vlan-480’ in the lambda function i actually want to pass it a LIST but because i am using the “in” statement, it only allows me to pass a single string.
Again i want to check multiples, so for example, give me the output for ‘vlan-480’ AND ‘vlan-484’ and it should return the lines for me from the staticPaths
I cannot think of way of getting this done, might just be me been stupid but for some reason i cannot solve it.
Also tried an if statement but i have the same problem, with the single string option.
#Second Method
path_matches = []
for path_match in staticPaths:
if 'vlan-480' in path_match:
path_matches.append(path_match)
print(path_matches)
Can anyone think of a way of doing this, its probably really easy but for some reason i cannot think of it. I did try and use List Comprehension but struggled to get the output i needed.
much appericated
Try this:
substrings = ['vlan-480', 'vlan-484']
filter_object = filter(lambda a: any(x in a for x in substrings), staticPaths)
print(list(filter_object))
The list substrings contains substrings to search for
The output I get for your dataset is:
['[N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-480]', "['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-484]", "['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C01-N2][vlan-480]"]
You can also use list comprehension
substrings = ['vlan-480', 'vlan-484']
path_matches = [x for x in staticPaths if substrings[0] in x or substrings[1] in x]
that results in :
path_matches
['[N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-480]',
"['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C02-N2][vlan-484]",
"['N4-NFS-NETAPP-8040-C01]/[N4-NHT-LEAF-VPC-FAS-C01-N2][vlan-480]"]
Your problem is with indentation i guess check this
path_matches = []
for path_match in staticPaths:
if 'vlan-480' in path_match:
path_matches.append(path_match)
Try to use this:
filter_object = filter(lambda a: 'vlan-480' in a or 'vlan-484' in a, staticPaths)
print(list(filter_object))
To do multi selection you can use operator OR/AND, so it will be such this 'vlan-480' in a or 'vlan-484' in a
I'm learning lists and in order to get a better understanding I thought I'd apply some basic concepts I've learned so far.
What I'm attempting to do through my code is to add a new name to my list_of_Names and have it add a last name automatically. This is where I'm stuck.
My solution was using " Washington".join(newPerson) but that clearly doesn't work.
And please don't mind the efficiency of the code, I'm creating a new list just so I can apply the pop() command in a new scenario.
Also I've looked up similar issues, please don't tell me to use the map() command if it is somehow possible.
list_of_Names = ["Wallace Washington"]
def addNewMemeber(name):
newPerson = []
newPerson.append(name)
" Washington".join(newPerson)
list_of_Names.append(newPerson.pop())
addNewMemeber("William")
print(list_of_Names, end=", ")
1). I have rewritten your code again to get your desired result:-
list_of_Names = ["Wallace Washington"]
def addNewMemeber(name):
name = name + " Washington"
# If surname is fixed. If not then store surnames in a list and then perform logic by if-else.
list_of_Names.append(name)
addNewMemeber("William")
print(list_of_Names, end=", ")
2). Same Solution with join() method.
list_of_Names = ["Wallace Washington"]
def addNewMemeber(name):
l = []
l.append(name)
l.append("Washington")
name = " ".join(l)
# Join is used to convert list into string.
list_of_Names.append(name)
addNewMemeber("William")
print(list_of_Names, )
I hope this will help you.
You just need to add a string to the list list_of_names. There is no point in the newPerson list
def addNewMemeber(name):
list_of_Names.append(f'{name} Washington')
In order to use the pop command for the sake of it, theres 2 problems.
Your call to join doesn't save the result to a variable
Join doesn't make any sense.
So if you really want to add to the newPerson list, then don't use join whatsoever..
def addNewMemeber(name):
newPerson = [f'{name} Washington']
list_of_Names.append(newPerson.pop())
You can read about join function here.
I think the code you are looking for is as follows:
list_of_Names = ["Wallace Washington"]
def addNewMemeber(name):
name += ' Washington'
list_of_Names.append(name)
addNewMemeber("x")
How i can change multiple parameters value in this url: https://google.com/?test=sadsad&again=tesss&dadasd=asdaas
You can see my code: i can just change 2 value!
This is the response https://google.com/?test=aaaaa&dadasd=howwww
again parameter not in the response! how i can change the value and add it to the url?
def between(value, a, b):
pos_a = value.find(a)
if pos_a == -1: return ""
pos_b = value.rfind(b)
if pos_b == -1: return ""
adjusted_pos_a = pos_a + len(a)
if adjusted_pos_a >= pos_b: return ""
return value[adjusted_pos_a:pos_b]
def before(value, a):
pos_a = value.find(a)
if pos_a == -1: return ""
return value[0:pos_a]
def after(value, a):
pos_a = value.rfind(a)
if pos_a == -1: return ""
adjusted_pos_a = pos_a + len(a)
if adjusted_pos_a >= len(value): return ""
return value[adjusted_pos_a:]
test = "https://google.com/?test=sadsad&again=tesss&dadasd=asdaas"
if "&" in test:
print(test.replace(between(test, "=", "&"), 'aaaaa').replace(after(test, "="), 'howwww'))
else:
print(test.replace(after(test, "="), 'test'))
Thanks!
From your code it seems like you are probably fairly new to programming, so first of all congratulations on having attempted to solve your problem.
As you might expect, there are language features you may not know about yet that can help with problems like this. (There are also libraries specifically for parsing URLs, but point you to those wouldn't help your progress in Python quite as much - if you are just trying to get some job done they might be a godsend).
Since the question lacks a little clarity (don't worry - I can only speak and write English, so you are ahead of me there), I'll try to explain a simpler approach to your problem. From the last block of your code I understand your intent to be:
"If there are multiple parameters, replace the value of the first with 'aaaaa' and the others with 'howwww'. If there is only one, replace its value with 'test'."
Your code is a fair attempt (at what I think you want to do). I hope the following discussion will help you. First, set url to your example initially.
>>> url = "https://google.com/?test=sadsad&again=tesss&dadasd=asdaas"
While the code deals with multiple arguments or one, it doesn't deal with no arguments at all. This may or may not matter, but I like to program defensively, having made too many silly mistakes in the past. Further, detecting that case early simplifies the remaining logic by eliminating an "edge case" (something the general flow of your code does not handle). If I were writing a function (good when you want to repeat actions) I'd start it with something like
if "?" not in url:
return url
I skipped this here because I know what the sample string is and I'm not writing a function. Once you know there are arguments, you can split them out quite easily with
>>> stuff, args = url.split("?", 1)
The second argument to split is another defensive measure, telling it to ignore all but the first question mark. Since we know there is at least one, this guarantees there will always be two elements in the result, and Python won't complain about a different number of names as values in that assignment. Let's confirm their values:
>>> stuff, args
('https://google.com/', 'test=sadsad&again=tesss&dadasd=asdaas')
Now we have the arguments alone, we can split them out into a list:
>>> key_vals = args.split("&")
>>> key_vals
['test=sadsad', 'again=tesss', 'dadasd=asdaas']
Now you can create a list of key,value pairs:
>>> kv_pairs = [kv.split("=", 1) for kv in key_vals]
>>> kv_pairs
[['test', 'sadsad'], ['again', 'tesss'], ['dadasd', 'asdaas']]
At this point you can do whatever is appropriate do the keys and values - deleting elements, changing values, changing keys, and so on. You could create a dictionary from them, but beware repeated keys. I assume you can change kv_pairs to reflect the final URL you want.
Once you have made the necessary changes, putting the return value back together is relatively simple: we have to put an "=" between each key and value, then a "&" between each resulting string, then join the stuff back up with a "?". One step at a time:
>>> [f"{k}={v}" for (k, v) in kv_pairs]
['test=sadsad', 'again=tesss', 'dadasd=asdaas']
>>> "&".join(f"{k}={v}" for (k, v) in kv_pairs)
'test=sadsad&again=tesss&dadasd=asdaas'
>>> stuff + "?" + "&".join(f"{k}={v}" for (k, v) in kv_pairs)
'https://google.com/?test=sadsad&again=tesss&dadasd=asdaas'
I would use urllib since it handles this for you.
First lets break down the URL.
import urllib
u = urllib.parse.urlparse('https://google.com/?test=sadsad&again=tesss&dadasd=asdaas')
ParseResult(scheme='https', netloc='google.com', path='/', params='', query='test=sadsad&again=tesss&dadasd=asdaas', fragment='')
Then lets isolate the query element.
data = dict(urllib.parse.parse_qsl(u.query))
{'test': 'sadsad', 'again': 'tesss', 'dadasd': 'asdaas'}
Now lets update some elements.
data.update({
'test': 'foo',
'again': 'fizz',
'dadasd': 'bar'})
Now we should encode it back to the proper format.
encoded = urllib.parse.urlencode(data)
'test=foo&again=fizz&dadasd=bar'
And finally let us assemble the whole URL back together.
new_parts = (u.scheme, u.netloc, u.path, u.params, encoded, u.fragment)
final_url = urllib.parse.urlunparse(new_parts)
'https://google.com/?test=foo&again=fizz&dadasd=bar'
Is it necessary to do it from scartch? If not use the urllib already included in vanilla Python.
from urllib.parse import urlparse, parse_qsl, urlencode, urlunparse
url = "https://google.com/?test=sadsad&again=tesss&dadasd=asdaas"
parsed_url = urlparse(url)
qs = dict(parse_qsl(parsed_url.query))
# {'test': 'sadsad', 'again': 'tesss', 'dadasd': 'asdaas'}
if 'again' in qs:
del qs['again']
# {'test': 'sadsad', 'dadasd': 'asdaas'}
parts = list(parsed_url)
parts[4] = urlencode(qs)
# ['https', 'google.com', '/', '', 'test=sadsad&dadasd=asdaas', '']
new_url = urlunparse(parts)
# https://google.com/?test=sadsad&dadasd=asdaas
How can I make unique URL in Python a la http://imgur.com/gM19g or http://tumblr.com/xzh3bi25y
When using uuid from python I get a very large one. I want something shorter for URLs.
Edit: Here, I wrote a module for you. Use it. http://code.activestate.com/recipes/576918/
Counting up from 1 will guarantee short, unique URLS. /1, /2, /3 ... etc.
Adding uppercase and lowercase letters to your alphabet will give URLs like those in your question. And you're just counting in base-62 instead of base-10.
Now the only problem is that the URLs come consecutively. To fix that, read my answer to this question here:
Map incrementing integer range to six-digit base 26 max, but unpredictably
Basically the approach is to simply swap bits around in the incrementing value to give the appearance of randomness while maintaining determinism and guaranteeing that you don't have any collisions.
I'm not sure most URL shorteners use a random string. My impression is they write the URL to a database, then use the integer ID of the new record as the short URL, encoded base 36 or 62 (letters+digits).
Python code to convert an int to a string in arbitrary bases is here.
Python's short_url is awesome.
Here is an example:
import short_url
id = 20 # your object id
domain = 'mytiny.domain'
shortened_url = "http://{}/{}".format(
domain,
short_url.encode_url(id)
)
And to decode the code:
decoded_id = short_url.decode_url(param)
That's it :)
Hope this will help.
Hashids is an awesome tool for this.
Edit:
Here's how to use Hashids to generate a unique short URL with Python:
from hashids import Hashids
pk = 123 # Your object's id
domain = 'imgur.com' # Your domain
hashids = Hashids(salt='this is my salt', min_length=6)
link_id = hashids.encode(pk)
url = 'http://{domain}/{link_id}'.format(domain=domain, link_id=link_id)
This module will do what you want, guaranteeing that the string is globally unique (it is a UUID):
http://pypi.python.org/pypi/shortuuid/0.1
If you need something shorter, you should be able to truncate it to the desired length and still get something that will reasonably probably avoid clashes.
This answer comes pretty late but I stumbled upon this question when I was planning to create an URL shortener project. Now that I have implemented a fully functional URL shortener(source code at amitt001/pygmy) I am adding an answer here for others.
The basic principle behind any URL shortener is to get an int from long URL then use base62(base32, etc) encoding to convert this int to a more readable short URL.
How is this int generated?
Most of the URL shortener uses some auto-incrementing datastore to add URL to datastore and use the autoincrement id to get base62 encoding of int.
The sample base62 encoding from string program:
# Base-62 hash
import string
import time
_BASE = 62
class HashDigest:
"""Base base 62 hash library."""
def __init__(self):
self.base = string.ascii_letters + string.digits
self.short_str = ''
def encode(self, j):
"""Returns the repeated div mod of the number.
:param j: int
:return: list
"""
if j == 0:
return [j]
r = []
dividend = j
while dividend > 0:
dividend, remainder = divmod(dividend, _BASE)
r.append(remainder)
r = list(reversed(r))
return r
def shorten(self, i):
"""
:param i:
:return: str
"""
self.short_str = ""
encoded_list = self.encode(i)
for val in encoded_list:
self.short_str += self.base[val]
return self.short_str
This is just the partial code showing base62 encoding. Check out the complete base62 encoding/decoding code at core/hashdigest.py
All the link in this answer are shortened from the project I created
The reason UUIDs are long is because they contain lots of information so that they can be guaranteed to be globally unique.
If you want something shorter, then you'll need to do something like generate a random string, checking whether it is in the universe of already generated strings, and repeating until you get an unused string. You'll also need to watch out for concurrency here (what if the same string gets generated by a separate process before you inserted into the set of strings?).
If you need some help generating random strings in Python, this other question might help.
It doesn't really matter that this is Python, but you just need a hash function that maps to the length you want. For example, maybe use MD5 and then take just the first n characters. You'll have to watch out for collisions in that case, though, so you might want to pick something a little more robust in terms of collision detection (like using primes to cycle through the space of hash strings).
I don't know if you can use this, but we generate content objects in Zope that get unique numeric ids based on current time strings, in millis (eg, 1254298969501)
Maybe you can guess the rest. Using the recipe described here:
How to convert an integer to the shortest url-safe string in Python?, we encode and decode the real id on the fly, with no need for storage. A 13-digit integer is reduced to 7 alphanumeric chars in base 62, for example.
To complete the implementation, we registered a short (xxx.yy) domain name, that decodes and does a 301 redirect for "not found" URLs,
If I was starting over, I would subtract the "starting-over" time (in millis) from the numeric id prior to encoding, then re-add it when decoding. Or else when generating the objects. Whatever. That would be way shorter..
You can generate a N random string:
import string
import random
def short_random_string(N:int) -> str:
return ''.join(random.SystemRandom().choice(
string.ascii_letters + \
string.digits) for _ in range(N)
)
so,
print (short_random_string(10) )
#'G1ZRbouk2U'
all lowercase
print (short_random_string(10).lower() )
#'pljh6kp328'
Try this http://code.google.com/p/tiny4py/ ... It's still under development, but very useful!!
My Goal: Generate a unique identifier of a specified fixed length consisting of the characters 0-9 and a-z. For example:
zcgst5od
9x2zgn0l
qa44sp0z
61vv1nl5
umpprkbt
ylg4lmcy
dec0lu1t
38mhd8i5
rx00yf0e
kc2qdc07
Here's my solution. (Adapted from this answer by kmkaplan.)
import random
class IDGenerator(object):
ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyz"
def __init__(self, length=8):
self._alphabet_length = len(self.ALPHABET)
self._id_length = length
def _encode_int(self, n):
# Adapted from:
# Source: https://stackoverflow.com/a/561809/1497596
# Author: https://stackoverflow.com/users/50902/kmkaplan
encoded = ''
while n > 0:
n, r = divmod(n, self._alphabet_length)
encoded = self.ALPHABET[r] + encoded
return encoded
def generate_id(self):
"""Generate an ID without leading zeros.
For example, for an ID that is eight characters in length, the
returned values will range from '10000000' to 'zzzzzzzz'.
"""
start = self._alphabet_length**(self._id_length - 1)
end = self._alphabet_length**self._id_length - 1
return self._encode_int(random.randint(start, end))
if __name__ == "__main__":
# Sample usage: Generate ten IDs each eight characters in length.
idgen = IDGenerator(8)
for i in range(10):
print idgen.generate_id()