Strip characters from list in Python - python

so I'm trying to print out a list of VMWare templates that are sitting in our lab. I want the output to look like this:
vagrant-ubuntu12.04-small
vagrant-centos6.6-small
vagrant-ubuntu12.04
vagrant-centos6.6
Whereas the current output looks more like this:
['[datastore2] vagrant-ubuntu12.04-small']
['[datastore2] vagrant-centos6.6-small']
['[datastore1] vagrant-centos6.6']
['[datastore1] vagrant-ubuntu12.04']
Here's my code:
from pysphere import VIServer
from pprint import pprint
VSPHERE = VIServer()
VSPHERE.connect('helike.labs.sdelements.com',
'xxxxxxxxx',
'xxxxxxxxx')
VMLIST = VSPHERE.get_registered_vms()
def is_template(string):
""" Is it a template? """
if string.find(".vmtx") == -1:
return False
else:
return True
def is_vagrant_template(string):
""" Is it a Vagrant Template? """
if string.find("vagrant") == -1:
return False
else:
return True
def is_proper_template(string):
""" filter out extraneous templates """
if string.find("sde") == -1:
return True
else:
return False
temp1 = filter(is_template, VMLIST)
temp2 = filter(is_vagrant_template, temp1)
temp3 = filter(is_proper_template, temp2)
for item in temp3:
relist = item.split('/')[:1]
pprint(relist)
I know this is probably really amateurish code but I'm not really a python guy. Is there some kind of regex or something I could use to clean this up a bit?

If it is always the same format just split once on whitespace and extract the second element:
data = [['[datastore2] vagrant-ubuntu12.04-small'],
['[datastore2] vagrant-centos6.6-small'],
['[datastore1] vagrant-centos6.6'],
['[datastore1] vagrant-ubuntu12.04']]
for sub in data:
print(sub[0].split(None,1)[1])
vagrant-ubuntu12.04-small
vagrant-centos6.6-small
vagrant-centos6.6
vagrant-ubuntu12.04
You can probably also do the split before you put the data in a list but without seeing the actual input it is impossible to say for sure.

A simple regex can do it, gives some flexibility.
Can either just grab capture group 1 into an array,
or just global find and replace with capture group 1.
If you don't know all possible characters, just replace
[a-z\d.-]+ with \S+
(?mi)^\['\[[^\]]*\]\h+([a-z\d.-]+)\h*'\]
(?mi) # Modes: Multi-line, No-Case
^ # BOL
\[' \[ [^\]]* \]
\h+
( [a-z\d.-]+ ) # (1)
\h*
'\]

The function you're looking for is map:
https://docs.python.org/2/library/functions.html#map
What you'd want to do is call map after filter, like so:
def is_proper_vagrant_template(string):
""" Is it a proper Vagrant template? """
return ".vmtx" in string and "vagrant" in string and "sde" not in string
def clean_template(string):
""" Return the second half of the string, assuming whitespace as a separator """
return string.split()[1]
temp1 = filter(is_proper_vagrant_template, VMLIST)
clean = map(clean_template, temp1)
In the snippet above, filter works the same way as what you had before, only I rewrote the call to combine your three functions into one. The map function takes the filtered list and calls clean_template on each element, returning the results as a list.
clean_template returns the second half of the string (the part that you're interested in), assuming there is no whitespace in the string other than what you identified.

Related

I have a problem with the task of reversing words and removing parentheses

Task
Write a program that will decode the secret message by reversing text
between square brackets. The message may contain nested brackets (that
is, brackets within brackets, such as One[owT[Three[ruoF]]]). In
this case, innermost brackets take precedence, similar to parentheses
in mathematical expressions, e.g. you could decode the aforementioned
example like this:
One[owT[Three[ruoF]]]
One[owT[ThreeFour]]
One[owTruoFeerhT]
OneThreeFourTwo
In order to make your own task slightly easier and less tricky, you
have already replaced all whitespaces in the original text with
underscores (“_”) while copying it from the paper version.
Input description
The first and only line of the standard input
consists of a non-empty string of up to 2 · 106 characters which may
be letters, digits, basic punctuation (“,.?!’-;:”), underscores (“_”)
and square brackets (“[]”). You can safely assume that all square
brackets are paired correctly, i.e. every opening bracket has exactly
one closing bracket matching it and vice versa.
Output description
The standard output should contain one line – the
decoded secret message without any square brackets.
Example
For sample input:
A[W_[y,[]]oh]o[dlr][!]
the correct output is:
Ahoy,_World!
Explanation
This example contains empty brackets. Of course, an empty string, when
reversed, remains empty, so we can simply ignore them. Then, as
previously, we can decode this example in stages, first reversing the
innermost brackets to obtain A[W_,yoh]o[dlr][!]. Afterwards, there
are no longer any nested brackets, so the remainder of the task is
trivial.
Below is my program that doesn't quite work
word = input("print something: ")
word_reverse = word[::-1]
while("[" in word and "]" in word):
open_brackets_index = word.index("[")
close_brackets_index = word_reverse.index("]")*(-1)-1
# print(word)
# print(open_brackets_index)
# print(close_brackets_index)
reverse_word_into_quotes = word[open_brackets_index+1:close_brackets_index:][::-1]
word = word[:close_brackets_index]
word = word[:open_brackets_index]
word = word+reverse_word_into_quotes
word = word.replace("[","]").replace("]","[")
print(word)
print(word)
Unfortunately my code only works with one pair of parentheses and I don't know how to fix it.
Thank you in advance for your help
Assuming the re module can be used, this code does the job:
import re
text = 'A[W_[y,[]]oh]o[dlr][!]'
# This scary regular expresion does all the work:
# It says find a sequence that starts with [ and ends with ] and
# contains anything BUT [ and ]
pattern = re.compile('\[([^\[\]]*)\]')
while True:
m = re.search(pattern, text)
if m:
# Here a single pattern like [String], if any, is replaced with gnirtS
text = re.sub(pattern, m[1][::-1], text, count=1)
else:
break
print(text)
Which prints this line:
Ahoy,_World!
I realize the my previous answer has been accepted but, for completeness, I'm submitting a second solution that does NOT use the re module:
text = 'A[W_[y,[]]oh]o[dlr][!]'
def find_pattern(text):
# Find [...] and return the locations of [ (start) ] (end)
# and the in-between str (content)
content = ''
for i,c in enumerate(text):
if c == '[':
content = ''
start = i
elif c == ']':
end = i
return start, end, content
else:
content += c
return None, None, None
while True:
start, end, content = find_pattern(text)
if start is None:
break
# Replace the content between [] with its reverse
text = "".join((text[:start], content[::-1], text[end+1:]))
print(text)

how to delete char after -> without using a regular expression

Given a string s representing characters typed into an editor,
with "->" representing a delete, return the current state of the editor.
For every one "->" it should delete one char. If there are two "->" i.e "->->" it should delete 2 char post the symbol.
Example 1
Input
s = "a->bcz"
Output
"acz"
Explanation
The "b" got deleted by the delete.
Example 2
Input
s = "->x->z"
Output
empty string
Explanation
All characters are deleted. Also note you can type delete when the editor
is empty as well.
"""
I Have tried following function but id didnt work
def delete_forward(text):
"""
return the current state of the editor after deletion of characters
"""
f = "->"
for i in text:
if (i==f):
del(text[i+1])
How can i complete this without using regular expressions?
Strings do not support item deletion. You have to create a new string.
>>> astring = 'abc->def'
>>> astring.index('->') # Look at the index of the target string
3
>>> x=3
>>> astring[x:x+3] # Here is the slice you want to remove
'->d'
>>> astring[0:x] + astring[x+3:] # Here is a copy of the string before and after, but not including the slice
'abcef'
This only handles one '->' per string, but you can iterate on it.
Here's a simple recursive solution-
# Constant storing the length of the arrow
ARROW_LEN = len('->')
def delete_forward(s: str):
try:
first_occurence = s.index('->')
except ValueError:
# No more arrows in string
return s
if s[first_occurence + ARROW_LEN:first_occurence + ARROW_LEN + ARROW_LEN] == '->':
# Don't delete part of the next arrow
next_s = s[first_occurence + ARROW_LEN:]
else:
# Delete the character immediately following the arrow
next_s = s[first_occurence + ARROW_LEN + 1:]
return delete_forward(s[:first_occurence] + s[first_occurence + ARROW_LEN + 1:])
Remember, python strings are immutable so you should instead rely on string slicing to create new strings as you go.
In each recursion step, the first index of -> is located and everything before this is extracted out. Then, check if there's another -> immediately following the current location - if there is, don't delete the next character and call delete_forward with everything after the first occurrence. If what is immediately followed is not an arrow, delete the immediately next character after the current arrow, and feed it into delete_forward.
This will turn x->zb into xb.
The base case for the recursion is when .index finds no matches, in which case the result string is returned.
Output
>>> delete_forward('ab->cz')
'abz'
>>> delete_forward('abcz')
'abcz'
>>> delete_forward('->abc->z')
'bc'
>>> delete_forward('abc->z->')
'abc'
>>> delete_forward('a-->b>x-->c>de->f->->g->->->->->')
'a->x->de'
There could be several methods to achieve this in python e.g.:
Using split and list comprehensions (If you want to delete a single character everytime one or more delete characters encountered):
def delete_forward(s):
return ''.join([s.split('->')[0]] + [i[1:] if len(i)>1 else "" for i in s.split('->')[1:]])
Now delete_forward("a->bcz") returns 'acz' & delete_forward("->x->z") returns ''. ote that this works for EVERY possible case whether there are many delete characters, one or none at all. Moreover it will NEVER throw any exception or error as long as input is str. This however assumes you want to delete a single character everytime one or more delete characters encountered.
If you want to delete as many characters as the number of times delete characters occur:
def delete_forward(s):
new_str =''
start = 0
for end in [i for i in range(len(s)) if s.startswith('->', i)] +[len(s)+1]:
new_str += s[start:end]
count = 0
start = max(start, end)
while s[start:start+2] =='->':
count+=1
start+=2
start += count
return new_str
This produces same output for above two cases however for case: 'a->->bc', it produces 'a' instead of 'ac' as produced by first function.

I want to split a string by a character on its first occurence, which belongs to a list of characters. How to do this in python?

Basically, I have a list of special characters. I need to split a string by a character if it belongs to this list and exists in the string. Something on the lines of:
def find_char(string):
if string.find("some_char"):
#do xyz with some_char
elif string.find("another_char"):
#do xyz with another_char
else:
return False
and so on. The way I think of doing it is:
def find_char_split(string):
char_list = [",","*",";","/"]
for my_char in char_list:
if string.find(my_char) != -1:
my_strings = string.split(my_char)
break
else:
my_strings = False
return my_strings
Is there a more pythonic way of doing this? Or the above procedure would be fine? Please help, I'm not very proficient in python.
(EDIT): I want it to split on the first occurrence of the character, which is encountered first. That is to say, if the string contains multiple commas, and multiple stars, then I want it to split by the first occurrence of the comma. Please note, if the star comes first, then it will be broken by the star.
I would favor using the re module for this because the expression for splitting on multiple arbitrary characters is very simple:
r'[,*;/]'
The brackets create a character class that matches anything inside of them. The code is like this:
import re
results = re.split(r'[,*;/]', my_string, maxsplit=1)
The maxsplit argument makes it so that the split only occurs once.
If you are doing the same split many times, you can compile the regex and search on that same expression a little bit faster (but see Jon Clements' comment below):
c = re.compile(r'[,*;/]')
results = c.split(my_string)
If this speed up is important (it probably isn't) you can use the compiled version in a function instead of having it re compile every time. Then make a separate function that stores the actual compiled expression:
def split_chars(chars, maxsplit=0, flags=0, string=None):
# see note about the + symbol below
c = re.compile('[{}]+'.format(''.join(chars)), flags=flags)
def f(string, maxsplit=maxsplit):
return c.split(string, maxsplit=maxsplit)
return f if string is None else f(string)
Then:
special_split = split_chars(',*;/', maxsplit=1)
result = special_split(my_string)
But also:
result = split_chars(',*;/', my_string, maxsplit=1)
The purpose of the + character is to treat multiple delimiters as one if that is desired (thank you Jon Clements). If this is not desired, you can just use re.compile('[{}]'.format(''.join(chars))) above. Note that with maxsplit=1, this will not have any effect.
Finally: have a look at this talk for a quick introduction to regular expressions in Python, and this one for a much more information packed journey.

Python test if string matches a template value

I am trying to iterate through a list of strings, keeping only those that match a naming template I have specified. I want to accept any list entry that matches the template exactly, other than having an integer in a variable <SCENARIO> field.
The check needs to be general. Specifically, the string structure could change such that there is no guarantee <SCENARIO> always shows up at character X (to use list comprehensions, for example).
The code below shows an approach that works using split, but there must be a better way to make this string comparison. Could I use regular expressions here?
template = 'name_is_here_<SCENARIO>_20131204.txt'
testList = ['name_is_here_100_20131204.txt', # should accept
'name_is_here_100_20131204.txt.NEW', # should reject
'other_name.txt'] # should reject
acceptList = []
for name in testList:
print name
acceptFlag = True
splitTemplate = template.split('_')
splitName = name.split('_')
# if lengths do not match, name cannot possibly match template
if len(splitTemplate) == len(splitName):
print zip(splitTemplate, splitName)
# compare records in the split
for t, n in zip(splitTemplate, splitName):
if t!=n and not t=='<SCENARIO>':
#reject if any of the "other" fields are not identical
#(would also check that '<SCENARIO>' field is numeric - not shown here)
print 'reject: ' + name
acceptFlag = False
else:
acceptFlag = False
# keep name if it passed checks
if acceptFlag == True:
acceptList.append(name)
print acceptList
# correctly prints --> ['name_is_here_100_20131204.txt']
Try with the re module for regular expressions in Python:
import re
template = re.compile(r'^name_is_here_(\d+)_20131204.txt$')
testList = ['name_is_here_100_20131204.txt', #accepted
'name_is_here_100_20131204.txt.NEW', #rejected!
'name_is_here_aabs2352_20131204.txt', #rejected!
'other_name.txt'] #rejected!
acceptList = [item for item in testList if template.match(item)]
This should do, I understand that name_is_here is just a placeholder for alphanumeric characters?
import re
testList = ['name_is_here_100_20131204.txt', # should accept
'name_is_here_100_20131204.txt.NEW', # should reject
'other_name.txt',
'name_is_44ere_100_20131204.txt',
'name_is_here_100_2013120499.txt',
'name_is_here_100_something_2013120499.txt',
'name_is_here_100_something_20131204.txt']
def find(scenario):
begin = '[a-z_]+100_' # any combinations of chars and underscores followd by 100
end = '_[0-9]{8}.txt$' #exactly eight digits followed by .txt at the end
pattern = re.compile("".join([begin,scenario,end]))
result = []
for word in testList:
if pattern.match(word):
result.append(word)
return result
find('something') # returns ['name_is_here_100_something_20131204.txt']
EDIT: scenario in separate variable, regex now only matches characters followed by 100, then scenarion, then eight digits followed by .txt.

Replacing items in string, Python

I'm trying to define a function in python to replace some items in a string. My string is a string that contains degrees minutes seconds (i.e. 216-56-12.02)
I want to replace the dashes so I can get the proper symbols, so my string will look like 216° 56' 12.02"
I tried this:
def FindLabel ([Direction]):
s = [Direction]
s = s.replace("-","° ",1) #replace first instancwe of the dash in the original string
s = s.replace("-","' ") # replace the remaining dash from the last string
s = s + """ #add in the minute sign at the end
return s
This doesn't seem to work. I'm not sure what's going wrong. Any suggestions are welcome.
Cheers,
Mike
Honestly, I wouldn't bother with replacement. Just .split() it:
def find_label(direction):
degrees, hours, minutes = direction.split('-')
return u'{}° {}\' {}"'.format(degrees, hours, minutes)
You could condense it even more if you want:
def find_label(direction):
return u'{}° {}\' {}"'.format(*direction.split('-'))
If you want to fix your current code, see my comments:
def FindLabel(Direction): # Not sure why you put square brackets here
s = Direction # Or here
s = s.replace("-",u"° ",1)
s = s.replace("-","' ")
s += '"' # You have to use single quotes or escape the double quote: `"\""`
return s
You might have to specify the utf-8 encoding at the top of your Python file as well using a comment:
# This Python file uses the following encoding: utf-8
this is how i would do it by splitting into a list and then joining back:
s = "{}° {}' {}\"".format(*s.split("-"))

Categories

Resources