Unclear on how to frame the following function correctly:
Creating a function that will take in a string and return the string in camel case without spaces (or pascal case if the first letter was already capital), removing special characters
text = "This-is_my_test_string,to-capitalize"
def to_camel_case(text):
# Return 1st letter of text + all letters after
return text[:1] + text.title()[1:].replace(i" ") if not i.isdigit()
# Output should be "ThisIsMyTestStringToCapitalize"
the "if" statement at the end isn't working out, and I wrote this somewhat experimentally, but with a syntax fix, could the logic work?
Providing the input string does not contain any spaces then you could do this:
from re import sub
def to_camel_case(text, pascal=False):
r = sub(r'[^a-zA-Z0-9]', ' ', text).title().replace(' ', '')
return r if pascal else r[0].lower() + r[1:]
ts = 'This-is_my_test_string,to-capitalize'
print(to_camel_case(ts, pascal=True))
print(to_camel_case(ts))
Output:
ThisIsMyTestStringToCapitalize
thisIsMyTestStringToCapitalize
Here is a short solution using regex. First it uses title() as you did, then the regex finds non-alphanumeric-characters and removes them, and finally we take the first character to handle pascal / camel case.
import re
def to_camel_case(s):
s1 = re.sub('[^a-zA-Z0-9]+', '', s.title())
return s[0] + s1[1:]
text = "this-is2_my_test_string,to-capitalize"
print(to_camel_case(text)) # ThisIsMyTestStringToCapitalize
The below should work for your example.
Splitting apart your example by anything that isn's alphanumeric or a space. Then capitalizing each word. Finally, returning the re-joined string.
import re
def to_camel_case(text):
words = re.split(r'[^a-zA-Z0-9\s]', text)
return "".join([word.capitalize() for word in words])
text_to_camelcase = "This-is_my_test_string,to-capitalize"
print(to_camel_case(text_to_camelcase))
use the split function to split between anything that is not a letter or a whitespace and the function .capitalize() to capitalize single words
import re
text_to_camelcase = "This-is_my_test_string,to-capitalize"
def to_camel_case(text):
split_text = re.split(r'[^a-zA-Z0-9\s]', text)
cap_string = ''
for word in split_text:
cap_word = word.capitalize()
cap_string += cap_word
return cap_string
print(to_camel_case(text_to_camelcase))
Related
I am trying to write a function to remove all punctuation characters from a string. I've tried several permutations on translate, replace, strip, etc. My latest attempt uses a brute force approach:
def clean_lower(sample):
punct = list(string.punctuation)
for c in punct:
sample.replace(c, ' ')
return sample.split()
That gets rid of almost all of the punctuation but I'm left with // in front of one of the words. I can't seem to find any way to remove it. I've even tried explicitly replacing it with sample.replace('//', ' ').
What do I need to do?
using translate is the fastest way to remove punctuations, this will remove // too:
import string
s = "This is! a string, with. punctuations? //"
def clean_lower(s):
return s.translate(str.maketrans('', '', string.punctuation))
s = clean_lower(s)
print(s)
Use regular expressions
import re
def clean_lower(s):
return(re.sub(r'\W','',s))
Above function erases any symbols except underscore
Perhaps you should approach it from the perspective of what you want to keep:
For example:
import string
toKeep = set(string.ascii_letters + string.digits + " ")
toRemove = set(string.printable) - toKeep
cleanUp = str.maketrans('', '', "".join(toRemove))
usage:
s = "Hello! world of / and dice".translate(cleanUp)
# s will be 'Hello world of and dice'
as suggested by #jasonharper you need to redefine "sample" and it should work:
import string
sample='// Hello?) // World!'
print(sample)
punct=list(string.punctuation)
for c in punct:
sample=sample.replace(c,'')
print(sample.split())
I'm creating a palindrome checker and it works however I need to find a way to replace/remove punctuation from the given input. I'm trying to do for chr(i) i in range 32,47 then substitute those in with ''. The characters I need excluded are 32 - 47. I've tried using the String module but I can only get it to either exclude spaces or punctuation it can't be both for whatever reason.
I've already tried the string module but can't get that to remove spaces and punctuation at the same time.
def is_palindrome_stack(string):
s = ArrayStack()
for character in string:
s.push(character)
reversed_string = ''
while not s.is_empty():
reversed_string = reversed_string + s.pop()
if string == reversed_string:
return True
else:
return False
def remove_punctuation(text):
return text.replace(" ",'')
exclude = set(string.punctuation)
return ''.join(ch for ch in text if ch not in exclude)
That is because you are returning from your method in the very first line, in return text.replace(" ",''). Change it to text = text.replace(" ", "") and it should work fine.
Also, the indentation is probably messed up in your post, maybe during copy pasting.
Full method snippet:
def remove_punctuation(text):
text = text.replace(" ",'')
exclude = set(string.punctuation)
return ''.join(ch for ch in text if ch not in exclude)
You might use str methods to get rid of unwanted characters following way:
import string
tr = ''.maketrans('','',' '+string.punctuation)
def remove_punctuation(text):
return text.translate(tr)
txt = 'Point.Space Question?'
output = remove_punctuation(txt)
print(output)
Output:
PointSpaceQuestion
maketrans create replacement table, it accepts 3 str-s: first and second must be equal length, n-th character from first will be replaced with n-th character from second, third str is characters to remove. You need only to remove (not replace) characters so first two arguments are empty strs.
I am massaging strings so that the 1st letter of the string and the first letter following either a dash or a slash needs to be capitalized.
So the following string:
test/string - this is a test string
Should look look like so:
Test/String - This is a test string
So in trying to solve this problem my 1st idea seems like a bad idea - iterate the string and check every character and using indexing etc. determine if a character follows a dash or slash, if it does set it to upper and write out to my new string.
def correct_sentence_case(test_phrase):
corrected_test_phrase = ''
firstLetter = True
for char in test_phrase:
if firstLetter:
corrected_test_phrase += char.upper()
firstLetter = False
#elif char == '/':
else:
corrected_test_phrase += char
This just seems VERY un-pythonic. What is a pythonic way to handle this?
Something along the lines of the following would be awesome but I can't pass in both a dash and a slash to the split:
corrected_test_phrase = ' - '.join(i.capitalize() for i in test_phrase.split(' - '))
Which I got from this SO:
Convert UPPERCASE string to sentence case in Python
Any help will be appreciated :)
I was able to accomplish the desired transformation with a regular expression:
import re
capitalized = re.sub(
'(^|[-/])\s*([A-Za-z])', lambda match: match[0].upper(), phrase)
The expression says "anywhere you match either the start of the string, ^, or a dash or slash followed by maybe some space and a word character, replace the word character with its uppercase."
demo
If you don't want to go with a messy splitting-joining logic, go with a regex:
import re
string = 'test/string - this is a test string'
print(re.sub(r'(^([a-z])|(?<=[-/])\s?([a-z]))',
lambda match: match.group(1).upper(), string))
# Test/String - This is a test string
Using double split
import re
' - '.join([i.strip().capitalize() for i in re.split(' - ','/'.join([i.capitalize() for i in re.split('/',test_phrase)]))])
I'm using that:
import string
last = 'pierre-GARCIA'
if last not in [None, '']:
last = last.strip()
if '-' in last:
last = string.capwords(last, sep='-')
else:
last = string.capwords(last, sep=None)
I have to write a code to do 2 things:
Compress more than one occurrence of the space character into one.
Add a space after a period, if there isn't one.
For example:
input> This is weird.Indeed
output>This is weird. Indeed.
This is the code I wrote:
def correction(string):
list=[]
for i in string:
if i!=" ":
list.append(i)
elif i==" ":
k=i+1
if k==" ":
k=""
list.append(i)
s=' '.join(list)
return s
strn=input("Enter the string: ").split()
print (correction(strn))
This code takes any input by the user and removes all the extra spaces,but it's not adding the space after the period(I know why not,because of the split function it's taking the period and the next word with it as one word, I just can't figure how to fix it)
This is a code I found online:
import re
def correction2(string):
corstr = re.sub('\ +',' ',string)
final = re.sub('\.','. ',corstr)
return final
strn= ("This is as .Indeed")
print (correction2(strn))
The problem with this code is I can't take any input from the user. It is predefined in the program.
So can anyone suggest how to improve any of the two codes to do both the functions on ANY input by the user?
Is this what you desire?
import re
def corr(s):
return re.sub(r'\.(?! )', '. ', re.sub(r' +', ' ', s))
s = input("> ")
print(corr(s))
I've changed the regex to a lookahead pattern, take a look here.
Edit: explain Regex as requested in comment
re.sub() takes (at least) three arguments: The Regex search pattern, the replacement the matched pattern should be replaced with, and the string in which the replacement should be done.
What I'm doing here is two steps at once, I've been using the output of one function as input of another.
First, the inner re.sub(r' +', ' ', s) searches for multiple spaces (r' +') in s to replace them with single spaces. Then the outer re.sub(r'\.(?! )', '. ', ...) looks for periods without following space character to replace them with '. '. I'm using a negative lookahead pattern to match only sections, that don't match the specified lookahead pattern (a normal space character in this case). You may want to play around with this pattern, this may help understanding it better.
The r string prefix changes the string to a raw string where backslash-escaping is disabled. Unnecessary in this case, but it's a habit of mine to use raw strings with regular expressions.
For a more basic answer, without regex:
>>> def remove_doublespace(string):
... if ' ' not in string:
... return string
... return remove_doublespace(string.replace(' ',' '))
...
>>> remove_doublespace('hi there how are you.i am fine. '.replace('.', '. '))
'hi there how are you. i am fine. '
You try the following code:
>>> s = 'This is weird.Indeed'
>>> def correction(s):
res = re.sub('\s+$', '', re.sub('\s+', ' ', re.sub('\.', '. ', s)))
if res[-1] != '.':
res += '.'
return res
>>> print correction(s)
This is weird. Indeed.
>>> s=raw_input()
hee ss.dk
>>> s
'hee ss.dk'
>>> correction(s)
'hee ss. dk.'
I can only use a string in my program if it contains no special characters except underscore _. How can I check this?
I tried using unicodedata library. But the special characters just got replaced by standard characters.
You can use string.punctuation and any function like this
import string
invalidChars = set(string.punctuation.replace("_", ""))
if any(char in invalidChars for char in word):
print "Invalid"
else:
print "Valid"
With this line
invalidChars = set(string.punctuation.replace("_", ""))
we are preparing a list of punctuation characters which are not allowed. As you want _ to be allowed, we are removing _ from the list and preparing new set as invalidChars. Because lookups are faster in sets.
any function will return True if atleast one of the characters is in invalidChars.
Edit: As asked in the comments, this is the regular expression solution. Regular expression taken from https://stackoverflow.com/a/336220/1903116
word = "Welcome"
import re
print "Valid" if re.match("^[a-zA-Z0-9_]*$", word) else "Invalid"
You will need to define "special characters", but it's likely that for some string s you mean:
import re
if re.match(r'^\w+$', s):
# s is good-to-go
Everyone else's method doesn't account for whitespaces. Obviously nobody really considers a whitespace a special character.
Use this method to detect special characters not including whitespaces:
import re
def detect_special_characer(pass_string):
regex= re.compile('[#_!#$%^&*()<>?/\|}{~:]')
if(regex.search(pass_string) == None):
res = False
else:
res = True
return(res)
Like the method from Cybernetic, to get those extra characters missed, modify the 2nd line of the function from
regex= re.compile('[#_!#$%^&*()<>?/\|}{~:]')
to
regex= re.compile('[#_!#$%^&*()<>?/\\|}{~:\[\]]')
where the \ and ] characters are escaped with \
So in full:
import re
def detect_special_characer(pass_string):
regex= re.compile('[#_!#$%^&*()<>?/\\\|}{~:[\]]')
if(regex.search(pass_string) == None):
res = False
else:
res = True
return(res)
If a character is not numeric, a space, or is A-Z, then it is special
for character in my_string
if not (character.isnumeric() and character.isspace() and character.isalpha() and character != "_")
print(" \(character)is special"