Python behavior of string in loop - python

In trying to capitalize a string at separators I encountered behavior I do not understand. Can someone please explain why the string s in reverted during the loop? Thanks.
s = 'these-three_words'
seperators = ('-','_')
for sep in seperators:
s = sep.join([i.capitalize() for i in s.split(sep)])
print s
print s
stdout:
These-Three_words
These-three_Words
These-three_Words

capitalize turns the first character uppercase and the rest of the string lowercase.
In the first iteration, it looks like this:
>>> [i.capitalize() for i in s.split('-')]
['These', 'Three_words']
In the second iteration, the strings are the separated into:
>>> [i for i in s.split('_')]
['These-Three', 'words']
So running capitalize on both will then turn the T in Three lowercase.

You could use title():
>>> s = 'these-three_words'
>>> print s.title()
These-Three_Words

str.capitalize capitalizes the first character and lowercases the remaining characters.

Capitalize() will return a copy of the string with only its first character capitalized. You could use this:
def cap(s):
return s[0].upper() + s[1:]

Related

Does split method in python returned something containing \u for some characters and how to get rid of it?

I have a unicode string:
s = "ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"
the split method it returns is somewhat changed, with a \u180e in the second word.
>>> print(s.split())
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ\u180eᠠ', 'ᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']
What I want to get is:
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ', 'ᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']
What is the reason causing this, and how to solve it?
I don't think the problem is with the split function, but with the list itself.
>>> s = ["ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"]
>>> print(s)
['ᠤᠷᠢᠳᠤ ᠲᠠᠯ\u180eᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ']
You should still be able to use the list normally, because it corrects itself when the element is used.
>>> s = "ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"
>>> s = s.split()
>>> [print(e) for e in s]
ᠤᠷᠢᠳᠤ
ᠲᠠᠯ᠎ᠠ
ᠶᠢᠨ
ᠬᠠᠪᠲᠠᠭᠠᠢ
ᠬᠡᠪᠲᠡᠭᠡ
According to Wikipedia: https://en.wikipedia.org/wiki/Whitespace_character#Unicode
U+180E is a space character until Uncode 6.3.0 so if python implements a earlier Unicode spec than i guess split() would break on all space characters. You could work arround this by giving split an argument if you want to only split on certain characters (s.split(" ")) that would give you:
>>> s.split(" ")
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ\u180eᠠ\u202fᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']

How to remove all characters before a specific character in Python?

I'd like to remove all characters before a designated character or set of characters (for example):
intro = "<>I'm Tom."
Now I'd like to remove the <> before I'm (or more specifically, I). Any suggestions?
Use re.sub. Just match all the chars upto I then replace the matched chars with I.
re.sub(r'^.*?I', 'I', stri)
str.find could find character index of certain string's first appearance:
intro[intro.find('I'):]
Since index(char) gets you the first index of the character, you can simply do string[index(char):].
For example, in this case index("I") = 2, and intro[2:] = "I'm Tom."
If you know the character position of where to start deleting, you can use slice notation:
intro = intro[2:]
Instead of knowing where to start, if you know the characters to remove then you could use the lstrip() function:
intro = intro.lstrip("<>")
str = "<>I'm Tom."
temp = str.split("I",1)
temp[0]=temp[0].replace("<>","")
str = "I".join(temp)
I looped through the string and passed the index.
intro_list = []
intro = "<>I'm Tom."
for i in range(len(intro)):
if intro[i] == '<' or intro[i] == '>':
pass
else:
intro_list.append(intro[i])
intro = ''.join(intro_list)
print(intro)
import re
date_div = "Blah blah\nblah, Updated: Aug. 23, 2012 Blah blah Updated: Feb. 13, 2019"
up_to_word = ":"
rx_to_first = r'^.*?{}'.format(re.escape(up_to_word))
rx_to_last = r'^.*{}'.format(re.escape(up_to_word))
# (Dot.) In the default mode, this matches any character except a newline.
# If the DOTALL flag has been specified, this matches any character including a newline.
print("Remove all up to the first occurrence of the word including it:")
print(re.sub(rx_to_first, '', date_div, flags=re.DOTALL).strip())
print("Remove all up to the last occurrence of the word including it:")
print(re.sub(rx_to_last, '', date_div, flags=re.DOTALL).strip())
>>> intro = "<>I'm Tom."
#Just split the string at the special symbol
>>> intro.split("<>")
Output = ['', "I'm Tom."]
>>> new = intro.split("<>")
>>> new[1]
"I'm Tom."
This solution works if the character is not in the string too, but uses if statements which can be slow.
if 'I' in intro:
print('I' + intro.split('I')[1])
else:
print(intro)
You can use itertools.dropwhile to all the characters before seeing a character to stop at. Then, you can use ''.join() to turn the resulting iterable back into a string:
from itertools import dropwhile
''.join(dropwhile(lambda x: x not in stop, intro))
This outputs:
I'm Tom.
Based on the #AvinashRaj answer, you can use re.sub to substituate a substring by a string or a character thanks to regex:
missing import re
output_str = re.sub(r'^.*?I', 'I', input_str)
import re
intro = "<>I'm Tom."
re.sub(r'<>I', 'I', intro)

Converting specific letters to uppercase or lowercase in python

So to return a copy of a string converted to lowercase or uppercase one obviously uses the lower() or upper().
But how does one go about making a copy of a string with specific letters converted to upper or lowercase.
For example how would i convert 'test' into 'TesT'
this is honestly baffling me so help is greatly appreciated
got it, thanks for the help Cyber and Matt!
If you're just looking to replace specific letters:
>>> s = "test"
>>> s.replace("t", "T")
'TesT'
There is one obvious solution, slice the string and upper the parts you want:
test = 'test'
test = test[0].upper() + test[1:-1] + test[-1].upper()
import re
input = 'test'
change_to_upper = 't'
input = re.sub(change_to_upper, change_to_upper.upper(), input)
This uses the regular expression engine to say find anything that matches change_to_upper and replace it with the the upper case version.
You could use the str.translate() method:
import string
# Letters that should be upper-cased
letters = "tzqryp"
table = string.maketrans(letters, letters.upper())
word = "test"
print word.translate(table)
As a general way to replace all of a letter with something else
>>> swaps = {'t':'T', 'd':'D'}
>>> ''.join(swaps.get(i,i) for i in 'dictionary')
'DicTionary'
I would use translate().
For python2:
>>> from string import maketrans
>>> "test".translate(maketrans("bfty", "BFTY"))
'TesT'
And for python3:
>>> "test".translate(str.maketrans("bfty", "BFTY"))
'TesT'
Python3 can do:
def myfunc(str):
if len(str)>3:
return str[:3].capitalize() + str[3:].capitalize()
else:
return 'Word is too short!!'
The simplest solution:
>>> letters = "abcdefghijklmnop"
>>> trantab = str.maketrans(letters, letters.upper())
>>> print("test string".translate(trantab))
tEst strING
Simply
chars_to_lower = "MTW"
"".join([char.lower() if char in chars_to_lower else char for char in item]

Remove the first character of a string

I would like to remove the first character of a string.
For example, my string starts with a : and I want to remove that only. There are several occurrences of : in the string that shouldn't be removed.
I am writing my code in Python.
python 2.x
s = ":dfa:sif:e"
print s[1:]
python 3.x
s = ":dfa:sif:e"
print(s[1:])
both prints
dfa:sif:e
Your problem seems unclear. You say you want to remove "a character from a certain position" then go on to say you want to remove a particular character.
If you only need to remove the first character you would do:
s = ":dfa:sif:e"
fixed = s[1:]
If you want to remove a character at a particular position, you would do:
s = ":dfa:sif:e"
fixed = s[0:pos]+s[pos+1:]
If you need to remove a particular character, say ':', the first time it is encountered in a string then you would do:
s = ":dfa:sif:e"
fixed = ''.join(s.split(':', 1))
Depending on the structure of the string, you can use lstrip:
str = str.lstrip(':')
But this would remove all colons at the beginning, i.e. if you have ::foo, the result would be foo. But this function is helpful if you also have strings that do not start with a colon and you don't want to remove the first character then.
Just do this:
r = "hello"
r = r[1:]
print(r) # ello
deleting a char:
def del_char(string, indexes):
'deletes all the indexes from the string and returns the new one'
return ''.join((char for idx, char in enumerate(string) if idx not in indexes))
it deletes all the chars that are in indexes; you can use it in your case with del_char(your_string, [0])

Capitalize a string

Does anyone know of a really simple way of capitalizing just the first letter of a string, regardless of the capitalization of the rest of the string?
For example:
asimpletest -> Asimpletest
aSimpleTest -> ASimpleTest
I would like to be able to do all string lengths as well.
>>> b = "my name"
>>> b.capitalize()
'My name'
>>> b.title()
'My Name'
#saua is right, and
s = s[:1].upper() + s[1:]
will work for any string.
What about your_string.title()?
e.g. "banana".title() -> Banana
s = s[0].upper() + s[1:]
This should work with every string, except for the empty string (when s="").
this actually gives you a capitalized word, instead of just capitalizing the first letter
cApItAlIzE -> Capitalize
def capitalize(str):
return str[:1].upper() + str[1:].lower().......
for capitalize first word;
a="asimpletest"
print a.capitalize()
for make all the string uppercase use the following tip;
print a.upper()
this is the easy one i think.
You can use the str.capitalize() function to do that
In [1]: x = "hello"
In [2]: x.capitalize()
Out[2]: 'Hello'
Hope it helps.
Docs can be found here for string functions https://docs.python.org/2.6/library/string.html#string-functions
Below code capitializes first letter with space as a separtor
s="gf12 23sadasd"
print( string.capwords(s, ' ') )
Gf12 23sadasd
str = str[:].upper()
this is the easiest way to do it in my opinion

Categories

Resources