Handling spaces differently when converting between lists and strings - python

I want to be able to convert some strings to lists of characters, and vice versa. However, all spaces within the strings should be represented by an empty string element in the corresponding list. For example:
typed_words = ['T', "y", "p", "e", "", "t", "h", "i", "s"]
target_text = "Type this"
I've tried using the join method to convert the list into a string, but since there is an empty element in the list, it creates a string with no spaces.
How do I allow for the special case of ''/' ' amidst the rest of the characters?

First of all, if one ignores the space-to-empty-string issue, converting a list of characters to a string and back again is as simple as:
# Converting the list to a string:
total_string = ''.join(list_of_characters)
# Converting the string to a list:
list_of_characters = list(total_string)
In your case, you need the extra step of converting between spaces and empty strings, which you can accomplish with a list comprehension.
For instance, here's a list comprehension (split onto multiple lines for extra clarity) that reproduces a list faithfully, except with empty string elements replaced with spaces:
[
' ' if char == '' else char
for char in list_of_characters
]
So your final conversions would look like this:
# Converting the list to a string:
total_string = ''.join([' ' if char == '' else char for char in list_of_characters])
# Converting the string to a list:
list_of_characters = ['' if char == ' ' else char for char in total_string]
Note that one can iterate over a string just like iterating over a list, which is why that final comprehension can simply iterate over total_string rather than having to do list(total_string).
P.S. An empty string ('') evaluates to False in boolean contexts, so you could make use of the or operator's short-circuiting behavior to use this shorter (though arguably less immediately legible) version:
# Converting the list to a string:
total_string = ''.join([char or ' ' for char in list_of_characters])

For making a list, you need to iterate every char and append it to a list.
def strToList(s):
l = []
for c in s:
l.append(c)
return l
For doing the inverse operation, python allows using the += operator on strings:
def listToStr(l):
s = ""
for c in l:
s += str(c)
return s

We can do something as simple as the following, which is the general case
from typing import List # native python pkg, no need to install
def list_to_string(lst: List[str]) -> str:
return " ".join(lst)
def string_to_list(str_: str) -> List[str]:
return str_.split("")
if we want to cater to your needs that if there is an empty string, replace it with space, we shall do the following
def list_to_string(lst: List[str]) -> str:
'''
Here we loop over the characters in a list,
check if character "c" is space, then append it
otherwise replaced with an empty string
#param list: (List) expects a list of strings
#returns String object of characters appended together
'''
return " ".join([c if c not c.isspace() else "" for c in lst])
Why am I following a one-liner approach? This is due to the fact that Pythonic approaches are better IMHO as they are more readable, compact, and neat i.e. Pythonic
Yet we can also implement the solutions using normal for loops, which is the same thing yet I prefer one-liners. And we can use string concatenation but strings are immutable objects, so every time we append to the main string we have created another string, and so on for as lengthy as the given list, which is not an optimal approach due to the increased number of variables created and memory consumption (which in your case might not be that much, but better keep an eye for that)

Related

How to replace multiple matches in Regex

I'm trying to replace '=' with '==' in the following string:
log="[x] = '1' and [y] <> '7' or [z]='51'".
Unfortunately, only the second '=' is getting replaced. Why is the first one not being replaced and how do I replace the first one as well?
def subs_equal_sign(logic):
y = re.compile(r'\]\s?\=\s?')
iterator = y.finditer(logic)
for match in iterator:
j = str(match.group())
return logic.replace(j, ']==')
The output should be:
log="[x] == '1' and [y] <> '7' or [z]=='51'".
This is what i get instead:
log="[x] = '1' and [y] <> '7' or [z]=='51'".
for match in iterator:
j = str(match.group())
return logic.replace(j, ']==')
This part goes through the matches and doesn't do any replacing.
Only when you leave the loop, you do replacing - that's why it changes only the last one. ;)
Also, you do replacing without using the regex - simple str.replace takes all substrings matches and replaces them. So if your first = didn't have space before, it would get changed anyway!
Looking at your regex, there is only one space possible between ] and =, so why not do the replacing on those two cases, instead of using regexes? ;)
def subs_equal_sign(logic):
return logic.replace(']=', ']==').replace('] =', ']==')
Maybe the replace() function is what you are looking for :
log="[x] = '1' and [y] <> '7' or [z]='51'"
log = log.replace("=", "==")
Change your function to
def subs_equal_sign(logic):
y = re.compile(r'\]\s?\=\s?')
return y.sub("]==", logic)
and the output will now be
>>> subs_equal_sign('''log="[x] = '1' and [y] <> '7' or [z]='51'".''')
'log="[x]==\'1\' and [y] <> \'7\' or [z]==\'51\'".'
as expected.
#h4z3 correctly pointed out that your key problem is iterating through the matched groups without doing anything to them. You can make it work by simply using re.sub() to replace all occurrences at once.
A quick way to deal with this is to remove the whitespace:
def subs_equal_sign(logic):
for k in range(len(logic))):
logic[k].replace(' ','')
y = re.compile(r'\]\s?\=\s?')
iterator = y.finditer(logic)
for match in iterator:
j = str(match.group())
return logic.replace(j, ']==')
Does the string represent the branching logic for a REDCap variable? If so, I wrote a function a while back that should convert REDCap's SQL-like syntax to a pythonic form. Here it is:
def make_pythonic(str):
"""
Takes the branching logic string of a field name
and converts the syntax to that of Python.
"""
# make list of all checkbox vars in branching_logic string
# NOTE: items in list have the same serialization (ordering)
# as in the string.
checkbox_snoop = re.findall('[a-z0-9_]*\([0-9]*\)', str)
# if there are entries in checkbox_snoop
if len(checkbox_snoop) > 0:
# serially replace "[mycheckboxvar(888)]" syntax of each
# checkbox var in the logic string with the appropraite
# "record['mycheckboxvar___888']" syntax
for item in checkbox_snoop:
item = re.sub('\)', '', item)
item = re.sub('\(', '___', item)
str = re.sub('[a-z0-9_]*\([0-9]*\)', item, str)
# mask and substitute
str = re.sub('<=', 'Z11Z', str)
str = re.sub('>=', 'X11X', str)
str = re.sub('=', '==', str)
str = re.sub('Z11Z', '<=', str)
str = re.sub('X11X', '>=', str)
str = re.sub('<>', '!=', str)
str = re.sub('\[', 'record[\'', str)
str = re.sub('\]', '\']', str)
# return the string
return str
This could replace the given character with the new char to be replaced in the entire string.
log=log.replace("=","==")#Replaces the given substring with new string
print(log)#Display

How does the loop help iterate in this code

The problem at hand is that given a string S, we can transform every letter individually to be lowercase or uppercase to create another string.
Desired result is a list of all possible strings we could create.
Eg:
Input:
S = "a1b2"
Output:
["a1b2", "a1B2", "A1b2", "A1B2"]
I see the below code generates the correct result, but I'm a beginner in Python and can you help me understand how does loop line 5 & 7 work, which assign value to res.
def letterCasePermutation(self, S):
res = ['']
for ch in S:
if ch.isalpha():
res = [i+j for i in res for j in [ch.upper(), ch.lower()]]
else:
res = [i+ch for i in res]
return res
The result is a list of all possible strings up to this point. One call to the function handles the next character.
If the character is a non-letter (line 7), the comprehension simply adds that character to each string in the list.
If the character is a letter, then the new list contains two strings for each one in the input: one with the upper-case version added, one for the lower-case version.
If you're still confused, then I strongly recommend that you make an attempt to understand this with standard debugging techniques. Insert a couple of useful print statements to display the values that confuse you.
def letterCasePermutation(self, S):
res = ['']
for ch in S:
print("char = ", ch)
if ch.isalpha():
res = [i+j for i in res for j in [ch.upper(), ch.lower()]]
else:
res = [i+ch for i in res]
print(res)
return res
letterCasePermutation(None, "a1b2")
Output:
char = a
['A', 'a']
char = 1
['A1', 'a1']
char = b
['A1B', 'A1b', 'a1B', 'a1b']
char = 2
['A1B2', 'A1b2', 'a1B2', 'a1b2']
Best way to analyze this code is include the line:
print(res)
at the end of the outer for loop, as first answer suggests.
Then run it with the string '123' and the string 'abc' which will isolate the two conditionals. This gives the following output:
['1']
['12']
['123']
and
['A','a']
['AB','Ab','aB','ab']
['ABC','ABc','AbC','aBC','Abc','aBc','abC','abc']
Here we can see the loop is just taking the previously generated list as its input, and if the next string char is not a letter, is simply tagging the number/symbol onto the end of each string in the list, via string concatenation. If the next char in the initial input string is a letter, however, then the list is doubled in length by creating two copies for each item in the list, while simultaneously appending an upper version of the new char to the first copy, and a lower version of the new char to the second copy.
For an interesting result, see how the code fails if this change is made at line 2:
res = []

Alphabet position in python

Newbie here...Trying to write a function that takes a string and replaces all the characters with their respective dictionary values.
Here is what I have:
def alphabet_position(text):
dict = {'a':'1','b':'2','c':'3','d':'4','e':'5','f':'6','g':'7','h':'8':'i':'9','j':'10','k':'11','l':'12','m':'13','n':'14','o':'15','p':'16','q':'17','r':'18','s':'19','t':'20','u':'21','v':'22','w':'23','x':'24','y':'25','z':'26'}
text = text.lower()
for i in text:
if i in dict:
new_text = text.replace(i, dict[i])
print (new_text)
But when I run:
alphabet_position("The sunset sets at twelve o' clock.")
I get:
the sunset sets at twelve o' cloc11.
meaning it only changes the last character in the string. Any ideas? Any input is greatly appreciated.
Following your logic you need to create a new_text string and then iteratively replace its letters. With your code, you are only replacing one letter at a time, then start from scratch with your original string:
def alphabet_position(text):
dict = {'a':'1','b':'2','c':'3','d':'4','e':'5','f':'6','g':'7','h':'8','i':'9','j':'10','k':'11','l':'12','m':'13','n':'14','o':'15','p':'16','q':'17','r':'18','s':'19','t':'20','u':'21','v':'22','w':'23','x':'24','y':'25','z':'26'}
new_text = text.lower()
for i in new_text:
if i in dict:
new_text = new_text.replace(i, dict[i])
print (new_text)
And as suggested by Kevin, you can optimize a bit using set. (adding his comment here since he deleted it: for i in set(new_text):) Note that this might be beneficial only for large inputs though...
As your question is generally asking about "Alphabet position in python", I thought I could complement the already accepted answer with a different approach. You can take advantage of Python's string lib, char to int conversion and list comprehension to do the following:
import string
def alphabet_position(text):
alphabet = string.ascii_lowercase
return ''.join([str(ord(char)-96) if char in alphabet else char for char in text])
Your approach is not very efficient. You are recreating the string for every character.
There are 5 e characters in your string. This means replace is called 5 times, even though it only actually needs to do anything the first time.
There is another approach that might be more efficient. We cant use str.translate unfortunately, as it's remit is one to one replacements.
We just iterate the input and produce a new string character by character.
def alphabet_position2(text):
d = {L: str(i) for i, L in enumerate('abcdefghijklmnopqrstuvwxyz', 1)}
result = ''
for t in text.lower():
result += d.get(t, t)
return result
This is a pretty simple approach with list comprehension.
Generate k:v in this format from string module, 1:b instead of b:1
import string
def alphabet_position(text):
alphabeths = {v: k for k, v in enumerate(string.ascii_lowercase, start=1)}
return " ".join(str(alphabeths.get(char)) for char in text.lower() if char in alphabeths.keys())

What do these single quotes do at the beginning of line 2?

I found the following code in some random website explaining concatenating:
data_numb = input("Input Data, then press enter: ")
numb = ''.join(list(filter(str.isdigit, data_numb)))
print('(' + numb[:3] + ') ' + numb[3:6] + '-' + numb[6:])
and I was wondering what the single quotes do in the
numb = ''.join(
Any help is appreciated!
join(iterable) is a method from the str class.
Return a string which is the concatenation of the strings in iterable.
A TypeError will be raised if there are any non-string values in
iterable, including bytes objects. The separator between elements is
the string providing this method.
''.join(("Hello", "World")) will return 'HelloWorld'.
';'.join(("Hello", "World", "how", "are", "you")) will return 'Hello;World;how;are;you'.
join is very helpful if you need to add a delimiter between each element from a list (or any iterable) of strings.
It looks like nothing but if you do not use join, this kind of operation is often ugly to implement because of edge effects:
For a list or tuple of strings :
def join(list_strings, delimiter):
str_result = ''
for e in list_strings[:-1]:
str_result += e + delimiter
if list_strings:
str_result += list_strings[-1]
return str_result
For any iterable :
def join(iterable, delimiter):
iterator = iter(iterable)
str_result = ''
try:
str_result += next(iterator)
while True:
str_result += delimiter + next(iterator)
except StopIteration:
return str_result
Because join works on any iterable, you don't need to create a list from the filter result.
numb = ''.join(filter(str.isdigit, data_numb))
works as well
Join method is used to concatenate a string with any iterable object. In this example, the first string is an empty string, also represented by two single quotes, '' (don't confuse the single quotes with a single double quote).
The join() method of a string object concatenates it with another iterable provided. So, if the first string is an empty string, the resultant string is the concatenated output of the elements in the iterable.
What is its use:
It can be used to concatenate a list of strings. For example:
a = ['foo', 'bar']
b = ''.join(a)
print(b) # foobar
It can be used to concatenate strings. (Since a string is an iterable, as well)
a = "foobar"
b = ''.join(a)
print(b) # foobar
You can think of more use cases, but this is just a gist of it. You can also refer to the documentation here.

Python: Iterate over string with while loop

I'm trying to delete all the characters at the end of a string following the last occurrence of a '+'. So for instance, if my string is 'Mother+why+is+the+river+laughing' I want to reduce this to 'Mother+why+is+the+river'. I don't know what the string will be in advance though.
I thought of iterating backwards over the string. Something like:
while letter in my_string[::-1] != '+':
my_string = my_string[:-1]
This won't work because letter in not pre defined.
Any thoughts?
Just use str.rsplit():
my_string = my_string.rsplit('+', 1)[0]
.rsplit() splits from the end of a string; with a limit of 1 it'll only split on the very last + in the string and [0] gives you everything before that last +.
Demo:
>>> 'Mother+why+is+the+river+laughing'.rsplit('+', 1)[0]
'Mother+why+is+the+river'
If there is no + in the string, the original string is returned:
>>> 'Mother'.rsplit('+', 1)[0]
'Mother'
As for your loop; you are testing against a reversed string and the condition returns True until the last + has been removed; you'd have to test in the loop for what character you just removed:
while True:
last = my_string[-1]
my_string = my_string[:-1]
if last == '+':
break
but this is rather inefficient compared to using str.rsplit(); creating a new string for each character removed is costly.

Categories

Resources