I'd like to know if one can write the following statement in one line:
new = ''
for char in text:
if char in blacklist:
new += ' '
else:
new += char
I tried but I get syntax error:
new = ''.join(c for c in text if c not in blacklist else ' ')
I know is not better or prettier, I just want to know if it's possible.
Iterating over it seems like an overly complicated way to do it. Why not use a regex?
import re
blacklist = re.compile(r'[xyz]') # Blacklist the characters 'x', 'y', 'z'
new = re.sub(blacklist, ' ', text)
You're using your in-line conditional in the wrong place (it'd work if you didn't have the else ' ' there, as then it'd just be a filter on the iterable). As it is, you'll want to do it this way:
new = ''.join(c if c not in blacklist else ' ' for c in text)
You could also do it like this if you wanted:
new = ''.join(' ' if c in blacklist else c for c in text)
You almost had it:
''.join(c if c not in blacklist else ' ' for c in text)
The X if Y else Z is an expression in itself, so you can't split it up by putting the for c in text part in the middle.
Use the translate method of str. Build a string of your whitelist characters, with ' ' in place of the blacklist ones:
>>> table = ''.join(c if c not in 'axy' else ' ' for c in map(chr,range(256)))
Then call translate with this table:
>>> 'xyzzy'.translate(table)
' zz '
Related
When I execute the following code I expect all ' a ' to be replaced by ' b ' yet only non overlapping matches are replaced.
" a a a a a a a a ".replace(' a ', ' b ')
>>>' b a b a b a b a'
So I use the following:
" a a a a a a a a ".replace(' a ', ' b ').replace(' a ', ' b ')
>>>' b b b b b b b b '
Is this a bug or a feature of replace ?
From the docs ALL OCCURENCES are replaced.
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
Most likely your best bet is using regex. Lookbehind/lookahead expressions let you match part of a string surrounded by a specific expression.
import re
s = " a a a a a a a a "
pattern = r'(?<= )a(?= )'
print(re.sub(pattern, "b", s))
Spaces don't actually become part of the match, so they don't get replaced.
why not just replace only the thing you want to replace that is only 'a' and not ' a ' like this
" a a a a a a a a ".replace('a', 'b')
which gives the output
' b b b b b b b b '
I have the next list of sentences:
list_of_sentense = ['Hi how are you?', 'I am good', 'Great!', 'I am doing good,', 'Good.']
I want to convert it into:
['Hi how are you?', 'I am good.', 'Great!', 'I am doing good.', 'Good.']
So I need to insert a period only if a sentence doesn't end with '?', '!' or '.'. Also if a sentence ends with a comma I need to change it into a period.
My code is here:
list_of_sentense_fixed = []
for i in range(len(list_of_sentense)):
b = list_of_sentense[i]
b = b + '.' if (not b.endswith('.')) or (not b.endswith('!')) or (not b.endswith('?')) else b
list_of_sentense_fixed.append(b)
But it doesn't work properly.
Just define a function to fix one sentence, then use list comprehension to construct a new list from the old:
def fix_sentence(str):
if str == "": # Don't change empty strings.
return str
if str[-1] in ["?", ".", "!"]: # Don't change if already okay.
return str
if str[-1] == ",": # Change trailing ',' to '.'.
return str[:-1] + "."
return str + "." # Otherwise, add '.'.
orig_sentences = ['Hi how are you?', 'I am good', 'Great!', 'I am doing good,', 'Good.']
fixed_sentences = [fix_sentence(item) for item in orig_sentences]
print(fixed_sentences)
This outputs, as requested:
['Hi how are you?', 'I am good.', 'Great!', 'I am doing good.', 'Good.']
With a separate function, you can just improve fix_sentence() if/when new rules need to be added.
For example, being able to handle empty strings so that you don't get an exception when trying to extract the last character from them, as per the first two lines of the function.
According to De Morgan's laws, you should change to:
b = b + '.' if (not b.endswith('.')) and (not b.endswith('!')) and (not b.endswith('?')) else b
You can simplify to:
b = b + '.' if b and b[-1] not in ('.', '!', '?') else b
I'm trying to make a directory that is okay to appear in a URL. I want to ensure that it doesn't contain any special characters and replace any spaces with hyphens.
from os.path import join as osjoin
def image_dir(self, filename):
categorydir = ''.join(e for e in str(self.title.lower()) if e.isalnum())
return "category/" + osjoin(categorydir, filename)
It's removing special characters however I'd like use .replace(" ", "-") to swap out spaces with hyphens
The best way is probably to use the slugify function which takes any string as input and returns an URL-compatible one, yours is not an URL but it will do the trick, eg:
>>> from django.utils.text import slugify
>>> slugify(' Joel is a slug ')
'joel-is-a-slug'
Why don't you use the quote function?
import urllib.parse
urlllib.parse.quote(filename.replace(" ", "-"), safe="")
You can create this functions and call remove_special_chars(s) to do it:
def __is_ascii__(c):
return (ord(c) < 128)
def remove_special_chars(s):
output = ''
for c in s:
if (c.isalpha() and __is_ascii__(c)) or c == ' ':
output = output + c
else:
if c in string.punctuation:
output = output + ' '
output = re.sub(' +', ' ', output)
output = output.replace(' ', '-')
return output
It will remove each non-ASCII character and each element in string.punctuation
EDIT:
This function will substitute each element in string.punctuation with a '-', if you want you can substitute ' ' with '' in the else statement to merge the two parts of the string before and after the punctuation element.
So basically I have this string __int64 __fastcall(IOService *__hidden this);, and I need to insert a word in between __fastcall (this could be anything) and (IOService... such as __int64 __fastcall LmaoThisWorks(IOService *__hidden this);.
I've thought about splitting the string but this seems a bit overkill. I'm hoping there's a simpler and shorter way of doing this:
type_declaration_fun = GetType(fun_addr) # Sample: '__int64 __fastcall(IOService *__hidden this)'
if type_declaration_fun:
print(type_declaration_fun)
type_declaration_fun = type_declaration_fun.split(' ')
first_bit = ''
others = ''
funky_list = type_declaration_fun[1].split('(')
for x in range(0, (len(funky_list))):
if x == 0:
first_bit = funky_list[0]
else:
others = others + funky_list[x]
type_declaration_fun = type_declaration_fun[0] + ' ' + funky_list[0] + ' ' + final_addr_name + others
type_declaration_fun = type_declaration_fun + ";"
print(type_declaration_fun)
The code is not only crap, but it doesn't quite work. Here's a sample output:
void *__fastcall(void *objToFree)
void *__fastcall IOFree_stub_IONetworkingFamilyvoid;
How could I make this work and cleaner?
Notice that there could be nested parentheses and other weird stuff, so you need to make sure that the name is added just before the first parenthesis.
You can use the method replace():
s = 'ABCDEF'
ins = '$'
before = 'DE'
new_s = s.replace(before, ins + before, 1)
print(new_s)
# ABC$DEF
Once you find the index of the character you need to insert before, you can use splicing to create your new string.
string = 'abcdefg'
string_to_insert = '123'
insert_before_char = 'c'
for i in range(len(string)):
if string[i] == insert_before_char:
string = string[:i] + string_to_insert + string[i:]
break
What about this:
s = "__int64__fastcall(IOService *__hidden this);"
t = s.split("__fastcall",1)[0]+"anystring"+s.split("__fastcall",1)[1]
I get:
__int64__fastcallanystring(IOService *__hidden this);
I hope this is what you want. If not, please comment.
Use regex.
In [1]: import re
pattern = r'(?=\()'
string = '__int64 __fastcall(IOService *__hidden this);'
re.sub(pattern, 'pizza', string)
Out[1]: '__int64 __fastcallpizza(IOService *__hidden this);'
The pattern is a positive lookahead to match the first occurrence of (.
x='high speed'
z='new text'
y = x.index('speed')
x =x[:y] + z +x[y:]
print(x)
>>> high new textspeed
this a quick example, please be aware that y inclusuve after the new string.
be Aware that you are changing the original string, or instead just declare a new string.
how do you get a list to fix the spaces in the list m.
m = ['m, a \n', 'l, n \n', 'c, l\n']
for i in m:
if (' ') in i:
i.strip(' ')
I got:
'm, a \n'
'l, n \n'
'c, l\n'
and I want it to return:
['m, a\n', 'l, n\n', 'c, l\n']
The strip() method will strip all the characters from the end of the string. In your case, strip starts at the end of your string, encounters a '\n' character, and exits.
It seems a little unclear what you are trying to do, but I will assume that you are looking to clear out any white space between the last non-whitespace character of your string and the newline. Correct me if I'm wrong.
There are many ways to do this, and this may not be the best, but here is what I came up with:
m = ['This, is a string. \n', 'another string! \n', 'final example\n ']
m = map(lambda(x): x.rstrip() + '\n' if x[-1] == '\n' else x.rstrip(' '), m)
print(m)
['This, is a string.\n', 'another string!\n', 'final example\n']
Here I use the built in map function iterate over each list element and remove all white space from the end (rstrip() instead of strip() which does both the start and end) of the string, and add in a new line if there was one present in the original string.
Your code wouldn't be useful in a script; you are just seeing the REPL displaying the result of the expression i.strip(' '). In a script, that value would just be ignored.
To create a list, use a list comprehension:
result = [i.strip(' ') for i in m if ' ' in i]
Note, however, strip only removes the requested character from either end; in your data, the space precedes the newline. You'll need to do something like removing the newline as well, then put it back:
result = ["%s\n" % i.strip() for i in m if ' ' in i]
You can use regex:
import re
m = ['m, a \n', 'l, n \n', 'c, l\n']
final_m = [re.sub('(?<=[a-zA-Z])\s+(?=\n)', '', i) for i in m]
Output:
['m, a\n', 'l, n\n', 'c, l\n']
Quick and dirty:
m = [x.replace(' \n', '\n') for x in m]
If you know that only one space goes before the '\n'