I'm looking to replace a list of characters (including escaped characters) with another.
for example:
l=['\n','<br>','while:','<while>','for:','<for>']
s='line1\nline2'
s.replace(l[0],[l[1])
but passing the list indices through the method produces no effect.
I've also tried using
s=l[1].join(s.split(l[0]))
How can I replace a list of characters with another without expressing the pairs each time in the function?
As I said in the comments, the problem with your code is that you assumed that the replace works in-place. It does not, you have to assign the value it returns to the variable.
But, there is a better way of doing it that involves dictionaries. Take a look:
d = {'\n': '<br>', 'while:': '<while>', 'for:': '<for>'}
s = 'line1\nline2\nwhile: nothing and for: something\n\nnothing special'
for k, v in d.items():
s = s.replace(k, v)
print(s) # line1<br>line2<br><while> nothing and <for> something<br><br>nothing special
The advantage of using dictionaries in this case is that you make it very straightforward what you want to replace and what with. Playing with the indexes is not something you want to do if you can avoid it.
Finally, if you are wondering how to convert your list to a dict you can use the following:
d = {k: v for k, v in zip(l[::2], l[1::2])}
which does not break even if your list has an odd number of elements.
l=['\n','<br>','while:','<while>','for:','<for>']
s='line1\nline2'
for i in range(0, len(l), 2):
s = s.replace(l[i], l[i+1])
You simply have to iterate over the list containing your desired pairs, stepping over 2 values each time. And then assign the result of the replacement to the variable itself (replace doesn't do inline replacement because strings are inmutable in Python)
Related
This question already has answers here:
How to replace multiple substrings of a string?
(28 answers)
Closed 7 months ago.
I have a string like
a = "X1+X2*X3*X1"
b = {"X1":"XX0","X2":"XX1","X0":"XX2"}
I want to replace the substring 'X1,X2,X3' using dict b.
However, when I replace using the below code,
for x in b:
a = a.replace(x,b[x])
print(a)
'XXX2+XX1*X3'
Expected result is XX0 + XX1*X3*XX0
I know it is because the substring is replaced in a loop, but I don't know how to solve it.
You can create a pattern with '|' then search in dictionary transform like below.
Try this:
import re
a = "X1+X2*X3*X1"
b = {"X1":"XX0","X2":"XX1","X0":"XX2"}
pattern = re.compile("|".join(b.keys()))
out = pattern.sub(lambda x: b[re.escape(x.group(0))], a)
Output:
>>> out
'XX0+XX1*X3*XX0'
You can use the repl parameter of re.sub:
import re
re.sub('X\d', lambda x: b.get(x.group(), x.group()), a)
output:
'XX0+XX1*X3*XX0'
The reason for this is beacuse you are replacing the same string multiple times, so behind the scenes (or between the iterations) there are a few more switches in the middle that you probably don't see (unless debugging this code).
Please note that dictionary keys are not ordered, so you cannot assume what's replaced when.
I suggest you use template
Edit: newer versions of python do preserve insertion order - Are dictionaries ordered in Python 3.6+?
With just built-in functions. The given b dictionary contains cycles between keys and values, so used the f-string notation to perform the substitutions. To escape the {} one should double them {{}}, used for repeated substitution. The enumerate is needed to get unique keys in the new dictionary, so no more cycles.
a = "X1+X2*X3*X1"
b = {"X1":"XX0","X2":"XX1","X0":"XX2"}
new_dict = {}
for i, k in enumerate(b):
sub_format = f'{k}' + f'{i}'
new_dict[sub_format] = b[k]
a = a.replace(k, f'{{{sub_format}}}')
print(a.format(**new_dict))
Output
XX0+XX1*X3*XX0
I was going through some Python challenges and this particular one has been bugging my mind and thought it would be worth getting some explaining. It reads:
Have the function LetterChanges(str) take the str parameter being passed and modify it using the following algorithm. Replace every letter in the string with the letter following it in the alphabet (ie. c becomes d, z becomes a). Then capitalize every vowel in this new string (a, e, i, o, u) and finally return this modified string.
Example:
Input: "fun times!"
Output: gvO Ujnft!
The code:
def LetterChanges(str):
letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW"
changes = "bcdEfghIjklmnOpqrstUvwxyzABCDEFGHIJKLMNOPQRSTUVWZ"
mapping = { k:v for (k,v) in zip(str+letters,str+changes) }
return "".join([ mapping[c] for c in str ])
I understand that it takes two strings, letters and changes. It uses the zip() function that takes iterators and 'zips' them, forming an iterator in the form of a dictionary. k:v for (k,v) It's a dict comprehension.
My doubts are:
What exactly is happening with str+letters,str+changes and why it had to be done?
[ mapping[c] for c in str ] Why is it that by doing this, we accomplish the replacement of every key with its value or has it says in the challenge description: "Replace every letter in the string with the letter following it in the alphabet"
This line:
mapping = { k:v for (k,v) in zip(str+letters,str+changes) }
As you already observed, creates a dictionary using dictionary comprehension syntax. The resulting dictionary will associate each letter with the "new" letter to be used when translating the string. Usually, it would be done like this:
mapping = {k: v for k, v in zip(source, destination)}
Or even shorter:
mapping = dict(zip(source, destination))
However, the next line does the following:
"".join([ mapping[c] for c in str ])
It blindly transforms every single character in str doing a lookup in the dictionary that was just created. If the string contains any character that is not in the mapping, this fails.
To get around this issue, whoever wrote the above code used the silly trick of first adding every single character of the string to the map, associating it with itself, and then adding the corresponding mapping for characters to be replaced.
So here:
mapping = { k:v for (k,v) in zip(str+letters,str+changes) }
The str+ before letters and before changes prepends the whole content of the string to both the originals and the replacements, creating a mapping for each character of the string that is not in letters.
This is the same as:
mapping = {k: k for k in str}
mapping.update({k: v for k, v in zip(letters, changes)})
Which is anyway both awful and slow, so to answer your question:
why it had to be done?
Because whoever wrote the code decided to. There's no need for it, it takes O(len(str)) time to build the mapping, going through the whole string, when there really is no need to. No Python programmer would have wrote it that way.
The 'good' way of doing it would have been:
mapping = dict(zip(source, destination))
return ''.join(mapping.get(c, c) for c in str)
All in all, the above code is pretty awkward and IMHO accomplishes the task in a very messy way.
Easy to spot problems are:
The mapping iterates over the whole string, which is totally unneeded.
A mapping is created to replace characters, but does not take advantage of the already existing str.maketrans() and str.translate() built-in methods available in Python.
The letters X, Y, Z are missing from the letters string, and therefore not transformed.
The list comprehension inside join is totally unneeded, it could be done without the square brackets [].
The variable name str overrides the global type name str, which is bad and should not be done.
A better solution would be:
def LetterChanges(s):
old = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
new = 'bcdEfghIjklmnOpqrstUvwxyzAZABCDEFGHIJKLMNOPQRSTUVWXY'
table = str.maketrans(old, new)
return s.translate(table)
Even better would be to pre-calculate the table only one time and then use the already created one on successive calls:
def LetterChanges(s, table={}):
if not table:
old = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
new = 'bcdEfghIjklmnOpqrstUvwxyzAZABCDEFGHIJKLMNOPQRSTUVWXY'
table.update(str.maketrans(old, new))
return s.translate(table)
Performance:
Original: 1.081s for 100k translations of Hello World!.
Updated: 0.400s for 100k translations of Hello World! (4.5x speedup).
Updated with caching: 0.082s for 100k translations of Hello World! (22.5x speedup).
What exactly is happening with str+letters,str+changes and why it had to be done?
Because the input string "fun times!" doesn't just contain letters from the alphabet; it also contains a space ' ' and an exclamation mark '!'. If these aren't keys in the dictionary mapping, then mapping[c] will raise a KeyError when c is one of those characters.
So the purpose of zip(str + letters, str + changes) is to ensure that every character present in the string is mapped to itself in the dictionary, before adding the actually-required transformations into the dictionary. Note that because it's str + ... with str first, any letters of the alphabet in str will map to themselves first, and then be overwritten by the mapping from letters to changes.
That said, it would be simpler to use mapping.get instead of mapping[...], since the get method allows a default to be returned in case the key is not present. In that case, we don't have to make sure every character in the input string is present as a key in the dictionary.
def letter_changes(string):
letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW"
changes = "bcdEfghIjklmnOpqrstUvwxyzABCDEFGHIJKLMNOPQRSTUVWZ"
mapping = { k: v for (k, v) in zip(letters, changes) }
return "".join(mapping.get(c, c) for c in string)
Here mapping.get(c, c) means, "get the mapping associated with the key c, or if c is not a key in the dictionary, just use c itself". This means a symbol like ' ' or '!' which is not in the dictionary will be left unchanged.
I am initializing my list object using following code.
list = [
func1(centroids[0],value),
func1(centroids[1],value),
....,
func1(centroids[n],value)]
I am trying to do it a more elegant way using some inline iteration. Following is the pseudo code of one possible way.
list = [value for value in func1(centroids[n],value)]
I am not clear how to call func1 in an iterative way. Can you suggest a possible implementation?
For a list of objects, Python knows how to iterate over it directly so you can eliminate the index shown in most of the other answers,
res = [func1(c, value) for c in centroids]
That's all there is to it.
A simple list comprehension consists of the "template" list element, followed by the iterator needed to step through the desired values.
my_list = [func1(centroids[0],value)
for n in range(n+1)]
Use this code:
list = [func1(centroids[x], value) for x in range(n)]
This is called a list comprehension. Put the values that you want the list to contain up front, then followed by the for loop. You can use the iterating variable of the for loop with the value. In this code, you set up n number(s) of variable(s) from the function call func1(centroids[x], value). If the variable n equals to, let's say, 4, list = [func1(centroids[0], value), func1(centroids[0], value), func1(centroids[0], value), func1(centroids[0], value)] would be equal to the code above
I have a dictionary that has keys of different word lengths, for example:
d={'longggg':'a', 'short':'b', 'medium':'c', 'shor':'d'}
and I want to end up with a dictionary that only has keys that are greater than a certain length. For example, I want to only keep entries that are 6 letters long or more. So I want
new_d={'longggg':'a', 'medium':'c'}.
I tried
new_d=dict(k,v) for k,v in d.items() if len[k]>=6
and
new_d={}
for k, v in d.items():
if len[k]>=6:
new_d.update({k:v})
along with many other variations of that code, but the problem ends up being in taking the length of a key.
Use Dictionary comprehensions. No need to do for k in d.keys(). Just use for k in d as d.keys() will return a list which is not needed at all. (A lesson I learnt from Stackoverflow itself!!)
Also as #roganjosh pointed out use len() instead of len[] (len() is a function). Square brackets are used for indexing in say, lists and strings.
d={'longggg':'a', 'short':'b', 'medium':'c', 'shor':'d'}
a = {k:d[k] for k in d if len(k)>=6}
print a
Output:
{'medium': 'c', 'longggg': 'a'}
You can try this:
d={'longggg':'a', 'short':'b', 'medium':'c', 'shor':'d'}
final_d = {a:b for a, b in d.items() if len(a) >= 6}
Output:
{'medium': 'c', 'longggg': 'a'}
len is a built-in function in Python, and therefore uses parentheses (not the square brackets operator).
The big issue (among other things) with your first solution is that you are creating a separate dictionary for each k, v.
Your second solution should work if you fix the len function call, but I would rewrite new_d.update({k:v}) as new_d[k] = v, since that's the standard way to use a Python dictionary.
I can tell you're new, and the best resource for beginner's questions like these will be the Python documentation rather than Stack Overflow. You should also try copy and pasting your error output into Google. You'll probably be able to solve your problems quicker and you'll get more meaningful answers.
I'm I need to slice the leading character off the valued a dictionary - but only if the length of the value is greater than 1. Currently I'm doing this with a dictionary comprehension:
new_dict = {item[0]:item[1][1:] for item in old_dict if item.startswith('1')}
but I don't know how to modify this so that keys of length one are left alone.
The keys are the codewords of a Huffman code, and so start with '0' or '1'.
An example code is:
code = {'a':'0', 'b':'10', 'c':'110', 'd':'111'}
The above code works fine for 'b','c','d' but fails for 'a' (this is intensional - it's a unit test).
How do I correctly modify the above example to pass the test?
The nature of a comprehension is that it builds a new object iteratively, so you if you want every key in the original object old_dict to have a corresponding key in new_dict, you simply have to process every key.
Also, you say "I need to slice the leading character off the keys a dictionary", but the code you give slices the leading characters off the values. I assume you mean values. I suggest the following:
new_dict = {key:(value[:1] if len(value) > 1 else value) for key,value in old_dict.iteritems()}
Apart from using sequence assignment to make the iteration a bit clearer, I've used the if expression (equivalent to ternary operator in c-like languages) to incorporate the condition.
I've also dropped your original if clause, because I don't understand you to want to skip values starting with '1'.
I'm not sure which variable is where but you could do something along these lines.
new_dict = { item[0]:item[1][1] if len(item[1]) > 1 else item[0]:item[1] for item in old_dict if item.startswith('1') }
If I understand your question correctly, you can accomplish it with this:
new_dict = {k:v[len(v)>1:] for k,v in old_dict.items()}
v[len(v)>1] will return the key if it is only 1 character, and it will strip off the leading character if it is more than one character
I'm not sure what you are trying to accomplish with if item.startswith('1') is a qualifier for your list comprehension but if you need it you can add it back on. May need to make it v.startswith('1') though.