fix length re.sub with dictionary

fix length re.sub with dictionary - python

I have a dictionary with all keys three letter long: threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha' ...}
Now I need to translate a sentence abcdefghi into foobarha. I'm trying the method below with re.sub, but don't know how to put dictionary in to it:
p = re.compile('.{3}') # match every three letters
re.sub(p,'how to put dictionary here?', "abcdefghi")
Thanks! (no need to check if input length is multiple of three)

You can pass any callable to re.sub, so:
p.sub(lambda m: threeLetterDict[m.group(0)], "abcdefghi")
It works!

A solution that avoids re entirely:
threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha'}
threes = map("".join, zip(*[iter('abcdefghi')]*3))
"".join(threeLetterDict[three] for three in threes)
#>>> 'foobarha'

You might not need to use sub here:
>>> p = re.compile('.{3}')
>>> ''.join([threeLetterDict.get(i, i) for i in p.findall('abcdefghi')])
'foobarha'
Just an alternate solution :).

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How do I concatenate a list of strings into a single string?
For example, given ['this', 'is', 'a', 'sentence'], how do I get "this-is-a-sentence"?
For handling a few strings in separate variables, see How do I append one string to another in Python?.
For the opposite process - creating a list from a string - see How do I split a string into a list of characters? or How do I split a string into a list of words? as appropriate.

Use str.join:
>>> words = ['this', 'is', 'a', 'sentence']
>>> '-'.join(words)
'this-is-a-sentence'
>>> ' '.join(words)
'this is a sentence'

A more generic way (covering also lists of numbers) to convert a list to a string would be:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> my_lst_str = ''.join(map(str, my_lst))
>>> print(my_lst_str)
12345678910

It's very useful for beginners to know
why join is a string method.
It's very strange at the beginning, but very useful after this.
The result of join is always a string, but the object to be joined can be of many types (generators, list, tuples, etc).
.join is faster because it allocates memory only once. Better than classical concatenation (see, extended explanation).
Once you learn it, it's very comfortable and you can do tricks like this to add parentheses.
>>> ",".join("12345").join(("(",")"))
Out:
'(1,2,3,4,5)'
>>> list = ["(",")"]
>>> ",".join("12345").join(list)
Out:
'(1,2,3,4,5)'

Edit from the future: Please don't use the answer below. This function was removed in Python 3 and Python 2 is dead. Even if you are still using Python 2 you should write Python 3 ready code to make the inevitable upgrade easier.
Although #Burhan Khalid's answer is good, I think it's more understandable like this:
from str import join
sentence = ['this','is','a','sentence']
join(sentence, "-")
The second argument to join() is optional and defaults to " ".

list_abc = ['aaa', 'bbb', 'ccc']
string = ''.join(list_abc)
print(string)
>>> aaabbbccc
string = ','.join(list_abc)
print(string)
>>> aaa,bbb,ccc
string = '-'.join(list_abc)
print(string)
>>> aaa-bbb-ccc
string = '\n'.join(list_abc)
print(string)
>>> aaa
>>> bbb
>>> ccc

We can also use Python's reduce function:
from functools import reduce
sentence = ['this','is','a','sentence']
out_str = str(reduce(lambda x,y: x+"-"+y, sentence))
print(out_str)

We can specify how we join the string. Instead of '-', we can use ' ':
sentence = ['this','is','a','sentence']
s=(" ".join(sentence))
print(s)

If you have a mixed content list and want to stringify it, here is one way:
Consider this list:
>>> aa
[None, 10, 'hello']
Convert it to string:
>>> st = ', '.join(map(str, map(lambda x: f'"{x}"' if isinstance(x, str) else x, aa)))
>>> st = '[' + st + ']'
>>> st
'[None, 10, "hello"]'
If required, convert back to the list:
>>> ast.literal_eval(st)
[None, 10, 'hello']

If you want to generate a string of strings separated by commas in final result, you can use something like this:
sentence = ['this','is','a','sentence']
sentences_strings = "'" + "','".join(sentence) + "'"
print (sentences_strings) # you will get "'this','is','a','sentence'"

def eggs(someParameter):
del spam[3]
someParameter.insert(3, ' and cats.')
spam = ['apples', 'bananas', 'tofu', 'cats']
eggs(spam)
spam =(','.join(spam))
print(spam)

Without .join() method you can use this method:
my_list=["this","is","a","sentence"]
concenated_string=""
for string in range(len(my_list)):
if string == len(my_list)-1:
concenated_string+=my_list[string]
else:
concenated_string+=f'{my_list[string]}-'
print([concenated_string])
>>> ['this-is-a-sentence']
So, range based for loop in this example , when the python reach the last word of your list, it should'nt add "-" to your concenated_string. If its not last word of your string always append "-" string to your concenated_string variable.

How to read user command input and store parts in variables

So let's say that user types !give_money user#5435 33000
Now I want to take that user#5435 and 33000 and store them in variables.
How do I do that? Maybe it is very simple but I don't know.
If you need any more info please comment.
Thanks!

list_of_sub_string=YourString.split()
print(list_of_sub_string[-1]) #33000
print(list_of_sub_string[-2]) #user#5435

Split the input on spaces and extract the second and third elements:
parts = input().split()
user = parts[1]
numb = parts[2]
Although it would be more Pythonic to unpack into variables (discarding the first with a conventional underscore):
_, user, numb = input().split()
Just to elaborate further, input.split() returns a list of the sublists split at the deliminator passed into the function. However, when there are no inputs, the string is split on spaces.
To get a feel, observe:
>>> 'hello there bob'.split()
['hello', 'there', 'bob']
>>> 'split,on,commas'.split(',')
['split', 'on', 'commas']
and then unpacking just assigns variables to each element in a list:
>>> a, b, c = [1, 2, 3]
>>> a
1
>>> b
2
>>> c
3

How to find all element from str

I want to find all element in dict from str.
I try to write code, but it doesn't work well.
I consider using recursive function.
str = "xx111xxx200x222x"
nums = {"one hundreds": ["100","111"], "two hundreds": ["200", "222"]}
result = []
def allfind(data):
for key in nums.keys():
for num in nums[key]:
index = data.find(num)
if index > -1:
result.append(key)
return allfind(data[index+len(num):])
allfind("xx111xxx200x222x")
print result # return ["two hundreds", "two hundreds"]
# I want to get ["one hundreds", "two hundreds", "two hundreds"]

I would have transformed the dictionary to have all the values as keys and the key as corresponding values and using the RegEx suggested by Grijesh Chauhan, getting the values will be easy like this
nums, my_str = {num:key for key in nums for num in nums[key]}, "xx111xxx200x222x"
import re
print nums
# {'200': 'two hundreds', '100': 'one hundreds', '111': 'one hundreds', '222': 'two hundreds'}
print [nums[item] for item in re.split('x+', my_str) if nums.get(item, "")]
# ['one hundreds', 'two hundreds', 'two hundreds']

you can do something like(read comments):
>>> import re
>>> r = [] # return list
>>> for i in re.split('x+', "xx111xxx200x222x"): # outer loop
... for k in nums: # iterate for each key
... if i in nums[k]: # check if i in list at key
... r.append(k) # if true add in return list
...
>>> r
['one hundreds', 'two hundreds', 'two hundreds']
Note in outer loop you are iterating for following:
>>> re.split('x+', "xx111xxx200x222x")
['', '111', '200', '222', '']
# ^ ^ doesn't exists in dict values.

The reason because you are getting wrong answer is because nums is a dictionary and is orderless and so,
nums.keys() becomes ['two hundreds', 'one hundreds']
Hence you have two hundreds as your first result and then when you do
return allfind(data[index+len(num):])
it returns the string x222x. Which ofcourse has only "two hundreds" (222), so final result becomes
['two hundreds', 'two hundreds']
The solution, which I think you can do it after knowing the error, should come when you iterate over the nums keys in correct order. (Think list).
Also, try putting simple print statements for easy debugging, whenever possible.

How to concatenate (join) items in a list to a single string

How do I concatenate a list of strings into a single string?
For example, given ['this', 'is', 'a', 'sentence'], how do I get "this-is-a-sentence"?
For handling a few strings in separate variables, see How do I append one string to another in Python?.
For the opposite process - creating a list from a string - see How do I split a string into a list of characters? or How do I split a string into a list of words? as appropriate.

Use str.join:
>>> words = ['this', 'is', 'a', 'sentence']
>>> '-'.join(words)
'this-is-a-sentence'
>>> ' '.join(words)
'this is a sentence'

A more generic way (covering also lists of numbers) to convert a list to a string would be:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> my_lst_str = ''.join(map(str, my_lst))
>>> print(my_lst_str)
12345678910

It's very useful for beginners to know
why join is a string method.
It's very strange at the beginning, but very useful after this.
The result of join is always a string, but the object to be joined can be of many types (generators, list, tuples, etc).
.join is faster because it allocates memory only once. Better than classical concatenation (see, extended explanation).
Once you learn it, it's very comfortable and you can do tricks like this to add parentheses.
>>> ",".join("12345").join(("(",")"))
Out:
'(1,2,3,4,5)'
>>> list = ["(",")"]
>>> ",".join("12345").join(list)
Out:
'(1,2,3,4,5)'

Edit from the future: Please don't use the answer below. This function was removed in Python 3 and Python 2 is dead. Even if you are still using Python 2 you should write Python 3 ready code to make the inevitable upgrade easier.
Although #Burhan Khalid's answer is good, I think it's more understandable like this:
from str import join
sentence = ['this','is','a','sentence']
join(sentence, "-")
The second argument to join() is optional and defaults to " ".

list_abc = ['aaa', 'bbb', 'ccc']
string = ''.join(list_abc)
print(string)
>>> aaabbbccc
string = ','.join(list_abc)
print(string)
>>> aaa,bbb,ccc
string = '-'.join(list_abc)
print(string)
>>> aaa-bbb-ccc
string = '\n'.join(list_abc)
print(string)
>>> aaa
>>> bbb
>>> ccc

We can also use Python's reduce function:
from functools import reduce
sentence = ['this','is','a','sentence']
out_str = str(reduce(lambda x,y: x+"-"+y, sentence))
print(out_str)

We can specify how we join the string. Instead of '-', we can use ' ':
sentence = ['this','is','a','sentence']
s=(" ".join(sentence))
print(s)

If you have a mixed content list and want to stringify it, here is one way:
Consider this list:
>>> aa
[None, 10, 'hello']
Convert it to string:
>>> st = ', '.join(map(str, map(lambda x: f'"{x}"' if isinstance(x, str) else x, aa)))
>>> st = '[' + st + ']'
>>> st
'[None, 10, "hello"]'
If required, convert back to the list:
>>> ast.literal_eval(st)
[None, 10, 'hello']

If you want to generate a string of strings separated by commas in final result, you can use something like this:
sentence = ['this','is','a','sentence']
sentences_strings = "'" + "','".join(sentence) + "'"
print (sentences_strings) # you will get "'this','is','a','sentence'"

def eggs(someParameter):
del spam[3]
someParameter.insert(3, ' and cats.')
spam = ['apples', 'bananas', 'tofu', 'cats']
eggs(spam)
spam =(','.join(spam))
print(spam)

Without .join() method you can use this method:
my_list=["this","is","a","sentence"]
concenated_string=""
for string in range(len(my_list)):
if string == len(my_list)-1:
concenated_string+=my_list[string]
else:
concenated_string+=f'{my_list[string]}-'
print([concenated_string])
>>> ['this-is-a-sentence']
So, range based for loop in this example , when the python reach the last word of your list, it should'nt add "-" to your concenated_string. If its not last word of your string always append "-" string to your concenated_string variable.

Splitting a string by capital letters

I currently have the following code, which finds capital letters in a string 'formula': http://pastebin.com/syRQnqCP
Now, my question is, how can I alter that code (Disregard the bit within the "if choice = 1:" loop) so that each part of that newly broken up string is put into it's own variable?
For example, putting in NaBr would result in the string being broken into "Na" and "Br". I need to put those in separate variables so I can look them up in my CSV file.
Preferably it'd be a kind of generated thing, so if there are 3 elements, like MgSO4, O would be put into a separate variable like Mg and S would be.
If this is unclear, let me know and I'll try and make it a bit more comprehensible... No way of doing so comes to mind currently, though. :(
EDIT: Relevant pieces of code:
Function:
def split_uppercase(string):
x=''
for i in string:
if i.isupper(): x+=' %s' %i
else: x+=i
return x.strip()
String entry and lookup:
formula = raw_input("Enter formula: ")
upper = split_uppercase(formula)
#Pull in data from form.csv
weight1 = float(formul_data.get(element1.lower()))
weight2 = float(formul_data.get(element2.lower()))
weight3 = float(formul_data.get(element3.lower()))
weightSum = weight1 + weight2 + weight3
print "Total weight =", weightSum

I think there is a far easier way to do what you're trying to do. Use regular expressions. For instance:
>>> [a for a in re.split(r'([A-Z][a-z]*)', 'MgSO4') if a]
['Mg', u'S', u'O', u'4']
If you want the number attached to the right element, just add a digit specifier in the regex:
>>> [a for a in re.split(r'([A-Z][a-z]*\d*)', txt) if a]
[u'Mg', u'S', u'O4']
You don't really want to "put each part in its own variable". That doesn't make sense in general, because you don't know how many parts there are, so you can't know how many variables to create ahead of time. Instead, you want to make a list, like in the example above. Then you can iterate over this list and do what you need to do with each piece.

You can use re.split to perform complex splitting on strings.
import re
def split_upper(s):
return filter(None, re.split("([A-Z][^A-Z]*)", s))
>>> split_upper("fooBarBaz")
['foo', 'Bar', 'Baz']
>>> split_upper("fooBarBazBB")
['foo', 'Bar', 'Baz', 'B', 'B']
>>> split_upper("fooBarBazBB4")
['foo', 'Bar', 'Baz', 'B', 'B4']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

fix length re.sub with dictionary - python

You can pass any callable to re.sub, so: p.sub(lambda m: threeLetterDict[m.group(0)], "abcdefghi") It works!

A solution that avoids re entirely: threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha'} threes = map("".join, zip([iter('abcdefghi')]3)) "".join(threeLetterDict[three] for three in threes) #>>> 'foobarha'

You might not need to use sub here: >>> p = re.compile('.{3}') >>> ''.join([threeLetterDict.get(i, i) for i in p.findall('abcdefghi')]) 'foobarha' Just an alternate solution :).

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How to read user command input and store parts in variables

How to find all element from str

How to concatenate (join) items in a list to a single string

Splitting a string by capital letters

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

fix length re.sub with dictionary - python

You can pass any callable to re.sub, so: p.sub(lambda m: threeLetterDict[m.group(0)], "abcdefghi") It works!

A solution that avoids re entirely: threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha'} threes = map("".join, zip(*[iter('abcdefghi')]*3)) "".join(threeLetterDict[three] for three in threes) #>>> 'foobarha'

You might not need to use sub here: >>> p = re.compile('.{3}') >>> ''.join([threeLetterDict.get(i, i) for i in p.findall('abcdefghi')]) 'foobarha' Just an alternate solution :).

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How to read user command input and store parts in variables

How to find all element from str

How to concatenate (join) items in a list to a single string

Splitting a string by capital letters

Categories

Resources

A solution that avoids re entirely: threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha'} threes = map("".join, zip([iter('abcdefghi')]3)) "".join(threeLetterDict[three] for three in threes) #>>> 'foobarha'