Splitting a string by capital letters

Splitting a string by capital letters - python

I currently have the following code, which finds capital letters in a string 'formula': http://pastebin.com/syRQnqCP
Now, my question is, how can I alter that code (Disregard the bit within the "if choice = 1:" loop) so that each part of that newly broken up string is put into it's own variable?
For example, putting in NaBr would result in the string being broken into "Na" and "Br". I need to put those in separate variables so I can look them up in my CSV file.
Preferably it'd be a kind of generated thing, so if there are 3 elements, like MgSO4, O would be put into a separate variable like Mg and S would be.
If this is unclear, let me know and I'll try and make it a bit more comprehensible... No way of doing so comes to mind currently, though. :(
EDIT: Relevant pieces of code:
Function:
def split_uppercase(string):
x=''
for i in string:
if i.isupper(): x+=' %s' %i
else: x+=i
return x.strip()
String entry and lookup:
formula = raw_input("Enter formula: ")
upper = split_uppercase(formula)
#Pull in data from form.csv
weight1 = float(formul_data.get(element1.lower()))
weight2 = float(formul_data.get(element2.lower()))
weight3 = float(formul_data.get(element3.lower()))
weightSum = weight1 + weight2 + weight3
print "Total weight =", weightSum

I think there is a far easier way to do what you're trying to do. Use regular expressions. For instance:
>>> [a for a in re.split(r'([A-Z][a-z]*)', 'MgSO4') if a]
['Mg', u'S', u'O', u'4']
If you want the number attached to the right element, just add a digit specifier in the regex:
>>> [a for a in re.split(r'([A-Z][a-z]*\d*)', txt) if a]
[u'Mg', u'S', u'O4']
You don't really want to "put each part in its own variable". That doesn't make sense in general, because you don't know how many parts there are, so you can't know how many variables to create ahead of time. Instead, you want to make a list, like in the example above. Then you can iterate over this list and do what you need to do with each piece.

You can use re.split to perform complex splitting on strings.
import re
def split_upper(s):
return filter(None, re.split("([A-Z][^A-Z]*)", s))
>>> split_upper("fooBarBaz")
['foo', 'Bar', 'Baz']
>>> split_upper("fooBarBazBB")
['foo', 'Bar', 'Baz', 'B', 'B']
>>> split_upper("fooBarBazBB4")
['foo', 'Bar', 'Baz', 'B', 'B4']

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How do I concatenate a list of strings into a single string?
For example, given ['this', 'is', 'a', 'sentence'], how do I get "this-is-a-sentence"?
For handling a few strings in separate variables, see How do I append one string to another in Python?.
For the opposite process - creating a list from a string - see How do I split a string into a list of characters? or How do I split a string into a list of words? as appropriate.

Use str.join:
>>> words = ['this', 'is', 'a', 'sentence']
>>> '-'.join(words)
'this-is-a-sentence'
>>> ' '.join(words)
'this is a sentence'

A more generic way (covering also lists of numbers) to convert a list to a string would be:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> my_lst_str = ''.join(map(str, my_lst))
>>> print(my_lst_str)
12345678910

It's very useful for beginners to know
why join is a string method.
It's very strange at the beginning, but very useful after this.
The result of join is always a string, but the object to be joined can be of many types (generators, list, tuples, etc).
.join is faster because it allocates memory only once. Better than classical concatenation (see, extended explanation).
Once you learn it, it's very comfortable and you can do tricks like this to add parentheses.
>>> ",".join("12345").join(("(",")"))
Out:
'(1,2,3,4,5)'
>>> list = ["(",")"]
>>> ",".join("12345").join(list)
Out:
'(1,2,3,4,5)'

Edit from the future: Please don't use the answer below. This function was removed in Python 3 and Python 2 is dead. Even if you are still using Python 2 you should write Python 3 ready code to make the inevitable upgrade easier.
Although #Burhan Khalid's answer is good, I think it's more understandable like this:
from str import join
sentence = ['this','is','a','sentence']
join(sentence, "-")
The second argument to join() is optional and defaults to " ".

list_abc = ['aaa', 'bbb', 'ccc']
string = ''.join(list_abc)
print(string)
>>> aaabbbccc
string = ','.join(list_abc)
print(string)
>>> aaa,bbb,ccc
string = '-'.join(list_abc)
print(string)
>>> aaa-bbb-ccc
string = '\n'.join(list_abc)
print(string)
>>> aaa
>>> bbb
>>> ccc

We can also use Python's reduce function:
from functools import reduce
sentence = ['this','is','a','sentence']
out_str = str(reduce(lambda x,y: x+"-"+y, sentence))
print(out_str)

We can specify how we join the string. Instead of '-', we can use ' ':
sentence = ['this','is','a','sentence']
s=(" ".join(sentence))
print(s)

If you have a mixed content list and want to stringify it, here is one way:
Consider this list:
>>> aa
[None, 10, 'hello']
Convert it to string:
>>> st = ', '.join(map(str, map(lambda x: f'"{x}"' if isinstance(x, str) else x, aa)))
>>> st = '[' + st + ']'
>>> st
'[None, 10, "hello"]'
If required, convert back to the list:
>>> ast.literal_eval(st)
[None, 10, 'hello']

If you want to generate a string of strings separated by commas in final result, you can use something like this:
sentence = ['this','is','a','sentence']
sentences_strings = "'" + "','".join(sentence) + "'"
print (sentences_strings) # you will get "'this','is','a','sentence'"

def eggs(someParameter):
del spam[3]
someParameter.insert(3, ' and cats.')
spam = ['apples', 'bananas', 'tofu', 'cats']
eggs(spam)
spam =(','.join(spam))
print(spam)

Without .join() method you can use this method:
my_list=["this","is","a","sentence"]
concenated_string=""
for string in range(len(my_list)):
if string == len(my_list)-1:
concenated_string+=my_list[string]
else:
concenated_string+=f'{my_list[string]}-'
print([concenated_string])
>>> ['this-is-a-sentence']
So, range based for loop in this example , when the python reach the last word of your list, it should'nt add "-" to your concenated_string. If its not last word of your string always append "-" string to your concenated_string variable.

How to read user command input and store parts in variables

So let's say that user types !give_money user#5435 33000
Now I want to take that user#5435 and 33000 and store them in variables.
How do I do that? Maybe it is very simple but I don't know.
If you need any more info please comment.
Thanks!

list_of_sub_string=YourString.split()
print(list_of_sub_string[-1]) #33000
print(list_of_sub_string[-2]) #user#5435

Split the input on spaces and extract the second and third elements:
parts = input().split()
user = parts[1]
numb = parts[2]
Although it would be more Pythonic to unpack into variables (discarding the first with a conventional underscore):
_, user, numb = input().split()
Just to elaborate further, input.split() returns a list of the sublists split at the deliminator passed into the function. However, when there are no inputs, the string is split on spaces.
To get a feel, observe:
>>> 'hello there bob'.split()
['hello', 'there', 'bob']
>>> 'split,on,commas'.split(',')
['split', 'on', 'commas']
and then unpacking just assigns variables to each element in a list:
>>> a, b, c = [1, 2, 3]
>>> a
1
>>> b
2
>>> c
3

Program to make an acronym with a period in between each letter

so i'm trying to make a program in Python PyScripter 3.3 that takes input, and converts the input into an acronym. This is what i'm looking for.
your input: center of earth
programs output: C.O.E.
I don't really know how to go about doing this, I am looking for not just the right answer, but an explanation of why certain code is used, thanks..
What I have tried so far:
def first_letters(lst):
return [s[:1] for s in converted]
def main():
lst = input("What is the phrase you wish to convert into an acronym?")
converted = lst.split().upper()
Beyond here I am not really sure where to go, so far I know I need to captialize the input, split it into separate words, and then beyond that im not sure where to go...

I like Python 3.
>>> s = 'center of earth'
>>> print(*(word[0] for word in s.upper().split()), sep='.', end='.\n')
C.O.E.
s = 'center of earth' - Assign the string.
s.upper() - Make the string uppercase. This goes before split() because split() returns a list and upper() doesn't work on lists.
.split() - Split the uppercased string into a list.
for word in - Iterate through each element of the created list.
word[0] - The first letter of each word.
* - Unpack this generator and pass each element as an argument to the print function.
sep='.' - Specify a period to separate each printed argument.
end='.\n' - Specify a period and a newline to print after all the arguments.
print - Print it.
As an alternative:
>>> s = 'center of earth'
>>> '.'.join(filter(lambda x: x.isupper(), s.title())) + '.'
'C.O.E.'
s = 'center of earth' - Assign the string.
s.title() - Change the string to Title Case.
filter - Filter the string, retaining only those elements that are approved by a predicate (the lambda below).
lambda x: x.isupper() - Define an anonymous inline function that takes an argument x and returns whether x is uppercase.
'.'.join - Join all the filtered elements with a '.'.
+ '.' - Add a period to the end.
Note that this one returns a string instead of simply printing it to the console.

>>> import re
>>> s = "center of earth"
>>> re.sub('[a-z ]+', '.', s.title())
'C.O.E.'
>>> "".join(i[0].upper() + "." for i in s.split())
'C.O.E.'

Since you want an explanation and not just an answer:
>>> s = 'center of earth'
>>> s = s.split() # split it into words
>>> s
['center', 'of', 'earth']
>>> s = [i[0] for i in s] # get only the first letter or each word
>>> s
['c', 'o', 'e']
>>> s = [i.upper() for i in s] # convert the letters to uppercase
>>> s
['C', 'O', 'E']
>>> s = '.'.join(s) # join the letters into a string
>>> s
'C.O.E'
>>> s = s + '.' # add the dot at the end
>>> s
'C.O.E.'

Output string.split() into array elements in one line

I have a string myString:
myString = "alpha beta gamma"
I want to split myString into its three words:
myWords = myString.split()
Then, I can access each word individually:
firstWord = myWords[0]
secondWord = myWords[1]
thirdWord = myWords[2]
My question: How can I assign these three words in just one line, as an output from the split() function? For example, something like:
[firstWord secondWord thirdWord] = myString.split()
What's the syntax in Python 2.7?

Almost exactly what you tried works:
firstWord, secondWord, thirdWord = myString.split()
Demo:
>>> first, second, third = "alpha beta gamma".split()
>>> first
'alpha'
>>> second
'beta'
>>> third
'gamma'

To expand on Mr. Pieter's answer and the comment raised by TheSoundDefense,
Should str.split() return more values than you allow, it will break with ValueError: too many values to unpack. In Python 3 you can do
first, second, third, *extraWords = str.split()
And this will dump everything extra into a list called extraWords. However, for Python 2 it gets a little more complicated and less convenient as described in this question and answer.
Alternatively, you could change how you're storing the variables and put them in a dictionary using a comprehension.
>>> words = {i:word for i,word in enumerate(myString.split())}
>>> words
{0: 'alpha', 1: 'beta', 2: 'gamma'}
This has the advantage of avoiding the value unpacking all together (which I think is less than ideal in Python 2). However, this can obfuscate the variable names as they are now referred to as words[0] instead of a more specific name.

the above answer is correct also if you want to make a list do this:
my_list=[first, second, third] = "alpha beta gamma".split()

Alternatively, using a list comprehension ...
mystring = "alpha beta gamma"
myWords = [x for x in mystring.split()]
first = myWords[0]
second = myWords[1]
third = myWords[2]
print(first, second, third)
alpha beta gamma

fix length re.sub with dictionary

I have a dictionary with all keys three letter long: threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha' ...}
Now I need to translate a sentence abcdefghi into foobarha. I'm trying the method below with re.sub, but don't know how to put dictionary in to it:
p = re.compile('.{3}') # match every three letters
re.sub(p,'how to put dictionary here?', "abcdefghi")
Thanks! (no need to check if input length is multiple of three)

You can pass any callable to re.sub, so:
p.sub(lambda m: threeLetterDict[m.group(0)], "abcdefghi")
It works!

A solution that avoids re entirely:
threeLetterDict={'abc': 'foo', 'def': 'bar', 'ghi': 'ha'}
threes = map("".join, zip(*[iter('abcdefghi')]*3))
"".join(threeLetterDict[three] for three in threes)
#>>> 'foobarha'

You might not need to use sub here:
>>> p = re.compile('.{3}')
>>> ''.join([threeLetterDict.get(i, i) for i in p.findall('abcdefghi')])
'foobarha'
Just an alternate solution :).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting a string by capital letters - python

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How to read user command input and store parts in variables

Program to make an acronym with a period in between each letter

Output string.split() into array elements in one line

fix length re.sub with dictionary

Categories

Resources