Python String Concatenation - concatenating '\n' - python

I am new to Python and need help trying to understand two problems i am getting relating to concatenating strings. I am aware that strings can be added to concatenate each other using + symbol like so.
>>> 'a' + 'b'
'ab'
However, i just recently found out you do not even need to use the + symbol to concatenate strings (by accident/fiddling around), which leads to my first problem to understand - How/why is this possible!?
>>> print 'a' + 'b'
ab
Furthermore, I also understand that the '\n' string produces a 'newline'. But when used in conjunction with my first problem. I get the following.
>>> print '\n' 'a'*7
a
a
a
a
a
a
a
So my second problem arises - "Why do i get 7 new lines of the letter 'a'. In other words, shouldn't the repeater symbol, *, repeat the letter 'a' 7 times!? As follows.
>>> print 'a'*7
aaaaaaa
Please help me clarify what is going on.

When "a" "b" is turned into "ab", this ins't the same as concatenating the strings with +. When the Python source code is being read, adjacent strings are automatically joined for convenience.
This isn't a normal operation, which is why it isn't following the order of operations you expect for + and *.
print '\n' 'a'*7
is actually interpreted the same as
print '\na'*7
and not as
print '\n' + 'a'*7

Python concatenates strings together when you do not separate them with a comma:
>>> print 'a' 'b'
ab
>>> print 'a', 'b'
a b
So you are actually printing '\na' 7 times.

I'm not sure what you mean by "how is it possible". You write a rule: two strings next to each other get concatenated. Then you implement it in the parser. Why? Because it allows you do conveniently do things like this:
re.findall('(<?=(foo))' # The first part of a complicated regexp
'>asdas s' # The next part
'[^asd]' # The last part
)
That way, you can describe just what you're doing.
When you do A * B + C, the computer always does A times B first, then adds C, because multiplication comes before addition.
When you do string concatenation by putting the string literals next to each other, and multiplication, the special string concatenation comes first. This means '\n' 'a' * 7 is the same as ('\n' 'a') * 7, so the string you're repeating is '\na'.

You've probably already realised that relying on the implicit concatenation of adjacent strings is sometimes problematic. Also, concatenating with the + operator is not efficient. It's not noticeable if joining only a few small strings, but it is very noticeable at scale.
Be explicit about it; use ''.join()
print '\n'.join(['a'*7])

Related

How to print multiple letters in a string when combining two prompts [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 29 days ago.
I want to get a new string from the third character to the end of the string, e.g. myString[2:end]. If omitting the second part means 'to the end', and if you omit the first part, does it start from the start?
>>> x = "Hello World!"
>>> x[2:]
'llo World!'
>>> x[:2]
'He'
>>> x[:-2]
'Hello Worl'
>>> x[-2:]
'd!'
>>> x[2:-2]
'llo Worl'
Python calls this concept "slicing" and it works on more than just strings. Take a look here for a comprehensive introduction.
Just for completeness as nobody else has mentioned it. The third parameter to an array slice is a step. So reversing a string is as simple as:
some_string[::-1]
Or selecting alternate characters would be:
"H-e-l-l-o- -W-o-r-l-d"[::2] # outputs "Hello World"
The ability to step forwards and backwards through the string maintains consistency with being able to array slice from the start or end.
Substr() normally (i.e. PHP and Perl) works this way:
s = Substr(s, beginning, LENGTH)
So the parameters are beginning and LENGTH.
But Python's behaviour is different; it expects beginning and one after END (!). This is difficult to spot by beginners. So the correct replacement for Substr(s, beginning, LENGTH) is
s = s[ beginning : beginning + LENGTH]
A common way to achieve this is by string slicing.
MyString[a:b] gives you a substring from index a to (b - 1).
One example seems to be missing here: full (shallow) copy.
>>> x = "Hello World!"
>>> x
'Hello World!'
>>> x[:]
'Hello World!'
>>> x==x[:]
True
>>>
This is a common idiom for creating a copy of sequence types (not of interned strings), [:]. Shallow copies a list, see Python list slice syntax used for no obvious reason.
Is there a way to substring a string in Python, to get a new string from the 3rd character to the end of the string?
Maybe like myString[2:end]?
Yes, this actually works if you assign, or bind, the name,end, to constant singleton, None:
>>> end = None
>>> myString = '1234567890'
>>> myString[2:end]
'34567890'
Slice notation has 3 important arguments:
start
stop
step
Their defaults when not given are None - but we can pass them explicitly:
>>> stop = step = None
>>> start = 2
>>> myString[start:stop:step]
'34567890'
If leaving the second part means 'till the end', if you leave the first part, does it start from the start?
Yes, for example:
>>> start = None
>>> stop = 2
>>> myString[start:stop:step]
'12'
Note that we include start in the slice, but we only go up to, and not including, stop.
When step is None, by default the slice uses 1 for the step. If you step with a negative integer, Python is smart enough to go from the end to the beginning.
>>> myString[::-1]
'0987654321'
I explain slice notation in great detail in my answer to Explain slice notation Question.
I would like to add two points to the discussion:
You can use None instead on an empty space to specify "from the start" or "to the end":
'abcde'[2:None] == 'abcde'[2:] == 'cde'
This is particularly helpful in functions, where you can't provide an empty space as an argument:
def substring(s, start, end):
"""Remove `start` characters from the beginning and `end`
characters from the end of string `s`.
Examples
--------
>>> substring('abcde', 0, 3)
'abc'
>>> substring('abcde', 1, None)
'bcde'
"""
return s[start:end]
Python has slice objects:
idx = slice(2, None)
'abcde'[idx] == 'abcde'[2:] == 'cde'
You've got it right there except for "end". It's called slice notation. Your example should read:
new_sub_string = myString[2:]
If you leave out the second parameter it is implicitly the end of the string.
text = "StackOverflow"
#using python slicing, you can get different subsets of the above string
#reverse of the string
text[::-1] # 'wolfrevOkcatS'
#fist five characters
text[:5] # Stack'
#last five characters
text[-5:] # 'rflow'
#3rd character to the fifth character
text[2:5] # 'rflow'
#characters at even positions
text[1::2] # 'tcOefo'
If myString contains an account number that begins at offset 6 and has length 9, then you can extract the account number this way: acct = myString[6:][:9].
If the OP accepts that, they might want to try, in an experimental fashion,
myString[2:][:999999]
It works - no error is raised, and no default 'string padding' occurs.
Well, I got a situation where I needed to translate a PHP script to Python, and it had many usages of substr(string, beginning, LENGTH).
If I chose Python's string[beginning:end] I'd have to calculate a lot of end indexes, so the easier way was to use string[beginning:][:length], it saved me a lot of trouble.
str1='There you are'
>>> str1[:]
'There you are'
>>> str1[1:]
'here you are'
#To print alternate characters skipping one element in between
>>> str1[::2]
'Teeyuae'
#To print last element of last two elements
>>> str1[:-2:-1]
'e'
#Similarly
>>> str1[:-2:-1]
'e'
#Using slice datatype
>>> str1='There you are'
>>> s1=slice(2,6)
>>> str1[s1]
'ere '
Maybe I missed it, but I couldn't find a complete answer on this page to the original question(s) because variables are not further discussed here. So I had to go on searching.
Since I'm not yet allowed to comment, let me add my conclusion here. I'm sure I was not the only one interested in it when accessing this page:
>>>myString = 'Hello World'
>>>end = 5
>>>myString[2:end]
'llo'
If you leave the first part, you get
>>>myString[:end]
'Hello'
And if you left the : in the middle as well you got the simplest substring, which would be the 5th character (count starting with 0, so it's the blank in this case):
>>>myString[end]
' '
Using hardcoded indexes itself can be a mess.
In order to avoid that, Python offers a built-in object slice().
string = "my company has 1000$ on profit, but I lost 500$ gambling."
If we want to know how many money I got left.
Normal solution:
final = int(string[15:19]) - int(string[43:46])
print(final)
>>>500
Using slices:
EARNINGS = slice(15, 19)
LOSSES = slice(43, 46)
final = int(string[EARNINGS]) - int(string[LOSSES])
print(final)
>>>500
Using slice you gain readability.
a="Helloo"
print(a[:-1])
In the above code, [:-1] declares to print from the starting till the maximum limit-1.
OUTPUT :
>>> Hello
Note: Here a [:-1] is also the same as a [0:-1] and a [0:len(a)-1]
a="I Am Siva"
print(a[2:])
OUTPUT:
>>> Am Siva
In the above code a [2:] declares to print a from index 2 till the last element.
Remember that if you set the maximum limit to print a string, as (x) then it will print the string till (x-1) and also remember that the index of a list or string will always start from 0.
I have a simpler solution using for loop to find a given substring in a string.
Let's say we have two string variables,
main_string = "lullaby"
match_string = "ll"
If you want to check whether the given match string exists in the main string, you can do this,
match_string_len = len(match_string)
for index,value in enumerate(main_string):
sub_string = main_string[index:match_string_len+index]
if sub_string == match_string:
print("match string found in main string")

Variants of string concatenation?

Out of the following two variants (with or without plus-sign between) of string literal concatenation:
What's the preferred way?
What's the difference?
When should one or the other be used?
Should non of them ever be used, if so why?
Is join preferred?
Code:
>>> # variant 1. Plus
>>> 'A'+'B'
'AB'
>>> # variant 2. Just a blank space
>>> 'A' 'B'
'AB'
>>> # They seems to be both equal
>>> 'A'+'B' == 'A' 'B'
True
Juxtaposing works only for string literals:
>>> 'A' 'B'
'AB'
If you work with string objects:
>>> a = 'A'
>>> b = 'B'
you need to use a different method:
>>> a b
a b
^
SyntaxError: invalid syntax
>>> a + b
'AB'
The + is a bit more obvious than just putting literals next to each other.
One use of the first method is to split long texts over several lines, keeping
indentation in the source code:
>>> a = 5
>>> if a == 5:
text = ('This is a long string'
' that I can continue on the next line.')
>>> text
'This is a long string that I can continue on the next line.'
''join() is the preferred way to concatenate more strings, for example in a list:
>>> ''.join(['A', 'B', 'C', 'D'])
'ABCD'
The variant without + is done during the syntax parsing of the code. I guess it was done to let you write multiple line strings nicer in your code, so you can do:
test = "This is a line that is " \
"too long to fit nicely on the screen."
I guess that when it's possible, you should use the non-+ version, because in the byte code there will be only the resulting string, no sign of concatenation left.
When you use +, you have two string in your code and you execute the concatenation during runtime (unless interpreters are smart and optimize it, but I don't know if they do).
Obviously, you cannot do:
a = 'A'
ba = 'B' a
Which one is faster? The no-+ version, because it is done before even executing the script.
+ vs join -> If you have a lot of elements, join is prefered because it is optimised to handle many elements. Using + to concat multiple strings creates a lot of partial results in the process memory, while using join doesn't.
If you're going to concat just a couple of elements I guess + is better as it's more readable.

Python, breaking up Strings

I need to make a program in which the user inputs a word and I need to do something to each individual letter in that word. They cannot enter it one letter at a time just one word.
I.E. someone enters "test" how can I make my program know that it is a four letter word and how to break it up, like make my program make four variables each variable set to a different letter. It should also be able to work with bigger and smaller words.
Could I use a for statement? Something like For letter ste that letter to a variable, but what is it was like a 20 character letter how would the program get all the variable names and such?
Do you mean something like this?
>>> s = 'four'
>>> l = list(s)
>>> l
['f', 'o', 'u', 'r']
>>>
Addendum:
Even though that's (apparently) what you think you wanted, it's probably not necessary because it's possible for a string to hold virtually any size of a word -- so a single string variable likesabove should be good enough for your program verses trying to create a bunch of separately named variables for each character. For one thing, it would be difficult to write the rest of the program because you wouldn't to know what valid variable names to use.
The reason it's OK not to have separate variable for each character is because a single string can have any number of characters in it as well as be empty. Python's built-inlen()function will return a count of the number of letters in a string if applied to one, so the result oflen(s)in the above would be4.
Any character in a string can be randomly accessed by indexing it with an integer between0andlen(s)-1inside of square brackets, so to reference the third character you would uses[2]. It's useful to think of the index as the offset or the character from the beginning of the string.
Even so, in Python using indexing is often not needed because you can also iteratively process each character in a string in aforloop without using them as shown in this simple example:
num_vowels = 0
for ch in s:
if ch in 'aeiou':
num_vowels += 1
print 'there are', num_vowels, 'vowel(s) in the string', s
Python also has many other facilities and built-ins that further help when processing strings (and in fact could simplify the above example), which you'll eventually learn as you become more familiar with the language and its many libraries.
When you iterate a string, it returns the individual characters like
for c in thestring:
print(c)
You can use this to put the letters into a list if you really need to, which will retain its order but list(string) is a better choice for that (be aware that unordered types like dict or set do not guarantee any order).
You don't have to do any of those; In Python, you can access characters of a string using square brackets:
>>> word = "word"
>>> print(word[0])
w
>>> print(word[3])
d
>>> print(len(word))
4
You don't want to assign each letter to a separate variable. Then you'd be writing the rest of your program without even being able to know how many variables you have defined! That's an even worse problem than dealing with the whole string at once.
What you instead want to do is have just one variable holding the string, but you can refer to individual characters in it with indexing. Say the string is in s, then s[0] is the first character, s[1] is the second character, etc. And you can find out how far up the numbers go by checking len(s) - 1 (because the indexes start at 0, a length 1 string has maximum index 0, a length 2 string has maximum index 1, etc).
That's much more manageable than figuring out how to generate len(s) variable names, assign them all to a piece of the string, and then know which variables you need to reference.
Strings are immutable though, so you can't assign to s[1] to change the 2nd character. If you need to do that you can instead create a list with e.g. l = list(s). Then l[1] is the second character, and you can assign l[1] = something to change the element in the list. Then when you're done you can get a new string out with s_new = ''.join(l) (join builds a string by joining together a sequence of strings passed as its argument, using the string it was invoked on to the left as a separator between each of the elements in the sequence; in this case we're joining a list of single-character strings using the empty string as a separator, so we just get all the single-character strings joined into a single string).
x = 'test'
counter = 0
while counter < len(x):
print x[counter] # you can change this to do whatever you want to with x[counter]
counter += 1

Replacing reoccuring characters in strings in Python 3.1

Is it possible to replace a single character inside a string that occurs many times?
Input:
Sentence=("This is an Example. Thxs code is not what I'm having problems with.") #Example input
^
Sentence=("This is an Example. This code is not what I'm having problems with.") #Desired output
Replace the 'x' in "Thxs" with an i, without replacing the x in "Example".
You can do it by including some context:
s = s.replace("Thxs", "This")
Alternatively you can keep a list of words that you don't wish to replace:
whitelist = ['example', 'explanation']
def replace_except_whitelist(m):
s = m.group()
if s in whitelist: return s
else: return s.replace('x', 'i')
s = 'Thxs example'
result = re.sub("\w+", replace_except_whitelist, s)
print(result)
Output:
This example
Sure, but you essentially have to build up a new string out of the parts you want:
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> s[22]
'x'
>>> s[:22] + "i" + s[23:]
"This is an Example. This code is not what I'm having problems with."
For information about the notation used here, see good primer for python slice notation.
If you know whether you want to replace the first occurrence of x, or the second, or the third, or the last, you can combine str.find (or str.rfind if you wish to start from the end of the string) with slicing and str.replace, feeding the character you wish to replace to the first method, as many times as it is needed to get a position just before the character you want to replace (for the specific sentence you suggest, just one), then slice the string in two and replace only one occurrence in the second slice.
An example is worth a thousands words, or so they say. In the following, I assume you want to substitute the (n+1)th occurrence of the character.
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> n = 1
>>> pos = 0
>>> for i in range(n):
>>> pos = s.find('x', pos) + 1
...
>>> s[:pos] + s[pos:].replace('x', 'i', 1)
"This is an Example. This code is not what I'm having problems with."
Note that you need to add an offset to pos, otherwise you will replace the occurrence of x you have just found.

Ways to slice a string?

I have a string, example:
s = "this is a string, a"
Where a ',' (comma) will always be the 3rd to the last character, aka s[-3].
I am thinking of ways to remove the ',' but can only think of converting the string into a list, deleting it, and converting it back to a string. This however seems a bit too much for simple task.
How can I accomplish this in a simpler way?
Normally, you would just do:
s = s[:-3] + s[-2:]
The s[:-3] gives you a string up to, but not including, the comma you want removed ("this is a string") and the s[-2:] gives you another string starting one character beyond that comma (" a").
Then, joining the two strings together gives you what you were after ("this is a string a").
A couple of variants, using the "delete the last comma" rather than "delete third last character" are:
s[::-1].replace(",","",1)[::-1]
or
''.join(s.rsplit(",", 1))
But these are pretty ugly. Slightly better is:
a, _, b = s.rpartition(",")
s = a + b
This may be the best approach if you don't know the comma's position (except for last comma in string) and effectively need a "replace from right". However Anurag's answer is more pythonic for the "delete third last character".
Python strings are immutable. This means that you must create at least 1 new string in order to remove the comma, as opposed to editing the string in place in a language like C.
For deleting every ',' character in the text, you can try
s = s.split(',')
>> ["this is a string", " a"]
s = "".join(s)
>> "this is a string a"
Or in one line:
s0 = "".join(s.split(','))
The best simple way is : You can use replace function as :-
>>> s = 'this is a string, a'
>>> s = s.replace(',','')
>>> s
'this is a string a'
Here, replace() function search the character ',' and replace it by '' i.e. empty character
Note that , the replace() function defaults all ',' but if you want only replace some ',' in some case you can use : s.replace(',' , '', 1)
To slice a string of arbitrary length into multiple equal length slices of arbitrary length you could do
def slicer(string, slice_length):
return [string[i:i + slice_length]
for i in xrange(0, len(string), slice_length)]
If slice_length does not divide exactly into len(string) then there will be a single slice at the end of the list that holds the remainder.

Categories

Resources