String manipulation in Python

String manipulation in Python - python

I have a randomly generated string from 6 letters in this form, example:
A' B F2 E' B2 A2 C' D2 C D' E2 F
Some letters have " ' " added to them some have number "2". What i want is to add letter "x" to every letter that is on its own.
So it would look like this:
A' Bx F2 E' B2 A2 C' D2 Cx D' E2 Fx
The trick is that it would add the "x" only to those letters that are on their own. No, Bx -> Bx2.
Any ideas?

Transform your string into list with split()
s = """A' B F2 E' B2 A2 C' D2 C D' E2 F"""
L = s.split(' ')
for i in xrange(len(L)):
if len(L[i]) == 1:
L[i] += 'x'
str_out = ' '.join(L)

The split-comprehend-join version:
' '.join(n+'x' if len(n)==1 else n for n in inputstr.split(' '))
The regex version:
>>> inputstr = "A' F B2 C"
>>> re.sub(r'([A-Z])(?=\s|$)', r'\1x', inputstr)
"A' Fx B2 Cx"
In essence, find any uppercase letter not followed by either a space or the end of the string, and replace it with that character followed by an x.
I ran a few tests with timeit; the former (list comprehension) appears to run slightly faster than the latter (about 15-20% faster on average). This does not appear to change no matter the number of replacements that need to be done (a string 10 times as long still has about the same ratio of processing time as the original).

Ugly or Pythonic?
items = "A' B F2 E' B2 A2 C' D2 C D' E2 F".split()
itemsx = ((a+'x' if len(a)==1 else a) for a in items)
out = ' '.join(itemsx)

With a regular expression,
import re
newstring = re.sub(r"\b(\w)(?![2'])", r'\1x', oldstring)
should be fine. If you're allergic to res,
news = ' '.join(x + 'x' if len(x)==1 else x for x in olds.split())
is a concise way of expressing a similar transformation (if length-one is really the only thing you need to check before appending 'x' to an item).

' '.join(n if len(n) == 2 else n + 'x' for n in s.split(' '))

>>> s="A' B F2 E' B2 A2 C' D2 C D' E2 F".split()
>>> import string
>>> letters=list(string.letters)
>>> for n,i in enumerate(s):
... if i in letters:
... s[n]=i+"x"
...
>>> ' '.join(s)
"A' Bx F2 E' B2 A2 C' D2 Cx D' E2 Fx"
>>>

>>> ' '.join((i+'x')[:2] for i in items.split())
"A' Bx F2 E' B2 A2 C' D2 Cx D' E2 Fx"

Related

Python - Counting Letter Frequency in a String

I want to write my each string's letter frequencies. My inputs and expected outputs are like this.
"aaaa" -> "a4"
"abb" -> "a1b2"
"abbb cc a" -> "a1b3 c2 a1"
"bbbaaacddddee" -> "b3a3c1d4e2"
"a b" -> "a1 b1"
I found this solution but it gives the frequencies in random order. How can I do this?

Does this satisfy your needs?
from itertools import groupby
s = "bbbaaac ddddee aa"
groups = groupby(s)
result = [(label, sum(1 for _ in group)) for label, group in groups]
res1 = "".join("{}{}".format(label, count) for label, count in result)
# 'b3a3c1 1d4e2 1a2'
# spaces just as spaces, do not include their count
import re
re.sub(' [0-9]+', ' ', res1)
'b3a3c1 d4e2 a2'

For me, it is a little bit trickier that it looks at first. For example, it does look that "bbbaaacddddee" -> "b3a3c1d4e2" needs the count results to be outputted in the order of appearance in the passed string:
import re
def unique_elements(t):
l = []
for w in t:
if w not in l:
l.append(w)
return l
def splitter(s):
res = []
tokens = re.split("[ ]+", s)
for token in tokens:
s1 = unique_elements(token) # or s1 = sorted(set(token))
this_count = "".join([k + str(v) for k, v in list(zip(s1, [token.count(x) for x in s1]))])
res.append(this_count)
return " ".join(res)
print(splitter("aaaa"))
print(splitter("abb"))
print(splitter("abbb cc a"))
print(splitter("bbbaaacddddee"))
print(splitter("a b"))
OUTPUT
a4
a1b2
a1b3 c2 a1
b3a3c1d4e2
a1 b1
If the order of appearance is not a real deal, you can disregard the unique_elements function and simply substitute something like s1 = sorted(set(token)) within splitter, as indicated in the comment.

here is you answer
test_str = "here is your answer"
res = {}
list=[]
list=test_str.split()
# print(list)
for a in list:
res={}
for keys in a:
res[keys] = res.get(keys, 0) + 1
for key,value in res.items():
print(f"{key}{value}",end="")
print(end=" ")

There is no need to iterate every character in every word.
This is an alternate solution. (If you don't want to use itertools, that looked pretty tidy.)
def word_stats(data: str=""):
all = []
for word in data.split(" "):
res = []
while len(word)>0:
res.append(word[:1] + str(word.count(word[:1])))
word = word.replace(word[:1],"")
res.sort()
all.append("".join(res))
return " ".join(all)
print(word_stats("asjssjbjbbhsiaiic ifiaficjxzjooro qoprlllkskrmsnm mmvvllvlxjxj jfnnfcncnnccnncsllsdfi"))
print(word_stats("abbb cc a"))
print(word_stats("bbbaaacddddee"))
This would output:
c5d1f3i1j1l2n7s2
a1b3 c2 a1
a3b3c1d4e2

Python: Get each element from a list based on position, brackets, and parentheses

So I have the next list:
a = ["[Test](Link)", "[Test2](link2)", "[test3](link3)"]
And I want to get it to show in this way:
b1 = Test
b2 = Link
b3 = test3
b4 = link2
b5 = test3
b6 = link3
How could I do something like this?
I've tried to join the list and use re to get what I want but I failed

import re
a = ["[Test](Link)", "[Test2](link2)", "[test3](link3)"]
for s in a:
m = re.match('(\[.*\])(\(.*\))$', s)
print(m.group(1))
print(m.group(2))
results:
[Test]
(Link)
[Test2]
(link2)
[test3]
(link3)

You can use re.findall after using join on your list:
>>> re.findall(r'(\[.*?\]|\(.*?\))', ''.join(a))
['[Test]', '(Link)', '[Test2]', '(link2)', '[test3]', '(link3)']
Regex Explanation:
( # Matching group 1
\[.*?\] # Matches non-greedily in between brackets
| # OR
\(.*?\) # Matches non-greedily between parenthesis
) # End of matching group

>>> a = ["[Test](Link)", "[Test2](link2)", "[test3](link3)"]
>>> b1,b2,b3,b4,b5,b6 = (y.strip('[]()') for x in a for y in x.split(']'))
>>> print (b1,b2,b3,b4,b5,b6)
Test Link Test2 link2 test3 link3

Your original question looks like you want each value to be representing by a new variable. You can use globals() to dynamically create new variables.
count = 0
g = globals()
for i in a:
f = i.strip('[)').split('](')
count += 1
g['b' + str(count)] = f[0]
print ('b' + str(count) + ' = ' + f[0])
count += 1
g['b' + str(count)] = f[1]
print ('b' + str(count) + ' = ' + f[1])
b1 = Test
b2 = Link
b3 = Test2
b4 = link2
b5 = test3
b6 = link3
Below is the output of the variables that were dynamically created.
In [5]: b1
Out[5]: 'Test'
In [6]: b2
Out[6]: 'Link'
In [7]: b3
Out[7]: 'Test2'
In [8]: b4
Out[8]: 'link2'
In [9]: b5
Out[9]: 'test3'
In [10]: b6
Out[10]: 'link3'

make python utilize any symbol's input

I'm making a code that can translate numbers to piano keys,
***Sorry for the confusion, I meant the ideal output for "3.14159ABC265" is "E1 _ C1 F1 C1 G1 D2 _ _ _ D1 A2 G1", however python will give an error when the input has #, \, or something
the codes:
numbers = str(input('This code will convert numbers to piano keys, \nnow input any numbers here'))
keys = str('')
while len(numbers) == str(0):
G = str('_ ')
if numbers[0] == str(0): G='B1 '
if numbers[0] == str(1): G='C1 '
if numbers[0] == str(2): G='D1 '
if numbers[0] == str(3): G='E1 '
if numbers[0] == str(4): G='F1 '
if numbers[0] == str(5): G='G1 '
if numbers[0] == str(6): G='A2 '
if numbers[0] == str(7): G='B2 '
if numbers[0] == str(8): G='C2 '
if numbers[0] == str(9): G='D2 '
keys += G
numbers = numbers[1:len(numbers)]
print(keys)
This code is already working, but not when the input has \, # or something. I've searched for a while but didn't found an answer.
By the way I think python should have an option to disable the differences between numbers and strings in a short code like this XD

You can use ord to turn any characters into numbers (based on ASCII values), then use division and remainder to map the numbers into piano key numbers and scales, and then use chr to turn key numbers into alphabets. Here's a one-line example:
>>> ' '.join(map(lambda c: chr(ord('A') + int((ord(c) - ord(' ')) % 7)) + str(int((ord(c) - ord(' ')) / 7)), input()))
3.14159ABC265
'F2 A2 D2 G2 D2 A3 E3 F4 G4 A5 E2 B3 A3'
>>>

Regex in Python Equation Replacement

I'm somewhat new to regex and Python and am in the following situation. I'd like to take an equation string, like "A + B + C + 4D", and place the number 1 in front of all variables that have no number in front of them. So something like:
>>> foo_eqn = "A + B + C + 4D"
>>> bar_eqn = fill_in_ones(foo_eqn)
>>> bar_eqn
"1A + 1B + 1C + 4D"
After some research and asking, I came up with
def fill_in_ones(in_eqn):
out_eqn = re.sub(r"(\b[A-Z]\b)", "1"+ r"\1", in_eqn, re.I)
return(out_eqn)
However, it looks like this only works for the first two variables:
>>> fill_in_ones("A + B")
1A + 1B
>>> fill_in_ones("A + B + E")
1A + 1B + E
>>> fill_in_ones("2A + B + C + D")
2A + 1B + 1C + D
Anything really obvious I'm missing? Thanks!

Looks like the re.I (ignore case flag) is the culprit:
>>> def fill_in_ones(in_eqn):
... out_eqn = re.sub(r"(\b[A-Z]\b)", "1"+ r"\1", in_eqn)
... return(out_eqn)
...
>>>
>>> fill_in_ones("A + 3B + C + 2D + E")
'1A + 3B + 1C + 2D + 1E'
This is because the next positional argument to re.sub is count, not flags. You'll need:
def fill_in_ones(in_eqn):
out_eqn = re.sub(r"(\b[A-Z]\b)", "1"+ r"\1", in_eqn, flags=re.I)
return(out_eqn)
Unfortunately, the re.I flag happens to be 2:
>>> import re
>>> re.I
2

count the matching characters between two inputs given by user

How do i get this python output? counting matches and mismatches
String1: aaabbbccc #aaabbbccc is user input
String2: aabbbcccc #aabbbcccc is user input
Matches: ?
MisMatches: ?
String1: aaAbbBccc #mismatches are capitalize
String2: aaBbbCccc

import itertools
s1 = 'aaabbbccc'
s2 = 'aabbbcccc'
print "Matches:", sum( c1==c2 for c1, c2 in itertools.izip(s1, s2) )
print "Mismatches:", sum( c1!=c2 for c1, c2 in itertools.izip(s1, s2) )
print "String 1:", ''.join( c1 if c1==c2 else c1.upper() for c1, c2 in itertools.izip(s1, s2) )
print "String 2:", ''.join( c2 if c1==c2 else c2.upper() for c1, c2 in itertools.izip(s1, s2) )
This produces:
Matches: 7
Mismatches: 2
String 1: aaAbbBccc
String 2: aaBbbCccc

Assuming you have gotten the string from a file or user input, what about:
import itertools
s1 = 'aaabbbccc'
s2 = 'aabbbcccc'
# This will only consider n characters, where n = min(len(s1), len(s2))
match_indices = [i for (i,(c1, c2)) in enumerate(itertools.izip(s1, s2)) if c1 == c2]
num_matches = len(match_indices)
num_misses = min(len(s1), len(s2)) - num_matches
print("Matches: %d" % num_matches)
print("Mismatches: %d" % num_misses)
print("String 1: %s" % ''.join(c if i in match_indices else c.upper() for (i,c) in enumerate(s1)))
print("String 2: %s" % ''.join(c if i in match_indices else c.upper() for (i,c) in enumerate(s2)))
Output:
Matches: 7
Mismatches: 2
String 1: aaAbbBccc
String 1: aaBbbCccc
If you wanted to count strings of uneven length (where extra characters counted as misses), you could change:
num_misses = min(len(s1), len(s2)) - num_matches
# to
num_misses = max(len(s1), len(s2)) - num_matches

You can try:
index = 0
for letter in String1:
if String1[index] != String2[index]:
mismatches +=1
index += 1
print "Matches:" + (len(String1)-mismatches)
print "Mismatches:" + mismatches

You could try the below.
>>> s1 = 'aaabbbccc'
>>> s2 = 'aabbbcccc'
>>> match = 0
>>> mismatch = 0
>>> for i,j in itertools.izip_longest(s1,s2):
if i == j:
match += 1
else:
mismatch +=1
In python3 use itertools.zip_longest instead of itertools.izip_longest.
If you want to consider a and A as a match, then change the if condition to,
if i.lower() == j.lower():
Finally get the match and mismatch count from the variables match and mismatch .

>>>s= list('aaabbbccc')
>>>s1=list('aabbbcccc')
>>>match=0
>>>mismatch=0
>>>for i in range(0,len(s)):
... if(s[i]==s1[i]):
... match+=1
... else:
... mismatch+=1
... s[i]=s[i].upper()
... s1[i]=s1[i].upper()
>>>print 'Matches:'+ str(match)
>>>print 'MisMatches:'+str(mismatch)
>>>print 'String 1:' +''.join(s)
>>>print 'String 2:' +''.join(s1)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

String manipulation in Python - python

Transform your string into list with split() s = """A' B F2 E' B2 A2 C' D2 C D' E2 F""" L = s.split(' ') for i in xrange(len(L)): if len(L[i]) == 1: L[i] += 'x' str_out = ' '.join(L)

Ugly or Pythonic? items = "A' B F2 E' B2 A2 C' D2 C D' E2 F".split() itemsx = ((a+'x' if len(a)==1 else a) for a in items) out = ' '.join(itemsx)

' '.join(n if len(n) == 2 else n + 'x' for n in s.split(' '))

>>> s="A' B F2 E' B2 A2 C' D2 C D' E2 F".split() >>> import string >>> letters=list(string.letters) >>> for n,i in enumerate(s): ... if i in letters: ... s[n]=i+"x" ... >>> ' '.join(s) "A' Bx F2 E' B2 A2 C' D2 Cx D' E2 Fx" >>>

>>> ' '.join((i+'x')[:2] for i in items.split()) "A' Bx F2 E' B2 A2 C' D2 Cx D' E2 Fx"

Related

Python - Counting Letter Frequency in a String

Python: Get each element from a list based on position, brackets, and parentheses

make python utilize any symbol's input

Regex in Python Equation Replacement

count the matching characters between two inputs given by user

Categories

Resources