Split string > list of sublists of words and characters

Split string > list of sublists of words and characters - python

No imports allowed (it's a school assignment).
Wish to split a random string into a list of sublists. Words in a sublist, all other characters (including whitespace) would be in a sublist containing only one item. Anyone have some advice on how to do this;
part = "Hi! Goodmorning, I'm fine."
list = [[H,i],[!],[_],[G,o,o,d,m,o,r,n,i,n,g],[,],[_],[I],['],[m],[_],[f,i,n,e],[.]]

This does the trick:
globalList = []
letters = "abcdefghijklmnopqrstuvwxyz"
message = "Hi! Goodmorning, I'm fine."
sublist = []
for char in message:
#if the character is in the list of letters, append it to the current substring
if char.lower() in letters:
sublist.append(char)
else:
#add the previous sublist (aka word) to globalList, if it is not empty
if sublist:
globalList.append(sublist)
#adds the single non-letter character to the globalList
globalList.append([char])
#initiates a fresh new sublist
sublist = []
print(globalList)
#output is [['H', 'i'], ['!'], [' '], ['G', 'o', 'o', 'd', 'm', 'o', 'r', 'n', 'i', 'n', 'g'], [','], [' '], ['I'], ["'"], ['m'], [' '], ['f', 'i', 'n', 'e'], ['.']]

Try this out :
part = "Hi! Goodmorning, I'm fine."
n = part.count(" ")
part = part.split()
k = 0
# Add spaces to the list
for i in range(1,n+1):
part.insert(i+k, "_")
k += 1
new = [] # list to return
for s in part:
new.append([letter for letter in s])

part = "Hi! Goodmorning, I'm fine."
a = []
b = []
c = 0
for i in part:
if i.isalpha():
if c == 1:
a.append(b)
b=[]
b.append(i)
c = 0
else:
b.append(i)
else:
a.append(b)
b=[]
b.append(i)
c = 1
a.append(b)
print a

Related

how to separate alternating digits and characters in string into dict or list?

'L134e2t1C1o1d1e1'
the original string is "LeetCode"
but I need to separate strings from digits, digits can be not only single-digit but also 3-4 digits numbers like 345.
My code needs to separate into dict of key values; keys are characters and numbers is the digit right after the character. Also create 2 lists of separate digits, letters only.
expected output:
letters = ['L', 'e', 't', 'C', 'o', 'd', 'e']
digits = [134,2,1,1,1,1,1]
This code is not properly processing this.
def f(s):
d = dict()
letters = list()
# letters = list(filter(lambda x: not x.isdigit(), s))
i = 0
while i < len(s):
print('----------------------')
if not s[i].isdigit():
letters.append(s[i])
else:
j = i
temp = ''
while j < len(s) and s[j].isdigit():
j += 1
substr = s[i:j]
print(substr)
i += 1
print('----END -')
print(letters)

With the following modification your function separates letters from digits in s:
def f(s):
letters = list()
digits = list()
i = 0
while i < len(s):
if not s[i].isdigit():
letters.append(s[i])
i += 1
else:
j = i
temp = ''
while j < len(s) and s[j].isdigit():
j += 1
substr = s[i:j]
i = j
digits.append(substr)
print(letters)
print(digits)
f('L134e2t1C1o1d1e1')
As said in my comments you didn't update i after the inner loop terminates which made i go back to a previous and already processed index.

If I would be limited to not use regex I would do it following way
text = 'L134e2t1C1o1d1e1'
letters = [i for i in text if i.isalpha()]
digits = ''.join(i if i.isdigit() else ' ' for i in text).split()
print(letters)
print(digits)
output
['L', 'e', 't', 'C', 'o', 'd', 'e']
['134', '2', '1', '1', '1', '1', '1']
Explanation: for letters I use simple list comprehension with condition, .isalpha() is str method which check if string (in this consisting of one character) is alphabetic. For digits (which should be rather called numbers) I replace non-digits using single space, turn that into string using ''.join then use .split() (it does split on one or more whitespaces). Note that digits is now list of strs rather than ints, if that is desired add following line:
digits = list(map(int,digits))

Your string only had two e's, so I've added one more to complete the example. This is one way you could do it:
import re
t = 'L1e34e2t1C1o1d1e1'
print(re.sub('[^a-zA-Z]', '', t))
Result:
LeetCode
I know you cannot use regex, but to complete this answer, I'll just add a solution:
def f(s):
d = re.findall('[0-9]+', s)
l = re.findall('[a-zA-Z]', s)
print(d)
print(l)
f(t)
Result:
['134', '2', '1', '1', '1', '1', '1']
['L', 'e', 't', 'C', 'o', 'd', 'e']

You edited your question and I got a bit confused, so here is a really exhaustive code giving you a list of letters, list of the numbers, the dict with the number associated with the number, and finally the sentence with corresponding number of characters ...
def f(s):
letters = [c for c in s if c.isalpha()]
numbers = [c for c in s if c.isdigit()]
mydict = {}
currentKey = ""
for c in s:
print(c)
if c.isalpha():
mydict[c] = [] if c not in mydict.keys() else mydict[c]
currentKey = c
elif c.isdigit():
mydict[currentKey].append(c)
sentence = ""
for i in range(len(letters)):
count = int(numbers[i])
while count > 0:
sentence += letters[i]
count -= 1
print(letters)
print(numbers)
print(mydict)
print(sentence)

letters = []
digits = []
dig = ""
for letter in 'L134e2t1C1o1d1e1':
if letter.isalpha():
# do not add empty string to list
if dig:
# append dig to list of digits
digits.append(dig)
dig = ""
letters.append(letter)
# if it is a actual letter continue
continue
# add digits to `dig`
dig = dig + letter
Try this. The idea is to skip all actual letters and add the digits to dig.

I know there's an accepted answer but I'll throw this one in anyway:
letters = []
digits = []
lc = 'L134e2t1C1o1d1e1'
n = None
for c in lc:
if c.isalpha():
if n is not None:
digits.append(n)
n = None
letters.append(c)
else:
if n is None:
n = int(c)
else:
n *= 10
n += int(c)
if n is not None:
digits.append(n)
for k, v in zip(letters, digits):
dct.setdefault(k, []).append(v)
print(letters)
print(digits)
print(dct)
Output:
['L', 'e', 't', 'C', 'o', 'd', 'e']
[134, 2, 1, 1, 1, 1, 1]
{'L': [134], 'e': [2, 1], 't': [1], 'C': [1], 'o': [1], 'd': [1]}

Can the index number of a list itself be used as an integer in Python?

I am doing this code wars kata https://www.codewars.com/kata/57eb8fcdf670e99d9b000272/train/python
you have to return the highest scoring word within a string. letters are scored based on position in the alphabet
a =1, z= 26
I've created a list :
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
I want to iterate through the words, which will be in the string (x) passed as a parameter, and if the letter being checked is in the alphabet list, which it of course will be, then to a sperate variable: score, increment score by the number at which the current letter being checked is indexed within the alphabet list.
Is it possible to use list indexes in this way?
Here's my code soo far:
def high(x):
alphabet = []
scores = [] # ignore
score = 0
for letter in range(97,123):
alphabet.append(chr(letter))
word_list = x.split()
for word in word_list:
for letter in word:
if letter in alphabet:
score += # find way to use alphabet list index number as integer here
Thanks.

From what I can see the list of letters isn't needed at all :
import string
def high(x):
score = 0
for word in x.split():
for letter in word:
if letter in string.ascii_lowercase:
score += ord(letter)-96
return score
or even simpler :
import string
def high(x):
# Sum expression on multiple lines for clarity
return sum( ord(letter)-96
for word in x.split()
for letter in word
if letter in string.ascii_lowercase)

Use list comprehensions and dictionary score to keep track of each letter and its score. Notice that the input string is lowercased - I assume that upper- and lowercase letters are scored the same.
alphabet = 'abcdefghijklmnopqrstuvwxyz'
score = dict(zip(list(alphabet), [i + 1 for i in range(len(alphabet))]))
x = 'aB, c'
score = sum([score[c] for c in list(x.lower()) if c in score])
print(score)
# 6

#AlanJP, would you like to try this program:
# simple word scoring program
import string
characters = string.ascii_lowercase
ranking = {c: i for i, c in enumerate(characters, 1)}
#print(ranking)
word_list = 'abba is great'.split()
for word in word_list:
score = 0 # reset the score for each incoming word
for char in word:
score += ranking[char]
print(word, score)
Output:
abba 6
is 28
great 51
>>>

Yes. alphabet.index(letter) + 1 will give you what you want.

index() you're going also want to +1 to account for index [0]
# vowels list
vowels = ['a', 'e', 'i', 'o', 'i', 'u']
# index of 'e' in vowels
index = vowels.index('e')
print('The index of e:', index)

Python: How to check if element of string is in a string and print the not included string elements?

I want to check each list element of c in alphabet and print the letters of the alphabet, which are not in a list element of c.
For example, shall the first list element of c "aa" print all letters of alphabet in a string excluding the letter a.
alphabet = "abcdefghijklmnopqrstuvwxyz"
c = ['aa', 'bb', 'zz']
for x in c:
if x in alphabet:
print(alphabet)
else:
print('not an element of alphabet')

Something like that:
alphabet = "abcdefghijklmnopqrstuvwxyz"
cases = ['aa', 'bb', 'zz']
for case in cases:
missing_letters = []
for letter in alphabet:
if letter not in case:
missing_letters.append(letter)
print(f"Case {case} misses following alphabeth letters {missing_letters}")
Output:
Case aa misses following alphabeth letters ['b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

If you are sure that the elements in c are all in the format of 'xx' like in your sample, then the following is a solution:
alphabet = "abcdefghijklmnopqrstuvwxyz"
c = ['ad', 'bb', 'zz','ad', 'bt', 'uz']
for x in c:
new_alph = alphabet
for char in x:
new_alph = new_alph.replace(char,'')
if new_alph == alphabet:
print('not an element of alphabet')
else:
print(new_alph)
Output:
bcefghijklmnopqrstuvwxyz
acdefghijklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
bcefghijklmnopqrstuvwxyz
acdefghijklmnopqrsuvwxyz
abcdefghijklmnopqrstvwxy
Another way is to use translate to make the code more compact:
alphabet = "abcdefghijklmnopqrstuvwxyz"
c = ['ad', 'bb', 'zz','ad', 'bt', 'uz']
for x in c:
new_alph = alphabet.translate({ord(char): '' for char in x})
if new_alph == alphabet:
print('not an element of alphabet')
else:
print(new_alph)
Output:
bcefghijklmnopqrstuvwxyz
acdefghijklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
bcefghijklmnopqrstuvwxyz
acdefghijklmnopqrsuvwxyz
abcdefghijklmnopqrstvwxy

As long as the strings in c are only 2 chars this will work
alphabet = "abcdefghijklmnopqrstuvwxyz"
c = ['aa', 'bb', 'zz']
for x in c:
if x[0] in alphabet or x[1] in alphabet:
alphabet.replace(x[0], '').replace(x[1], '')
else:
print('not an element of alphabet')

Search list in string characters

I have to search if a word doesn't have a vowel, and then put them in another list.
I can't make it work, and I don't understand why.
for i in range(len(Cadena)):
if all Vocales[] not in Cadena:
Lista.append(Cadena[i])

Try this.
word = 'apple'
vowels = ['a','e','i','o','u','A','E','I','O','U']
for letter in word:
if letter in vowels:
#this word has a vowel, do something
else:
# this word doesn't have a vowel.

I'm not 100% sure what your variables mean, but your problem is simple.
Here is the solution:
# array with words that you to sort into two groups
words = ["abc", "def", " ggg"]
vowels = ["a", "e", " i", "o", " u"] # possibly include "y"
numVowels = len(vowels)
withVowels = [] # words with vowels
withoutVowels = [] # words without vowels
# categorize words
for w in words:
for i, v in enumerate(vowels):
if v in w:
withVowels.append(w)
elif i == (numVowels -1):
withoutVowels.append(w)
At the end of this for loop, the withVowels will contain ["abc", "def"] and withoutVowels will contain [" ggg"]

import re
def vocales(text):
#with regular expression.
print re.findall("[aeiouÁÉÍÓÚ]", text.lower(), re.IGNORECASE)
#or verifying letter by letter
print [e for e in text if e in "aeiouÁÉÍÓÚ"]
#Shows the characters that are not vowels
print [e for e in text if e not in "aeiouÁÉÍÓÚ"]
#Returns false if it has vowels
return not len([e for e in text if e in "aeiouÁÉÍÓÚ"])
vocales("Hola mundo. Hello world")
Output:
['o', 'a', 'u', 'o', 'e', 'o', 'o']
['o', 'a', 'u', 'o', 'e', 'o', 'o']
['H', 'l', ' ', 'm', 'n', 'd', '.', ' ', 'H', 'l', 'l', ' ', 'w', 'r', 'l', 'd']
False

You can set up a string with all your vowels, then you can use list comprehension to check your words against this string
vow = 'aeiou'
words = ['apple', 'banana', 'vsh', 'stmpd']
w_vowels = [i for i in words if any(k in vow for k in i)]
wo_vowels = [i for i in words if not any(k in vow for k in i)]
print(w_vowels) # => ['apple', 'banana']
print(wo_vowels) # => ['vsh', 'stmpd']
Expanded Loops without any:
w_vowels = []
for i in words:
for k in i:
if k in vow:
w_vowels.append(i)
break
wo_vowels = []
for i in words:
for k in i:
if k in vow:
break
else:
wo_vowels.append(i)

What is the best way in python to assign each of the same element of one list to the same indices on another list?

I thought about having a word as a string, making it into a "regularList" of strings, generating a "dummyList" which contains a string '-'for each letter in the word, then looping through the "regularList", simultaneously removing each instance of my guessed letter from "regularList" and reassigning it to the same index of "dummyList". Basically, I need to make:
regularList = [['a', 'a', 'r', 'd', 'v', 'a', 'r', 'k']]
dummyList = ['_','_','_','_','_','_','_']
Into:
regularList = [['r', 'd', 'v', 'r', 'k']]
dummyList = ['a','a','_','_','_','a','_','_']
Here is my attempt:
word = 'aardvark'
def changeLetter(word):
guess = raw_input('Guess a letter:') # When called, guess:a
print word
dummyList = []
for i in word:
dummyList.append('_ ')
print dummyList
regularList = [list(i) for i in word.split('\n')]
print regularList
numIters = 0
while guess in regularList[0]:
numIters += 1
index = regularList[0].index(guess)
dummyList[index] = guess
del regularList[0][index]
print regularList
print dummyList
print numIters
changeLetter(word)
This code produces:
Samuels-MacBook:python amonette$ python gametest.py
Guess a letter:a
aardvark
['_ ', '_ ', '_ ', '_ ', '_ ', '_ ', '_ ', '_ ']
[['a', 'a', 'r', 'd', 'v', 'a', 'r', 'k']]
[['r', 'd', 'v', 'r', 'k']]
['a', '_ ', '_ ', 'a', '_ ', '_ ', '_ ', '_ ']
3
As you can see, the proper indices aren't being reassigned.

word = 'aardvark'
def changeLetter(word):
guess = raw_input('Guess a letter:') # When called, guess:a
print word
dummyList = []
for i in word:
dummyList.append('_ ')
print dummyList
regularList = [list(i) for i in word.split('\n')]
print regularList
numIters = 0
position = 0
length = len(regularList[0])
while numIters < len(regularList[0]):
if regularList[0][numIters] == guess:
dummyList[position] = guess
del regularList[0][numIters]
numIters -=1
position +=1
numIters +=1
print regularList
print dummyList
print numIters
changeLetter(word)
Your program has one mistake when you delete an element the size of array becomes small and the element which should be next becomes previous.
regularList[0] = ['a', 'a', 'r', 'd', 'v', 'a', 'r', 'k']
while guess in regularList[0]:
In this loop, when you remove first a , list becomes ['a', 'r', 'd', 'v', 'a', 'r', 'k']
Now when your loop continues, guess becomes 'r' that is next element in previous list. Hence a which was previously at position 1 is neglected (0 based indexing).

Maybe with enumerate and then remove the elements:
for index, element in enumerate(regularList):
if element == "a":
dummyList[index] = element
regularList.remove("a")

The accepted answer is way too long, you can just use a clean list comprehension
currentWord = [letter if letter in guessedLetters else "_" for letter in solution]
A fully functioning guess the word program without win logic would be 6 lines
solution = "aardvark"
guessedLetters = []
while True:
guessedLetters.append(input("Guess a letter "))
currentWord = [letter if letter in guessedLetters else "_" for letter in solution]
print(" ".join(currentWord))
This outputs:
Guess a letter a
a a _ _ _ a _ _
Guess a letter k
a a _ _ _ a _ k
Guess a letter v
a a _ _ v a _ k
Guess a letter r
a a r _ v a r k
Guess a letter d
a a r d v a r k
Guess a letter

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Split string > list of sublists of words and characters - python

Try this out : part = "Hi! Goodmorning, I'm fine." n = part.count(" ") part = part.split() k = 0 # Add spaces to the list for i in range(1,n+1): part.insert(i+k, "_") k += 1 new = [] # list to return for s in part: new.append([letter for letter in s])

part = "Hi! Goodmorning, I'm fine." a = [] b = [] c = 0 for i in part: if i.isalpha(): if c == 1: a.append(b) b=[] b.append(i) c = 0 else: b.append(i) else: a.append(b) b=[] b.append(i) c = 1 a.append(b) print a

Related

how to separate alternating digits and characters in string into dict or list?

Can the index number of a list itself be used as an integer in Python?

Python: How to check if element of string is in a string and print the not included string elements?

Search list in string characters

What is the best way in python to assign each of the same element of one list to the same indices on another list?

Categories

Resources