Join parts of lists before and after & character - python

I am trying to get a list, check if it contains the '&' character, then join the data before and after that character depending on where it is. The '&' position will not always be the same.
Lets say i have a list
_list = ['John', 'Adams', '&', 'George', 'Washington']
I want to get the values before and after the ampersand and store them as a string to a variable.
name_one = "John Adams"
name_two = "George Washington"
Keep in mind this would have to be dynamic in that i need to be able to get all of the data before and after no matter how many indices there are
_list = ['John', 'Adams', 'Jr.', '&', 'George', 'Washington']
Would return
name_one = "John Adams Jr."
name_two = "George Washington"

You can use list.index to find the index of the first occurence of'&' and then slice before and after that index.
def get_names(lst):
try:
index = lst.index('&')
except ValueError:
... # return some default value if `'&'` is not in you list
return ' '.join(lst[:index]), ' '.join(lst[index + 1:])
lst = ['John', 'Adams', 'Jr.', '&', 'George', 'Washington']
name_one, name_two = get_names(lst)
name_one # 'John Adams Jr.'
name_two # 'George Washington'

Since you are going to combine the partial lists into strings, anyway, why not first join all the pieces and then split?
_list = ['John', 'Adams', 'Jr.', '&', 'George', 'Washington']
name_one, name_two = " ".join(_list).split(" & ")
print(name_one, name_two, sep=", ")
#John Adams Jr., George Washington
You can even process more than two parts using the same expression:
_list = ['John', 'Adams', '&', 'George', 'Washington', '&', 'Ben', 'Franklin']
name_one, name_two, *more_names = " ".join(_list).split(" & ")
print(name_one, name_two, more_names, sep=", ")
#John Adams, George Washington, ['Ben Franklin']

Here is a simple solution:
l = ['John', 'Adams', '&', 'George', 'Washington']
ind = l.index('&')
name_one = ' '.join(l[:ind])
name_two = ' '.join(l[ind+1:])
Needless to say, you should be careful about the list not containing the '&' character or containing multiple instances of it.

You can use itertools.groupby:
import itertools
l = [['John', 'Adams', '&', 'George', 'Washington'], ['John', 'Adams', 'Jr.', '&', 'George', 'Washington']]
for i in l:
a, _, b = [' '.join(b) for a, b in itertools.groupby(i, key=lambda x:x=='&')]
print(a, b)
Output:
('John Adams', 'George Washington')
('John Adams Jr.', 'George Washington')

you can try this method:
data22=['John', 'Adams', '&', 'George', 'Washington','john','&','paul','&','and','hi']
track=[0]+[j+1 for j,i in enumerate(data22) if i=='&']
for i in range(0,len(track)):
if len(track[i:i+2])==2:
real_data=data22[track[i:i+2][0]:track[i:i+2][1]]
print(" ".join(real_data[:-1]))
else:
print(" ".join(data22[track[i:i+2][0]:]))
output:
John Adams
George Washington john
paul
and hi

Related

Removing specific character from the end of strings in list in python

Let's say I have a list that is something like this:
lst = ['Joe C', 'Jill', 'Chad', 'Cassie C']
I want to remove the last character from each string if that character is a 'C'. At the moment I'm stuck at this impass:
no_c_list = [i[:-1] for i in lst if i[-1:] == 'C']
However, this would return a list of:
['Joe', 'Cassie']
Use rstrip:
lst = ['Joe C', 'Jill', 'Chad', 'Cassie C']
result = [e.rstrip('C') for e in lst]
print(result)
Output
['Joe ', 'Jill', 'Chad', 'Cassie ']
From the documentation:
Return a copy of the string with trailing characters removed. The
chars argument is a string specifying the set of characters to be
removed.
Also, as mentioned by #dawg:
result = [e.rstrip(' C') for e in lst]
If you want to remove the trailing whitespace also.
Try this:
lst = ['Joe C', 'Jill', 'Chad', 'Cassie C']
new_list = [ i[:-2] if i[-2:] == " C" else i for i in lst ]
print(new_list)
You could use a regex:
>>> lst = ['Joe C', 'Jill', 'Chad', 'Cassie C']
>>> import re
>>> [re.sub(r' C$', '', s) for s in lst]
['Joe', 'Jill', 'Chad', 'Cassie']

How to find lowercase first letter in word list, and change them into uppercase

I want to find out if any of those names start with lowercase and, if they do, change it to uppercase.
unknown_list = ('toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin')
Tuples are immutable so you cannot change them but if you change unknown_list into a list then you are able to do that. You should use the .capitalize() function!
Here is the short version.
x = ['toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin']
x = [name.capitalize() for name in x]
And the long version.
x = ['toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin']
for index, name in enumerate(x):
x[index] = name.capitalize()
The main idea in both is that you capitalize every name in order to achieve your goal.
The capitalize() method can do this easily:
>>> unknown_list = ('toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin')
>>> new_list = [x.capitalize() for x in unknown_list]
>>> new_list
['Toby', 'James', 'Kate', 'George', 'Rick', 'Alex', 'Jein', 'Medelin']
Note that's creating a new list but you could just as easily assign back to the original variable if you want to overwrite it.
maybe you can do like this:
x = ['toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin']
x = [name.title() for name in x]
You used to chr() and ord()
chr(97) is 'a', chr(65) is 'A'
and
ord('a') is 97 , ord('A') is 65 as int
test case:
for name in unknown_list:
if ord(name[0]) >=97 and ord(name[0]) <=122:
tmp = ord(name[0]) - 32
print(chr(tmp))
But, easier way
name = 'james'
print(name.capitalize())
It can print 'James'
One small reminding is that the parentheses you're using make your unknown_list a tuple. Tuples are immutable.
If you just want to capitalize everything in the list you can do this
unknown_list = ['toby', 'James', 'kate', 'George', 'rick', 'Alex', 'Jein', 'medelin']
capLs = []
for i in unknown_list:
capLs.append(i.capitalize())
print(capLs)

Python algorithm to find the full names in a text

I am trying to put a simple algorithm to return first name and last name from a list containing a mix of first name and last names. For instance from the list
l = ['John May', ' May', 'John', 'John Smith','Jack', 'John','May Smith', 'Sandra', 'Tim John','Simon, 'Tim Sandra', 'Sandra Smith']
I would like to do the following:
If there is a single first name or last name in the text return only the full name containing that single first name or last name
If there is no full name return that single first name and last name
I wrote the following code to achieve that but I have only tested on a few test cases. I was wondering if someone can tell how to make this code more efficient and notice any issues with it.
l = ['John May', ' May', 'John', 'John Smith', 'John','May Smith', 'Sandra', 'Tim John', 'Tim Sandra', 'Sandra Smith', 'Simon', 'jack']
def get_unique_full_name(l):
t = list(set(l))
print(t)
person = []
for i in range(0,len(t)):
for j in range(0,len(t)):
if t[i] in t[j].split() or t[j] in t[i].split():
if (t[i] != t[j]):
temp = t[i] if len(t[i]) == max(len(t[i]),len(t[j])) else t[j]
person.append(temp)
print(get_unique_full_name(l))
returns:(Expected output)
['John', 'May Smith', ' May', 'Sandra', 'Tim Sandra', 'John May', 'Sandra Smith', 'Simon', 'Tim John', 'jack', 'John Smith']

Split list for punctuation

I have a given title
I want to start splitting on whitespace and punctuation of a list, so that no word in the resulting list contains any whitespace or punctuation character.
Ex: the word "Joe's" gets split into "Joe" and "s"
'ad sf' gets split into 'ad' and 'sf'
Starting:
['Toms', 'ad sf', "Joe's"]
Ending:
['Toms', 'ad', 'sf' , 'Joe', 's']
I have tried regex, split, but there's not an easy and concise way. Can anyone think of a better way?
Split each item and join the pieces:
from itertools import chain
mylist = ['Toms', 'ad sf', "Joe's"]
list(chain(*[re.split("\W+", item) for item in mylist]))
#['Toms', 'ad', 'sf', 'Joe', 's']
Here's a "clean" functional solution:
list(chain(*map(lambda item: re.split("\W+", item), mylist)))
#['Toms', 'ad', 'sf', 'Joe', 's']
You can use re.split:
import re
s = ['Toms', 'ad sf', "Joe's"]
final_result = [j for i in s for j in re.split(r'\W+', i)]
Output:
['Toms', 'ad', 'sf', 'Joe', 's']
There is no builtin way to achieve what you want, but here is the most concise way I could think of using map.
import re
words = ['Toms', 'ad sf', "Joe's"]
sum(map(re.compile(r'\W+').split, words), [])
# Output: ['Toms', 'ad', 'sf', 'Joe', 's']

Change a list to a string with specific formatting

Let's say that I have a list of names:
names = ['john', 'george', 'ringo', 'paul']
And need to get a string output like:
john', 'george', 'ringo', 'paul
(Note that the missing quote at the beginning and at the end is on purpose)
Is there an easier way to do this than
new_string=''
for x in names:
new_string = new_string + x + "', '"
I know something like that will work, however the real names list will be very very (very) big and was wondering if there is a nicer way to do this.
You can simply use str.join:
>>> names = ['john', 'george', 'ringo', 'paul']
>>> print("', '".join(names))
john', 'george', 'ringo', 'paul
>>>
may be bad way to do it, just wana share it :
>>> names = ['john', 'george', 'ringo', 'paul']
>>> print(str(names)[2:-2])
john', 'george', 'ringo', 'paul

Categories

Resources