Append length of string for each string in list - python

Input
strlist = ['test', 'string']
Desired output:
strlist = [('test', 4), ('string', 6)]
Attempt 1:
def add_len(strlist):
for i, c in enumerate(strlist):
strlist += str(len(strlist))
return strlist
Attempt 2:
def add_len(strlist):
for c in strlist:
c += " " + str(len(c))
return strlist
I realise I have the following issues:
Attempt 1: This results in an infinite loop, as the code keeps adding onto the list.
Attempt 2: This does not add the value to the string, however, when I do, I get the infinite loop issue from #1.
I believe I need to evaluate the number of elements in the list first and implement a while statement, but not quite sure how to do this.

Use a list comprehension:
strlist = ['test', 'string']
def add_len(strlist):
return [(s, len(s)) for s in strlist]

You can use a list comprehension like this.
def add_len(strlist):
return [(s, len(s)) for s in strlist]
Or expanded,
def add_len(strlist):
new_list = []
for s in strlist:
new_list.append((s, len(s)))
return new_list
In the expanded form we can see the steps it's going through a bit more clearly.
Create list new_list to put your strings and lengths into.
Iterate over each string in strlist.
For each string, append the string and its length to new_list.
Return new_list.
Of course, both of these gives you your desired output:
[('test', 4), ('string', 6)]

Using map():
>>> strlist
['test', 'string']
>>> list(map(lambda x: (x, len(x)), strlist))
[('test', 4), ('string', 6)]

for i, c in enumerate(strlist)
The i is the index of the element in the string list, and the c the value of the element. And you keep on appending the results in place (strlist), so that the strlist keeps growing.
While the second attempt won't output your desired results. The result will be strlist = ['test 4', 'string 6']. Every single element is not made up of tuple.
Meanwhile both the two attempts modify strlist in place, which will bring in potential issues (affect other attributes/parameters that refer to strlist).
A better solution for this is using list comprehension.
strlist = ['test', 'string']
new_strlist = [(s, len(s)) for s in strlist]

Related

Python string to list per character

Good day I just want to understand the logic behind this code
lst = []
word = "ABCD"
lst[:0] = word
print(lst)
OUTPUT: ['A', 'B', 'C', 'D'] why not ['ABCD'] how?
for i in word: # this code I understand it's looping through the string
lst.append(i) # then appending to list
but the first code above I don't get the logic.
lst[:0] = ... is implemented by lst.__setitem__(slice(0, None, 0), ...), where ... is an arbitrary iterable.
The resulting slice is the empty list at the beginning of lst (though in this case, it doesn't really matter since lst is empty), so each element of ... is inserted into lst, starting at the beginning.
You can see this starting with a non-empty list.
>>> lst = [1,2,3]
>>> word = "ABCD"
>>> lst[:0] = word
>>> lst
['A', 'B', 'C', 'D', 1, 2, 3]
To get lst == ['ABCD'], you need to make the right-hand side an iterable containing the string:
lst[:0] = ('ABCD', ) # ['ABCD'] would also work.
Actually it's a well known way to convert string to a list character by character
you can find here -> https://www.geeksforgeeks.org/python-program-convert-string-list/
if you wanna try to get your list element like 'ABCD' then try
lst[:0] = [word,]
by doing that you specify that you need whole word as an element

Join characters from list of strings by index

For example, I have the following list.
list=['abc', 'def','ghi','jkl','mn']
I want to make a new list as:
newList=['adgjm','behkn','cfil']
picking every first character of each element forming a new string then appending into the new list, and then with the second character of every element and so on:
Thanks for the help.
One way is zipping the strings in the list, which will interleave the characters from each string in the specified fashion, and join them back with str.join:
l = ['abc', 'def','ghi','jkl']
list(map(''.join, zip(*l)))
# ['adgj', 'behk', 'cfil']
For strings with different length, use zip_longest, and fill with an empty string:
from itertools import zip_longest
l = ['abcZ', 'def','ghi','jkl']
list(map(''.join, zip_longest(*l, fillvalue='')))
# ['adgj', 'behk', 'cfil', 'Z']
You can try this way:
>>> list1 =['abc', 'def','ghi','jkl']
>>> newlist = []
>>> for args in zip(*list1):
... newlist.append(''.join(args))
...
>>> newlist
['adgj', 'behk', 'cfil']
Or using list comprehension:
>>> newlist = [''.join(args) for args in zip(*list1)]
>>> newlist
['adgj', 'behk', 'cfil']
You can try this:
list=['abc', 'def','ghi','jkl']
n = len(list[0])
newList = []
i = 0
for i in range(n):
newword = ''
for word in list:
newword += word[i]
newList.append(newword)
print(newList)

Splitting lists by empty element

I have a single list that could be any amount of elements.
['jeff','ham','boat','','my','name','hello']
How do I split this one list into two lists or any amount of lists depending on blank string elements?
All these lists can then be put into one list of lists.
If you are certain that there is only one blank string in the list, you can use str.index to find the index of the blank string, and then slice the list accordingly:
index = lst.index('')
[lst[:index], lst[index + 1:]]
If there could be more than one blank string in the list, you can use itertools.groupby like this:
lst = ['jeff','ham','boat','','my','name','hello','','hello','world']
from itertools import groupby
print([list(g) for k, g in groupby(lst, key=bool) if k])
This outputs:
[['jeff', 'ham', 'boat'], ['my', 'name', 'hello'], ['hello', 'world']]
Using itertools.groupby, you can do:
from itertools import groupby
lst = ['jeff','ham','boat','','my','name','hello']
[list(g) for k, g in groupby(lst, key=bool) if k]
# [['jeff', 'ham', 'boat'], ['my', 'name', 'hello']]
Using bool as grouping key function makes use of the fact that the empty string is the only non-truthy string.
This is one approach using a simple iteration.
Ex:
myList = ['jeff','ham','boat','','my','name','hello']
result = [[]]
for i in myList:
if not i:
result.append([])
else:
result[-1].append(i)
print(result)
Output:
[['jeff', 'ham', 'boat'], ['my', 'name', 'hello']]
Let list_string be your list. This should do the trick :
list_of_list=[[]]
for i in list_string:
if len(i)>0:
list_of_list[-1].append(i)
else:
list_of_list.append([])
Basically, you create a list of list, and you go through your original list of string, each time you encounter a word, you put it in the last list of your list of list, and each time you encounter '' , you create a new list in your list of list. The output for your example would be :
[['jeff','ham','boat'],['my','name','hello']]
i'm not sure that this is what you're trying to do, but try :
my_list = ['jeff','ham','boat','','my','name','','hello']
list_tmp = list(my_list)
final_list = []
while '' in list_tmp:
idx = list_tmp.index('')
final_list.append(list_tmp[:idx])
list_tmp = list_tmp[idx + 1:]

Given a list of string, determine if one string is a prefix of another string

I want to write a Python function which checks if one string is a prefix string of another; not an arbitrary sub string of another; must be prefix. If it is, return True. For instance,
list = ['abc', 'abcd', 'xyx', 'mno']
Return True because 'abc' is a prefix of 'abcd'.
list = ['abc', 'xyzabc', 'mno']
Return False
I tried the startwith() and list comprehension, but it didn't quite work.
Appreciate for any help or pointers.
Let us first sort the given lst w.r.t length of the string, due to the known fact that sub strings always have length less than or equal to the original string, so after sorting we have strings with smaller length at the start of the list, and then we iterate over the sorted list comparing the current element with all the elements next to it, This small optimization would reduce the complexity of the problem as now we don't have to comapre each element with every other element.
lst1 = ['abc', 'abcd', 'xyx', 'mno']
lst2 = ['abc', 'xyzabc', 'mno']
lst3 = ["abc", "abc"]
def check_list(lst):
lst = list(set(lst)) #if you want to avoid redundant strings.
lst.sort(key = lambda x:len(x))
n = len(lst)
for i in xrange(n):
for j in xrange(i+1, n):
if lst[j].startswith(lst[i]):
return True
return False
print check_list(lst1)
print check_list(lst2)
print check_list(lst3)
>>> True
>>> False
>>> False #incase you use lst = list(set(lst))
Using itertools
import itertools
list1 = ["abc", "xyz", "abc123"]
products = itertools.product(list1, list1)
is_substringy = any(x.startswith(y) for x, y in products if x != y)
This isn't very optimised, but depending on the amount of data you've got to deal with, the code is fairly elegant (and short); that might trump speed in your use case.
This assumes that you don't have pure repeats in the list however (but you don't have that in your example).
import itertools
mlist = ['abc', 'abcd', 'xyx', 'mno']
#combination of list elements, 2-by-2. without repetition
In [638]: for i,j in itertools.combinations(mlist,2):
print (i,j)
.....:
('abc', 'abcd')
('abc', 'xyx')
('abc', 'mno')
('abcd', 'xyx')
('abcd', 'mno')
('xyx', 'mno')
#r holds the final result. if there is any pair where one is a prefixed of another
r=False
In [639]: for i,j in itertools.combinations(mlist,2):
r = r or i.startswith(j) # if i is the prefix of j. logical or
r = r or j.startswith(i) # if j is the prefix of i
.....:
In [640]: r
Out[640]: True

How to check float string?

I have a list which consists irregular words and float numbers, I'd like to delete all these float numbers from the list, but first I need to find a way to detect them. I know str.isdigit() can discriminate numbers, but it can't work for float numbers. How to do it?
My code is like this:
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
for i in my_list:
if i.isdigit() == True:
my_list.pop(i)
# Can't work, i.isdigit returns False
Use exception handling and a list comprehension. Don't modify the list while iterating over it.
>>> def is_float(x):
... try:
... float(x)
... return True
... except ValueError:
... return False
>>> lis = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
>>> [x for x in lis if not is_float(x)]
['fun', 'cool', 'go', 'foo']
To modify the same list object use slice assignment:
>>> lis[:] = [x for x in lis if not is_float(x)]
>>> lis
['fun', 'cool', 'go', 'foo']
Easy way:
new_list = []
for item in my_list:
try:
float(item)
except ValueError:
new_list.append(item)
Using regular expressions:
import re
expr = re.compile(r'\d+(?:\.\d*)')
new_list = [item for item in my_list if not expr.match(item)]
A point about using list.pop():
When you use list.pop() to alter an existing list, you are shortening the length of the list, which means altering the indices of the list. This will lead to unexpected results if you are simultaneously iterating over the list. Also, pop() takes the index as an argument, not the element. You are iterating over the element in my_list. It is better to create a new list as I have done above.
A dead simple list comprehension, adding only slightly to isdigit:
my_list = [s for s in my_list if not all(c.isdigit() or c == "." for c in s)]
This will remove string representations of both int and float values (i.e. any string s where all characters c are numbers or a full stop).
As I understand OP the function should only remove floats. If integers should stay - consider this solution:
def is_float(x):
try:
return int(float(x)) < float(x)
except ValueError:
return False
my_list = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224']
list_int = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224', '42']
print [item for item in my_list if not is_float(item)]
print [item for item in list_int if not is_float(item)]
Output
['fun', 'cool', 'go', 'foo']
['fun', 'cool', 'go', 'foo', '42']
Regular expressions would do the trick - this code searches each string for the format of a float (including floats starting with or ending with a decimal point), and if the string is not a float, adds it to the new list.
import re
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
new_list = []
for pos, st in enumerate(my_list):
if not re.search('[0-9]*?[.][0-9]*', st):
new_list.append(st)
print new_list
Creating a new list avoids working on the same list you are iterating on.
Ewans answer is cleaner and quicker, I think.

Categories

Resources