I have a list which consists irregular words and float numbers, I'd like to delete all these float numbers from the list, but first I need to find a way to detect them. I know str.isdigit() can discriminate numbers, but it can't work for float numbers. How to do it?
My code is like this:
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
for i in my_list:
if i.isdigit() == True:
my_list.pop(i)
# Can't work, i.isdigit returns False
Use exception handling and a list comprehension. Don't modify the list while iterating over it.
>>> def is_float(x):
... try:
... float(x)
... return True
... except ValueError:
... return False
>>> lis = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
>>> [x for x in lis if not is_float(x)]
['fun', 'cool', 'go', 'foo']
To modify the same list object use slice assignment:
>>> lis[:] = [x for x in lis if not is_float(x)]
>>> lis
['fun', 'cool', 'go', 'foo']
Easy way:
new_list = []
for item in my_list:
try:
float(item)
except ValueError:
new_list.append(item)
Using regular expressions:
import re
expr = re.compile(r'\d+(?:\.\d*)')
new_list = [item for item in my_list if not expr.match(item)]
A point about using list.pop():
When you use list.pop() to alter an existing list, you are shortening the length of the list, which means altering the indices of the list. This will lead to unexpected results if you are simultaneously iterating over the list. Also, pop() takes the index as an argument, not the element. You are iterating over the element in my_list. It is better to create a new list as I have done above.
A dead simple list comprehension, adding only slightly to isdigit:
my_list = [s for s in my_list if not all(c.isdigit() or c == "." for c in s)]
This will remove string representations of both int and float values (i.e. any string s where all characters c are numbers or a full stop).
As I understand OP the function should only remove floats. If integers should stay - consider this solution:
def is_float(x):
try:
return int(float(x)) < float(x)
except ValueError:
return False
my_list = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224']
list_int = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224', '42']
print [item for item in my_list if not is_float(item)]
print [item for item in list_int if not is_float(item)]
Output
['fun', 'cool', 'go', 'foo']
['fun', 'cool', 'go', 'foo', '42']
Regular expressions would do the trick - this code searches each string for the format of a float (including floats starting with or ending with a decimal point), and if the string is not a float, adds it to the new list.
import re
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
new_list = []
for pos, st in enumerate(my_list):
if not re.search('[0-9]*?[.][0-9]*', st):
new_list.append(st)
print new_list
Creating a new list avoids working on the same list you are iterating on.
Ewans answer is cleaner and quicker, I think.
Related
I have a single list that could be any amount of elements.
['jeff','ham','boat','','my','name','hello']
How do I split this one list into two lists or any amount of lists depending on blank string elements?
All these lists can then be put into one list of lists.
If you are certain that there is only one blank string in the list, you can use str.index to find the index of the blank string, and then slice the list accordingly:
index = lst.index('')
[lst[:index], lst[index + 1:]]
If there could be more than one blank string in the list, you can use itertools.groupby like this:
lst = ['jeff','ham','boat','','my','name','hello','','hello','world']
from itertools import groupby
print([list(g) for k, g in groupby(lst, key=bool) if k])
This outputs:
[['jeff', 'ham', 'boat'], ['my', 'name', 'hello'], ['hello', 'world']]
Using itertools.groupby, you can do:
from itertools import groupby
lst = ['jeff','ham','boat','','my','name','hello']
[list(g) for k, g in groupby(lst, key=bool) if k]
# [['jeff', 'ham', 'boat'], ['my', 'name', 'hello']]
Using bool as grouping key function makes use of the fact that the empty string is the only non-truthy string.
This is one approach using a simple iteration.
Ex:
myList = ['jeff','ham','boat','','my','name','hello']
result = [[]]
for i in myList:
if not i:
result.append([])
else:
result[-1].append(i)
print(result)
Output:
[['jeff', 'ham', 'boat'], ['my', 'name', 'hello']]
Let list_string be your list. This should do the trick :
list_of_list=[[]]
for i in list_string:
if len(i)>0:
list_of_list[-1].append(i)
else:
list_of_list.append([])
Basically, you create a list of list, and you go through your original list of string, each time you encounter a word, you put it in the last list of your list of list, and each time you encounter '' , you create a new list in your list of list. The output for your example would be :
[['jeff','ham','boat'],['my','name','hello']]
i'm not sure that this is what you're trying to do, but try :
my_list = ['jeff','ham','boat','','my','name','','hello']
list_tmp = list(my_list)
final_list = []
while '' in list_tmp:
idx = list_tmp.index('')
final_list.append(list_tmp[:idx])
list_tmp = list_tmp[idx + 1:]
I've the following list:
my_list = ['a', 'b', 'c']
I've the following list of strings:
my_strings = ['azz', 'bzz', 'czz']
I'm doing the following to determine if any items of my_list are in contained within a item in my_strings:
for my_string in my_strings:
if any(x in my_string for x in my_list):
# Do Stuff
What's the best practice for retaining the x as found in my_list so that I might be able to then do the following:
#Do Stuff
new_var = my_string.split('x')[1]
The desired result would be able to assign zz to new_var by determining that a from my list was in azz from my_strings
It is simple indeed, you can even do it in a beautiful one liner using list comprehension as follows:
new_var = [my_string.split(x)[1] for my_string in my_strings for x in my_list if x in my_string]
this returns for you 2nd element from all splits of all strings in my_strings in which there exists elements from my_list
You should not use any because you actually care about which one matches:
for my_string in my_strings:
for x in my_list:
if x in my_string:
#Do Stuff
my_string.split('x')[0]
break
You should not use any() because you need to know specifically which element matches a certain string. Simply use a regular for loop instead:
>>> def split(strings, lst):
... for string in strings:
... for el in lst:
... if el in string:
... yield string.split(el)[1]
...
>>>
>>> for string in split(['azz', 'bzz', 'czz'], ['a', 'b', 'c']):
... print(string)
...
zz
zz
zz
>>>
Let's say if you want to split first value of my_string with first value of my_list
my_string[0].split(my_list[0])
I want to write a Python function which checks if one string is a prefix string of another; not an arbitrary sub string of another; must be prefix. If it is, return True. For instance,
list = ['abc', 'abcd', 'xyx', 'mno']
Return True because 'abc' is a prefix of 'abcd'.
list = ['abc', 'xyzabc', 'mno']
Return False
I tried the startwith() and list comprehension, but it didn't quite work.
Appreciate for any help or pointers.
Let us first sort the given lst w.r.t length of the string, due to the known fact that sub strings always have length less than or equal to the original string, so after sorting we have strings with smaller length at the start of the list, and then we iterate over the sorted list comparing the current element with all the elements next to it, This small optimization would reduce the complexity of the problem as now we don't have to comapre each element with every other element.
lst1 = ['abc', 'abcd', 'xyx', 'mno']
lst2 = ['abc', 'xyzabc', 'mno']
lst3 = ["abc", "abc"]
def check_list(lst):
lst = list(set(lst)) #if you want to avoid redundant strings.
lst.sort(key = lambda x:len(x))
n = len(lst)
for i in xrange(n):
for j in xrange(i+1, n):
if lst[j].startswith(lst[i]):
return True
return False
print check_list(lst1)
print check_list(lst2)
print check_list(lst3)
>>> True
>>> False
>>> False #incase you use lst = list(set(lst))
Using itertools
import itertools
list1 = ["abc", "xyz", "abc123"]
products = itertools.product(list1, list1)
is_substringy = any(x.startswith(y) for x, y in products if x != y)
This isn't very optimised, but depending on the amount of data you've got to deal with, the code is fairly elegant (and short); that might trump speed in your use case.
This assumes that you don't have pure repeats in the list however (but you don't have that in your example).
import itertools
mlist = ['abc', 'abcd', 'xyx', 'mno']
#combination of list elements, 2-by-2. without repetition
In [638]: for i,j in itertools.combinations(mlist,2):
print (i,j)
.....:
('abc', 'abcd')
('abc', 'xyx')
('abc', 'mno')
('abcd', 'xyx')
('abcd', 'mno')
('xyx', 'mno')
#r holds the final result. if there is any pair where one is a prefixed of another
r=False
In [639]: for i,j in itertools.combinations(mlist,2):
r = r or i.startswith(j) # if i is the prefix of j. logical or
r = r or j.startswith(i) # if j is the prefix of i
.....:
In [640]: r
Out[640]: True
Let's say that you have a string array 'x', containing very long strings, and you want to search for the following substring: "string.str", within each string in array x.
In the vast majority of the elements of x, the substring in question will be in the array element. However, maybe once or twice, it won't be. If it's not, then...
1) is there a way to just ignore the case and then move onto the next element of x, by using an if statement?
2) is there a way to do it without an if statement, in the case where you have many different substrings that you're looking for in any particular element of x, where you might potentially end up writing tons of if statements?
You want the try and except block. Here is a simplified example:
a = 'hello'
try:
print a[6:]
except:
pass
Expanded example:
a = ['hello', 'hi', 'hey', 'nice']
for i in a:
try:
print i[3:]
except:
pass
lo
e
You can use list comprehension to filter the list concisely:
Filter by length:
a_list = ["1234", "12345", "123456", "123"]
print [elem[3:] for elem in a_list if len(elem) > 3]
>>> ['4', '45', '456']
Filter by substring:
a_list = ["1234", "12345", "123456", "123"]
a_substring = "456"
print [elem for elem in a_list if a_substring in elem]
>>> ['123456']
Filter by multiple substrings (Checks if all the substrings are in the element by comparing the filtered array size and the number of substrings):
a_list = ["1234", "12345", "123456", "123", "56", "23"]
substrings = ["56","23"]
print [elem for elem in a_list if\
len(filter(lambda x: x in elem, substrings)) == len(substrings)]
>>> ['123456']
Well, if I understand what you wrote, you can use the continue keyword to jump to the next element in the array.
elements = ["Victor", "Victor123", "Abcdefgh", "123456", "1234"]
astring = "Victor"
for element in elements:
if astring in element:
# do stuff
else:
continue # this is useless, but do what you want, buy without it the code works fine too.
Sorry for my English.
Use any() to see if any of the substrings are in an item of x. any() will consume a generator expression and it exhibits short circuit beavior - it will return True with the first expression that evaluates to True and stop consuming the generator.
>>> substrings = ['list', 'of', 'sub', 'strings']
>>> x = ['list one', 'twofer', 'foo sub', 'two dollar pints', 'yard of hoppy poppy']
>>> for item in x:
if any(sub in item.split() for sub in substrings):
print item
list one
foo sub
yard of hoppy poppy
>>>
These three functions are apart of my study guide and would greatly appreciate some assistance.
In each case, the function returns a value (so use the return statement): it does not print the value (no print statement) or mutate (change the value of) any of its arguments.
1) The repl function takes three arguments:
◦old is any value;
◦new is any value;
◦xs is a list.
Example:
>>> repl('zebra', 'donkey', ['mule', 'horse', 'zebra', 'sheep', 'zebra'])
['mule', 'horse', 'donkey', 'sheep', 'donkey']
It returns a new list formed by replacing every occurrence of old in xs with new.
It must not mutate the list xs; i.e., after return from the function, the actual argument given for xs must be what it was before.
>>> friends = ['jules', 'james', 'janet', 'jerry']
>>> repl('james', 'henry', friends)
['jules', 'henry', 'janet', 'jerry']
>>> friends
['jules', 'james', 'janet', 'jerry']
2) The search function looks for a value in a list. It takes two arguments:
◦y is the value being searched for.
◦xs is the list being searched in.
It returns the index of the first occurrence of y in xs, if it occurs; −1 otherwise.
Examples:
>>> words = ['four', 'very', 'black', 'sheep']
>>> search('four', words)
0
>>> search('sheep', words)
3
>>> search('horse', words)
-1
3) The doubles function is given a list of numbers and returns a new list containing the doubles of every number in the given list.
Example:
>>> doubles([1, 3, 7, 10])
[2, 6, 14, 20]
It must not mutate the given list:
>>> salaries = [5000, 7500, 15000]
>>> doubles(salaries)
[10000, 15000, 30000]
>>> salaries
[5000, 7500, 15000]
This is to be done without using any list methods except append. (In particular, you may not use the index or count for the search function.)
Although you can use the list len function, and the list operations +, *, indexing, slicing, and == for comparing lists or elements. You will need to use some of these but not all.
Any help is greatly appreciated like I mentioned in the introduction.
So far all I have is.
def repl (find, replacement, s):
newString = ''
for c in s:
if c != find:
newString = newString + c
else:
newString = newString + replacement
return newString
def search(y, xs):
n = len(xs)
for i in range(n):
if xs[i] == y:
return i
return -1
and....
def search(key,my_list):
if key in my_list:
return my_list.index(key)
else:
return
I'm not sure what needs to be returned after the else statement.
def relp(old,new,my_list):
final = []
for x in my_list:
if x is old:
final.append(new)
else:
final.append(x)
return final
def search(key,my_list):
if key in my_list:
return my_list.index(key)
else:
return -1
def doubles(my_list):
return[x*x for x in my_list]
I suspect this lesson is about list comprehensions
doubles = lambda my_list: [x*2 for x in my_list]
repl = lambda old_t,new_t,my_list: [x if x != old_t else new_t for x in my_list]
print repl("cow","mouse",["cow","rat","monkey","elephant","cow"])
print doubles([1,2,3,4,'d'])