Converting strings to floats in a nested list - python

I have a list of lists which contain strings of numbers and words
I want to convert only those strings which are numbers to floats
aList= [ ["hi", "1.33"], ["bye", " 1.555"] ]

First, you need a function that does the "convert a string to float if possible, otherwise leave it as a string":
def floatify(s):
try:
return float(s)
except ValueError:
return s
Now, you can just call that on each value, either generating a new list, or modifying the old one in place.
Since you have a nested list, this means a nested iteration. You might want to start by doing it explicitly in two steps:
def floatify_list(lst):
return [floatify(s) for s in lst]
def floatify_list_of_lists(nested_list):
return [floatify_list(lst) for lst in nested_list]
You can of course combine it into one function just by making floatify_list a local function:
def floatify_list_of_lists(nested_list):
def floatify_list(lst):
return [floatify(s) for s in lst]
return [floatify_list(lst) for lst in nested_list]
You could also do it by substituting the inner expression in place of the function call. If you can't figure out how to do that yourself, I would recommend not doing it, because you're unlikely to understand it (complex nested list comprehensions are hard enough for experts to understand), but if you must:
def floatify_list_of_lists(nested_list):
return [[floatify(s) for s in lst] for lst in nested_list]
Or, if you prefer your Python to look like badly-disguised Haskell:
def floatify_list_of_lists(nested_list):
return map(partial(map, floatify), nested_list)

Related

Why does this function to convert all items in a list to strings not work?

I have a two dimensional array and I try to convert all items within each array to strings.
First I tried to use a function to_str and this approach didn't work. I do not understand why it doesn't work (it returns the input unchanged):
lst = [['test1', 555], ['test2', 3333]]
def to_str(item):
for i in item:
if not isinstance(i, str):
i = str(i)
return item
output = list(map(lambda item:to_str(item), lst))
output: [['test1', 555], ['test2', 3333]]
Then I used a list comprehension instead, and it worked:
output = list(map(lambda item: [str(i) for i in item], lst))
output: [['test1', '555'], ['test2', '3333']]
Why does the first approach using to_str not work?
You're trying to modify the iteration variable named i. This has no effect at all, you're just rewriting the value of a local variable that points to a list element, but not changing the list itself. For this to work you have to modify the list elements at each index position, something like this:
def to_str(item):
# iterate over the indexes in the item
for i in range(len(item)):
# we can remove this check, and simply convert everything to str
if not isinstance(item[i], str):
item[i] = str(item[i])
return item
Or we can create a new list with the results, instead of overwriting the original (but this will be equivalent to using a list comprehension, better use a list comprehension):
def to_str(item):
result = []
for element in item:
# we can remove this check, and simply convert everything to str
if not isinstance(element, str):
result.append(str(element))
else:
result.append(element)
return result
Also, regarding your second approach: it'd be better if you avoid using list, map and lambda, if what you want is to create a new list as a result use a list comprehension directly. This is a more idiomatic way to solve the problem, also removing the unnecessary string check:
[[str(i) for i in item] for item in lst]
=> [['test1', '555'], ['test2', '3333']]
Converted value i is not used anywhere in approach #1 and function just returns input
def to_str(item):
result = []
for i in item:
if not isinstance(i, str):
i = str(i)
result.append(i)
return result

Python, encode strings within a list of lists

Am using Python 2.7.
I have a list of lists like:
testList2 = [[u'462', u'San Germ\xe1n, PR'],[u'461', u'40341']]
I want to encode the strings in the list of lists like:
encodedList = [['462', 'San Germ\xc3\xa1n, PR'],['461', '40341']]
Tried to write a function to do this (did not work):
def testEncode(a):
for list in a:
return [x.encode('utf-8') for x in list]
I think that for the function to work, it needs to append each encoded list to the prior encoded list to generate an encoded list of lists. Not sure how to do this. If someone could explain how the function could be edited to do this, that would be awesome.
I tried the following which did not work either
def testEncode(a):
b = []
for list in a:
b.append([x.encode('utf-8') for x in list])
return b
Having realized that your first code is not actually a typographical error but a logical mistake, let me summarize my comments here. There are two problems (both related) in your approaches:
Problem with the first code: You are currently returning only the first sublist because you put the return in your for loop. Your input list contains sublists so you need to loop over them in a nested manner. One way is to do it as you are doing in your second approach. Another way is to use list comprehensions. Following is the list comprehension way where i will iterate through the sublists and x will iterate through the elements of your sublist i.
def testEncode(a):
return [[x.encode('utf-8') for x in i] for i in a]
Problem with the second code: In this attempt of yours, you have basically solved the problem of ignoring the sublists but you forgot to put your return statement outside the for loop. So before your nested for loop iterate through all the sublists, you prematurely return the result. Therefore, you only see the first sublist modified.
def testEncode(a):
b = []
for list in a:
b.append([x.encode('utf-8') for x in list])
return b # <-- Moved outside the for loop now

Python: Get first element of list potentially containing sublists

I'm looking for the first element of a Python list potentially containing either numbers (integer or float), or many levels of nested sublists containing the same. In these examples, let's suppose I am always looking for the number '1'. If the list contains no sublists, we have:
>>> foo = [1,2,3]
>>> foo[0]
1
If the list contains one sublist, and I know this information, I can again obtain 1 with
>>> foo = [[1,2],[3,4]]
>>> foo[0][0]
1
Similarly if the first element of my list is a list containing a list:
>>> foo = [[[1,2],[3,4]],[[5,6],[7,8]]]
>>> foo[0][0][0]
1
Is there a general way to get the first integer or float in foo, without resorting to calling a function recursively until drilling down to a value of foo[0] that is no longer a list?
There shouldn't be any need for recursion. Assuming that you are always working with lists and ints, this should work perfectly well for you.
foo = [[[1,2],[3,4]],[[5,6],[7,8]]]
result = None
while True:
try:
result = foo[0]
except TypeError:
break
Unlike the other answers, this asks for forgiveness rather than for permission, which is a bit more Pythonic.
If you really want to be Pythonic, you could define a function like as follows. However, this would admittedly be overkill given your specification.
def first_scalar(foo):
result = None
while True:
try:
result = next(iter(foo))
except TypeError:
return result
Note that it returns None if the argument is not an iterable. The same applies for the first segment of code.
Note that this doesn't work if the if the deepest "left-most" child list is empty. To account for this, you'll need to totally flatten the list.
def _flatten(foo):
try:
for item in foo:
yield from flatten(foo)
except TypeError:
yield foo
def flatten(foo):
for item in foo:
yield from _flatten(foo)
def first_scalar(foo):
return next(flatten(foo))
Note that the above must be written in at least Python 3.3.
The following code is for earlier versions of Python.
def _flatten(foo):
try:
for item in foo:
for subitem in _flatten(foo):
yield subitem
except TypeError:
yield foo
def flatten(foo):
for item in foo:
for subitem in _flatten(foo):
yield subitem
The general-case answer for this is "Fix your data structure." Lists are supposed to be homogeneous, e.g. every element of the list should have the same type (be that int or list of ints or list of lists of ints or etc).
The special case here would be to recurse until you find a number and return it.
def foo(lst):
first_el = lst[0]
if isinstance(first_el, (float, int)):
return first_el
else:
return foo(first_el)
create a simple, recursive function:
>>> def getFirst(l):
return l[0] if not isinstance(l[0],list) else getFirst(l[0])
>>> getFirst([1,2,3,4])
1
>>> getFirst([[1,2,3],[4,5]])
1
>>> getFirst([[[4,2],12,[1,3]],1])
4
this will return l[0] if l[0] is anything but a list. else, it will return the first item of l[0] recursively
You can just "dive in", without any recursion:
lst = [[1, 2], [3, 4]]
first = lst
while isinstance(first, list):
first = first[0]
If you really want to avoid any loops or recursion, there is an ugly workaround. Transform the list to a string and then remove the list-specific chars:
','.join(map(str,foo)).replace('[','').replace(']','').replace(' ','').split(',')
Of course it only works if the list is composed by strings or integers. If the objects in the list are custom, you would have to transform them to string. But, since there is an unknown number of sublists, you would have to use recursion, so using this workaround wouldn't make sense.
Another thing, maybe the elements of the list and sublists have the same chars as the list-specific ones, such as '[' or ',', so that would also be a problem.
In short, this is a bad workaround that only works for sure if the list and sublists are composed of numbers. Otherwise, using some kind of recursion is most probably necessary.

Accessing elements of a list

I have a list of strings, and calling a function on each string which returns a string. The thing I want is to update the string in the list. How can I do that?
for i in list:
func(i)
The function func() returns a string. i want to update the list with this string. How can it be done?
If you need to update your list in place (not create a new list to replace it), you'll need to get indexes that corresponds to each item you get from your loop. The easiest way to do that is to use the built-in enumerate function:
for index, item in enumerate(lst):
lst[index] = func(item)
You can reconstruct the list with list comprehension like this
list_of_strings = [func(str_obj) for str_obj in list_of_strings]
Or, you can use the builtin map function like this
list_of_strings = map(func, list_of_strings)
Note : If you are using Python 3.x, then you need to convert the map object to a list, explicitly, like this
list_of_strings = list(map(func, list_of_strings))
Note 1: You don't have to worry about the old list and its memory. When you make the variable list_of_strings refer a new list by assigning to it, the reference count of the old list reduces by 1. And when the reference count drops to 0, it will be automatically garbage collected.
First, don't call your lists list (that's the built-in list constructor).
The most Pythonic way of doing what you want is a list comprehension:
lst = [func(i) for i in lst]
or you can create a new list:
lst2 = []
for i in lst:
lst2.append(func(i))
and you can even mutate the list in place
for n, i in enumerate(lst):
lst[n] = func(i)
Note: most programmers will be confused by calling the list item i in the loop above since i is normally used as a loop index counter, I'm just using it here for consistency.
You should get used to the first version though, it's much easier to understand when you come back to the code six months from now.
Later you might also want to use a generator...
g = (func(i) for i in lst)
lst = list(g)
You can use map() to do that.
map(func, list)

Python: check if value is in a list no matter the CaSE

I want to check if a value is in a list, no matter what the case of the letters are, and I need to do it efficiently.
This is what I have:
if val in list:
But I want it to ignore case
check = "asdf"
checkLower = check.lower()
print any(checkLower == val.lower() for val in ["qwert", "AsDf"])
# prints true
Using the any() function. This method is nice because you aren't recreating the list to have lowercase, it is iterating over the list, so once it finds a true value, it stops iterating and returns.
Demo : http://codepad.org/dH5DSGLP
If you know that your values are all of type str or unicode, you can try this:
if val in map(str.lower, list):
...Or:
if val in map(unicode.lower, list):
If you really have just a list of the values, the best you can do is something like
if val.lower() in [x.lower() for x in list]: ...
but it would probably be better to maintain, say, a set or dict whose keys are lowercase versions of the values in the list; that way you won't need to keep iterating over (potentially) the whole list.
Incidentally, using list as a variable name is poor style, because list is also the name of one of Python's built-in types. You're liable to find yourself trying to call the list builtin function (which turns things into lists) and getting confused because your list variable isn't callable. Or, conversely, trying to use your list variable somewhere where it happens to be out of scope and getting confused because you can't index into the list builtin.
You can lower the values and check them:
>>> val
'CaSe'
>>> l
['caSe', 'bar']
>>> val in l
False
>>> val.lower() in (i.lower() for i in l)
True
items = ['asdf', 'Asdf', 'asdF', 'asjdflk', 'asjdklflf']
itemset = set(i.lower() for i in items)
val = 'ASDF'
if val.lower() in itemset: # O(1)
print('wherever you go, there you are')

Categories

Resources