Convert a list of strings to ints where possible - python

I have seen a variety of answers on here, but none that quite answered my question. I am trying to convert the following list
list = ['A', '2', '8', 'B', '3']
to the following:
list = ['A', 2, 8, 'B', 3]
I want to keep the strings as strings but convert the strings to ints where possible.
I know I could do something like:
list = [int(i) for i in list]
if it were just numbers, but I am unsure how to do it when it is mixed.

There's always try/except:
oldlist = ['A', '2', '8', 'B', '3']
newlist = []
for x in oldlist:
try:
newlist.append(int(x))
except ValueError:
newlist.append(x)
newlist
# ['A', 2, 8, 'B', 3]

You can use str.isdigit():
>>> l = ['A', '2', '8', 'B', '3']
>>> [int(x) if x.isdigit() else x for x in l]
['A', 2, 8, 'B', 3]
Taking negative numbers into account:
>>> l = ['A', '2', '8', 'B', '-3']
>>> [int(x) if x.isdigit() or x.startswith('-') and x[1:].isdigit() else x for x in l]
>>> ['A', 2, 8, 'B', -3]

I would just extract the conversion into a function.
def int_if_possible(value):
try:
return int(value)
except (ValueError, TypeError):
return value
int_list = [int_if_possible(i) for i in int_list]
Also I renamed your list to int_list, so that we can still use the list constructor if required.

You can use try , except block
lst1 = ['A', '2', '8', 'B', '3']
lst2 = []
for i in lst1:
try:
lst2.append(int(i))
except ValueError:
lst2.append(i)
print lst2

Related

Subwords of a string in Python

I am trying to create a list of every possible version of a string in a fast way. I don't really mean specifically subwords - for example from a string "ABC", I want to get:
['C', 'B', 'BC', 'A', 'AB', 'ABC']
(without "AC" which is a subword)
Same goes for "123":
I want to get ['3', '2', '23', '1', '12', '123'] instead of ['3', '2', '23', '1', '13', '12', '123']
Here is a simple loop and slice based generator function:
def subs(s):
for i in range(len(s)):
for j in range(i+1, len(s)+1):
yield s[i:j]
>>> list(subs("ABC"))
['A', 'AB', 'ABC', 'B', 'BC', 'C']
Might be faster to extend the substrings instead of freshly slicing each:
def subs(s):
while s:
t = ''
for c in s:
t += c
yield t
s = s[1:]
Benchmark results for s = "z" * 5000:
8.4 seconds subs_slice
1.5 seconds subs_extend
Benchmark code (Try it online!):
from timeit import timeit
from collections import deque
def subs_slice(s):
for i in range(len(s)):
for j in range(i+1, len(s)+1):
yield s[i:j]
def subs_extend(s):
while s:
t = ''
for c in s:
t += c
yield t
s = s[1:]
funcs = subs_slice, subs_extend
for func in funcs:
print(list(func('ABCD')))
s = "z" * 5000
for _ in range(3):
for func in funcs:
t = timeit(lambda: deque(func(s), 0), number=1)
print(t, func.__name__)
print()
For ABC you can just get ['C', 'B', 'BC', 'A', 'AB', 'ABC', 'AC'] then use remove() to remove the subword from your list. E.i:
abc_list = ['C', 'B', 'BC', 'A', 'AB', 'ABC', 'AC']
abc_list.remove('AC')
Output: ['C', 'B', 'BC', 'A', 'AB', 'ABC']
There is a lack of context to the question to give you a full answer. Do all of your strings have 3 characters or more? how do you define what you don't need?
If all the strings are 3 characters in length, then you can use this:
def subwording(word: str):
subword = word[0]+word[2]
return subword
Then you can remove subword from your list.

How to take the 3rd element of each nested list but if 3rd doesnt exist put null?

My list looks like the following;
list = [['a', 'b', 'x'], ['a', 'd', 'r'], ['a', 'c']]
what I want to do is extract the 3rd element of each sub-list but on the last one because there is only 2 input "null"
This is what I have tried already.
try:
lst1 = [item[2] for item in list]
except IndexError:
lst1 = ['' for item in list]
print(lst1)
expected output would be the following;
lst1 = ['x', 'r', '']
You may go with simple list comprehension:
lst = [['a', 'b', 'x'], ['a', 'd', 'r'], ['a', 'c']]
res = [i[2] if len(i) > 2 else '' for i in lst]
print(res) # ['x', 'r', '']
The condition i[2] if len(i) > 2 else '' ensures the 3rd item i[2] (Python uses zero-based indexing) exists only if length of a sublist has more than 2 items len(i) > 2.
You can use a ternary operator here, like:
result = [item[2] if len(item) > 2 else '' for item in mylist]
for example:
>>> [item[2] if len(item) > 2 else '' for item in mylist]
['x', 'r', '']
Note: please do not give your variables names of builtins, since that will override the references to these builtins. For example use mylist, instead of list.
You can use next and iter like in the following
>>> lst = [['a', 'b', 'x'], ['a', 'd', 'r'], ['a', 'c']]
>>> n = 2
>>> [next(iter(l[n:]), '') for l in lst]
['x', 'r', '']
You can do this with a loop and try-except as well, similar to how you attempted:
lst1 = []
for item in list:
try:
lst1.append(item[2])
except IndexError:
lst1.append("")
If you want to have the index where you want to cut as a variable, you can choose to have this as a nice function:
def nth_element(n):
lst1 = []
for item in list:
try:
lst1.append(item[2])
except IndexError:
lst1.append("")
return lst1
Now you can enter any n and just call this function: nth_element(0) gives you the first elements of each list, nth_element(1) second elements, nth_element(2) third elements, etc.
Per your comments, you can collect all lists of nth elements, up to m, inside another list as follows: [nth_element(i) for i in range(m)].
Note: Don't name your variables with built-ins. So instead of list, name it lst or my_list or list0 etc.

Create a list using the letters in x

Using list comprehension, create a list of all the letters used in x.
x = ‘December 11, 2018’
I tried writing each letter out but I am receiving a syntax error!
In Python a string acts as a list; it is easier and quicker to convert the list into a set (only unique values) and then back to a list:
unique_x = list(set(x))
Or if you must use list comprehension:
used = set()
all_x = "December 11, 2018"
unique_x = [x for x in all_x if x not in used and (used.add(x) or True)]
x = "December 11, 2018"
lst = [letter for letter in x]
print(lst) # test
Output:
['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r', ' ', '1', '1', ',', ' ', '2', '0', '1', '8']
You can make a list comprehension like:
x = ‘December 11, 2018’
new_list = [letter for letter in x]
print(new_list)
# Output
# ['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r', ' ', '1', '1', ',', ' ', '2', '0', '1', '8']
Alternatively, you could skip the list comprehension and just use new_list = list(x) to get the same result.
if you want only the letters and no spaces, you can use .replace on x like: x.replace(' ','') or add on if clause in your list comprehension:
new_list = [letter for letter in x if letter != ' ']
This should work
x = list('December 11, 2018')
print(x)
result = []
for item in x:
try:
int(item)
except ValueError:
if item == "," or item == " ":
pass
else:
result.append(item)
print(result)
"""
Output:
['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r']
"""
If you are using only dates with that format, you could do this
x = "December 11, 2018".split()
print(x[0])
"""
Output:
'December'
"""

How to see if during int casting its not an int

I have this Matrix=[['1', '2', '3'], ['4', 'a', 'n']]
Im doing this:
Matrix=[arr.split() for arr in Matrix]
Matrix=[list(map(int, arr)) for arr in Matrix]
as you can see I have 'a' and 'n' there, I want to stop the process and raise a flag like con=false everytime I got a char inside the Matrix,
how do I do that?
One solution is to declare a "better" casting function and call it instead of int in map:
matrix = [['1', '2', '3'], ['4', 'a', 'n']]
def int_with_default(value, default="NaN"):
try:
return int(value)
except ValueError:
return default
matrix = [list(map(int_with_default, arr)) for arr in matrix]
The output matrix will be [[1, 2, 3], [4, 'NaN', 'NaN']]. Note that you could also use math.nan instead of this arbitrary string I used as an example.
If you have only positive integers you can use the following listcomp:
m = [['1', '2', '3'], ['4', 'a', 'n']]
[list(map(lambda x: int(x) if x.isdigit() else None, row)) for row in m]
# [[1, 2, 3], [4, None, None]]

How to reorganize sublists and exclude specific indexes in those sublists?

How can I reorganize sublists and exclude certain items from sublists to create a new list of sublists?
By reorganize I mean that I want to change the order of the items within each sublists across each sublist. For example moving every element at index 0 to index 1, and moving every element in index 2 to index 0 across every sublist. At the same time, I don't want to include index 1 in the original list of sublists.
Original_List = [['a','b','c'],['a','b','c'],['a','b','c']]
Desired_List = [['c','a'],['c','a'],['c','a']]
I currently have this function, which rearranges and pulls out different indexes from a sublist.
def Function(li):
return map(lambda x: (x[2] + "|" + x[0]).split("|"),li)
However, there are situations in which the sublists are much longer and there are more indexes that I want to pull out.
Rather than making this same function for 3 or 4 indexes like this for example:
def Function(li):
return map(lambda x: (x[2] + "|" + x[1] + "|" + x[0]).split("|"),li)
I'd like to use the *args, so that I can specify different amounts of indexes of the sublists to pull out. This is what I have so far, but I get a TypeError.
def Function(self,li,*args):
return map(lambda x: ([int(arg) + "|" for arg in args]).split("|"))
I get a TypeError, which I can understand but can't get around:
TypeError: string indices must be integers, not str
Perhaps there is a better and faster method entirely to rearrange sublists and exclude certain items within those sublists?
Also, it would be amazing if the function could deal with sub-sub-lists like this.
Original_List = [['a','b','c',['1','2','3']],['a','b','c',['1','2','3']],['a','b','c',['1','2','3']]]
Inputs that I'd like to achieve this:
[2] for c
[0] for a
[3][1] for '2'
Desired_List = [['c','a','2'],['c','a','2'],['c','a','2']]
I think what you are describing is this:
def sublist_indices(lst, *args):
return [[l[i] for i in args] for l in lst]
>>> sublist_indices([[1, 2, 3], [4, 5, 6]], 2, 0)
[[3, 1], [6, 4]]
If your sublists and sub-sublists contain all iterable items (e.g. strings, lists), you can use itertools.chain.from_iterable to flatten the sub-sublists, and then index in:
from itertools import chain
def sublists(lst, *args):
return [[list(chain.from_iterable(l))[i] for i in args] for l in lst]
e.g.
>>> lst = [['a', 'b', 'c', ['1', '2', '3']],
['a', 'b', 'c', ['1', '2', '3']],
['a', 'b', 'c', ['1', '2', '3']]]
>>> sublists(lst, 2, 0, 4)
[['c', 'a', '2'], ['c', 'a', '2'], ['c', 'a', '2']]
original = [['a','b','c'],['a','b','c'],['a','b','c']]
desired = [['c','a'],['c','a'],['c','a']]
def filter_indices(xs, indices):
return [[x[i] for i in indices if i < len(x)] for x in xs]
filter_indices(original, [2, 0])
# [['c', 'a'], ['c', 'a'], ['c', 'a']]
filter_indices(original, [2, 1, 0])
# [['c', 'b', 'a'], ['c', 'b', 'a'], ['c', 'b', 'a']]
I'm not sure what exactly you mean by "reorganize", but this nested list comprehension will take in a list of lists li and return a new list which contains the lists in li, but with the indices in args excluded.
def exclude_indices(li, *args):
return [[subli[i] for i in range(len(subli)) if i not in args] for subli in li]

Categories

Resources