Using list comprehension, create a list of all the letters used in x.
x = ‘December 11, 2018’
I tried writing each letter out but I am receiving a syntax error!
In Python a string acts as a list; it is easier and quicker to convert the list into a set (only unique values) and then back to a list:
unique_x = list(set(x))
Or if you must use list comprehension:
used = set()
all_x = "December 11, 2018"
unique_x = [x for x in all_x if x not in used and (used.add(x) or True)]
x = "December 11, 2018"
lst = [letter for letter in x]
print(lst) # test
Output:
['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r', ' ', '1', '1', ',', ' ', '2', '0', '1', '8']
You can make a list comprehension like:
x = ‘December 11, 2018’
new_list = [letter for letter in x]
print(new_list)
# Output
# ['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r', ' ', '1', '1', ',', ' ', '2', '0', '1', '8']
Alternatively, you could skip the list comprehension and just use new_list = list(x) to get the same result.
if you want only the letters and no spaces, you can use .replace on x like: x.replace(' ','') or add on if clause in your list comprehension:
new_list = [letter for letter in x if letter != ' ']
This should work
x = list('December 11, 2018')
print(x)
result = []
for item in x:
try:
int(item)
except ValueError:
if item == "," or item == " ":
pass
else:
result.append(item)
print(result)
"""
Output:
['D', 'e', 'c', 'e', 'm', 'b', 'e', 'r']
"""
If you are using only dates with that format, you could do this
x = "December 11, 2018".split()
print(x[0])
"""
Output:
'December'
"""
Related
Can someone help me split this list into a list of lists?
For example, given this input:
['Na', '2', ' ', 'C', ' ', 'O', '3']
I want this output:
[['Na', '2'], ['C'], ['O','3']]
You can use itertools.groupby() to generate the desired sublists:
from itertools import groupby
[list(group) for key, group in groupby(data, key=lambda x: x == ' ') if not key]
This outputs:
[['Na', '2'], ['C'], ['O', '3']]
lst = ['Na', '2', ' ', 'C', ' ', 'O', '3']
lst_of_lsts = []
sublist = []
for item in lst:
if item != " ":
sublist.append(item)
else:
lst_of_lsts.append(sublist)
sublist = []
if sublist != []:
lst_of_lsts.append(sublist)
For example, from the 'tokens' list below, I want to extract the pair_list:
tokens = ['0', '#', 'a', 'b', '#', '#', 'c', '#', '#', 'g', 'h', 'g', '#']
pair_list = [['a', 'b'], ['c'], ['g', 'h', 'g']]
I was trying to do something like below, but hasn't succeeded:
hashToken_begin_found = True
hashToken_end_found = False
previous_token = None
pair_list = []
for token in tokens:
if hashToken_begin_found and not hashToken_end_found and previous_token and previous_token == '#':
hashToken_begin_found = False
elif not hashToken_begin_found:
if token == '#':
hashToken_begin_found = True
hashToken_end_found = True
else:
...
ADDITION:
My actual problem is more complicated. What's inside the pair of # symbols are words in social media, like hashed phrases in twitter, but they are not English. I was simplified the problem to illustrate the problem. The logic would be something like I wrote: found the 'start' and 'end' of each # pair and extract it. In my data, anything in a pair of hash tags is a phrase, i.e. I live in #United States# and #New York#!. I need to get United States and New York. No regex. These words are already in a list.
I think you're overcomplicating the issue here. Think of the parser as a very simple state machine. You're either in a sublist or not. Every time you hit a hash, you toggle the state.
When entering a sublist, make a new list. When inside a sublist, append to the current list. That's about it. Here's a sample:
pair_list = []
in_pair = False
for token in tokens:
if in_pair:
if token == '#':
in_pair = False
else:
pair_list[-1].append(token)
elif token == '#':
pair_list.append([])
in_pair = True
You could try itertools.groupby in one single line:
from itertools import groupby
tokens = ['0', '#', 'a', 'b', '#', '#', 'c', '#', '#', 'g', 'h', 'g', '#']
print([list(y) for x, y in itertools.groupby(tokens, key=lambda x: x.isalpha()) if x])
Output:
[['a', 'b'], ['c'], ['g', 'h', 'g']]
I group by the consecutive groups where the value is alphabetic.
If you want to use a for loop you could try:
l = [[]]
for i in tokens:
if i.isalpha():
l[-1].append(i)
else:
if l[-1]:
l.append([])
print(l[:-1])
Output:
[['a', 'b'], ['c'], ['g', 'h', 'g']]
Another way (Try it online!):
it = iter(tokens)
pair_list = []
while '#' in it:
pair_list.append(list(iter(it.__next__, '#')))
Yet another (Try it online!):
pair_list = []
try:
i = 0
while True:
i = tokens.index('#', i)
j = tokens.index('#', i + 1)
pair_list.append(tokens[i+1 : j])
i = j + 1
except ValueError:
pass
I need to append some repeated values from a list into a sublist, let me explain with an example:
I have a variable called array that contains strings of uppercase letters and $ symbols.
array = ['F', '$', '$', '$', 'D', '$', 'C']
My end goal is to have this array:
final_array = ['F', ['$', '$', '$'], 'D', ['$'], 'C']
As in the example, I need to group all $ symbols that are togheter into sublist in the original array, I thought about iterating over the array and finding all symbols near the current $ and then creating a second array, but I think maybe there is something more pythonic I can do, any ideas?
You can use groupby from itertools
array = ['F', '$', '$', '$', 'D', '$', 'C']
from itertools import groupby
result = []
for key, group in groupby(array):
if key == '$':
result.append(list(group))
else:
result.append(key)
print(result)
You can of course shorten the for-loop to a comprehension:
result = [list(group) if key == '$' else key for key, group in groupby(array)]
A general approach, that would work in every case (not just '$'):
array = ['F', '$', '$', '$', 'D', '$', 'C']
different_values = []
final_array = []
aux_array = []
old_value = None
for value in array:
if value not in different_values:
different_values.append(value)
final_array.append(value)
aux_array = []
else:
if value == old_value:
aux_array = list(final_array[-1])
del final_array[-1]
aux_array.append(value)
final_array.append(aux_array)
else:
aux_array = [value]
final_array.append(aux_array)
old_value = value
print(final_array)
I have made a little change from #rdas answer just in case we have an array with repeated values that we want to keep:
array = ['F', 'F', '$', '$', '$', 'D', '$', 'C']
result = []
for key, group in groupby(array):
if key == '$':
result.append(list(group))
else:
for g in list(group):
result.append(g)
print(result)
# ['F', 'F', ['$', '$', '$'], 'D', ['$'], 'C']
is it possible to merge the numbers in a list of chars?
I have a list with some characters:
my_list = ['a', 'f', '£', '3', '2', 'L', 'k', '3']
I'm want to concatenate the adjacent numbers as follow:
my_list = ['a', 'f', '£', '32', 'L', 'k', '3']
I have this, and it works fine, but i don't really like how it came out.
def number_concat(my_list):
new_list = []
number = ""
for ch in my_list:
if not ch.isnumeric():
if number != "":
new_list.append(number)
number =""
new_list.append(ch)
else:
number = ''.join([number,ch])
if number != "":
new_list.append(number)
return new_list
What's the best way to do this?
You can use itertools.groupby:
from itertools import groupby
my_list = ['a', 'f', '£', '3', '2', 'L', 'k', '3']
out = []
for _, g in groupby(enumerate(my_list, 2), lambda k: True if k[1].isdigit() else k[0]):
out.append(''.join(val for _, val in g))
print(out)
Prints:
['a', 'f', '£', '32', 'L', 'k', '3']
you can use a variable to track the index position in the list and then just compare two elements and if they are both digits concat them by popping the index and adding it to the previous one. we leave index pointing to the same value since we popd all other elements iwll have shifted so we need to check this index again and check the next char which will now be in that index. If the char is not a digit then move the index to the next char.
# coding: latin-1
my_list = ['a', 'f', '£', '3', '2', 'L', 'k', '3']
index = 1
while index < len(my_list):
if my_list[index].isdigit() and my_list[index - 1].isdigit():
my_list[index - 1] += my_list.pop(index)
else:
index += 1
print(my_list)
OUTPUT
['a', 'f', '£', '32', 'L', 'k', '3']
Regex:
>>> re.findall('\d+|.', ''.join(my_list))
['a', 'f', '£', '32', 'L', 'k', '3']
itertools:
>>> [x for d, g in groupby(my_list, str.isdigit) for x in ([''.join(g)] if d else g)]
['a', 'f', '£', '32', 'L', 'k', '3']
Another:
>>> [''.join(g) for _, g in groupby(my_list, lambda c: c.isdigit() or float('nan'))]
['a', 'f', '£', '32', 'L', 'k', '3']
You are just trying to reduce your numbers together.
One way to accomplish this is to loop through the list, and check if it's a number using str.isnumeric().
my_list = ['a', 'f', '£', '3', '2', 'L', 'k', '3']
new_list = ['']
for c in my_list:
if c.isnumeric() and new_list[-1].isnumeric(): # Check if current and previous character is a number
new_list[-1] += c # Mash characters together.
else:
new_list.append(c)
else:
new_list[:] = new_list[1:] # Remove '' placeholder to avoid new_list[-1] IndexError
print(new_list) # ['a', 'f', '£', '32', 'L', 'k', '3']
This has also been tested with first character is numeric.
sure! this will combine all consecutive digits:
i = 0
while i < len(my_list):
if my_list[i].isdigit():
j = 1
while i+j < len(my_list) and my_list[i+j].isdigit():
my_list[i] += my_list.pop(i+j)
j += 1
i += 1
you can also do this recursively, which is maybe more elegant (in that it will be easier to build up correctly as the task becomes more complicated) but also possibly more confusing:
def group_digits(list, accumulator=None):
if list == []:
return accumulator or []
if not accumulator:
return group_digits(list[1:], list[:1])
x = list.pop(0)
if accumulator[-1].isdigit() and x.isdigit():
accumulator[-1] += x
else:
accumulator.append(x)
return group_digits(list, accumulator)
A quick and dirty way under the assumption that the non-numeric characters are not white-space:
''.join(c if c.isdigit() else ' '+ c + ' ' for c in my_list).split()
The idea is to pad with spaces the characters that you don't want merged, smush the resulting characters together so that the non-padded ones become adjacent, and then split the result on white-space, the net result leaving the padded characters unchanged and the non-padded characters joined.
I have written a beginner-friendly solution using an index and two lists:
my_list = ['a', 'f', '£', '3', '2', 'L', 'k', '3']
result = []
index = 0
for item in my_list:
if item.isdigit():
# If current item is a number
if my_list[index-1].isdigit() and len(result) > 1:
# If previous item is a number too and it is not the 1st item
# of the list, sum the two and put them in the previous slot in result
result[index-1] = my_list[index-1] + my_list[index]
else:
result.append(item)
else:
result.append(item)
index += 1
print(my_list)
print(result)
Output
['a', 'f', '£', '3', '2', 'L', 'k', '3']
['a', 'f', '£', '32', 'L', 'k', '3']
I have a problem trying to transform a list.
The original list is like this:
[['a','b','c',''],['c','e','f'],['c','g','h']]
now I want to make the output like this:
[['a','b','c','e','f'],['a','b','c','g','h']]
When the blank is found ( '' ) merge the three list into two lists.
I need to write a function to do this for me.
Here is what I tried:
for x in mylist:
if x[len(x) - 1] == '':
m = x[len(x) - 2]
for y in mylist:
if y[0] == m:
combine(x, y)
def combine(x, y):
for m in y:
if not m in x:
x.append(m)
return(x)
but its not working the way I want.
try this :
mylist = [['a','b','c',''],['c','e','f'],['c','g','h']]
def combine(x, y):
for m in y:
if not m in x:
x.append(m)
return(x)
result = []
for x in mylist:
if x[len(x) - 1] == '':
m = x[len(x) - 2]
for y in mylist:
if y[0] == m:
result.append(combine(x[0:len(x)-2], y))
print(result)
your problem was with
combine(x[0:len(x)-2], y)
output :
[['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]
So you basically want to merge 2 lists? If so, you can use one of 2 ways :
Either use the + operator, or use the
extend() method.
And then you put it into a function.
I made it with standard library only with comments. Please refer it.
mylist = [['a','b','c',''],['c','e','f'],['c','g','h']]
# I can't make sure whether the xlist's item is just one or not.
# So, I made it to find all
# And, you can see how to get the last value of a list as [-1]
xlist = [x for x in mylist if x[-1] == '']
ylist = [x for x in mylist if x[-1] != '']
result = []
# combine matrix of x x y
for x in xlist:
for y in ylist:
c = x + y # merge
c = [i for i in c if i] # drop ''
c = list(set(c)) # drop duplicates
c.sort() # sort
result.append(c) # add to result
print (result)
The result is
[['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]
Your code almost works, except you never do anything with the result of combine (print it, or add it to some result list), and you do not remove the '' element. However, for a longer list, this might be a bit slow, as it has quadratic complexity O(n²).
Instead, you can use a dictionary to map first elements to the remaining elements of the lists. Then you can use a loop or list comprehension to combine the lists with the right suffixes:
lst = [['a','b','c',''],['c','e','f'],['c','g','h']]
import collections
replacements = collections.defaultdict(list)
for first, *rest in lst:
replacements[first].append(rest)
result = [l[:-2] + c for l in lst if l[-1] == "" for c in replacements[l[-2]]]
# [['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]
If the list can have more than one placeholder '', and if those can appear in the middle of the list, then things get a bit more complicated. You could make this a recursive function. (This could be made more efficient by using an index instead of repeatedly slicing the list.)
def replace(lst, last=None):
if lst:
first, *rest = lst
if first == "":
for repl in replacements[last]:
yield from replace(repl + rest)
else:
for res in replace(rest, first):
yield [first] + res
else:
yield []
for l in lst:
for x in replace(l):
print(x)
Output for lst = [['a','b','c','','b',''],['c','b','','e','f'],['c','g','b',''],['b','x','y']]:
['a', 'b', 'c', 'b', 'x', 'y', 'e', 'f', 'b', 'x', 'y']
['a', 'b', 'c', 'g', 'b', 'x', 'y', 'b', 'x', 'y']
['c', 'b', 'x', 'y', 'e', 'f']
['c', 'g', 'b', 'x', 'y']
['b', 'x', 'y']
try my solution
although it will change the order of list but it's quite simple code
lst = [['a', 'b', 'c', ''], ['c', 'e', 'f'], ['c', 'g', 'h']]
lst[0].pop(-1)
print([list(set(lst[0]+lst[1])), list(set(lst[0]+lst[2]))])