I have following data in a list and it is a hex number,
['aaaaa955554e']
I would like to split this into ['aaaaa9,55554e'] with a comma.
I know how to split this when there are some delimiters between but how should i do for this case?
Thanks
This will do what I think you are looking for:
yourlist = ['aaaaa955554e']
new_list = [','.join([x[i:i+6] for i in range(0, len(x), 6)]) for x in yourlist]
It will put a comma at every sixth character in each item in your list. (I am assuming you will have more than just one item in the list, and that the items are of unknown length. Not that it matters.)
i assume you wanna split into every 6th character
using regex
import re
lst = ['aaaaa955554e']
newlst = re.findall('\w{6}', lst[0])
# ['aaaaa9', '55554e']
Using list comprehension, this works for multiple items in lst
lst = ['aaaaa955554e']
newlst = [item[i:i+6] for i in range(0,len(a[0]),6) for item in lst]
# ['aaaaa9', '55554e']
This could be done using a regular expression substitution as follows:
import re
print re.sub(r'([a-zA-Z]+\d)(.*?)', r'\1,\2', 'aaaaa955554e', count=1)
Giving you:
aaaaa9,55554e
This splits after seeing the first digit.
Related
How to check if a nested list contains a substring?
strings = [[],["one", "two", "three"]]
substring = "wo"
strings_with_substring = [string for string in strings if substring in string]
print(strings_with_substring)
this script just prints :
[]
how to fix it? output should be:
two
==
Sayse, solution you provided doesn't work for me. I am new to python. I am sure I am missing something here. any thoughts?
import re
s = [[],["one", "two", "three"]]
substring = "wo"
# strings_with_substring = [string for string in strings if substring in string]
strings_with_substring = next(s for sl in strings for s in sl if substring in s)
print(strings_with_substring)
You are missing another level of iteration. Here is the looping logic without using a comprehension:
for sublist in strings:
for item in sublist:
if substring in item:
print(item)
Roll that up to a comprehension:
[item for sublist in strings for item in sublist if substring in item]
You're looking for
next(s for sl in strings for s in sl if substring in s)
This outputs "two", if you want a list of all elements then change the next for your list comprehension with given ammendments, or likewise, change next to any if you just want a boolean result
Since you said it should just print the string ~ You could use itertools to flatten your list and run it through a filter that you loop over.
from itertools import chain
strings = [[], ['one', 'two', 'three']]
substring = 'wo'
for found in filter(lambda s: substring in s, chain.from_iterable(strings)):
print(found)
I'm pretty new in python.
I have a list like this:
['SACOL1123', "('SA1123', 'AAW38003.1')"]
['SACOL1124', "('SA1124', 'AAW38004.1')"]
And I want to remove the extra double quotes and paranthesis, so it looks like this:
['SACOL1123', 'SA1123', 'AAW38003.1']
['SACOL1124', 'SA1124', 'AAW38004.1']
This is what I managed to do:
newList = [s.replace('"(', '') for s in list]
newList = [s.replace(')"', '') for s in newList]
But the output is exactly like the input list. How can I do it?
This is possible using ast.literal_eval. Your second element from list is string representation of a valid Python tuple which you can safely evaluate.
[[x[0]] + list(ast.literal_eval(x[1])) for x in lst]
Code:
import ast
lst = [['SACOL1123', "('SA1123', 'AAW38003.1')"],
['SACOL1124', "('SA1124', 'AAW38004.1')"]]
output = [[x[0]] + list(ast.literal_eval(x[1])) for x in lst]
# [['SACOL1123', 'SA1123', 'AAW38003.1'],
# ['SACOL1124', 'SA1124', 'AAW38004.1']]
This can be done by converting each item in the list to a string and then substituting the punctuation with empty string. Hope this helps:
import re
List = [['SACOL1123', "('SA1123', 'AAW38003.1')"],
['SACOL1124', "('SA1124', 'AAW38004.1')"]]
New_List = []
for Item in List:
New_List.append(re.sub('[\(\)\"\'\[\]\,]', '', str(Item)).split())
New_List
Output: [['SACOL1123', 'SA1123', 'AAW38003.1'],
['SACOL1124', 'SA1124', 'AAW38004.1']]
I have these strings, for example:
['2300LO/LCE','2302KO/KCE']
I want to have output like this:
['2300LO','2300LCE','2302KO','2302KCE']
How can I do it with Regex in Python?
Thanks!
You can make a simple generator that yields the pairs for each string. Then you can flatten them into a single list with itertools.chain()
from itertools import product, chain
def getCombos(s):
nums, code = re.match(r'(\d+)(.*)', s).groups()
for pair in product([nums], code.split("/")):
yield ''.join(pair)
a = ['2300LO/LCE','2302KO/KCE']
list(chain.from_iterable(map(getCombos, a)))
# ['2300LO', '2300LCE', '2302KO', '2302KCE']
This has the added side benefit or working with strings like '2300LO/LCE/XX/CC' which will give you ['2300LO', '2300LCE', '2300XX', '2300CC',...]
You can try something like this:
list1 = ['2300LO/LCE','2302KO/KCE']
list2 = []
for x in list1:
a = x.split('/')
tmp = re.findall(r'\d+', a[0]) # extracting digits
list2.append(a[0])
list2.append(tmp[0] + a[1])
print(list2)
This can be implemented with simple string splits.
Since you asked the output with regex, here is your answer.
list1 = ['2300LO/LCE','2302KO/KCE']
import re
r = re.compile("([0-9]{1,4})([a-zA-Z].*)/([a-zA-Z].*)")
out = []
for s in list1:
items = r.findall(s)[0]
out.append(items[0]+items[1])
out.append(items[2])
print(out)
The explanation for the regex - (4 digit number), followed by (any characters), followed by a / and (rest of the characters).
they are grouped with () , so that when you use find all, it becomes individual elements.
I want to check a string to see if it contains any of the words i have in my list.
the list is has somewhere around 100 individual words.
i have tried using regex but cant get it to work...
string = "<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>"
list = ['Café','Afrikansk','............','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
in this case the string has 'Dansk' in it. The string could contain more than one of the words in the list.
i want to write a piece of code that prints the words in the list which is also in the string.
in this case the output should be: Dansk
if there was more than one word in the string it should be: Dansk, ...., ....
I hope someone can help
>>> list = ['Café','Afrikansk','............','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
>>> string = """<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>"""
>>> [x for x in list if x in string]
['Dansk']
I recommend not using list as a variable name, as it usually referring to the type list (like str or int)
Use a list comprehension with a membership check:
[x for x in lst if x in string]
Note that I have renamed your list to lst, as list is built-in.
Example:
string = '<div class="header_links">$$ - $$$, Dansk, Veganske retter, Glutenfri retter</div>'
lst = ['Café','Afrikansk','Sushi','Svensk','Sydamerikansk','Syditaliensk','Szechuan','Taiwansk','Thai','Tibetansk','Østeuropæisk','Dansk']
print([x for x in lst if x in string])
# ['Dansk']
in your case you can use:
string_intersection = set(string.replace(',', '').split()).intersection(my_list)
print(*string_intersection, sep =',')
output:
Dansk
I would like to sort out the first row of a given list.
I've been already tried to use python "replace" to remove the second row.
But the problem is that the replace function seems not work at all.
Here is the regular expression I used: replace(r'^ //.*$','')
Here is the list:
//SA/... //short_message/Saint/...
//SS-SA/... //long_message/wonder-girl/...
here is the output I am expecting:
//SA/...
//SS-SA/...
l = ["1 12","3 12","2 12"] # space separated
n = [x.split()[0] for x in l]
print sorted(n)