Adding letters to each element of a list - python

I try to add "G:" in the beginning and a backslash before every point of each element in a list. Therefore I created this example list1:
list1 = ['AEX.EN', 'AXAL.OQ', 'AAPIOE.NW']
And I need something like list2:
list2 = ['G:AEX\.EN', 'G:AXAL\.OQ', 'G:AAPIOE\.NW']
Thank you very much for the help!

Use:
>>> ['G:' + i.replace('.', '\\.') for i in list1]
['G:AEX\\.EN', 'G:AXAL\\.OQ', 'G:AAPIOE\\.NW']
>>>
In this case I prefer re.escape:
>>> import re
>>> ['G:' + re.escape(i) for i in list1]
['G:AEX\\.EN', 'G:AXAL\\.OQ', 'G:AAPIOE\\.NW']
>>>

You can use + for join string then use replace() like below:
>>> list1 = ['AEX.EN', 'AXAL.OQ', 'AAPIOE.NW']
>>> [('G:'+l).replace('.','\.') for l in list1]
['G:AEX\\.EN', 'G:AXAL\\.OQ', 'G:AAPIOE\\.NW']

Related

How to extract strings between two markers for each object of a list in python

I got a list of strings. Those strings have all the two markers in. I would love to extract the string between those two markers for each string in that list.
example:
markers 'XXX' and 'YYY' --> therefore i want to extract 78665786 and 6866
['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
You can just loop over your list and grab the substring. You can do something like:
import re
my_list = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
output = []
for item in my_list:
output.append(re.search('XXX(.*)YYY', item).group(1))
print(output)
Output:
['78665786', '6866']
import re
l = ['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
l = [re.search(r'XXX(.*)YYY', i).group(1) for i in l]
This should work
Another solution would be:
import re
test_string=['XXX78665786YYYjajk','XXX78665783336YYYjajk']
int_val=[int(re.search(r'\d+', x).group()) for x in test_string]
the command split() splits a String into different parts.
list1 = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
list2 = []
for i in list1:
d = i.split("XXX")
for g in d:
d = g.split("YYY")
list2.append(d)
print(list2)
it's saved into a list

Remove double quotes and special characters from string list python

I'm pretty new in python.
I have a list like this:
['SACOL1123', "('SA1123', 'AAW38003.1')"]
['SACOL1124', "('SA1124', 'AAW38004.1')"]
And I want to remove the extra double quotes and paranthesis, so it looks like this:
['SACOL1123', 'SA1123', 'AAW38003.1']
['SACOL1124', 'SA1124', 'AAW38004.1']
This is what I managed to do:
newList = [s.replace('"(', '') for s in list]
newList = [s.replace(')"', '') for s in newList]
But the output is exactly like the input list. How can I do it?
This is possible using ast.literal_eval. Your second element from list is string representation of a valid Python tuple which you can safely evaluate.
[[x[0]] + list(ast.literal_eval(x[1])) for x in lst]
Code:
import ast
lst = [['SACOL1123', "('SA1123', 'AAW38003.1')"],
['SACOL1124', "('SA1124', 'AAW38004.1')"]]
output = [[x[0]] + list(ast.literal_eval(x[1])) for x in lst]
# [['SACOL1123', 'SA1123', 'AAW38003.1'],
# ['SACOL1124', 'SA1124', 'AAW38004.1']]
This can be done by converting each item in the list to a string and then substituting the punctuation with empty string. Hope this helps:
import re
List = [['SACOL1123', "('SA1123', 'AAW38003.1')"],
['SACOL1124', "('SA1124', 'AAW38004.1')"]]
New_List = []
for Item in List:
New_List.append(re.sub('[\(\)\"\'\[\]\,]', '', str(Item)).split())
New_List
Output: [['SACOL1123', 'SA1123', 'AAW38003.1'],
['SACOL1124', 'SA1124', 'AAW38004.1']]

Join characters from list of strings by index

For example, I have the following list.
list=['abc', 'def','ghi','jkl','mn']
I want to make a new list as:
newList=['adgjm','behkn','cfil']
picking every first character of each element forming a new string then appending into the new list, and then with the second character of every element and so on:
Thanks for the help.
One way is zipping the strings in the list, which will interleave the characters from each string in the specified fashion, and join them back with str.join:
l = ['abc', 'def','ghi','jkl']
list(map(''.join, zip(*l)))
# ['adgj', 'behk', 'cfil']
For strings with different length, use zip_longest, and fill with an empty string:
from itertools import zip_longest
l = ['abcZ', 'def','ghi','jkl']
list(map(''.join, zip_longest(*l, fillvalue='')))
# ['adgj', 'behk', 'cfil', 'Z']
You can try this way:
>>> list1 =['abc', 'def','ghi','jkl']
>>> newlist = []
>>> for args in zip(*list1):
... newlist.append(''.join(args))
...
>>> newlist
['adgj', 'behk', 'cfil']
Or using list comprehension:
>>> newlist = [''.join(args) for args in zip(*list1)]
>>> newlist
['adgj', 'behk', 'cfil']
You can try this:
list=['abc', 'def','ghi','jkl']
n = len(list[0])
newList = []
i = 0
for i in range(n):
newword = ''
for word in list:
newword += word[i]
newList.append(newword)
print(newList)

Create a list of consequential alphanumeric elements

I have
char=str('DOTR')
and
a=range(0,18)
How could I combine them to create a list with:
mylist=['DOTR00','DOTR01',...,'DOTR17']
If I combine them in a for loop then I lose the leading zero.
Use zfill:
>>> string = "DOTR"
>>> for i in range(0, 18):
... print("DOTR{}".format(str(i).zfill(2)))
...
DOTR00
DOTR01
DOTR02
DOTR03
DOTR04
DOTR05
DOTR06
DOTR07
DOTR08
DOTR09
DOTR10
DOTR11
DOTR12
DOTR13
DOTR14
DOTR15
DOTR16
DOTR17
>>>
And if you want a list:
>>> my_list = ["DOTR{}".format(str(i).zfill(2)) for i in range(18)]
>>> my_list
['DOTR00', 'DOTR01', 'DOTR02', 'DOTR03', 'DOTR04', 'DOTR05', 'DOTR06', 'DOTR07', 'DOTR08', 'DOTR09', 'DOTR10', 'DOTR11', 'DOTR12', 'DOTR13', 'DOTR14', 'DOTR15', 'DOTR16', 'DOTR17']
>>>
You can do it using a list comprehension like so:
>>> mylist = [char+'{0:02}'.format(i) for i in a]
>>> mylist
['DOTR00', 'DOTR01', 'DOTR02', 'DOTR03', 'DOTR04', 'DOTR05', 'DOTR06', 'DOTR07', 'DOTR08', 'DOTR09', 'DOTR10', 'DOTR11', 'DOTR12', 'DOTR13', 'DOTR14', 'DOTR15', 'DOTR16', 'DOTR17']
Simply use list comprehension and format:
mylist = ['DOTR%02d'%i for i in range(18)]
Or given that char and a are variable:
mylist = ['%s%02d'%(char,i) for i in a]
You can, as #juanpa.arrivillaga also specify it as:
mylist = ['{}{:02d}'.format(char,i) for i in a]
List comprehension is a concept where you write an expression:
[<expr> for <var> in <iterable>]
Python iterates over the <iterable> and unifies it with <var> (here i), next it calls the <expr> and the result is appended to the list until the <iterable> is exhausted.
can do like this
char = str('DOTR')
a=range(0,18)
b = []
for i in a:
b.append(char + str(i).zfill(2))
print(b)

How to replace string to the other string in list (python)

What is the best way to replace every string in the list?
For example if I have a list:
a = ['123.txt', '1234.txt', '654.txt']
and I would like to have:
a = ['123', '1234', '654']
Assuming that sample input is similar to what you actually have, use os.path.splitext() to remove file extensions:
>>> import os
>>> a = ['123.txt', '1234.txt', '654.txt']
>>> [os.path.splitext(item)[0] for item in a]
['123', '1234', '654']
Use a list comprehension as follows:
a = ['123.txt', '1234.txt', '654.txt']
answer = [item.replace('.txt', '') for item in a]
print(answer)
Output
['123', '1234', '654']
Assuming that all your strings end with '.txt', just slice the last four characters off.
>>> a = ['123.txt', '1234.txt', '654.txt']
>>> a = [x[:-4] for x in a]
>>> a
['123', '1234', '654']
This will also work if you have some weird names like 'some.txtfile.txt'
You could split you with . separator and get first item:
In [486]: [x.split('.')[0] for x in a]
Out[486]: ['123', '1234', '654']
Another way to do this:
a = [x[: -len("txt")-1] for x in a]
What is the best way to replace every string in the list?
That completely depends on how you define 'best'. I, for example, like regular expressions:
import re
a = ['123.txt', '1234.txt', '654.txt']
answer = [re.sub('^(\w+)\..*', '\g<1>', item) for item in a]
#print(answer)
#['123', '1234', '654']
Depending on the content of the strings, you could adjust it:
\w+ vs [0-9]+ for only digits
\..* vs \.txt if all strings end with .txt
data.colname = [item.replace('anythingtoreplace', 'desiredoutput') for item in data.colname]
Please note here 'data' is the dataframe, 'colname' is the column name you might have in that dataframe. Even the spaces are accounted, if you want to remove them from a string or number. This was quite useful for me. Also this does not change the datatype of the column so you might have to do that separately if required.

Categories

Resources