python: keep char only if it is within this list - python

i have a list:
a = ['a','b','c'.........'A','B','C'.........'Z']
and i have string:
string1= 's#$%ERGdfhliisgdfjkskjdfW$JWLI3590823r'
i want to keep ONLY those characters in string1 that exist in a
what is the most effecient way to do this? perhaps instead of having a be a list, i should just make it a string? like this a='abcdefg..........ABC..Z' ??

This should be faster.
>>> import re
>>> string1 = 's#$%ERGdfhliisgdfjkskjdfW$JWLI3590823r'
>>> a = ['E', 'i', 'W']
>>> r = re.compile('[^%s]+' % ''.join(a))
>>> print r.sub('', string1)
EiiWW
This is even faster than that.
>>> all_else = ''.join( chr(i) for i in range(256) if chr(i) not in set(a) )
>>> string1.translate(None, all_else)
'EiiWW'
44 microsec vs 13 microsec on my laptop.
How about that?
(Edit: turned out, translate yields the best performance.)

''.join([s for s in string1 if s in a])
Explanation:
[s for s in string1 if s in a]
creates a list of all characters in string1, but only if they are also in the list a.
''.join([...])
turns it back into a string by joining it with nothing ('') in between the elements of the given list.

List comprehension to the rescue!
wanted = ''.join(letter for letter in string1 if letter in a)
(Note that when passing a list comprehension to a function you can omit the brackets so that the full list isn't generated prior to being evaluated. While semantically the same as a list comprehension, this is called a generator expression.)

If, you are going to do this with large strings, there is a faster solution using translate; see this answer.

#katrielalex: To spell it out:
import string
string1= 's#$%ERGdfhliisgdfjkskjdfW$JWLI3590823r'
non_letters= ''.join(chr(i) for i in range(256) if chr(i) not in string.letters)
print string1.translate(None,non_letters)
print 'Simpler, but possibly less correct'
print string1.translate(None, string.punctuation+string.digits+string.whitespace)

Related

I am able to parse the log file but not getting output in correct format in python [duplicate]

How do I concatenate a list of strings into a single string?
For example, given ['this', 'is', 'a', 'sentence'], how do I get "this-is-a-sentence"?
For handling a few strings in separate variables, see How do I append one string to another in Python?.
For the opposite process - creating a list from a string - see How do I split a string into a list of characters? or How do I split a string into a list of words? as appropriate.
Use str.join:
>>> words = ['this', 'is', 'a', 'sentence']
>>> '-'.join(words)
'this-is-a-sentence'
>>> ' '.join(words)
'this is a sentence'
A more generic way (covering also lists of numbers) to convert a list to a string would be:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> my_lst_str = ''.join(map(str, my_lst))
>>> print(my_lst_str)
12345678910
It's very useful for beginners to know
why join is a string method.
It's very strange at the beginning, but very useful after this.
The result of join is always a string, but the object to be joined can be of many types (generators, list, tuples, etc).
.join is faster because it allocates memory only once. Better than classical concatenation (see, extended explanation).
Once you learn it, it's very comfortable and you can do tricks like this to add parentheses.
>>> ",".join("12345").join(("(",")"))
Out:
'(1,2,3,4,5)'
>>> list = ["(",")"]
>>> ",".join("12345").join(list)
Out:
'(1,2,3,4,5)'
Edit from the future: Please don't use the answer below. This function was removed in Python 3 and Python 2 is dead. Even if you are still using Python 2 you should write Python 3 ready code to make the inevitable upgrade easier.
Although #Burhan Khalid's answer is good, I think it's more understandable like this:
from str import join
sentence = ['this','is','a','sentence']
join(sentence, "-")
The second argument to join() is optional and defaults to " ".
list_abc = ['aaa', 'bbb', 'ccc']
string = ''.join(list_abc)
print(string)
>>> aaabbbccc
string = ','.join(list_abc)
print(string)
>>> aaa,bbb,ccc
string = '-'.join(list_abc)
print(string)
>>> aaa-bbb-ccc
string = '\n'.join(list_abc)
print(string)
>>> aaa
>>> bbb
>>> ccc
We can also use Python's reduce function:
from functools import reduce
sentence = ['this','is','a','sentence']
out_str = str(reduce(lambda x,y: x+"-"+y, sentence))
print(out_str)
We can specify how we join the string. Instead of '-', we can use ' ':
sentence = ['this','is','a','sentence']
s=(" ".join(sentence))
print(s)
If you have a mixed content list and want to stringify it, here is one way:
Consider this list:
>>> aa
[None, 10, 'hello']
Convert it to string:
>>> st = ', '.join(map(str, map(lambda x: f'"{x}"' if isinstance(x, str) else x, aa)))
>>> st = '[' + st + ']'
>>> st
'[None, 10, "hello"]'
If required, convert back to the list:
>>> ast.literal_eval(st)
[None, 10, 'hello']
If you want to generate a string of strings separated by commas in final result, you can use something like this:
sentence = ['this','is','a','sentence']
sentences_strings = "'" + "','".join(sentence) + "'"
print (sentences_strings) # you will get "'this','is','a','sentence'"
def eggs(someParameter):
del spam[3]
someParameter.insert(3, ' and cats.')
spam = ['apples', 'bananas', 'tofu', 'cats']
eggs(spam)
spam =(','.join(spam))
print(spam)
Without .join() method you can use this method:
my_list=["this","is","a","sentence"]
concenated_string=""
for string in range(len(my_list)):
if string == len(my_list)-1:
concenated_string+=my_list[string]
else:
concenated_string+=f'{my_list[string]}-'
print([concenated_string])
>>> ['this-is-a-sentence']
So, range based for loop in this example , when the python reach the last word of your list, it should'nt add "-" to your concenated_string. If its not last word of your string always append "-" string to your concenated_string variable.

python regular expression, pulling all letters out

Is there a better way to pull A and F from this: A13:F20
a="A13:F20"
import re
pattern = re.compile(r'\D+\d+\D+')
matches = re.search(pattern, a)
num = matches.group(0)
print num[0]
print num[len(num)-1]
output
A
F
note: the digits are of unknown length
You don't have to use regular expressions, or re at all. Assuming you want just letters to remain, you could do something like this:
a = "A13:F20"
a = filter(lambda x: x.isalpha(), a)
I'd do it like this:
>>> re.findall(r'[a-z]', a, re.IGNORECASE)
['A', 'F']
Use a simple list comprehension, as a filter and get only the alphabets from the actual string.
print [char for char in input_string if char.isalpha()]
# ['A', 'F']
You could use re.sub:
>>> a="A13.F20"
>>> re.sub(r'[^A-Z]', '', a) # Remove everything apart from A-Z
'AF'
>>> re.sub(r'[A-Z]', '', a) # Remove A-Z
'13.20'
>>>
If you're working with strings that all have the same format, you can just cut out substrings:
a="A13:F20"
print a[0], a[4]
More on python slicing in this answer:
Is there a way to substring a string in Python?

Put a symbol between every letter in a list of strings

How would I put ! after every character in a list
listOne = ["hello","world"]
How do I turn that into:
["h!e!l!l!o","w!o!r!l!d"]
Attempt:
def turn(List):
return [i for i in (list(lambda x: "%s!" % x,listOne))]
turn(listOne)
Returns:
['hello!',"world!"]
Is their another way to do this besides:
def turn(List):
x = ""
for word in words:
for letter in word:
x += "%s!" % letter
return x
turn(listOne)
I'm not a big fan of doing things like that however I do realize that may be more pythonic than what I'm trying to do which is make it as few lines as possible so. Is this possible?
You can easily achieve this with the str.join() method, and list comprehension:
>>> listOne = ['!'.join(i) for i in listOne]
>>> listOne
Output
['h!e!l!l!o', 'w!o!r!l!d']
Alternatively, as abarnert suggested, you can use the bulit-in map function.
>>> listOne = list(map('!'.join, listOne))
>>> listOne
['h!e!l!l!o', 'w!o!r!l!d']
Hope this helps!
listOne = ["hello","world"]
listTwo = ['!'.join([x for x in word]) for word in listOne]
How about this?
["!".join(s) for s in ["hello", "world"]]
Or more specific:
def turn(l):
return ["!".join(s) for s in l]
Edit: Removed wrapping of the string in list() as str.join takes every iterable
object (those that implement __iter__()), and, thus strings as well. Courtesy to #alKid.

How to concatenate (join) items in a list to a single string

How do I concatenate a list of strings into a single string?
For example, given ['this', 'is', 'a', 'sentence'], how do I get "this-is-a-sentence"?
For handling a few strings in separate variables, see How do I append one string to another in Python?.
For the opposite process - creating a list from a string - see How do I split a string into a list of characters? or How do I split a string into a list of words? as appropriate.
Use str.join:
>>> words = ['this', 'is', 'a', 'sentence']
>>> '-'.join(words)
'this-is-a-sentence'
>>> ' '.join(words)
'this is a sentence'
A more generic way (covering also lists of numbers) to convert a list to a string would be:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> my_lst_str = ''.join(map(str, my_lst))
>>> print(my_lst_str)
12345678910
It's very useful for beginners to know
why join is a string method.
It's very strange at the beginning, but very useful after this.
The result of join is always a string, but the object to be joined can be of many types (generators, list, tuples, etc).
.join is faster because it allocates memory only once. Better than classical concatenation (see, extended explanation).
Once you learn it, it's very comfortable and you can do tricks like this to add parentheses.
>>> ",".join("12345").join(("(",")"))
Out:
'(1,2,3,4,5)'
>>> list = ["(",")"]
>>> ",".join("12345").join(list)
Out:
'(1,2,3,4,5)'
Edit from the future: Please don't use the answer below. This function was removed in Python 3 and Python 2 is dead. Even if you are still using Python 2 you should write Python 3 ready code to make the inevitable upgrade easier.
Although #Burhan Khalid's answer is good, I think it's more understandable like this:
from str import join
sentence = ['this','is','a','sentence']
join(sentence, "-")
The second argument to join() is optional and defaults to " ".
list_abc = ['aaa', 'bbb', 'ccc']
string = ''.join(list_abc)
print(string)
>>> aaabbbccc
string = ','.join(list_abc)
print(string)
>>> aaa,bbb,ccc
string = '-'.join(list_abc)
print(string)
>>> aaa-bbb-ccc
string = '\n'.join(list_abc)
print(string)
>>> aaa
>>> bbb
>>> ccc
We can also use Python's reduce function:
from functools import reduce
sentence = ['this','is','a','sentence']
out_str = str(reduce(lambda x,y: x+"-"+y, sentence))
print(out_str)
We can specify how we join the string. Instead of '-', we can use ' ':
sentence = ['this','is','a','sentence']
s=(" ".join(sentence))
print(s)
If you have a mixed content list and want to stringify it, here is one way:
Consider this list:
>>> aa
[None, 10, 'hello']
Convert it to string:
>>> st = ', '.join(map(str, map(lambda x: f'"{x}"' if isinstance(x, str) else x, aa)))
>>> st = '[' + st + ']'
>>> st
'[None, 10, "hello"]'
If required, convert back to the list:
>>> ast.literal_eval(st)
[None, 10, 'hello']
If you want to generate a string of strings separated by commas in final result, you can use something like this:
sentence = ['this','is','a','sentence']
sentences_strings = "'" + "','".join(sentence) + "'"
print (sentences_strings) # you will get "'this','is','a','sentence'"
def eggs(someParameter):
del spam[3]
someParameter.insert(3, ' and cats.')
spam = ['apples', 'bananas', 'tofu', 'cats']
eggs(spam)
spam =(','.join(spam))
print(spam)
Without .join() method you can use this method:
my_list=["this","is","a","sentence"]
concenated_string=""
for string in range(len(my_list)):
if string == len(my_list)-1:
concenated_string+=my_list[string]
else:
concenated_string+=f'{my_list[string]}-'
print([concenated_string])
>>> ['this-is-a-sentence']
So, range based for loop in this example , when the python reach the last word of your list, it should'nt add "-" to your concenated_string. If its not last word of your string always append "-" string to your concenated_string variable.

How to get integer values from a string in Python?

Suppose I had a string
string1 = "498results should get"
Now I need to get only integer values from the string like 498. Here I don't want to use list slicing because the integer values may increase like these examples:
string2 = "49867results should get"
string3 = "497543results should get"
So I want to get only integer values out from the string exactly in the same order. I mean like 498,49867,497543 from string1,string2,string3 respectively.
Can anyone let me know how to do this in a one or two lines?
>>> import re
>>> string1 = "498results should get"
>>> int(re.search(r'\d+', string1).group())
498
If there are multiple integers in the string:
>>> map(int, re.findall(r'\d+', string1))
[498]
An answer taken from ChristopheD here: https://stackoverflow.com/a/2500023/1225603
r = "456results string789"
s = ''.join(x for x in r if x.isdigit())
print int(s)
456789
Here's your one-liner, without using any regular expressions, which can get expensive at times:
>>> ''.join(filter(str.isdigit, "1234GAgade5312djdl0"))
returns:
'123453120'
if you have multiple sets of numbers then this is another option
>>> import re
>>> print(re.findall('\d+', 'xyz123abc456def789'))
['123', '456', '789']
its no good for floating point number strings though.
Iterator version
>>> import re
>>> string1 = "498results should get"
>>> [int(x.group()) for x in re.finditer(r'\d+', string1)]
[498]
>>> import itertools
>>> int(''.join(itertools.takewhile(lambda s: s.isdigit(), string1)))
With python 3.6, these two lines return a list (may be empty)
>>[int(x) for x in re.findall('\d+', your_string)]
Similar to
>>list(map(int, re.findall('\d+', your_string))
this approach uses list comprehension, just pass the string as argument to the function and it will return a list of integers in that string.
def getIntegers(string):
numbers = [int(x) for x in string.split() if x.isnumeric()]
return numbers
Like this
print(getIntegers('this text contains some numbers like 3 5 and 7'))
Output
[3, 5, 7]
def function(string):
final = ''
for i in string:
try:
final += str(int(i))
except ValueError:
return int(final)
print(function("4983results should get"))
Another option is to remove the trailing the letters using rstrip and string.ascii_lowercase (to get the letters):
import string
out = [int(s.replace(' ','').rstrip(string.ascii_lowercase)) for s in strings]
Output:
[498, 49867, 497543]
integerstring=""
string1 = "498results should get"
for i in string1:
if i.isdigit()==True
integerstring=integerstring+i
print(integerstring)

Categories

Resources