How to replace multiple substrings in a list? - python

I need to turn the input_string into the comment below using a for loop. First I sliced it using the split() function, but now I need to somehow turn the input string into ['result1', 'result2', 'result3', 'result5']. I tried replacing the .xls and the dash for nothing (''), but the string output is unchanged. Please don't import anything, I'm trying to do this with functions and loops only.
input_string = "01-result.xls,2-result.xls,03-result.xls,05-result.xls"
# Must be turned into ['result1','result2', 'result3', 'result5']
splitted = input_string.split(',')
for c in ['.xls', '-', '0']:
if c in splitted:
splitted = splitted.replace(splitted, 'c', '')
When I type splitted, the output is ['01-result.xls', '2-result.xls', '03-result.xls', '05-result.xls'] therefore nothing is happening.

Use the re module's sub function and split.
>>> input_string = "01-result.xls,2-result.xls,03-result.xls,05-result.xls"
>>> import re
>>> re.sub(r'(\d+)-(\w+)\.xls',r'\2\1',input_string)
'result01,result2,result03,result05'
>>> re.sub(r'(\d+)-(\w+)\.xls',r'\2\1',input_string).split(',')
['result01', 'result2', 'result03', 'result05']
Using no imports, you can use a list comprehension
>>> [''.join(x.split('.')[0].split('-')[::-1]) for x in input_string.split(',')]
['result01', 'result2', 'result03', 'result05']
The algo here is, we loop through the string after splitting it on ,. Now we split the individual words on . and the first element of these on -. We now have the number and the words, which we can easily join.
Complete explanation of the list comp answer -
To understand what a list comprehension is, Read What does "list comprehension" mean? How does it work and how can I use it?
Coming to the answer,
Splitting the input list on ,, gives us the list of individual file names
>>> input_string.split(',')
['01-result.xls', '2-result.xls', '03-result.xls', '05-result.xls']
Now using the list comprehension construct, we can iterate through this,
>>> [i for i in input_string.split(',')]
['01-result.xls', '2-result.xls', '03-result.xls', '05-result.xls']
As we need only the file name and not the extension, we split by using . and take the first value.
>>> [i.split('.')[0] for i in input_string.split(',')]
['01-result', '2-result', '03-result', '05-result']
Now again, what we need is the number and the name as two parts. So we again split by -
>>> [i.split('.')[0].split('-') for i in input_string.split(',')]
[['01', 'result'], ['2', 'result'], ['03', 'result'], ['05', 'result']]
Now we have the [number, name] in a list, However the format that we need is "namenumber". Hence we have two options
Concat them like i.split('.')[0].split('-')[1]+i.split('.')[0].split('-')[0]. This is an unnecessarily long way
Reverse them and join. We can use slices to reverse a list (See How can I reverse a list in python?) and str.join to join like ''.join(x.split('.')[0].split('-')[::-1]).
So we get our final list comprehension
>>> [''.join(x.split('.')[0].split('-')[::-1]) for x in input_string.split(',')]
['result01', 'result2', 'result03', 'result05']

Here's a solution using list comprehension and string manipulation if you don't want to use re.
input_string = "01-result.xls,2-result.xls,03-result.xls,05-result.xls"
# Must be turned into ['result1','result2', 'result3', 'result5']
splitted = input_string.split(',')
#Remove extension, then split by hyphen, switch the two values,
#and combine them into the result string
print ["".join(i.split(".")[0].split("-")[::-1]) for i in splitted]
#Output
#['result01', 'result2', 'result03', 'result05']
The way this list comprehension works is:
Take the list of results and remove the ".xls". i.split(".)[0]
Split on the - and switch positions of the number and "result". .split("-")[::-1]
For every item in the list, join the list into a string. "".join()

Related

How to remove characters from a string after a certain point within a list?

I have a list of strings within a list and I want to remove everything in each string after the tenth character.
EX:
['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
I want just the first ten integers from the list and everything else should be stripped from it.
Output
['0.04112243', '0.12733313', '0.09203131']
You can use a list comprehension:
original = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
new = [s[:10] for s in original]
Output:
['0.04112243', '0.12733313', '0.09203131']
You can also be a bit more flexible if you want to keep everything before the first comma:
new = [s.partition(',')[0] for s in original]
You can access string characters similar as an array.
Code:
example = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
for s in example:
print(s[:10])
Output:
0.04112243
0.12733313
0.09203131
list comprehension and string slicing:
dirty = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
clean = [num[:10] for num in dirty]
Split() creates a list of strings delimited by the specified character. With this in mind, I would split each string on the comma (,) char and then append the first element to a list.
lst = ['0.04112243,0.04112243,right,4.11%', '0.12733313,0.05733313,right,12.73%', '0.09203131,0.02203131,right,9.2%']
result = []
for i in lst:
result.append(i.split(",")[0])
#Test output
print(result)
This should return the values you need, in the format you want!
Hope this helps.

Python 3 split()

When I'm splitting a string "abac" I'm getting undesired results.
Example
print("abac".split("a"))
Why does it print:
['', 'b', 'c']
instead of
['b', 'c']
Can anyone explain this behavior and guide me on how to get my desired output?
Thanks in advance.
As #DeepSpace pointed out (referring to the docs)
If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']).
Therefore I'd suggest using a better delimiter such as a comma , or if this is the formatting you're stuck with then you could just use the builtin filter() function as suggested in this answer, this will remove any "empty" strings if passed None as the function.
sample = 'abac'
filtered_sample = filter(None, sample.split('a'))
print(filtered_sample)
#['b', 'c']
When you split a string in python you keep everything between your delimiters (even when it's an empty string!)
For example, if you had a list of letters separated by commas:
>>> "a,b,c,d".split(',')
['a','b','c','d']
If your list had some missing values you might leave the space in between the commas blank:
>>> "a,b,,d".split(',')
['a','b','','d']
The start and end of the string act as delimiters themselves, so if you have a leading or trailing delimiter you will also get this "empty string" sliced out of your main string:
>>> "a,b,c,d,,".split(',')
['a','b','c','d','','']
>>> ",a,b,c,d".split(',')
['','a','b','c','d']
If you want to get rid of any empty strings in your output, you can use the filter function.
If instead you just want to get rid of this behavior near the edges of your main string, you can strip the delimiters off first:
>>> ",,a,b,c,d".strip(',')
"a,b,c,d"
>>> ",,a,b,c,d".strip(',').split(',')
['a','b','c','d']
In your example, "a" is what's called a delimiter. It acts as a boundary between the characters before it and after it. So, when you call split, it gets the characters before "a" and after "a" and inserts it into the list. Since there's nothing in front of the first "a" in the string "abac", it returns an empty string and inserts it into the list.
split will return the characters between the delimiters you specify (or between an end of the string and a delimiter), even if there aren't any, in which case it will return an empty string. (See the documentation for more information.)
In this case, if you don't want any empty strings in the output, you can use filter to remove them:
list(filter(lambda s: len(s) > 0, "abac".split("a"))

Remove blank string value from a list of strings

I am reading string information as input from a text file and placing them into lists, and one of the lines is like this:
30121,long,Mehtab,10,20,,30
I want to remove the empty value in between the ,, portion from this list, but have had zero results. I've tried .remove() and filter(). Python reads it as a 'str' value.
>>> import re
>>> re.sub(',,+', ',', '30121,long,Mehtab,10,20,,30')
'30121,long,Mehtab,10,20,30'
Use split() and remove()
In [11]: s = '30121,long,Mehtab,10,20,,30'
In [14]: l = s.split(',')
In [15]: l.remove('')
In [16]: l
Out[16]: ['30121', 'long', 'Mehtab', '10', '20', '30']
Filter should work. First I am writing the data in a list and then using filter operation to filter out items in a list which which are empty. In other words, only taking items that are not empty.
data = list("30121","long","Mehtab",10,20,"",30)
filtered_data = list(filter(lambda str: str != '', data))
print(filtered_data)
You can split the string based on your separator ("," for this) and then use list comprehension to consolidate the elements after making sure they are not blank.
",".join([element for element in string.split(",") if element])
We can also use element.strip() as if condition if we want to filter out string with only spaces.

Search list of string elements that match another list of string elements

I have a list with strings called names, I need to search each element in the names list with each element from the pattern list. Found several guides that can loop through for a individual string but not for a list of strings
a = [x for x in names if 'st' in x]
Thank you in advance!
names = ['chris', 'christopher', 'bob', 'bobby', 'kristina']
pattern = ['st', 'bb']
Desired output:
a = ['christopher', 'bobby', 'kristina]
Use the any() function with a generator expression:
a = [x for x in names if any(pat in x for pat in pattern)]
any() is a short-circuiting function, so the first time it comes across a pattern that matches, it returns True. Since I am using a generator expression instead of a list comprehension, no patterns after the first pattern that matches are even checked. That means that this is just about the fastest possible way of doing it.
You can do something like this:
[name for name in names if any([p in name for p in pattern])]
The code is self explanatory, just read it out loud; we're creating a list of all names that have one of the patterns in them.
Using two loops:
for name in names:
for pattern in patterns:
if pattern in name:
# append to result

How to remove a string from a list that startswith prefix in python

I have this list of strings and some prefixes. I want to remove all the strings from the list that start with any of these prefixes. I tried:
prefixes = ('hello', 'bye')
list = ['hi', 'helloyou', 'holla', 'byeyou', 'hellooooo']
for word in list:
list.remove(word.startswith(prexixes)
So I want my new list to be:
list = ['hi', 'holla']
but I get this error:
ValueError: list.remove(x): x not in list
What's going wrong?
You can create a new list that contains all the words that do not start with one of your prefixes:
newlist = [x for x in list if not x.startswith(prefixes)]
The reason your code does not work is that the startswith method returns a boolean, and you're asking to remove that boolean from your list (but your list contains strings, not booleans).
Note that it is usually not a good idea to name a variable list, since this is already the name of the predefined list type.
Greg's solution is definitely more Pythonic, but in your original code, you perhaps meant something like this. Observe that we make a copy (using list[:] syntax) and iterate over the copy, because you should not modify a list while iterating over it.
prefixes = ('hello', 'bye')
list = ['hi', 'helloyou', 'holla', 'byeyou', 'hellooooo']
for word in list[:]:
if word.startswith(prefixes):
list.remove(word)
print list
print len([i for i in os.listdir('/path/to/files') if not i.startswith(('.','~','#'))])

Categories

Resources