How to Extract words from list in Python

How to Extract words from list in Python - python

I have a string str1 with the following format, and I need to extract all the words starting with "SF-", without duplication.
I tried this:
newlist=[]
# Driver code
str1 = '"SF-9632":"schema,names","startAt":0,"maxResults":50,"total":58,"issues","SF-6349","total":70,"SF-6533'
for x in str1:
if "SF-" in str1:
newlist.append(x)
print(newlist)
The output inside newlist is equal to str1.

You missed the split operation which breaks your main string str1 into an array:
newlist=[]
# Driver code
str1 = '"SF-9632":"schema,names","startAt":0,"maxResults":50,"total":58,"issues","SF-6349","total":70,"SF-6533'
lstr1 = str1.split("\"")
for x in lstr1:
if "SF-" in x:
newlist.append(x)
newlist = list(set(newlist))
print(newlist)
Which then outputs:
['SF-9632', 'SF-6349', 'SF-6533']

Related

Take out text in pairs of brackets out of string

Is it possible to take out text in pairs of brackets out of a string into a list, with the non-bracketed text in another list, with both lists being nested in the same list? This is what I mean:
"hello{ok}why{uhh}so" --> [["hello","why","so"],["ok","uhh"]]

As per the sample you shared, this is quite easy with re modules. However, if your text is huge, you will have to think about improvising this solution. Using re, you can do something like below
import re
raw_text = "hello{ok}why{uhh}so"
result = [re.split(r"{[A-Za-z]*}", raw_text),re.findall(r"{([A-Za-z]*)}",raw_text)]
print(result)
produces a result
[['hello', 'why', 'so'], ['ok', 'uhh']]

Following piece of code may help you
input_str = "hello{ok}why{uhh}so"
list1, parsed_parentheses = [], []
for index in range(len(input_str)):
if input_str[index] == "{":
parsed_parentheses.append(input_str[index])
substr = ""
continue
else:
if parsed_parentheses == []:
continue
if input_str[index] == "}":
parsed_parentheses.append(input_str[index])
list1.append(substr)
if "{" == parsed_parentheses[-1]:
substr += input_str[index]
input_str = input_str.replace("{", "-").replace('}', "-").split('-')
list2 = list(set(input_str) - set(list1))
result = [list2, list1]
It produces the following result
[['hello', 'why', 'so'], ['ok', 'uhh']]

Not finding a good regex pattern to substitute the strings in a correct order(python)

I have a list of column names that are in string format like below:
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
Now I want to add df[] with " ' " to each column name using regex and I did it which does that when the list has (wallet-phone) this kind of string it gives an output like this df[('wallet']-df['phone')]. How do I get like this (df['wallet']-df['phone']), Is my pattern wrong. Please refer it below:
import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
x=[]
y=[]
for l in lst:
x.append(re.sub(r"([^+\-*\/'\d]+)", r"'\1'", l))
for f in x:
y.append(re.sub(r"('[^+\-*\/'\d]+')", r'df[\1]',f))
print(x)
print(y)
gives:
x:["'plug'", "'[plug'+'wallet]'", "'(wallet'-'phone)'"]
y:["df['plug']", "df['[plug']+df['wallet]']", "df['(wallet']-df['phone)']"]
Is the pattern wrong?
Expected output:
x:["'plug'", "['plug'+'wallet']", "('wallet'-'phone')"]
y:["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]
I also tried ([^+\-*\/()[]'\d]+) this pattern but it isn't avoiding () or []

It might be easier to locate words and enclose them in the dictionary reference:
import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
z = [re.sub(r"(\w+)",r"df['\1']",w) for w in lst]
print(z)
["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]

Sorting a list based on upper and lower case

I have a list:
List1 = ['name','is','JOHN','My']
I want to append the pronoun as the first item in a new list and append the names at last. Other items should be in the middle and their positions can change.
So far I have written:
my_list = ['name','is','JOHN','My']
new_list = []
for i in my_list:
if i.isupper():
my_list.remove(i)
new_list.append(i)
print(new_list)
Here, I can't check if an item is completely upper case or only its first letter is upper case.
Output I get:
['name','is','JOHN','My']
Output I want:
['My','name','is','JOHN']
or:
['My','is','name','JOHN']
EDIT: I have seen this post and it doesn’t have answers to my question.

i.isupper() will tell you if it's all uppercase.
To test if just the first character is uppercase and the rest lowercase, you can use i.istitle()
To make your final result, you can append to different lists based on the conditions.
all_cap = []
init_cap = []
non_cap = []
for i in my_list:
if i.isupper():
all_cap.append(i)
elif i.istitle():
init_cap.append(i)
else:
non_cap.append(i)
new_list = init_cap + non_cap + all_cap
print(new_list)
DEMO

How about this:
s = ['name', 'is', 'JOHN', 'My']
pronoun = ''
name = ''
for i in s:
if i.isupper():
name = i
if i.istitle():
pronoun = i
result = [pronoun, s[0], s[1], name]
print(result)

Don't # me pls XD. Try this.
my_list = ['name','is','JOHN','My']
new_list = ['']
for i in range(len(my_list)):
if my_list[i][0].isupper() and my_list[i][1].islower():
new_list[0] = my_list[i]
elif my_list[i].islower():
new_list.append(my_list[i])
elif my_list[i].isupper():
new_list.append(my_list[i])
print(new_list)

how to remove "\n" from a list of strings

I have a list that is read from a text file that outputs:
['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
I want to remove the \n from each element, but using .split() does not work on lists only strings (which is annoying as this is a list of strings).
How do I remove the \n from each element so I can get the following output:
['/Users/myname/Documents/test1.txt', '/Users/myname/Documents/test2.txt', '/Users/myname/Documents/test3.txt']

old_list = [x.strip() for x in old_list]
old_list refers to the list you want to remove the \n from.
Or if you want something more readable:
for x in range(len(old_list)):
old_list[x] = old_list[x].strip()
Does the same thing, without list comprehension.
strip() method takes out all the whitespaces, including \n.
But if you are not ok with the idea of removing whitespaces from start and end, you can do:
old_list = [x.replace("\n", "") for x in old_list]
or
for x in range(len(old_list)):
old_list[x] = old_list[x].replace("\n", "")

do a strip but keep in mind that the result is not modifying the original list, so you will need to reasign it if required:
a = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
a = [path.strip() for path in a]
print a

Give this code a try:
lst = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
for n, element in enumerate(lst):
element = element.replace('\n', '')
lst[n] = element
print(lst)

Use:
[i.strip() for i in lines]
in case you don't mind to lost the spaces and tabs at the beginning and at the end of the lines.

You can read the whole file and split lines using str.splitlines:
temp = file.read().splitlines()
if you still have problems go to this question where I got the answer from
How to read a file without newlines?
answered Sep 8 '12 at 11:57 Bakuriu

There are many ways to achieve your result.
Method 1: using split() method
l = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
result = [i.split('\n')[0] for i in l]
print(result) # ['/Users/myname/Documents/test1.txt', '/Users/myname/Documents/test2.txt', '/Users/myname/Documents/test3.txt']
Method 2: using strip() method that removes leading and trailing whitespace
l = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
result = [i.strip() for i in l]
print(result) # ['/Users/myname/Documents/test1.txt', '/Users/myname/Documents/test2.txt', '/Users/myname/Documents/test3.txt']
Method 3: using rstrip() method that removes trailing whitespace
l = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
result = [i.rstrip() for i in l]
print(result) # ['/Users/myname/Documents/test1.txt', '/Users/myname/Documents/test2.txt', '/Users/myname/Documents/test3.txt']
Method 4: using the method replace
l = ['/Users/myname/Documents/test1.txt\n', '/Users/myname/Documents/test2.txt\n', '/Users/myname/Documents/test3.txt\n']
result = [i.replace('\n', '') for i in l]
print(result) # ['/Users/myname/Documents/test1.txt', '/Users/myname/Documents/test2.txt', '/Users/myname/Documents/test3.txt']

Here is another way to do it with lambda:
cleannewline = lambda somelist : map(lambda element: element.strip(), somelist)
Then you can just call it as:
cleannewline(yourlist)

How to compare reverse strings in list of strings with the original list of strings in python?

Input a given string and check if any word in that string matches with its reverse in the same string then print that word else print $
I split the string and put the words in a list and then I reversed the words in that list. After that, I couldn't able to compare both the lists.
str = input()
x = str.split()
for i in x: # printing i shows the words in the list
str1 = i[::-1] # printing str1 shows the reverse of words in a new list
# now how to check if any word of the new list matches to any word of the old list
if(i==str):
print(i)
break
else:
print('$)
Input: suman is a si boy.
Output: is ( since reverse of 'is' is present in the same string)

You almost have it, just need to add another loop to compare each word against each inverted word. Try using the following
str = input()
x = str.split()
for i in x:
str1 = i[::-1]
for j in x: # <-- this is the new nested loop you are missing
if j == str1: # compare each inverted word against each regular word
if len(str1) > 1: # Potential condition if you would like to not include single letter words
print(i)
Update
To only print the first occurrence of a match, you could, in the second loop, only check the elements that come after. We can do this by keeping track of the index:
str = input()
x = str.split()
for index, i in enumerate(x):
str1 = i[::-1]
for j in x[index+1:]: # <-- only consider words that are ahead
if j == str1:
if len(str1) > 1:
print(i)
Note that I used index+1 in order to not consider single word palindromes a match.

a = 'suman is a si boy'
# Construct the list of words
words = a.split(' ')
# Construct the list of reversed words
reversed_words = [word[::-1] for word in words]
# Get an intersection of these lists converted to sets
print(set(words) & set(reversed_words))
will print:
{'si', 'is', 'a'}

Another way to do this is just in a list comprehension:
string = 'suman is a si boy'
output = [x for x in string.split() if x[::-1] in string.split()]
print(output)
The split on string creates a list split on spaces. Then the word is included only if the reverse is in the string.
Output is:
['is', 'a', 'si']
One note, you have a variable name str. Best not to do that as str is a Python thing and could cause other issues in your code later on.
If you want word more than one letter long then you can do:
string = 'suman is a si boy'
output = [x for x in string.split() if x[::-1] in string.split() and len(x) > 1]
print(output)
this gives:
['is', 'si']
Final Answer...
And for the final thought, in order to get just the 'is':
string = 'suman is a si boy'
seen = []
output = [x for x in string.split() if x[::-1] not in seen and not seen.append(x) and x[::-1] in string.split() and len(x) > 1]
print(output)
output is:
['is']
BUT, this is not necessarily a good way to do it, I don't believe. Basically you are storing information in seen during the list comprehension AND referencing that same list. :)

This answer wouldn't show you 'a' and won't output 'is' with 'si'.
str = input() #get input string
x = str.split() #returns list of words
y = [] #list of words
while len(x) > 0 :
a = x.pop(0) #removes first item from list and returns it, then assigns it to a
if a[::-1] in x: #checks if the reversed word is in the list of words
#the list doesn't contain that word anymore so 'a' that doesn't show twice wouldn't be returned
#and 'is' that is present with 'si' will be evaluated once
y.append(a)
print(y) # ['is']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to Extract words from list in Python - python

Related

Take out text in pairs of brackets out of string

Not finding a good regex pattern to substitute the strings in a correct order(python)

Sorting a list based on upper and lower case

how to remove "\n" from a list of strings

How to compare reverse strings in list of strings with the original list of strings in python?

Categories

Resources