Get match and unmatch items from two lists - python

I have to compare two lists for the matching and non matching elements and print them out. I have tried the below code:
list1 = ["prefencia","santro ne prefence"]
I'm fetching all the text from a webpage using selenium getText() method and all the text is getting stored in a string variable which is then stored to list2:
str = "Centro de prefencia de lilly ac"
list2 = []
list2 = str
for item in list1:
if item in list2:
print("match:", item)
else:
print("no_match:", item)
Result of above code-
match:prefencia
Seems like in keyword is working like contains. I would want to search for the exact match for the element present in list1 with the element present in list2.

At least here you have a problem:
list1 = ["prefencia","santro ne prefence"]
str = "Centro de prefencia de lilly ac"
list2 = []
list2 = str
Do you want list2 variable to be list. Now you set it as an empty list and on next row it to variable str.
How about something like this (quess what you want to do)
list1 = ["prefencia","santro ne prefence"]
mystr = "Centro de prefencia de lilly ac"
list2 = mystr.split(' ') // splits your string to list of words
for item in list1:
if item in list2:
print("match:", item)
else:
print("no_match:", item)
But if you split your string to list of words you'll never get exact match for multiple words such as "santro ne prefence".

Related

How to print each element of a list on a new line in a nested list

I wanted to print each individual element of a list that's in nested list. it should also print symbols to show where the different lists end. There are only going to be 3 lists total in the big list.
For Example,
list1 = [['assign1', 'assign2'], ['final,'assign4'], ['exam','study']]
Output should be:
######################
assign1
assign2
######################
----------------------
final
assign4
----------------------
*************************
exam
study
*************************
I know that to print an element in a normal list it is:
for element in list1:
print element
I am unaware of what steps to take next.
You can create another for loop around the one you know how to create. So:
for list2 in list1:
# print some stuff here
for word in list2:
print(word)
# print some more stuff
Assuming that there is a single nesting level.
symbols = "#-*"
list1 = [['assign1', 'assign2'], ['final', 'assign4'], ['exam','study']]
for i, element in enumerate(list1):
symbol = symbols[i%len(symbols)]
print(symbol*20)
print('\n'.join(element))
print(symbol*20)

Using two lists of indices to search through a third list

Suppose I have two lists of indices
letters = ['a', 'c']
numbers = ['1','2','6']
These lists are generated based on an interactive web interface element and is a necessary part of this scenario.
What is the most computationally efficient way in python I can use these two lists to search through the third list below for items?
list3 = ['pa1','pa2','pa3','pa4','pa5','pa6',
'pb1','pb2','pb3','pb4','pb5','pb6',
'pc1','pc2','pc3','pc4','pc5','pc6',
'pd1','pd2','pd3','pd4','pd5','pd6']
Using the letters and numbers lists, I want to search through list3 and return this list
sublist = ['pa1', 'pa2, 'pa6', 'pc1', 'pc2', 'pc6']
I could do something like this:
sublist = []
for tag in list3:
for l in letters:
for n in numbers:
if l in tag and n in tag:
sublist.append(tag)
But I'm wondering if there's a better or more recommended way?
Most of all, do not iterate through the character lists. Instead, use simple in operations; use any or all operations where necessary. In this case, since your tags are all of the form p[a-d][0-9], you can directly check the appropriate character:
for tag in list3:
if tag[1] in "ac" and tag[2] in "126":
sublist.append(tag)
For many uses or a generalized case, replace the strings with sets for O(1) time:
letter = set('a', 'c')
number = set('1', '2', '6')
for tag in list3:
if tag[1] in letter and tag[2] in number:
sublist.append(tag)
Next, get rid of the O(n^2) append series (adding to a longer list each time). Replace it with a list comprehension.
sublist = [tag for tag in list3
if tag[1] in letter and tag[2] in number]
If the letters and numbers can appear anywhere in the list, then you need a general search for each: look for an overlap in character sets:
sublist = [tag for tag in list3
if any(char in letter for char in tag) and
any(char in number for char in tag)
]
or with sets:
sublist = [tag for tag in list3
if set(tag).intersection(letter) and
set(tag).intersection(number)
]
Try this, in simple way
letters = ['a', 'c']
numbers = ['1','2','6']
sublist = []
result = []
list3 = ['pa1','pa2','pa3','pa4','pa5','pa6',
'pb1','pb2','pb3','pb4','pb5','pb6',
'pc1','pc2','pc3','pc4','pc5','pc6',
'pd1','pd2','pd3','pd4','pd5','pd6']
for letter in letters:
sublist.extend(list(filter(lambda tag: letter in tag, list3)))
for number in numbers:
result.extend(list(filter(lambda tag: number in tag, sublist)))
print(result)

Python - filter list from another other list with condition

list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)

How to create a list of lists of tokens extracted in different sentences?

I have a function which returns a list of elements and the len of each element of the list. I used that function in order to extract in the element of my past list, those which are present in a lexicon.
The problem I am facing is that the script below return a list of all the words present in my list of an element, but I want to return a list of words which are present in the lexicon for each elt of my past list. So that I will have a list of lists and those lists will contain only the word which appear in my lexicon for each particular element not a big ensemble of all the element.
My script is below and I tried two things : list-comprehension and loop but the two solutions always print me a list of all the words and not a list of lists of the word :
def polarity_word(texte, listpos, listneg):
lemme_sent, len_sent = lemmatisation(texte) # list of element(sentences lemmatized)
list_pos = []
list_neg = []
intersection = [w for w in listpos for elt in lemme_sent if w in elt ]
#other way
for elt in lemme_sent:
for w in elt.split():
if w in listpos:
list_pos.append([w])
# test data:
lemme_sent =[ 'je vie manger et boire', 'je être bel et lui très beau']
len_sent = [5, 7]
list_pos = ['luire','manger','vie','soleil','boire', 'demain', 'soir', 'bel', 'temps', 'beau']
print(intersection)
expected answer
[['vie', 'manger','boire'],['bel', 'beau']]
instead I have
[vie, manger','boire','bel','beau']
def polarity_word(texte, listpos, listneg):
lemme_sent, len_sent = lemmatisation(texte)
intersection=[]
for elt in lemme_sent:
intersection.append([word for word in listpos if word in elt])
return intersection

Splitline Python String

I have a list of elements whose text is like the following:
aSampleElementText = "Vraj Shroff\nIndia" I want to have two lists now where the first list's element would have "Vraj Shroff" and the second list's element would have "India".
I looked at other posts about split and splitlines. However, my code below is not giving me expected results.
Output:
"V",
"r"
Desired output:
"Vraj Shroff",
"India"
My code:
personalName = "Something" #first list
personalTitle = "Something" #second list
for i in range(len(names)-1)
#names is a list of elements (example above)
#it is len - 1 becuase I don't want to do this to the first element of the list
i += 1
temp = names[i].text
temp.splitlines()
personName.append(temp[0])
personTitle.append(temp[1])
names is a string. names[I] is the character corresponding to that index in the string. Hence you are getting this kind of output.
Do something like,
x = names.splitlines()
x will be the list with the elements.
names = []
locations = []
a = ["Vraj Shroff\nIndia", "Vraj\nIndia", "Shroff\nxyz", "abd cvd\nUS"]
for i in a:
b = i.splitlines()
names.append(b[0])
locations.append(b[1])
print(names)
print(locations)
output:
['Vraj Shroff', 'Vraj', 'Shroff', 'abd cvd']
['India', 'India', 'xyz', 'US']
Is this what you were looking for?

Categories

Resources