Write a code with collection library in python - python

I have a list in which there are several characters, for example, the characters of the first word have the same letters as the characters of the third word, but their order has changed. I want to write a code that uses the collection library to create a main list at the output and the number of words The ones that are the same in this main list should be made and there are the same words in them
Input list example:
['abc','acb','hds','sdh','nm','mn']
Output example:
[[abc,acb],[hds,sdh],[nm,mn]]
No matter how hard I tried, I could not complete it

lst = ['abc', 'acb', 'hds', 'sdh', 'nm', 'mn']
def match(lst: list[str]) -> list[list[str]]:
new_lst = []
for i in range(0, len(lst)):
for j in range(i + 1, len(lst)):
if sorted(lst[i]) == sorted(lst[j]):
create_new_pair = True
# check if it already has a pair to join
for pair in new_lst:
if lst[j] in pair:
create_new_pair = False
break
elif sorted(pair[0]) == sorted(lst[j]):
pair.append(lst[j])
create_new_pair = False
break
# create new pair if it is a new one
if create_new_pair:
new_lst.append([lst[i], lst[j]])
return new_lst
new_lst = match(lst)
print(new_lst)

Related

How do I compare two letters or values in two different lists?

I am making a program that has two lists (in Python), and each list contains 5 different letters. How do I make it so that any index number I choose for both lists gets compared and uppercase a letter if the condition is true? If the first two values in the list are the same (in my case a lowercase letter), then I want the letter in the second list to become uppercase.
example/attempt (I don't know what I'm doing):
if list1[0] = list2[0]:
upper(list2[0])
Without an example of you input and output, it's difficult to understand what your goal is, but if your goal is to use .upper() on any string in list2 where list1[i] and list2[i] are equal, you can use a combination of zip and enumerate to compare, and then assign the value of list2[i] to the uppercase string like so:
list1 = ['a', 'b', 'c']
list2 = ['a', 'p', 'q']
for i, (x, y) in enumerate(zip(list1, list2)):
if x == y:
list2[i] = y.upper()
print(list2)
Output:
['A', 'p', 'q']
I think you could use something like this:
def compare_and_upper(lst1, lst2):
for i in range(len(lst1)):
if lst1[i].upper() == lst2[i].upper():
return lst1[i].upper()
return None
This is not a full solution of your problem, more of a representation of how to do the comparisons, which you can then reuse / modify to do the solution you want in the end.
import string
from random import choices
def create_random_string(str_len=10):
# k = the number of letters that we want it to return.
return "".join(choices(string.ascii_lowercase, k=str_len))
def compare(str_len=10):
# Create the two strings that we want to compare
first_string = create_random_string(str_len)
second_string = create_random_string(str_len)
# comp_string will hold the final string that we want to return.
comp_string = ""
# Because the length of the strings are based on the variable str_len,
# we can use the range of that number to iterate over our comparisions.
for i in range(str_len):
# Compares the i'th position of the strings
# If they match, then add the uppercase version to comp_string
if first_string[i] == second_string[i]:
comp_string += first_string[i].upper()
else:
comp_string += "-"
return comp_string
for _ in range(10):
print(compare(20))
Sample output:
--------------------
---AS---D---------D-
----W--Q--------E---
--------------------
-----------------E--
------T-------------
--------------------
-------------P------
-----S--------------
--B-----------------

How to do slicing in strings in python?

I am trying to do slicing in string "abcdeeefghij", here I want the slicing in such a way that whatever input I use, i divide the output in the format of a list (such that in one list element no alphabets repeat).
In this case [abcde,e,efghij].
Another example is if input is "aaabcdefghiii". Here the expected output is [a,a,acbdefghi,i,i].
Also amongst the list if I want to find the highest len character i tried the below logic:
max_str = max(len(sub_strings[0]),len(sub_strings[1]),len(sub_strings[2]))
print(max_str) #output - 6
which will yield 6 as the output, but i presume this logic is not a generic one: Can someone suggest a generic logic to print the length of the maximum string.
Here is how:
s = "abcdeeefghij"
l = ['']
for c in s: # For character in s
if c in l[-1]: # If the character is already in the last string in l
l.append('') # Add a new string to l
l[-1] += c # Add the character to either the last string, either new, or old
print(l)
Output:
['abcde', 'e', 'efghij']
Use a regular expression:
import re
rx = re.compile(r'(\w)\1+')
strings = ['abcdeeefghij', 'aaabcdefghiii']
lst = [[part for part in rx.split(item) if part] for item in strings]
print(lst)
Which yields
[['abcd', 'e', 'fghij'], ['a', 'bcdefgh', 'i']]
You would loop over the characters in the input and start a new string if there is an existing match, otherwise join them onto the last string in the output list.
input_ = "aaabcdefghiii"
output = []
for char in input_:
if not output or char in output[-1]:
output.append("")
output[-1] += char
print(output)
To avoid repetition of alphabet within a list element repeat, you can greedily track what are the words that are already in the current list. Append the word to your answer once you detected a repeating alphabet.
from collections import defaultdict
s = input()
ans = []
d = defaultdict(int)
cur = ""
for i in s:
if d[i]:
ans.append(cur)
cur = i # start again since there is repeatition
d = defaultdict(int)
d[i] = 1
else:
cur += i #append to cur since no repetition yet
d[i] = 1
if cur: # handlign the last part
ans.append(cur)
print(ans)
An input of aaabcdefghiii produces ['a', 'a', 'abcdefghi', 'i', 'i'] as expected.

How to append items in list which starts with alphabet python

Find a sublist which starts with alphabet in python 3?
how to append items in list which starts with alphabet python
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'],
['2','abc_456', '0.99'], ['letters and words','end','99']]
index_list = []
sub_list = []
for i in range(0,len(code_result)):
if code_result[i][0].isalpha():
index_list.append([i,i-1])
for item in range(0,len(index_list)):
temp = re.sub('[^0-9a-zA-Z]','',str(code_result[index_list[item][0]]))
sub_list.append([code_result[index_list[item][1]][1]+" "+temp])
print(sub_list)
My code works only for one alphabet in the sublist not more than that
Expected Output:
[['abc_123 paragraph 100 MLMY'],['abc_456 letters and words end 99']]
This will do what you need with minimal changes
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'], ['2','abc_456', '0.99'], ['letters and words','end','99']]
index_list = []
sub_list = []
for i in range(0,len(code_result)):
if code_result[i][0][0].isalpha():
index_list.append([i,i-1])
for item in range(0,len(index_list)):
temp = re.sub('[^0-9a-zA-Z ]','',str(code_result[index_list[item][0]]))
sub_list.append([code_result[index_list[item][1]][1]+" "+temp])
print(sub_list)
BUT I am still unclear of what you are trying to do and think this whatever it is it could've been done better.
Since only letters have an uppercase and lowercase variation, you could use that as a condition. The whole thing could fit in a single list comprehension:
sub_list = [[s for s in a if s[0].lower()!=s[0].upper()] for a in code_result]
# [['abc_123'], ['paragraph', 'ML MY'], ['abc_456'], ['letters and words', 'end']]
Note that your problem statement and expected output are ambiguous. they could also mean:
sub lists that start with an item that only contains letters (based on question title):
[ a for a in code_result if a[0].lower()!=a[0].upper()]
# [['paragraph', '100', 'ML MY'], ['letters and words', 'end', '99']]
OR, based on the expected output, sub list elements that start with a letter, sometimes taken individually and other times using the whole sublist and arbitrarily concatenated into a single string within a sub list.
Here is another solution, that ends up with your desired output, using the built-in startswith method (see the documentation).
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'], ['2','abc_456', '0.99'], ['letters and words','end','99']]
l1 = []
l2 = []
last = False
for x in code_result:
if last:
for y in range(len(x)):
l1.append(x[y])
if y == len(x)-1:
l2.append([' '.join(l1)])
l1 = []
last = False
else:
for y in x:
a = re.search('^[a-zA-Z]', y)
if a:
l1.append(y)
last = True
break
print(l2)
This code iterates over your list of lists, checks whether an item in a list starts with 'abc' and breaks the inner loop. If the last is True, it appends all items from the subsequent list.

most pythonic way to compare substrings l in list L to string S & edit S according to l in L?

The list ['a','a #2','a(Old)'] should become {'a'} because '#' and '(Old)' are to be excised and a list of duplicates isn't needed. I struggled to develop a list comprehension with a generator and settled on this since I knew it'd work and valued time more than looking good:
l = []
groups = ['a','a #2','a(Old)']
for i in groups:
if ('#') in i: l.append(i[:i.index('#')].strip())
elif ('(Old)') in i: l.append(i[:i.index('(Old)')].strip())
else: l.append(i)
groups = set(l)
What's the slick way to get this result?
Here is general solution, if you want to clean elements of list lst from parts in wastes:
lst = ['a','a #2','a(Old)']
wastes = ['#', '(Old)']
cleaned_set = {
min([element.split(waste)[0].strip() for waste in wastes])
for element in arr
}
You could write this whole expression in a single set comprehension
>>> groups = ['a','a #2','a(Old)']
>>> {i.split('#')[0].split('(Old)')[0].strip() for i in groups}
{'a'}
This will get everything preceding a # and everything preceding '(Old)', then trim off whitespace. The remainder is placed into a set, which only keeps unique values.
You could define a helper function to apply all of the splits and then use a set comprehension.
For example:
lst = ['a','a #2','a(Old)', 'b', 'b #', 'b(New)']
splits = {'#', '(Old)', '(New)'}
def split_all(a):
for s in splits:
a = a.split(s)[0]
return a.strip()
groups = {split_all(a) for a in lst}
#{'a', 'b'}

removing duplicates from a bool list

I am trying to get a word in a list that is followed by a word with a ''.'' in it. for example, if this is a list
test_list = ["hello", "how", "are.", "you"]
it would select the word ''you'' I have managed to pull this off but I am trying to ensure that I do not get duplicate words.
Here is what I have so far
list = []
i = 0
bool = False
words = sent.split()
for word in words:
if bool:
list.append(word)
bool = False
# the bellow if statment seems to make everything worse instead of fixing the duplicate problem
if "." in word and word not in list:
bool = True
return list
Your whole code can be reduced to this example using zip() and list comprehension:
a = ['hello', 'how', 'are.', 'you']
def get_new_list(a):
return [v for k,v in zip(a, a[1:]) if k.endswith('.')]
Then, to remove the duplicates, if there is any, you can use set(), like this example:
final = set(get_new_list(a))
output:
{'you'}
This isn't based off of the code you posted, however it should do exactly what you're asking.
def get_word_after_dot(words):
for index, word in enumerate(words):
if word.endswith('.') and len(words) - index > 1:
yield words[index + 1]
Iterating over this generator will yield words that are followed by a period.
Here is a different approach to the same problem.
import itertools
from collections import deque
t = deque(map(lambda x: '.' in x, test_list)) # create a deque of bools
>>deque([False, False, True, False])
t.rotate(1) # shift it by one since we want the word after the '.'
>>deque([False, False, False, True])
set(itertools.compress(test_list, t)) # and then grab everywhere it is True
>>{'you'}
In the itertools recipes is the definition of pairwise which is useful to iterating a list 2 at a time:
def pairwise(iterable):
a, b = it.tee(iterable)
next(b, None)
return a, b
You can use this to create a list of words that follow a word ending in '.':
words = [n for m, n in zip(*pairwise(l)) if m[-1] == '.']
Remove duplicates:
seen = set()
results = [x for x in words if not (x in seen or seen.add(x))]

Categories

Resources