Recursive Selection Sort python - python

There is a recursive selection sort in the upcoming question that has to be done.
def selsort(l):
"""
sorts l in-place.
PRE: l is a list.
POST: l is a sorted list with the same elements; no return value.
"""
l1 = list("sloppy joe's hamburger place")
vl1 = l1
print l1 # should be: """['s', 'l', 'o', 'p', 'p', 'y', ' ', 'j', 'o', 'e', "'", 's', ' ', 'h', 'a', 'm', 'b', 'u', 'r', 'g', 'e', 'r', ' ', 'p', 'l', 'a', 'c', 'e']"""
ret = selsort(l1)
print l1 # should be """[' ', ' ', ' ', "'", 'a', 'a', 'b', 'c', 'e', 'e', 'e', 'g', 'h', 'j', 'l', 'l', 'm', 'o', 'o', 'p', 'p', 'p', 'r', 'r', 's', 's', 'u', 'y']"""
print vl1 # should be """[' ', ' ', ' ', "'", 'a', 'a', 'b', 'c', 'e', 'e', 'e', 'g', 'h', 'j', 'l', 'l', 'm', 'o', 'o', 'p', 'p', 'p', 'r', 'r', 's', 's', 'u', 'y']"""
print ret # should be "None"
I know how to get this by using key → l.sort(key=str.lower). But the question wants me to extract the maximum element, instead of the minimum, only to .append(...) it on to a recursively sorted sublist.
If I could get any help I would greatly appreciate it.

So. Do you understand the problem?
Let's look at what you were asked to do:
extract the maximum element, instead of the minimum, only to .append(...) it on to a recursively sorted sublist.
So, we do the following things:
1) Extract the maximum element. Do you understand what "extract" means here? Do you know how to find the maximum element?
2) Recursively sort the sublist. Here, "the sublist" consists of everything else after we extract the maximum element. Do you know how recursion works? You just call your sort function again with the sublist, relying on it to do the sorting. After all, the purpose of your function is to sort lists, so this is supposed to work, right? :)
3) .append() the maximum element onto the result of sorting the sublist. This should not require any explanation.
Of course, we need a base case for the recursion. When do we have a base case? When we can't follow the steps exactly as written. When does that happen? Well, why would it happen? Answer: we can't extract the maximum element if there are no elements, because then there is no maximum element to extract.
Thus, at the beginning of the function we check if we were passed an empty list. If we were, we just return an empty list, because sorting an empty list results in an empty list. (Do you see why?) Otherwise, we go through the other steps.

the sort method should do what you want. If you want the reverse, just use list.reverse()
If your job is to make your own sort method, that can be done.
Maybe try something like this:
def sort(l):
li=l[:] #to make new copy
newlist = [] #sorted list will be stored here
while len(li) != 0: #while there is stuff to be sorted
bestindex = -1 #the index of the highest element
bestchar = -1 #the ord value of the highest character
bestcharrep = -1 #a string representation of the best character
i = 0
for v in li:
if ord(v) < bestchar or bestchar == -1:#check if string is lower than old best
bestindex = i #Update best records
bestchar = ord(v)
bestcharrep = v
i += 1
del li[bestindex] #delete retrieved element from list
newlist.append(bestcharrep) #add element to new list
return newlist #return the sorted list

Related

Unique elements of sublists depending on specific value in sublist

I an trying to select unique datasets from a very large quite inconsistent list.
My Dataset RawData consists of string-items of different length.
Some items occure many times, for example: ['a','b','x','15/30']
The key to compare the item is always the last string: for example '15/30'
The goal is: Get a list: UniqueData with items that occure only once. (i want to keep the order)
Dataset:
RawData = [['a','b','x','15/30'],['d','e','f','g','h','20/30'],['w','x','y','z','10/10'],['a','x','c','15/30'],['i','j','k','l','m','n','o','p','20/60'],['x','b','c','15/30']]
My desired solution Dataset:
UniqueData = [['a','b','x','15/30'],['d','e','f','g','h','20/30'],['w','x','y','z','10/10'],['i','j','k','l','m','n','o','p','20/60']]
I tried many possible solutions for instance:
for index, elem in enumerate(RawData): and appending to a new list if.....
for element in list does not work, because the items are not exactly the same.
Can you help me finding a solution to my problem?
Thanks!
The best way to remove duplicates is to add them into a set. Add the last element into a set as to keep track of all the unique values. When the value you want to add is already present in the set unique do nothing if not present add the value to set unique and append the lst to result list here it's new.
Try this.
new=[]
unique=set()
for lst in RawData:
if lst[-1] not in unique:
unique.add(lst[-1])
new.append(lst)
print(new)
#[['a', 'b', 'x', '15/30'],
['d', 'e', 'f', 'g', 'h', '20/30'],
['w', 'x', 'y', 'z', '10/10'],
['i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', '20/60']]
You could set up a new array for unique data and to track the items you have seen so far. Then as you loop through the data if you have not seen the last element in that list before then append it to unique data and add it to the seen list.
RawData = [['a', 'b', 'x', '15/30'], ['d', 'e', 'f', 'g', 'h', '20/30'], ['w', 'x', 'y', 'z', '10/10'],
['a', 'x', 'c', '15/30'], ['i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', '20/60'], ['x', 'b', 'c', '15/30']]
seen = []
UniqueData = []
for data in RawData:
if data[-1] not in seen:
UniqueData.append(data)
seen.append(data[-1])
print(UniqueData)
OUTPUT
[['a', 'b', 'x', '15/30'], ['d', 'e', 'f', 'g', 'h', '20/30'], ['w', 'x', 'y', 'z', '10/10'], ['i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', '20/60']]
RawData = [['a','b','x','15/30'],['d','e','f','g','h','20/30'],['w','x','y','z','10/10'],['a','x','c','15/30'],['i','j','k','l','m','n','o','p','20/60'],['x','b','c','15/30']]
seen = []
seen_indices = []
for _,i in enumerate(RawData):
# _ -> index
# i -> individual lists
if i[-1] not in seen:
seen.append(i[-1])
else:
seen_indices.append(_)
for index in sorted(seen_indices, reverse=True):
del RawData[index]
print (RawData)
Using a set to filter out entries for which the key has already been seen is the most efficient way to go.
Here's a one liner example using a list comprehension with internal side effects:
UniqueData = [rd for seen in [set()] for rd in RawData if not(rd[-1] in seen or seen.add(rd[-1])) ]

How to compare element in list with the next element and separate if same?

I have a string, that contains random letters and numbers, but if there are two letters or numbers that are same and next to each other, then you have to separate them with "/". So the input string is "uBBjkko", and the result should be "uB/Bjk/ko".
Right now I have converted my string to list so I could compare every element to the the next:
mylist ['u', 'B', 'B', 'j', 'k', 'k', 'o']
for i in range(len(mylist)):
if mylist[i] == mylist[i + 1]:
mylist.insert(i + 1, "/")
print("".join(mylist))
but the code doesn't work if the list gets too long and if the list ends with two same letters or number such as
['u', 'B', 'B', 'j', 'k', 'k', 'o', '2', '2']
then the output will be "uB/Bjk/ko22" but it needs to be "uB/Bjk/ko2/2".
So as I said in the comment the problem is, that you insert while iterating. By iterating the other way around from end to begin you fix it. This way your iteration is not affected by the insertion:
mylist = ['u', 'B', 'B', 'j', 'k', 'k', 'o', '2', '2']
for i in range(len(mylist)-1, 0, -1): # This goes from len(mylist)-1 to 0 in -1 steps
if mylist[i] == mylist[i-1]:
mylist.insert(i, '/')
print("".join(mylist))
from itertools import zip_longest
mylist = ['u', 'B', 'B', 'j', 'k', 'k', 'o', '2', '2']
print("".join([a + ('/' if a == b else '') for a,b in zip_longest(mylist, mylist[1:], fillvalue='')]))
Though it may be a bit much, you can learn about itertools module. zip function or zip_longest function in this case and list comprehension too.

How do I alphabetically sort an array of strings without any sort functions? Python

When solving the following problem:
"Assuming you have a random list of strings (for example: a, b, c, d, e, f, g), write a program that will sort the strings in alphabetical order.
You may not use the sort command."
I run into the problem of running strings through the following code, which sometimes gets me duplicated strings in final list
I am fairly new to python and our class just started to look at numpy, and functions in that module, and im not sure of any being used in the code (except any sort function).
import numpy as np
list=[]
list=str(input("Enter list of string(s): "))
list=list.split()
print() # for format space purposes
listPop=list
min=listPop[0]
newFinalList=[]
if(len(list2)!=1):
while(len(listPop)>=1):
for i in range(len(listPop)):
#setting min=first element of list
min=listPop[0]
if(listPop[i]<=min):
min=listPop[i]
print(min)
listPop.pop(i)
newFinalList.append(min)
print(newFinalList)
else:
print("Only one string inputted, so already alphabatized:",list2)
Expected result of ["a","y","z"]
["a","y","z"]
Actual result...
Enter list of string(s): a y z
a
a
a
['a', 'a', 'a']
Enter list of string(s): d e c
d
c
d
d
['c', 'd', 'd']
Selection sort: for each index i of the list, select the smallest item at or after i and swap it into the ith position. Here's an implementation in three lines:
# For each index i...
for i in range(len(list)):
# Find the position of the smallest item after (or including) i.
j = list[i:].index(min(list[i:])) + i
# Swap it into the i-th place (this is a no-op if i == j).
list[i], list[j] = list[j], list[i]
list[i:] is a slice (subset) of list starting at the ith element.
min(list) gives you the smallest element in list.
list.index(element) gives you the (first) index of element in list.
a, b = b, a atomically swaps the values of a and b.
The trickiest part of this implementation is that when you're using index to find the index of the smallest element, you need to find the index within the same list[i:] slice that you found the element in, otherwise you might select a duplicate element in an earlier part of the list. Since you're finding the index relative to list[i:], you then need to add i back to it to get the index within the entire list.
You can implement Quick sort for same:
def partition(arr,low,high):
i = ( low-1 )
pivot = arr[high]
for j in range(low , high):
if arr[j] <= pivot:
i = i+1
arr[i],arr[j] = arr[j],arr[i]
arr[i+1],arr[high] = arr[high],arr[i+1]
return ( i+1 )
def quickSort(arr,low,high):
if low < high:
pi = partition(arr,low,high)
quickSort(arr, low, pi-1)
quickSort(arr, pi+1, high)
arr = ['a', 'x', 'p', 'o', 'm', 'w']
n = len(arr)
quickSort(arr,0,n-1)
print ("Sorted list is:")
for i in range(n):
print ("%s" %arr[i]),
output:
Sorted array is:
a m o p w x
Mergesort:
from heapq import merge
from itertools import islice
def _ms(a, n):
return islice(a,n) if n<2 else merge(_ms(a,n//2),_ms(a,n-n//2))
def mergesort(a):
return type(a)(_ms(iter(a),len(a)))
# example
import string
import random
L = list(string.ascii_lowercase)
random.shuffle(L)
print(L)
print(mergesort(L))
Sample run:
['h', 'g', 's', 'l', 'a', 'f', 'b', 'z', 'x', 'c', 'r', 'j', 'q', 'p', 'm', 'd', 'k', 'w', 'u', 'v', 'y', 'o', 'i', 'n', 't', 'e']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

How can method which evaluates a list to determine if it contains specific consecutive items be improved?

I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)
#Example nestedList:
nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]
I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.
I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.
def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False
I will do this,
bad = ['O', 'I']
for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)
#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.
I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.
nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]
#first create a map
pairdict = dict()
for i in range(len(nestedList)):
for j in range(len(nestedList[i])-1):
pair1 = (nestedList[i][j],nestedList[i][j+1])
if pair1 in pairdict:
pairdict[pair1].append(i+1)
else:
pairdict[pair1] = [i+1]
pair2 = (nestedList[i][j+1],nestedList[i][j])
if pair2 in pairdict:
pairdict[pair2].append(i+1)
else:
pairdict[pair2] = [i+1]
del nestedList
print(pairdict.get(('e','z'),None))
create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.
I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).
import re
p = re.compile('[OI]{2}|[IO]{2}') # match only OI or IO
def is_bad(pattern, to_check):
for item in to_check:
maybe_found = pattern.search(''.join(item))
if maybe_found:
yield True
else:
yield False
l = list(is_bad(p, nestedList))
print(l)
# [True, False, True]

Python list recursion

I want to get a nesting list using python,looking likes
[[['a'],'a'],'a']
So,I wrote a recursion function to get it.
def recursion(x,i):
x.append(list('spam'))
x=[x]
i-=1
print('i value is %d'%i)
print(x)
if i>0:
print('start new recursion!')
recursion(x,i)
print('callback x"s value:',x)
#return x
But ,if I call this function like
x=[]
recursion(x,4)
The result of x is
[['s', 'p', 'a', 'm']]
I don't understand it,and I found that this function seem had get the right value of x through the stdout,
i value is 3
[[['s', 'p', 'a', 'm']]]
start new recursion!
i value is 2
[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]]
start new recursion!
i value is 1
[[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]]
start new recursion!
i value is 0
[[[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]]
callback x"s value: [[[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]]
callback x"s value: [[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]
callback x"s value: [[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]
callback x"s value: [[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]
Please tell me what happen to x and why the function don't return x's value I wanted.Thanks so much,and apologize for my poor english.
#
Thanks for your all attention.The value of x I want to get is
[[[[[['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']], ['s', 'p', 'a', 'm']]]
I'm sorry that I missed it in the first post.
I'm also not sure what you want to happen, but if you want to split a string into its characters and put them into nested lists, this function works.
def recursion(l,s):
if s == "":
return l;
elif len(l) == 0:
nL = [s[0]]
return recursion(nL,s[1:])
else:
nL = [l,s[0]]
return recursion(nL,s[1:])
So for example
print recursion([],"spam")
would output
[[[['s'], 'p'], 'a'], 'm']
I'm not sure what you want to happen, but here's what is happening:
def recursion(x,i):
x.append(list('spam'))
x=[x]
Here, x becomes [['s','p','a','m']]. But it doesn't change after this. When you call recursion(x,i) a few lines later, this does not affect the original value of x.
Maybe if you do x = recursion(x,i) instead it will give you what you want, because x will actually be change at the top level.
I think you might be confused by the fact that a Python string ("spam" is a string) is in many ways equivalent to a Python list. You can index them, get their len(), etc. In particular, pretty much anything you do in square brackets works for both string and list types.
Your first example is of a single-byte string, [[['a'], 'a'], 'a'] but you don't give us some key details: what was the input that you expected to produce this output?
For example, the input might have been:
func('a', 3)
Or might have been:
func('aaa')
Or might have been:
func(['a'], 3) # or ,2)
Or might have been:
func(['a', 'a', 'a'])
Any of those would be a reasonable starting point for a return value of [[['a'], 'a'], 'a'].
Solution
Because your example function takes a second parameter, i, I'm going to assume there is a real need for it. So let's go with the simplest possible case:
def func(lst, times):
"""
Given a list and a multiplier > 0, return a "nested" list with
the contents of the original list repeated (nested) that many
times:
Input: [a,b],2
Output: [[a,b],a,b]
"""
assert times > 0
if times == 1:
return lst[:]
else:
result = lst[:]
result.insert(0, func(lst, times-1))
return result
for times in range(1,4):
print(func(['a', 'b'], times))
Alternate
Here's a simpler function, that doesn't assume a list:
def func(content, times):
"""
Given content and a multiplier > 0, return a set of nested lists,
with the requested depth, each containing optionally any further
nested lists and the given content.
Input: 'content', 2
Output: [['content'], 'content']
"""
assert(times > 0)
if times == 1:
result = [content]
else:
result = [func(content, times-1)]
result.append(content)
return result
for times in range(1,4):
print(func('a', times))

Categories

Resources