Appending list using .capitalize() - compiler freezes with high RAM usage - python

I'm trying to swap str elements in list with same elements but capitalize their first letter.
While trying to achieve this I'm just step by stepping it and when I try to use for loop to just append list with capitalized elements, my compiler freezes and proceeds to gradually increase in RAM usage up to 90%.
I can guess it has to do something with built in functions that I use (probably incorrectly). Can anyone help me understand what is happening and how should I approach it?
Here is code:
title = 'a clash of KINGS'
out = title.split()
for i in out:
out.append(i.capitalize())

Don't change a list while iterating over it. You keep adding elements to out list. You can print out inside the loop and see for yourself. Even if it didn't enter into infinite loop, still you did not replace the initial values, but just add more and more elements.
you can use list comprehension
title = 'a clash of KINGS'
out = title.split()
out = [word.capitalize() for word in out]
you can combine last 2 lines into one
title = 'a clash of KINGS'
out = [word.capitalize() for word in title.split()]

I think you are in a infinite loop. You're not accessing the out element, you're keep appending a lot of elements inside the list. I think what you're trying to do is:
title = 'a clash of KINGS'
out = title.split()
for i in range(len(out)):
out[i] = out[i].capitalize()

Related

Maya Python - Loop list elements for splitting

Hi guys I would like to loop multiple elements in a list elements creating a for loop, to make this script work with multi selection.. at the moment I can find an appropriate solution to append each element contained in myStr... Any idea? (at the moment I've added myStr[0] to pick first element)
myStr = cmds.ls(sl=1)
for i in myStr:
splits = myStr[0].split('_')
ver_up = int(splits[-2]) + 1
splits[-2] = '%04d'%ver_up
newStr = '_'.join(splits)
print(newStr)
cmds.duplicate(n=newStr)
If I understand you correctly, then you have several objects with a name based on <objectName>_<version>_<otherIdentifiers>. And you want to duplicate them and count up the version number. Correct? If so, you just have to add the object you want to duplicate to the cmds.duplicate() command like this:
cmds.duplicate(i, n=newStr)

Split string and take only part of it (python)

QUESTION
I have a list of strings, let's call it input_list, and every string in this list is made of five words divided only by a "%" character, like
"<word1>%<word2>%<word3>%<word4>%<word5>"
My goal is, for every element of input_list to make a string made only by <word3> and <word4> divided by the "%" sign, like this "<word3>%<word4>", and create a new list made by these strings.
So for example, if:
input_list = ['the%quick%brown%fox%jumps', 'over%the%lazy%dog%and']
then the new list will look like this
new_list = ['brown%fox', 'lazy%dog']
IMPORTANT NOTES AND POSSIBLE ANSWERS
The length of each word is random, so I can't just use string slicing or guess in any way how <word3> and <word4> start.
A possible way to answer this would the following, but I want to know if there is a better and maybe (computationally) faster way, without having to create a new variable (current_list) and/or without having to consider/split the whole string (maybe using regex?)
input_list = ['the%quick%brown%fox%jumps', 'over%the%lazy%dog%and']
new_list = []
for element in input_list:
current_list = element.split('%')
final_element = [current_list[2], current_list[3]]
new_list.append(final_element)
EDIT:
I tried to compare the running time of #Pac0 answer with the running time of #bb1 answer, and, with an input list of 100 strings, #Pac0 has a running time of 92.28286 seconds, #bb1 has a running time of 42.6106374 seconds. So I will consider #bb1 one as the answer.
new_list = ['%'.join(w.split('%')[2:4]) for w in input_list]
You can use a regular expression (regex) with a capture group:
import re
pattern = re.compile('[^%]*%[^%]*%([^%]*%[^%]*)%[^%]*')
input_list = ['the%quick%brown%fox%jumps', 'over%the%lazy%dog%and']
result = [pattern.search(s).group(1) for s in input_list]
print(result)
Note: the "compile" part is not strictly needed, but can help performance if you have a lot of strings to process.
How about this?
input_list = ['the%quick%brown%fox%jumps', 'over%the%lazy%dog%and']
new_list = ['%'.join(x.split('%')[2:4]) for x in input_list]
print (new_list)
Output
['brown%fox', 'lazy%dog']

Python: increment index range by 1

I have this code:
f = open('story.txt', 'r')
story = f.read()
a = 0
b = 2
active_words = story[a:b]
I'm trying to make it so later on in the program I can increase the range of this index by (+1), so that instead of taking one word out of the story as active_words, it takes 2,3,4... etc. Ideally this would be able to happen inside part of a while loop.
My apologies for my lack of formatting, this is my first post.
I'm also just starting to learn python and so there's probably a dead easy solution I've overlooked... I've tried trying to define a function like extend(active_words), but to no avail.
Thanks in advance!
A basic nested for-loop seems to be what you are looking for:
n = len(story)
for k in range(2,n+1):
for j in range(n-k+1):
active_words = story[j:j+k]
The outer loop controls the length of the slice and the inner loop actually generates the slices.
Note that these are just string slices, which will contain fragments of words. You might want something like story = story.split() prior to the loop.
Are you trying to split a sentence into words?
if so, try instead using:
story = f.read()
words = story.split()
for word in words:
print(word)

How to search for strings within nested lists

One of the questions for an assignment I'm doing consists of looking within a nested lists consisting of "an ultrashort story and its author.", to find a string that was inputted by a user. Not to sure on how to go about this, here is the assignment brief below if anyone would like more clarification. There are also more questions I'm not to sure on eg "find all stories by a certain author". Some explanations, or point me in the right direction is greatly appreciated :)
list = []
mylist = [['a','b','c'],['d','e','f']]
string = input("String?")
if string in [elem for sublist in mylist for elem in sublist] == True:
list.append(elem)
This is just an example of something i've tried, the list above is similar enough to the one i'm actually using for the question. I've just currently been going through different methods of iterating over a nested lists and adding mathcing items to another list. above code is just one example of an attemp i've made at this proccess.
""" the image above states that the data is in the
form of an list of sublists, with each sublist containing
two strings
"""
stories = [
['story string 1', 'author string 1'],
['story string 2', 'author string 2']
]
""" find stories that contain a given string
"""
stories_with_substring = []
substring = 'some string' # search string
for story, author in stories:
# if the substring is not in the story, a ValueError is raised
try:
story.index(substring)
stories_with_substring.append((story, author))
except ValueError:
continue
""" find stories by a given author
"""
stories_by_author = []
target_author = 'first last'
for story, author in stories:
if author == target_author:
stories_by_author.append((story, author))
This line here
for story, author in stories:
'Unpacks' the array. It's equivalent to
for pair in stories:
story = pair[0]
author = pair[1]
Or to go even further:
i = 0
while i < len(stories):
pair = stories[i]
story = pair[0]
author = pair[1]
I'm sure you can see how useful this is when dealing with lists that contain lists/tuples.
You may need to call .lower() on some of the strings if you want the search to be case insensitive
You can do a few things here. Your example showed the use of a list comprehension, so let's focus on some other aspects of this problem.
Recursion
You can define a function that iterates through all the items in the top level list. Assuming you know for sure all items are either strings or more lists, you can use type() to check if each item is another list, or is a string. If it's a string, do your search - if it's a list, have your function call itself. Let's look at an example. Please note that we should never using variables named list or string - these are core value types and we don't want to accidentally overwrite them!
mylist = [['a','b','c'],['d','e','f']]
def find_nested_items(my_list, my_input):
results = []
for i in mylist:
if type(i) == list:
items = find_nested_items(i, my_input)
results += items
elif my_input in i:
results.append(i)
return results
We're doing a few things here:
Creating an empty list named results
Iterating through the top level items of my_list
If one of those items is another list, we have our function call itself - at some point this will trigger the condition where an item is not a list, and will eventually return the results from that. For now, we assume the results we're getting back are going to be correct, so we concatenate those results to our top level results list
If the item is not a list, we simply check for the existence of our input and if so, add it to our results list
This kind of recursion is typically very safe, because it's inherently limited by our data structure. It can't run forever unless the data structure itself is infinitely deep.
Generators
Next, let's look at a much cooler function of python 3: generators. Right now, we're doing all the work of collecting the results in one go. If we later on want to iterate through those results, we need to iterate over them separately.
Instead of doing that, we can define a generator. This works almost the same, practically speaking, but instead of collecting the results in one loop and then using them in a second, we can collect and use each result all within a single loop. A generator "yields" a value, then stops until it is called the next time. Let's modify our example to make it a generator:
mylist = [['a','b','c'],['d','e','f']]
def find_nested_items(my_list, my_input):
for i in mylist:
if type(i) == list:
yield from find_nested_items(i, my_input)
elif my_input in i:
yield i
You'll notice this version is a fair bit shorter. There's no need to hold items in a temporary list - each item is "yielded", which means it's passed directly to the caller to use immediately, and the caller will stop our generator until it needs the next value.
yield from basically does the same recursion, it simply sets up a generator within a generator to return those nested items back up the chain to the caller.
These are some good techniques to try - please give them a go!

Compare items in list with nested for-loop

I have a list of URLs in an open CSV which I have ordered alphabetically, and now I would like to iterate through the list and check for duplicate URLs. In a second step, the duplicate should then be removed from the list, but I am currently stuck on the checking part which I have tried to solve with a nested for-loop as follows:
for i in short_urls:
first_url = i
for s in short_urls:
second_url = s
if i == s:
print "duplicate"
else:
print "all good"
The print statements will obviously be replaced once the nested for-loop is working. Currently, the list contains a few duplicates, but my nested loop does not seem to work correctly as it does not recognise any of the duplicates.
My question is: are there better ways to do perform this exercise, and what is the problem with the current nested for-loop?
Many thanks :)
By construction, your method is faulty, even if you indent the if/else block correctly. For instance, imagine if you had [1, 2, 3] as short_urls for the sake of argument. The outer for loop will pick out 1 to compare to the list against. It will think it's finding a duplicate when in the inner for loop it encounters the first element, a 1 as well. Essentially, every element will be tagged as a duplicate and if you plan on removing duplicates, you'll end up with an empty list.
The better solution is to call set(short_urls) to get a set of your urls with the duplicates removed. If you want a list (as opposed to a set) of urls with the duplicates removed, you can convert the set back into a list with list(set(short_urls)).
In other words:
short_urls = ['google.com', 'twitter.com', 'google.com']
duplicates_removed_list = list(set(short_urls))
print duplicates_removed_list # Prints ['google.com', 'twitter.com']
if i == s:
is not inside the second for loop. You missed an indentation
for i in short_urls:
first_url = i
for s in short_urls:
second_url = s
if i == s:
print "duplicate"
else:
print "all good"
EDIT: Also you are comparing every element of an array with every element of the same array. This means compare the element at position 0 with the element at postion 0, which is obviously the same.
What you need to do is starting the second for at the position after that reached in the first for.

Categories

Resources