Python Repeat loop iteration on exception, supply new values - python

In my script, I have to hit an endpoint multiple times as there is a cap of the number of values that can be passed through. So I iterate through sublists which works. The problem is that there are times where a set of values in a sublist will be rejected by the endpoint, however, there is no way of knowing which ones will be rejected until I get the rejected list of values from the response. I can raise an exception in these scenarios, remove the invalid values, and hit the end point again (i.e. repeat the code in the exception as the first try).
However, I was curious to know if there is a better approach to do this. Something along the lines of when iteration breaks, I restart the same iteration but this time with the new values supplied. What is best practice? See examples below:
#repeat code in exception with newly created sublist. this works but repeat code
for sublist in sublists:
try:
#supply sublist to endpoint
except:
#create new sublist with erroneous values removed
#copy code in try here but supply new list
#restart same iteration after rejected but with new values. this doesnt work, but is there a way to make it work?
i=0
while i<len(sublists):
try:
#supply sublists[i] to endpoint
except:
#create new sublist[i] with erroneous values removed
#restart current iteration but this time with the newly created sublist[i]
i=i

Related

Python Pandas. Endless cycle

Why does this part of the code have an infinite loop? It can't be so, because where I stop this part of code (in Jupyter Notebook), all 99999 values have changed to oil_mean_by_year[data.loc[i]['year']]
for i in data.index:
if data.loc[i]['dcoilwtico'] == 99999:
data.loc[i, 'dcoilwtico'] = oil_mean_by_year[data.loc[i]['year']]
Use merge to align the oil mean of a year with the given row:
Merge on data['year'] vs oil_mean_by_year's index
data_with_oil_mean = pd.merge(data, oil_mean_by_year.rename("oil_mean"),
left_on="year", right_index=True, how="left")
data_with_oil_mean['dcoilwtico'] = data_with_oil_mean['dcoilwtico'].mask(lambda xs: xs.eq(99999), data_with_oil_mean['oil_mean'])
This is a common mistake when using Pandas and it happens due to some misunderstanding about how Python works with lists. Let's take a look at what actually happens here.
We are trying to change dcoilwtico value for each row that has year equal to 99999. We do that by assigning new value to this column only if current value equals 99999. That means we need to check every single element of our list against 99999 and then assign new value to dcoilwtico only if needed. But there is no way to perform such operation on a list like this one without knowing its length beforehand! So, as soon as you try to access any item from this list that doesn't exist yet - e.g., data.loc(i, 'dcoilwtico') - your program will crash. And since you don't know anything about size of this list before running the script, it'll keep crashing until either memory runs out or you manually terminate the process.
The solution to this problem is simple. Just make sure that your condition checks whether index exists first:
if data.loc(i, 'dcoilwtico') == 99999:
data.loc(i, 'dcoilwtico') = oil_mean_by_year.get(data.loc(i, 'year'), 0)
else:
#...
Now your script should work fine.

I have a list with 7 elements in Python, but the len operator returns a length of 1

New to Python, this is my first application. I've been staring at this a while, and I'm sure I have some fundamental misunderstanding about what's going on.
In this example I have a list of 7 str (entries), and an assignment statement:
listLen = len(entries)
Followed by a breakpoint, and below is a screen capture showing the debugger where listLen is assigned a value of 1, and entries is a {list: 7}
I'd expect len(entries) to return a value of 7, but I can't seem to get the expected behavior. What am I missing?
UPDATE: I thought the answer was in the for loop modifying the list but apparently not.
If I set a breakpoint before assigning entries and single step through with the debugger including the for loop everything looks good and works.
If I set a breakpoint ON the for loop and single step once, entries again appears to be a {list: 7} but the len(entries) appears to be 1. The for loop executes one loop and exits.
The deep copy entriesCopy I made for debug is used nowhere else, and gets changed to [''], but I assume that since it's not used it gets optimized out or garbage collected, though it doesn't when single-stepping from an earlier breakpoint.
After breaking on the 'for' loop and single stepping once to the beginning of the 'while' loop:
Why would single stepping through the code work fine, but breaking at the for loop cause len(entries) to be wrong?
Single stepping from earlier breakpoint works fine, and the program returns the correct result:
I'm still struggling to get a minimum reproducible sample of code.
Here's more of the code:
entries = self.userQuery.getEntries()
entriesCopy = copy.deepcopy(self.userQuery.getEntries())
entryList = list()
listLen = len(entries)
for ii in range(0,listLen):
while ("\n\n") in entries[ii]: entries[ii]=entries[ii].replace("\n\n","\n") #strip double newlines
while ("\t") in entries[ii]: entries[ii] = entries[ii].replace("\t", "") # strip tabs
entryList=entries[ii].split("\n")
while("" in entryList): entryList.remove('')
self.SCPIDictionary[self.instructions[ii][1].replace("\n","")]=entryList;
Look a little higher in your debug output- you can see on line 42 entries: ['']
I can't read the code in your for loop so I don't know whats happening, but you seem to be modifying the list in there. If you use the "hover" to look at the value, you would get the current value of that variable. You set the breakpoint on the "for" part of the loop- try setting it on the first line of the loop and the line before the loop and watch for that entries list to get mutated.
--- edit ---
You provided more code. Its... kind of insane. Why are you modifying the "entries" object repeatedly in while loops? Then you copy the entry into another object, and then replace a value in some dictionary with the entry you just copied (with the key determined after running string transformations on a matrix dictionary?)
Two things-
To debug this, I am concerned about the types. Does "getEntries" actually return a list of strings, or is it a resultproxy or something similar? Sqlalchemy for example does not actually return a list. The python debugger is great, but you're doing so much mutation here- instead, lets use print statements. do print(entries) after every line. That will let you see when things are changing, and at least how many times your loop is executing. If it is something like a result proxy, as an example, after you finished iterating over it, there may just not be anything left in there when you look at it in the debugger.
consider this- instead of modifying all these mutable objects, pull out the values and modify those. As a rough draft-
for entry in entries:
values = []
for val in entry.replace("\n\n", "\n").replace("\t, "").split("\n"):
if val:
values.append(val)
self.CCPIDictionary[something?] = values

How can I modify a pandas dataframe I'm iterating over?

I know - this is verboten.
But when optimize.curve_fit hits a row of (maybe 5) identical values, it quits and returns a straight line.
I don't want to remove ALL duplicates, but I thought I might remove the middle member of any identical triplets, without doing too much damage to the fit.
So I wanted to use iterrows, and drop rows as I go, but I understand I may be working on a copy, not the original.
Or, I could just do an old-fashioned loop with an index.
How can I do this safely, and in such a way that the end parameter of the loop is updated each time I do a deletion?
Here's an example:
i = 1
while i < len(oneDate.index)-1:
print("triple=",oneDate.at[i-1,"Nprem"],oneDate.at[i,"Nprem"],oneDate.at[i+1,"Nprem"])
if oneDate.at[i,"Nprem"]==oneDate.at[i-1,"Nprem"] and oneDate.at[i,"Nprem"]==oneDate.at[i+1,"Nprem"]:
print("dropping i=",i,oneDate.at[i,"Nprem"])
oneDate.drop([i])
oneDate = oneDate.reset_index(drop=True)
pause()
else: i = i +1
I assumed that when I dropped and reset, the next item would move into the deleted slot, so I wouldn't have to increment the index. But it didn't, so I got an infinite loop.
OK, I found the , inplace=True option and it now works fine.

sorting without sort

Basically, I want to sort numbers without using 'sort'.
what I plan to do is create a new list and put every Min number into it
such as:
for item in List:
if item < (Min):
Min = item
nList.append(Min)
List.remove(Min)
which List is input list, Min=List[0] and nList =[]
How can I use double loop to keep it run?
Your first problem is that it only runs through the list once becauseā€¦ you wrote a for loop that explicitly runs through the list once, and no other loops.
If you want it to run through the list repeatedly, put another loop around it.
For example, since you're removing values from the original list each time through the loop, you could just keep going until you've remove them all, by adding while List: as an outer loop:
while List:
for item in List:
if item < (Min):
Min = item
nList.append(Min)
List.remove(Min)
This will not actually work as-is, but that's because of other flaws in your original logic, not anything new to the while loop.
The first obvious problems are:
You're removing elements from List as you iterate over it. This is illegal and technically anything could happen, but what actually will happen is that your iteration will skip over some of the elements.
You start Min with List[0], despite the fact that this is generally not the minimum. This means at least your first pass will add elements in the wrong order.
Eventually you will reach a point where item >= Min for every item left in List. What happens then? You never move anything over, and just loop forever doing nothing.
What you are doing (apart from logic errors) is still sorting - it's known as heap sort, and it takes O(n log n) time.
If you don't keep the list as a heap, your finding the minimum will be O(n) instead of O(log n), and your sort will run asymptotically as badly as bubble sort - O(n^2).

How to stop infinite recursion when Python objects trigger each other's updates?

I'm using PyGTK and the gtk.Assistant widget. On one page I have six comboboxes, which initially have the same contents (six numbers). When the users selects a number in one of those comboboxes, this number should no longer be available in the other five boxes (unless it's present as a duplicate in the original list). Hence I would like to always update the contents.
I have tried the following approach (just a few code snippets here), but (of course...) it just jumps into infinite recursion once the process has been triggered:
# 'combo_list' is a list containing the six comboboxes
def changed_single_score(self, source_combo, all_scores, combo_list, indx_in_combo_list):
scores = all_scores.split(', ')
for i in range(6):
selected = self.get_active_text(combo_list[i])
if selected in scores:
scores.remove(selected)
# 'scores' only contains the items that are still available
for indx in range(6):
# don't change the box which triggered the update
if not indx == indx_in_combo_list:
# the idea is to clear each list and then repopulate it with the
# remaining available items
combo_list[indx].get_model().clear()
for item in scores:
combo_list[indx].append_text(item)
# '0' is appended so that swapping values is still possible
combo_list[indx].append_text('0')
The above function is called when a change occurs in one of the comboboxes:
for indx in range(6):
for score in self.selected['scores'].split(', '):
combo_list[indx].append_text(score)
combo_list[indx].connect('changed', self.changed_single_score, self.selected['scores'], combo_list, indx)
Perhaps I ought to mention that I'm new to Python, OOP, and also rather new to GUI-programming. I'm probably being really stupid here, and/or overlooking the obvious solution, but I have so far been unable to figure out how to stop each box from triggering updating of all other boxes once it itself has been updated.
Thanks in advance for your replies - any help would be greatly appreciated.
The simplest fix for this sort of problem is generally to figure out if you're going to need to change the contents of the object (the combobox, in your case) and then only apply changes if you're actually changing something. This way you'll only propagate update events as far as they do something.
This should look something like:
# '0' is appended so that swapping values is still possible
items = [item for item in scores] + ['0']
for indx in range(6):
# don't change the box which triggered the update
if not indx == indx_in_combo_list:
# I'm not 100% sure that this next line is correct, but it should be close
existing_values = [model_item[0] for model_item in combolist[indx].get_model()]
if existing_values != items:
# the idea is to clear each list and then repopulate it with the
# remaining available items
combo_list[indx].get_model().clear()
for item in items:
combo_list[indx].append_text(item)
This is a pretty general approach (even some build systems use it). The main requirement is that things actually do settle. In your case it should settle immediately.

Categories

Resources