Error when using dataset.loc in pandas in kaggle notebook - python

this error "ValueError: Must have equal len keys and value when setting with an iterable" appears when i run this code
for dataset in train_test_data: SS dataset['Age'] <= 16, 'Age'] = 0, dataset.loc[(dataset['Age'] > 16) & (dataset['Age'] <= 26), 'Age'] = 1, dataset.loc[(dataset['Age'] > 26) & (dataset['Age'] <= 36), 'Age'] = 2, dataset.loc[(dataset['Age'] > 36) & (dataset['Age'] <= 62), 'Age'] = 3, dataset.loc[ dataset['Age'] > 62, 'Age'= 4]
error]
solve the error or git hint to solve it

Related

Compare dictionary keys (datetime) with list of tuples[1] and if matches return tuple[0]

I am a beginner(ish) with Python and having trouble with getting the correct syntax for this. Any help is greatly appreciated!
I have a dictionary and a list of tuples. I would like to compare the key of my dictionary to a value in the tuple, and if meets criteria return a different tuple value. Here's the illustration:
dictionary = {datetime.datetime(2022, 4, 12, 9, 30): 30, datetime.datetime(2022, 4, 12, 11, 0): 60, datetime.datetime(2022, 4, 12, 13, 0): 30}
tuplelist = [(1, datetime.time(6, 45, 21)), (2, datetime.time(7, 15, 21)), (3, datetime.time(7, 45, 21)...etc)
The goal is to see which increment of 30 minutes my dictionary key falls into, and update it with the increment number stored in tuple list. What I tried:
for k,y in dictionary:
for i, t in tuplelist:
if t <= k <= (t+ datetime.timedelta(minutes = 30)):
dictionary[k] = t
The error I got is unable to unpack non iterable type datetime.
Any help and/or explanation is welcome! I am really enjoying learning to code but not from a CS background so always looking for the how it works in addition to just the correct syntax.
Thank you!
Update for working solution:
newdic = {}
for k,v in dictionary.items():
for item in mylist:
i, t = item
if t <= k.time() <= (datetime.combine(datetime.today(),t) + datetime.timedelta(minutes=30)).time():
newdic.update({i : v})
else:
continue
This is not the complete answer. See if it helps resolve early issues up to the comment.
import datetime
dictionary = {datetime.datetime(2022, 4, 12, 9, 30): 30,
datetime.datetime(2022, 4, 12, 11, 0): 60,
datetime.datetime(2022, 4, 12, 13, 0): 30}
tuplelist = [(1, datetime.time(6, 45, 21)), (2, datetime.time(7, 15, 21)),
(3, datetime.time(7, 45, 21),)]
for k, y in dictionary.items():
for item in tuplelist: # get each tuple from the list
i, t = item # unpack the tuple into i and t
print(f'{i=} {t=}') # Check i and t values
# if t <= k <= (t + datetime.timedelta(minutes=30)):
# dictionary[k] = t

How to write a function that accepts two tuples, and returns the merged tuple in which all integers appears in ascending order?

Write a function merge(tup1, tup2) that accepts two sorted tuples as parameters, and returns the merged tuple in which all integers appear in ascending order.
You may assume that:
tup1 and tup2 each contain distinct integers sorted in ascending order.
Integers in tup1 are different from those in tup2.
Length of tuples may also vary.
I can't use Python's sorting function.
I've tried something like this, but failed public test cases such as:
merge((-1, 1, 3, 5), (-2, 4, 6, 7)) → (-2, -1, 1, 3, 4, 5, 6, 7)
merge((-3, 8, 67, 100, 207), (-10, 20, 30, 40, 65, 80, 90)) → (-10, -3, 8, 20, 30, 40, 65, 67, 80, 90, 100, 207)
merge((-1, 1, 3, 5, 7, 9, 11), (-2, 0, 2, 4, 6)) → (-2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 9, 11)
def merge(tup1, tup2):
size_1 = len(tup1)
size_2 = len(tup2)
res = ()
i, j = 0, 0
while i < size_1 and j < size_2:
if tup1(i) < tup2(j):
res.append(tup1(i))
i += 1
else:
res.append(tup2(j))
j += 1
return res = res + tup1(i:) + tup2(j:)
Your code (algorithm) is fine, you just have a few syntax errors:
Indexing is done with [..], not (...).
You can either return or assign - not both.
And a logical (attribute) error:
Tuples don't have append - you can concatenate them by using addition (or just by using lists and converting in the end to a tuple).
Your code with the syntax errors fixed and using lists seems to work fine:
def merge(tup1, tup2):
size_1 = len(tup1)
size_2 = len(tup2)
res = []
i, j = 0, 0
while i < size_1 and j < size_2:
if tup1[i] < tup2[j]:
res.append(tup1[i])
i += 1
else:
res.append(tup2[j])
j += 1
res.extend(tup1[i:])
res.extend(tup2[j:])
return res
Unpack both tuples into a list using *operator, sort and convert to tuple.
merge = lambda t1, t2: tuple(sorted([*t1, *t2]))
Since tuples are immutable in Python, I would loop over them to copy the items one by one into a list, sort that list with .sort(), and then convert it into a tuple.

Return value at the front of a for loop [duplicate]

This question already has answers here:
What do [] brackets in a for loop in python mean?
(3 answers)
Closed 3 years ago.
I just saw some codes by others.
labels = ["{0}-{1}".format(i, i + 9) for i in range(0, 100, 10)]
print(labels)
The output is
['0-9', '10-19', '20-29', '30-39', '40-49', '50-59', '60-69', '70-79', '80-89', '90-99']
How to understand this? Are values returned at the front of the for loop?
This line,
labels = ["{0}-{1}".format(i, i + 9) for i in range(0, 100, 10)]
is equivalent to this code:
labels = []
for i in range(0, 100, 10):
labels.append("{0}-{1}".format(i, i + 9))
Let's test it out:
labels = ["{0}-{1}".format(i, i + 9) for i in range(0, 100, 10)]
another_list = []
for i in range(0, 100, 10):
another_list.append("{0}-{1}".format(i, i + 9))
print(labels == another_list)
# True
It's called List Comprehension.
Also, you have range(0, 100, 10): range is "an immutable sequence of numbers."
You can see the numbers like this:
In [1]: list(range(0, 100, 10))
Out[1]: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
First you need to understand the for loop.
for i in range(0, 100, 10) starts with i=0, and goes to i=100, in intervals of 10. So, i = [0, 10, 20, 30, 40 , 50, ..., 100].
And then the .format(i, i+9) is putting the value of i, and i+9, separeted by - in labels.

create n item long playlist-like list from multiple other lists

I'm trying to create a list with values out of three other lists (high, medium and low). The new list should evaluate from the probability_array from which list it should pick a random value. The new list should be playlist-like: before a randomly chosen value will be picked twice (from high, medium or low), all other values of the specific list (high, medium or low) must be in the newly generated list.
Any ideas, how I can achieve this. My code so far:
import numpy as np
array_length = 30
high = [(1, 2), 3, 4, 5, 6, (7, 8, 9)]
medium = [10, 11, (12, 13), 14]
low = [100, 101, 102, (103, 104)]
probability_array = np.random.choice(
['High', 'Medium', 'Low',],
array_length,
p=[4/7, 2/7, 1/7]
)
# i. e.
"""
['Low' 'High' 'High' 'High' 'High' 'High' 'Medium' 'Medium' 'High' 'Medium'
'High' 'High' 'Medium' 'Medium' 'High' 'High' 'Medium' 'Medium' 'High'
'High' 'High' 'High' 'High' 'Medium' 'Medium' 'High' 'High' 'High' 'Low'
'High']
"""
# new list should look like:
"""
[102, (1, 2), 4, (7, 8, 9), 3, 6, 14, ...]
"""
Thanks
with np.random.choice there is an option to specify the probability of items being chosen. You can build a list of high, medium, low with that and feed it into another loop to construct your playlist correctly
- at roganjosh's suggestion, I also removed the ability for the same item to occur back to back
import numpy as np
import random
import collections
playlist_length = 30
# key is the match strength (ie. high, medium, low)
matches = {
'high' : {
'items' : [(1, 2), 3, 4, 5, 6, (7, 8, 9)],
'prob' : 4/7
},
'medium' : {
'items' : [10, 11, (12, 13), 14],
'prob' : 2/7
},
'low' : {
'items' : [100, 101, 102, (103, 104)],
'prob' : 1/7
}
}
# create two lists:
# a is a list of match strengths
# p is the desired probability of an item from that match strength occuring
a, p = zip(*[(match, matches[match]['prob']) for match in matches])
# build a list of match strengths, with our chosen size and probability
results = np.random.choice(a=a, p=p, size=playlist_length)
# build our playlist
playlist = []
last_item = None
for match_strength in results:
# count all the items currently in playlist (a bit inefficient, probably don't have to recreate the Counter obj everytime)
count_playlist = collections.Counter(playlist)
# filter items of the given match strength, leaving out those that are at the current max
items = matches[match_strength]['items']
max_count = max([count_playlist[item] for item in items])
filtered = list(filter(lambda item: count_playlist[item] < max_count, items))
# if all items have the same count, reset the filtered list to be any item
if not len(filtered):
filtered = items
# drop last item so that it does not repeat
if last_item and last_item in filtered and len(filtered) > 1:
filtered.remove(last_item)
# add one from filtered items to playlist
new_item = random.choice(filtered)
playlist.append(new_item)
last_item = new_item
print(collections.Counter(results))
print(playlist)
output:
the counter shows that the different match strength occur at acceptable frequencies Counter({'high': 19, 'medium': 10, 'low': 1})
and the playlist was
[(1, 2), 14, 4, 5, 102, 3, (7, 8, 9), 6, (1, 2), 11, 10, (12, 13), 4, 3, 10, (12, 13), 5, 11, (7, 8, 9), 14, 6, 4, (7, 8, 9), 5, 10, (1, 2), 6, 3, 11, 4]
I make no claims to the efficiency of this approach (it illustrates one potential method rather than final code) but it will work by ensuring that each list (high, medium, low) is consumed entirely before it gets repeated for a second time etc. It will work for a final sequence of any length but it can easily result in the same "track" appearing back-to-back. It was not clarified in comments whether this was an issue.
import numpy as np
array_length = 30
high = [(1, 2), 3, 4, 5, 6, (7, 8, 9)]
medium = [10, 11, (12, 13), 14]
low = [100, 101, 102, (103, 104)]
# Initialise exntended lists
new_high = []
new_medium = []
new_low = []
# Create extended lists as repeating units
for x in range(3):
np.random.shuffle(high)
np.random.shuffle(medium)
np.random.shuffle(low)
new_high.extend(high)
new_medium.extend(medium)
new_low.extend(low)
# Probability distribution for consuming the extended lists
probability_array = np.random.choice(
['High', 'Medium', 'Low',],
array_length,
p=[4.0/7, 2.0/7, 1.0/7]
)
# Our final sequence
playlist = []
# Keep track of how far we got through each of the extended lists
high_counter, medium_counter, low_counter = 0, 0, 0
for pick in probability_array:
if pick == 'High':
playlist.append(new_high[high_counter])
high_counter += 1
elif pick == 'Medium':
playlist.append(new_medium[medium_counter])
medium_counter += 1
else:
playlist.append(new_low[low_counter])
low_counter += 1
print(playlist)
Try this:
def make_list(high, medium, low, parr):
_, (nhigh, nlow, nmed) = np.unique(parr, return_counts=True)
# A check can be added to make sure labels are in the set ['High', 'Low', 'Medium']
dl = {}
dl['High'] = (nhigh // len(high) + 1) * np.random.permutation(np.asanyarray(high, dtype=np.object)).tolist()
dl['Medium'] = (nmed // len(medium) + 1) * np.random.permutation(np.asanyarray(medium, dtype=np.object)).tolist()
dl['Low'] = (nlow // len(low) + 1) * np.random.permutation(np.asanyarray(low, dtype=np.object)).tolist()
play_list = []
for p in parr:
play_list.append(dl[p].pop())
return play_list

Using the for loop to pull data from two lists [duplicate]

This question already has answers here:
Is there a better way to iterate over two lists, getting one element from each list for each iteration? [duplicate]
(7 answers)
Closed 9 years ago.
On python I have two lists basicaly
xvales = [2, 4, 6, 8, 10, 12]
yvales = [100, 95, 90, 85, 80, 75]
sumation = 0
How can I use a for loop and pull corresponding values from each list to use in a formula.first iteration i=2 and j=100. second iteration i=4 and j=95. third iteration i=6 and j=90. I think you understand what I'm trying to do.
I did try to do this.
For i in xvales and j in yvales:
v = i **2 / ( j+1 )
sumation += v
total = sum(i**2 / (j+1) for i,j in zip(xvalues, yvalues))
Use zip
>>> zip(xvales, yvales)
[(2, 100), (4, 95), (6, 90), (8, 85), (10, 80), (12, 75)]
Then, loop on it, and sum it:
sumation = sum(i **2 / ( j+1 ) for i, j in zip(xvales, yvales))
Edit
However, you probably want a float division, else this results in 2:
sumation = sum(i **2 / float( j+1 ) for i, j in zip(xvales, yvales))
# 4.475365812518561
Use the zip function to iterate over both at once.
Maintaining the rest of your structure:
for i, j in zip(xvales, yvales):
v = i **2 / ( j+1 )
sumation += v
I think this does it
xvales = [2, 4, 6, 8, 10, 12]
yvales = [100, 95, 90, 85, 80, 75]
summation = sum([i ** 2 / (j+1) for i,j in zip(xvales, yvales)])

Categories

Resources