Best way to split a dict based on key

Best way to split a dict based on key - python

I have a dict that looks like this (shortened)
{'18/09/2022-morning': [5.4, 6.0, 6.5, 6.7, 6.9, 7.9, 8.5, 7.5, 7.9, 7.8, 7.6, 6.8],
'18/09/2022-night': [6.4, 5.7, 4.8, 5.4, 4.7, 4.3],
'19/09/2022-morning': [3.8],
'19/09/2022-night': [4.1, 4.4, 4.3, 3.8, 3.5, 2.8]}
What is the best way to split it into two different dictionaries based on morning/night?
I can't think of an easy way to this! Example of desired output:
dic1 = {'18/09/2022-morning': [5.4, 6.0, 6.5, 6.7, 6.9, 7.9, 8.5, 7.5, 7.9, 7.8, 7.6, 6.8], '19/09/2022-morning': [3.8]}
dic2 = {'18/09/2022-night': [6.4, 5.7, 4.8, 5.4, 4.7, 4.3], '19/09/2022-night': [4.1, 4.4, 4.3, 3.8, 3.5, 2.8]}

mornings = {}
nights = {}
for k, v in d.items():
if k.endswith("morning"):
mornings[k] = v
else:
nights[k] = v

Related

I would like to create a new column based on conditions using .loc

I have the code:
to_test['averageRating'].unique()
array([5.8, 5.2, 5. , 6.5, 5.5, 7.3, 7.2, 4.2, 6.4, 7.1, 6.6, 5.4, 6.9,
6. , 6.1, 8.1, 6.3, 7.8, 3.9, 6.8, 6.2, 7.9, 7. , 4.9, 5.9, 7.5,
6.7, 8. , 5.7, 3.2, 4.8, 5.6, 7.4, 4.5, 3.6, 4.3, 3.4, 5.1, 4.4,
4.7, 7.7, 5.3, 4. , 8.4, 7.6, 3.3, 2.2, 3.7, 8.2, 4.1, 8.3, 1.7,
9. , 4.6, 8.5, 3.1, 3.8, 3.5, 1.9, 2.9, 2.8, 2.7, 9.2, 1.2, 2.1,
3. , 1.3, 1.1, 8.6, 2.5, 1. , 9.8, 8.7, 1.5, 9.3])
`
create a list of our conditions
conditions = [(to_test.loc[(to_test['averageRating']>=0.0) & (to_test['averageRating'] <= 3.3)]),
(to_test.loc[(to_test['averageRating']>=3.4) & (to_test['averageRating'] <=6.6)]),
(to_test.loc[(to_test['averageRating']>=6.7) & (to_test['averageRating'] <=10)])]
create a list of the values we want to assign for each condition
values = ['group1', 'group2', 'group3']
create a new column and use np.select to assign values to it using our lists as arguments
to_test['group'] = np.select(conditions, values)
display updated DataFrame
to_test.head()`
but it's not working

This is using a classic case of using cut. Sample code
df = pd.DataFrame({'averageRating' : np.random.uniform(0,10,100)})
df['group_using_cut'] = pd.cut(df['averageRating'],
[0,3.3,6.6,10],
labels=['group1','group2','group3'])
If you want to use np.select use conditions without loc
Sample Code
conds = [
(df['averageRating']>=0.0) & (df['averageRating'] <= 3.3),
(df['averageRating']>=3.4) & (df['averageRating'] <= 6.6),
(df['averageRating']>=6.7) & (df['averageRating'] <= 10),
]
df['group_using_selec'] = np.select(conds,['group1','group2','group3'])
Output df.head()

Use itertools.groupby (or a neet, pythonic way) to group a list by the difference between the consecutive numbers

I've read this question
But my question is a little different:
For example:
[0.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 5.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10.0]
should gave me:
[[0.0], [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9], [5.0], [9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10.0]]
All the types of them is float.The difference of each element of sub list should smaller than 0.1.
I try to solve it without using third party module.(Not homework, just a practice for python)
What I have tried: too much code with itertools.groupby(Couldn't solve it).One of my attempt:
import itertools
lst = [0.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 5.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10]
res = []
for _, item in itertools.groupby(enumerate(lst), key=lambda index_num: index_num[0]-index_num[1]):
print(list(item)) # Not expected. The solution I mentioned didn't work.
I want a neat, pythonic way to solve it.Any tricks are welcomed, I just want to learn more skills.:)

Sum combination of lists by element

I have a nested list, which can be of varying length (each sublist will always contain the same number of elements as the others):
list1=[[4.1,2.9,1.2,4.5,7.9,1.2],[0.7,1.1,2.0,0.4,1.8,2.2],[5.1,4.1,6.5,7.1,2.3,3.6]]
I can find every possible combination of sublists of length n using itertools:
n=2
itertools.combinations(list1,n)
[([4.1, 2.9, 1.2, 4.5, 7.9, 1.2], [0.7, 1.1, 2.0, 0.4, 1.8, 2.2]),
([4.1, 2.9, 1.2, 4.5, 7.9, 1.2], [5.1, 4.1, 6.5, 7.1, 2.3, 3.6]),
([0.7, 1.1, 2.0, 0.4, 1.8, 2.2], [5.1, 4.1, 6.5, 7.1, 2.3, 3.6])]
I would like to sum all lists in each tuple by index. In this example, I would end up with:
[([4.8, 4.0, 3.2, 4.9, 9.7, 3.4],
[9.2, 7.0, 7.7, 6.8, 10.2, 4.8],
[5.8, 5.2, 8.5, 7.5, 4.1, 5.8])]
I have tried:
[sum(x) for x in itertools.combinations(list1,n)]
[sum(x) for x in zip(*itertools.combinations(list1,n))]
Each run into errors.

You can use zip for this:
>>> [tuple(map(sum, zip(*x))) for x in itertools.combinations(list1, n)]
[(4.8, 4.0, 3.2, 4.9, 9.700000000000001, 3.4000000000000004),
(9.2, 7.0, 7.7, 11.6, 10.2, 4.8),
(5.8, 5.199999999999999, 8.5, 7.5, 4.1, 5.800000000000001)]

Try this :
>>> list1=[[4.1,2.9,1.2,4.5,7.9,1.2],[0.7,1.1,2.0,0.4,1.8,2.2],[5.1,4.1,6.5,7.1,2.3,3.6]]
>>> from itertools import combinations as c
>>> list(list(map(sum, zip(*k))) for k in c(list1, 2))
[[4.8, 4.0, 3.2, 4.9, 9.700000000000001, 3.4000000000000004], [9.2, 7.0, 7.7, 11.6, 10.2, 4.8], [5.8, 5.199999999999999, 8.5, 7.5, 4.1, 5.800000000000001]]

How can I convert a dict_keys list to integers

I am trying to find a way of converting a list within dict_keys() to an integer so I can use it as a trigger to send to another system. My code (below) imports a list of 100 words (a txt file with words each on a new line) which belong to 10 categories (e.g. the first 10 words belong to category 1, second 10 words belong to category 2 etc...).
Code:
from numpy.random import choice
from collections import defaultdict
number_of_elements = 10
Words = open('file_location').read().split()
categories = defaultdict(list)
for i in range(len(words)):
categories[i/number_of_elements].append(words[i])
category_labels = categories.keys()
category_labels
Output
dict_keys([0.0, 1.1, 2.0, 3.0, 4.9, 5.0, 0.5, 1.9, 8.0, 9.0, 1.3, 2.7, 3.9, 9.2, 9.4, 7.2, 4.2, 8.6, 5.1, 5.4, 3.3, 1.0, 6.6, 7.4, 7.7, 8.4, 5.8, 9.8, 0.7, 8.8, 2.1, 7.0, 6.4, 4.3, 0.1, 2.5, 3.8, 1.2, 6.9, 7.1, 5.6, 0.4, 5.3, 2.9, 7.3, 3.5, 9.5, 8.2, 2.8, 3.1, 0.9, 2.3, 8.1, 4.0, 6.3, 6.7, 4.5, 0.2, 1.7, 2.2, 8.9, 1.4, 7.6, 9.1, 7.8, 5.5, 4.8, 0.6, 3.2, 2.4, 6.5, 9.9, 9.6, 1.5, 6.0, 3.7, 4.7, 3.4, 5.9, 4.1, 1.6, 6.8, 9.3, 3.6, 8.5, 8.7, 0.3, 0.8, 7.5, 5.2, 2.6, 4.6, 5.7, 7.9, 6.1, 1.8, 8.3, 6.2, 9.7, 4.4])
What I need:
I would like the first number before the point (e.g. if it was 6.7, I just want the 6 as an int).
Thank you in advance for any help and/or advice!

Just convert your keys to integers using a list comprehension; note that there is no need to call .keys() here as iteration over the dictionary directly suffices:
[int(k) for k in categories]
You may want to bucket your values directly into integer categories rather than by floating point values:
categories = defaultdict(list)
for i, word in enumerate(words):
categories[int(i / number_of_elements)].append(word)
I used enumerate() to pair words up with their index, rather than use range() plus indexing back into words.

ARIMA exogenous variable out of sample

fit = statsmodels.api.tsa.ARIMA(efRates[0], (1,1,1), exog=ueRate).fit(transparams=False)
predicts = fit.predict(start=len(efRates[0]), end = len(efRates[0])+11, exog=ueRate, typ = 'levels')
Generates a
File "C:\Users\Saul Ramirez\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\arima_model.py", line 720, in predict
if self.k_exog == 1 and exog.ndim == 1:
AttributeError: 'bool' object has no attribute 'ndim'
Some info: efRates[0] and ueRate are lists of the same length.
efRates[0]
[0.030052056971642007,
0.03917330288542586,
0.02828475062426216,
0.03644101079605235,
0.03378605359919436,
0.02743587918046455,
0.03342745492501596,
0.026205917483282503,
0.030503758568976337,
0.024550760529053202,
0.03261189266424876,
0.03506521240864593,
0.027338276601998696,
0.053725765854704746,
0.02676967429100413,
0.03442977438269886,
0.033314687425925964,
0.027406120117972988,
0.037085495711527916,
0.021131004053371122,
0.03342530957311805,
0.02011467948214261,
0.03674645825546184,
0.030766279328527657,
0.022010347634637235,
0.048441932020847935,
0.055182794314502556,
0.037653187998947804,
0.054329400023020905,
0.030487014172364307,
0.04828703019272537,
0.029364609341652963,
0.04420916320116292,
0.0245732204143899,
0.04007219462688283,
0.030088483595491378,
0.04503547974992547,
0.050414257448672777,
0.03650945820093438,
0.0271939590858418,
0.043825558271225154,
0.02887263694287208,
0.034395655516300985,
0.033476222069816444,
0.02364138126589003,
0.034956784469719566,
0.025488157761323762,
0.03284135171594629,
0.0352266773873871,
0.02578522887525815,
0.030801158226067212,
0.017836011389627614,
0.03237266466197845,
0.020781381627205192,
0.03507981277516531,
0.030619701683938114,
0.0200645972051283,
0.02340543468851082,
0.022232375406303732,
0.031450255120488005,
0.030807264010862326,
0.02520300632649576,
0.02683432106844716,
0.01719544921035768,
0.022245308176032028,
0.015787396423808154,
0.02236691164709978,
0.022948859956318242,
0.018302596298743336,
0.02356268219722402,
0.020514907102090335,
0.029322000183361653,
0.030253386469667742,
0.02389996663574461,
0.026350732450672106,
0.018634569853141162,
0.02993859530565429,
0.01762489169698181,
0.028369112029450066,
0.024207088908217232,
0.019513438046869554,
0.02149236584384482,
0.020792834468107983,
0.0252767276304043,
0.025754940371044845,
0.01633653635317383,
0.02562719118582408,
0.01718720874173012,
0.02915438356543398,
0.017238835380189263,
0.028044663751279383,
0.027504015027686957,
0.020989801458819447,
0.025215885766374995,
0.02422123160263125,
0.03253702270430853,
0.02095284431753602,
0.03241141468118923,
0.018667854534336364,
0.03997670839216877,
0.022116655885610726,
0.030336876645878957,
0.03418820217137176,
0.018663800522544426,
0.02623414798030232,
0.020524065760586897]
ueRate
[4.9,
5,
5,
5,
4.9,
4.7,
4.8,
4.7,
4.7,
4.6,
4.6,
4.7,
4.7,
4.5,
4.4,
4.5,
4.4,
4.6,
4.5,
4.4,
4.5,
4.4,
4.6,
4.7,
4.6,
4.7,
4.7,
4.7,
5,
5,
4.9,
5.1,
5,
5.4,
5.6,
5.8,
6.1,
6.1,
6.5,
6.8,
7.3,
7.8,
8.3,
8.7,
9,
9.4,
9.5,
9.5,
9.6,
9.8,
10,
9.9,
9.9,
9.7,
9.8,
9.9,
9.9,
9.6,
9.4,
9.5,
9.5,
9.5,
9.5,
9.8,
9.4,
9.1,
9,
9,
9.1,
9,
9.1,
9,
9,
9,
8.8,
8.6,
8.5,
8.2,
8.3,
8.2,
8.2,
8.2,
8.2,
8.2,
8.1,
7.8,
7.8,
7.8,
7.9,
7.9,
7.7,
7.5,
7.5,
7.5,
7.5,
7.3,
7.2,
7.2,
7.2,
7,
6.7,
6.6,
6.7,
6.7,
6.3,
6.3]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Best way to split a dict based on key - python

mornings = {} nights = {} for k, v in d.items(): if k.endswith("morning"): mornings[k] = v else: nights[k] = v

Related

I would like to create a new column based on conditions using .loc

Use itertools.groupby (or a neet, pythonic way) to group a list by the difference between the consecutive numbers

Sum combination of lists by element

How can I convert a dict_keys list to integers

ARIMA exogenous variable out of sample

Categories

Resources