How to yield all possibilities using efficient nesting in Python? - python

Forex Triangular Arb problem:
I'm currently trying to solve an efficient way on how to yield all of the elements of a Dictionary.items(). Let's suppose the array has N length and I need to acquire a all possible combinations where [[A,B],[A,C],[C,B]...]
Currently, it is not efficient due to nesting
def Arb(tickers: dict) -> list:
for first_pair in tickers.items():
pair1: list = first_pair[0].split("/")
for second_pair in tickers.items():
pair2: list = second_pair[0].split("/")
if pair2[0] == pair1[0] and pair2[1] != pair1[1]:
for third_pair in tickers.items():
pair3: list = third_pair[0].split("/")
if pair3[0] == pair2[1] and pair3[1] == pair1[1]:
id1 = first_pair[1]["id"]
id2 = second_pair[1]["id"]
id3 = third_pair[1]["id"]
yield [pair1, id1, pair2, id2, pair3, id3]
What would be the efficient/pythonic way to return a List with all possible items?
This is an example
tickers = {"EZ/TC": {
"id": 1
},
"LM/TH": {
"id": 2
},
"CD/EH": {
"id": 3
},
"EH/TC": {
"id":4
},
"LM/TC": {
"id": 5
},
"CD/TC":{
"id": 6
},
"BT/TH": {
"id": 7,
},
"BT/TX": {
"id": 8,
},
"TX/TH":{
"id": 9
}
}
print(list(Arb(tickers)))
[(['CD', 'TC'], 6, ['CD', 'EH'], 3, ['EH', 'TC'], 4), (['BT', 'TH'], 7, ['BT', 'TX'], 8, ['TX', 'TH'], 9)]
The Output is a Single List comprised of "lists" of all possibilities.

You don't to iterate on items() as you don't use the values, just the keys. Then you want use itertools.permutations to get all the combinations in every order of each pair, then keep the ones that matches letters
def verify(v1, v2, v3):
return v1[0] == v2[0] and v1[1] == v3[1] and v2[1] == v3[0]
def arb(tickers) -> List:
c = permutations([x.split("/") for x in tickers], r=3)
return list(filter(lambda x: verify(*x), c))

Itertools.permutations and itertools.combinations could be helpful for this type of problem: https://docs.python.org/3/library/itertools.html
Here's a link with an example using itertools.permutations:
https://www.geeksforgeeks.org/python-itertools-permutations/

Related

Group list of dicts by two params and count grouped values

I have list of dicts with id numbers, I need to group it by main_id and second_id and count values in each group. What is the best Python way to reach this?
I'm tried with Pandas, but don't get dict with groups and counts
df = pd.DataFrame(data_list)
df2 = df.groupby('main_id').apply(lambda x: x.set_index('main_id')['second_id']).to_dict()
print(df2)
List looks like:
[
{
"main_id":34,
"second_id":"2149"
},
{
"main_id":82,
"second_id":"174"
},
{
"main_id":24,
"second_id":"4QCp"
},
{
"main_id":34,
"second_id":"2149"
},
{
"main_id":29,
"second_id":"126905"
},
{
"main_id":34,
"second_id":"2764"
},
{
"main_id":43,
"second_id":"16110"
}
]
I need result like:
[
{
"main_id":43,
"second_id":"16110",
"count": 1
},
{
"main_id":34,
"second_id":"2149",
"count": 2
}
]
You could use collections (from the standard library) instead of pandas. I assigned the list of dicts to xs:
import collections
# create a list of tuples; each is (main_id, secondary_id)
ids = [ (x['main_id'], x['second_id']) for x in xs ]
# count occurrences of each tuple
result = collections.Counter(ids)
Finally, result is a dict, which can be readily converted to the final form (not shown).
Counter({(34, '2149'): 2,
(82, '174'): 1,
(24, '4QCp'): 1,
(29, '126905'): 1,
(34, '2764'): 1,
(43, '16110'): 1})

Faster way to convert Pandas dataframe into nested json

I have data that looks like this:
player, goals, matches
ronaldo, 10, 5
messi, 7, 9
I want to convert this dataframe into a nested json, such as this one:
{
"content":[
{
"player": "ronaldo",
"events": {
"goals": 10,
"matches": 5
}
},
{
"player": "messi",
"events": {
"goals": 7,
"matches": 9
}
}
]
}
This is my code, using list comprehension:
df = pd.DataFrame([['ronaldo', 10, 5], ['messi', 7, 9]], columns=['player', 'goals', 'matches'])
d = [{'events': df.loc[ix, ['goals', 'matches']].to_dict(), 'player': df.loc[ix, 'player']} for ix in range(df.shape[0])]
j = {}
j['content'] = d
This works, but the performance is really slow when I have a lot of data. Is there a faster way to do it?
Use pandas.to_json. Fast and easy https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html
df.T.to_json()
try :
df.to_json(orient = "records")
The problem is it doesn't stack goals and matches on the event column , I'm not sure though if you can do it without looping

Merging 2 dictionaries in python

I have two dictionary:
a=
{
"2001935072": {
"WR": "48.9",
"nickname": "rogs30541",
"team": 2
},
....
}
and
b=
{
"2001935072": {
"wtr": 816
},
....
}
i've tried to merge them with both a.update(b) and a={**a, **b} but both gives this output when i print(a):
{
"2001935072": {
"wtr": 816
},
....
}
which is basicly a=b, how to merge A and B so the output is
{
"2001935072": {
"WR": "48.9",
"nickname": "rogs30541",
"team": 2
"wtr": 816
},
....
}
I would compute the union of the keys, then rebuild the dictionary merging the inner dictionaries together with an helper method (because dict merging is possible in 3.6+ inline but not before) (How to merge two dictionaries in a single expression?)
a={
"2001935072": {
"WR": "48.9",
"nickname": "rogs30541",
"team": 2
}
}
b= {
"2001935072": {
"wtr": 816
},
}
def merge_two_dicts(x, y):
"""Given two dicts, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
result = {k:merge_two_dicts(a.get(k,{}),b.get(k,{})) for k in set(a)|set(b)}
print(result)
result:
{'2001935072': {'WR': '48.9', 'nickname': 'rogs30541', 'team': 2, 'wtr': 816}}
notes:
a.get(k,{}) allows to get the value for k with a default of so merge still works, only retaining values from b dict.
merge_two_dicts is just an helper function. Not to be used with a and b dicts directly or it will give the wrong result, since last merged one "wins" and overwrites the other dict values
With Python 3.6+: you can do that without any helper function:
result = {k:{**a.get(k,{}),**b.get(k,{})} for k in set(a)|set(b)}
Try this:-
for i,j in a.items():
for x,y in b.items():
if i==x:
j.update(y)
print(a) #your updateed output
You can try list + dict comprehension to achieve your results:
>>> a = {"2001935072":{"WR":"48.9","nickname":"rogs30541","team":2}}
>>> b = {"2001935072":{"wtr":816}}
>>> l = dict([(k,a.get(k),b.get(k)) for k in set(list(a.keys()) + list(b.keys()))])
This will output:
>>> [('2001935072', {'WR': '48.9', 'nickname': 'rogs30541', 'team': 2}, {'wtr': 816})]
Finally to achieve your desired output
>>> dict((k,{**va,**vb}) for k,va,vb in l)
>>> {'2001935072': {'WR': '48.9', 'nickname': 'rogs30541', 'team': 2, 'wtr': 816}}

Remove duplicate of a dictionary from list

How can i remove duplicate of the key "name"
[
{
'items':[
{
'$oid':'5a192d0590866ecc5c1f1683'
}
],
'image':'image12',
'_id':{
'$oid':'5a106f7490866e25ddf70cef'
},
'name':'Amala',
'store':{
'$oid':'5a0a10ad90866e5abae59470'
}
},
{
'items':[
{
'$oid':'5a192d2890866ecc5c1f1684'
}
],
'image':'fourth shit',
'_id':{
'$oid':'5a106fa190866e25ddf70cf0'
},
'name':'Amala',
'store':{
'$oid':'5a0a10ad90866e5abae59470'
}
}
]
I want to marge together dictionary with the same key "name"
Here is what i have tried
b = []
for q in data:
if len(data) == 0:
b.append(q)
else:
for y in b:
if q['name'] != y['name']:
b.append(q)
but after trying this the b list doesn't return unique dictionary that i wanted
You loop through the assembled list and if you find a dict with a different name, you add the current dict. The logic should be different: only add it if you don't find one with the same name!
That being said, you should maintain a set of seen names. That will make the check more performant:
b, seen = [], set()
for q in data:
if q['name'] not in seen:
b.append(q)
seen.add(q['name'])

python nested dict in array group sum

I have this data structure in Python:
result = {
"data": [
{
"2015-08-27": {
"clicks": 10,
"views":20
}
},
{
"2015-08-28": {
"clicks": 6,
}
}
]
}
How can I add the elements of each dictionary? The output should be :
{
"clicks":16, # 10 + 6
"views":20
}
I am looking for a Pythonic solution for this. Any solutions using Counter are welcome but I am not able to implement it.
I have tried this but I get an error:
counters = []
for i in result:
for k,v in i.items():
counters.append(Counter(v))
sum(counters)
Your code was quite close to a workable solution, and we can make it work with a few important changes. The most important change is that we need to iterate over the "data" item in result.
from collections import Counter
result = {
"data": [
{
"2015-08-27": {
"clicks": 10,
"views":20
}
},
{
"2015-08-28": {
"clicks": 6,
}
}
]
}
counts = Counter()
for d in result['data']:
for k, v in d.items():
counts.update(v)
print(counts)
output
Counter({'views': 20, 'clicks': 16})
We can simplify that a little because we don't need the keys.
counts = Counter()
for d in result['data']:
for v in d.values():
counts.update(v)
The code you posted makes a list of Counters and then tries to sum them. I guess that's also a valid strategy, but unfortunately the sum built-in doesn't know how to add Counters together. But we can do it using functools.reduce.
from functools import reduce
counters = []
for d in result['data']:
for v in d.values():
counters.append(Counter(v))
print(reduce(Counter.__add__, counters))
However, I suspect that the first version will be faster, especially if there are lots of dicts to add together. Also, this version consumes more RAM, since it keeps a list of all the Counters.
Actually we can use sum to add the Counters together, we just have to give it an empty Counter as the start value.
print(sum(counters, Counter()))
We can combine this into a one-liner, eliminating the list by using a generator expression instead:
from collections import Counter
result = {
"data": [
{
"2015-08-27": {
"clicks": 10,
"views":20
}
},
{
"2015-08-28": {
"clicks": 6,
}
}
]
}
totals = sum((Counter(v) for i in result['data'] for v in i.values()), Counter())
print(totals)
output
Counter({'views': 20, 'clicks': 16})
This is not the best solution as I am sure that there are libraries that can get you there in a less verbose way but it is one you can easily read.
res = {}
for x in my_dict['data']:
for y in x:
for t in x[y]:
res.setdefault(t, 0)
res[t] += x[y][t]
print(res) # {'views': 20, 'clicks': 16}

Categories

Resources