Sort dictionary by key numeric with alphanumeric data

Sort dictionary by key numeric with alphanumeric data - python

I have a (Python) dictionary looking like this:
[
{
"data": "somedata1",
"name": "prefix1.7.9"
},
{
"data": "somedata2",
"name": "prefix1.7.90"
},
{
"data": "somedata3",
"name": "prefix1.1.1"
},
{
"data": "somedata4",
"name": "prefix4.1.1"
},
{
"data": "somedata5",
"name": "prefix4.1.2"
},
{
"data": "somedata5",
"name": "other 123"
},
{
"data": "somedata6",
"name": "different"
},
{
"data": "somedata7",
"name": "prefix1.7.11"
},
{
"data": "somedata7",
"name": "prefix1.11.9"
},
{
"data": "somedata7",
"name": "prefix1.17.9"
}
]
Now I want to sort it by "name" key.
If there postfix are numbers (splitted by 2 points) I want to sort it numerical.
e.g. with a resulting order:
different
other 123
prefix1.1.1
prefix1.1.9
prefix1.7.11
prefix1.7.90
prefix1.11.9
prefix1.17.9
prefix4.1.1
prefix4.1.2
Do you have an idea how to do this short and efficient?
The only idear I had, was to build a complete new list, but possibly this could also be done using a lambda function?

You can use re.findall with a regex that extracts either non-numerical words or digits from each name, and convert those that are digits to integers for numeric comparisons. To avoid comparisons between strings and integers, make the key a tuple where the first item is a Boolean of whether the token is numeric and the second item is the actual key for comparison:
import re
# initialize your input list as the lst variable
lst.sort(
key=lambda d: [
(s.isdigit(), int(s) if s.isdigit() else s)
for s in re.findall(r'[^\W\d]+|\d+', d['name'])
]
)
Demo: https://replit.com/#blhsing/ToughWholeInformationtechnology

You need to come up with a way of extracting your prefix, and your postfix from the 'name' values. This can be achieved using something like:
import math
def extract_prefix(s: str) -> str:
return s.split('.')[0]
def extract_postfix(s: str) -> float:
try:
return float('.'.join(s.split('.')[1:]))
except ValueError:
# if we cannot form a float i.e. no postfix exists, it'll be before some value with same prefix
return -math.inf
arr = [{'data': 'somedata1', 'name': 'prefix1.7.9'},
{'data': 'somedata2', 'name': 'prefix1.7.90'},
{'data': 'somedata3', 'name': 'prefix1.1.1'},
{'data': 'somedata4', 'name': 'prefix4.1.1'},
{'data': 'somedata5', 'name': 'prefix4.1.2'},
{'data': 'somedata5', 'name': 'other 123'},
{'data': 'somedata6', 'name': 'different'},
{'data': 'somedata7', 'name': 'prefix1.7.11'},
{'data': 'somedata7', 'name': 'prefix1.11.9'},
{'data': 'somedata7', 'name': 'prefix1.17.9'}]
result = sorted(sorted(arr, key=lambda d: extract_postfix(d['name'])), key=lambda d: extract_prefix(d['name']))
result:
[{'data': 'somedata6', 'name': 'different'},
{'data': 'somedata5', 'name': 'other 123'},
{'data': 'somedata3', 'name': 'prefix1.1.1'},
{'data': 'somedata7', 'name': 'prefix1.7.11'},
{'data': 'somedata1', 'name': 'prefix1.7.9'},
{'data': 'somedata2', 'name': 'prefix1.7.90'},
{'data': 'somedata7', 'name': 'prefix1.11.9'},
{'data': 'somedata7', 'name': 'prefix1.17.9'},
{'data': 'somedata4', 'name': 'prefix4.1.1'},
{'data': 'somedata5', 'name': 'prefix4.1.2'}]

Since you want to sort numerically you will need a helper function:
def split_name(s):
nameparts = s.split('.')
for i,p in enumerate(nameparts):
if p.isdigit():
nameparts[i] = int(p)
return nameparts
obj = obj.sort(key = lambda x:split_name(x['name']))

Here I am first sorting the list by version. Storing in the another list rank call rank, this list helps to replicates the ranking position for custom sorting.
Code using the pkg_resources:
from pkg_resources import parse_version
rank=sorted([v['name'] for v in Mydata], key=parse_version)
or
rank = sorted(sorted([v['name'] for v in Mydata], key=parse_version), key = lambda s: s[:3]=='pre') #To avoid the prefix value in sorting
sorted(Mydata, key = lambda x: rank.index(x['name']))
Output:
[{'data': 'somedata6', 'name': 'different'},
{'data': 'somedata5', 'name': 'other 123'},
{'data': 'somedata3', 'name': 'prefix1.1.1'},
{'data': 'somedata1', 'name': 'prefix1.7.9'},
{'data': 'somedata7', 'name': 'prefix1.7.11'},
{'data': 'somedata2', 'name': 'prefix1.7.90'},
{'data': 'somedata7', 'name': 'prefix1.11.9'},
{'data': 'somedata7', 'name': 'prefix1.17.9'},
{'data': 'somedata4', 'name': 'prefix4.1.1'},
{'data': 'somedata5', 'name': 'prefix4.1.2'}]
With another inputs:
[{'data': 'somedata6', 'name': 'Aop'},
{'data': 'somedata6', 'name': 'different'},
{'data': 'somedata5', 'name': 'other 123'},
{'data': 'somedata7', 'name': 'pop'},
{'data': 'somedata3', 'name': 'prefix1.hello'},
{'data': 'somedata3', 'name': 'prefix1.1.1'},
{'data': 'somedata4', 'name': 'prefix1.2.hello'},
{'data': 'somedata1', 'name': 'prefix1.7.9'},
{'data': 'somedata7', 'name': 'prefix1.7.11'},
{'data': 'somedata2', 'name': 'prefix1.7.90'},
{'data': 'somedata7', 'name': 'prefix1.17.9'},
{'data': 'somedata7', 'name': 'prefix1.17.9'},
{'data': 'somedata5', 'name': 'prefix4.1.2'},
{'data': 'somedata7', 'name': 'prefix9.1.1'},
{'data': 'somedata7', 'name': 'prefix10.11.9'}]

Related

JSON viewers don't accept my pattern even after dict going through json.dumps() + json.loads()

The result when printing after a = json.dumps(dicter) and print(json.loads(a)) is this:
{
'10432981': {
'tournament': {
'name': 'Club Friendly Games',
'slug': 'club-friendly-games',
'category': {
'name': 'World',
'slug': 'world',
'sport': {
'name': 'Football',
'slug': 'football',
'id': 1
},
'id': 1468,
'flag': 'international'
},
'uniqueTournament': {
'name': 'Club Friendly Games',
'slug': 'club-friendly-games',
'category': {
'name': 'World',
'slug': 'world',
'sport': {
'name': 'Football',
'slug': 'football',
'id': 1
},
'id': 1468,
'flag': 'international'
},
'userCount': 0,
'hasPositionGraph': False,
'id': 853,
'hasEventPlayerStatistics': False,
'displayInverseHomeAwayTeams': False
},
'priority': 0,
'id': 86
}
}
}
But when trying to read in any json viewer, they warn that the format is incorrect but don't specify where the problem is.
If it doesn't generate any error when converting the dict to JSON and not even when reading it, why do views warn of failure?

You must enclose the strings using double quotes ("). The json.loads returns a python dictionary, so it is not a valid JSON object. If you want to get valid JSON you can get the string that json.dumps returns.

Python steamlit select box menu returns string, but I need dict or list

Stack on this case, Python steamlit select box menu returns string, but I need dict or list, to use it further in my code.
I want to see company1, company2, company3 in dropdown menu, and if user's choice was for example 'company2' get ['ID': 'zxc222’, 'NAME': 'company2','DESC': 'comp2'].
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
st.selectbox('Subdivision:', options)

You can do the conversion to a dict after the selectbox:
import streamlit as st
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
choice = st.selectbox('Subdivision:', options)
chosen_base_object = None
for base_object in BaseObject:
if base_object["NAME"] == choice:
chosen_base_object = dict(base_object)
print(chosen_base_object) # {'ID': 'zxc345', 'NAME': 'company3', 'DESC': 'comp3'}

Assigning attributes to array of objects based on object array lookup

I have this code block that assigns an account type based on a list of acccount names. But it fails if an account name is not in any of the 'name' arrays.
How do I add a "default" type if not found? Or is there a more elegant way to do this?
accts = [
{'id': 1425396, 'name': 'Banana'},
{'id': 1425399, 'name': 'Schwab Brokerage'},
{'id': 1425400, 'name': 'Schwab'},
{'id': 1425411, 'name': 'CapitalOne'},
{'id': 1425428, 'name': '401K'},
{'id': 1425424, 'name': 'Venmo'},
{'id': 1425428, 'name': 'Geico'},
{'id': 1425428, 'name': 'PayPal'},
{'id': 1426349, 'name': 'Coinbase'},
{'id': 1426349, 'name': 'XXX'}
]
for acct in accts: acct['acct_type'] = next(acct_type for acct_type in
[
{'acct_type':'checking', 'accts':['Schwab','Venmo']},
{'acct_type':'credit', 'accts':['Banana','CapitalOne']},
{'acct_type':'other', 'accts':['Geico','PayPal']},
{'acct_type':'invest', 'accts':['Schwab Brokerage','401K','Coinbase']}
]
if acct['name'] in acct_type['accts'])['acct_type']
The last (XXX) account causes this:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

You could simply replace the list-of-dicts lookup with a dictionary. For example:
acct_types = {
"Schwab": "checking", "Venmo": "checking",
"Banana": "credit", "CapitalOne": "credit",
"Geico": "other", "PayPal": "other",
"Schwab Brokerage": "invest", "401K": "invest", "Coinbase": "invest"
}
Then, all you need to do is lookup in this dictionary using the .get() function. The second argument provides the default value.
for acct in accts:
acct['acct_type'] = acct_types.get(acct['name'], "other")
Which gives the expected result:
[{'id': 1425396, 'name': 'Banana', 'acct_type': 'credit'},
{'id': 1425399, 'name': 'Schwab Brokerage', 'acct_type': 'invest'},
{'id': 1425400, 'name': 'Schwab', 'acct_type': 'checking'},
{'id': 1425411, 'name': 'CapitalOne', 'acct_type': 'credit'},
{'id': 1425428, 'name': '401K', 'acct_type': 'invest'},
{'id': 1425424, 'name': 'Venmo', 'acct_type': 'checking'},
{'id': 1425428, 'name': 'Geico', 'acct_type': 'other'},
{'id': 1425428, 'name': 'PayPal', 'acct_type': 'other'},
{'id': 1426349, 'name': 'Coinbase', 'acct_type': 'invest'},
{'id': 1426349, 'name': 'XXX', 'acct_type': 'other'}]
If you don't want to manually create the acct_types dictionary, you can easily convert the list-of-dicts that you already have:
lookup = [
{'acct_type':'checking', 'accts':['Schwab','Venmo']},
{'acct_type':'credit', 'accts':['Banana','CapitalOne']},
{'acct_type':'other', 'accts':['Geico','PayPal']},
{'acct_type':'invest', 'accts':['Schwab Brokerage','401K','Coinbase']}
]
acct_types = dict()
for item in lookup:
for acct_name in item['accts']:
acct_types[acct_name] = item['acct_type']

I figured it out, using the default parameter of next():
for acct in accts: acct['acct_type'] = next(
(
acct_type for acct_type in
[
{'acct_type':'checking', 'accts':['Schwab','Venmo']},
{'acct_type':'credit', 'accts':['Banana','CapitalOne']},
{'acct_type':'other', 'accts':['Geico','PayPal']},
{'acct_type':'invest', 'accts':['Schwab Brokerage','401K','Coinbase']}
]
if acct['name'] in acct_type['accts']
),
{'acct_type':'other'}
)['acct_type']

chinese character encoding about django and highcharts(unicode utf8)

We must provide series to highcharts, in views.py, the data is:
series_1 = [
{'type': 'column', 'data': [4056], 'name': '二手家具'},
{'type': 'column', 'data': [3016], 'name': '家居百货'},
{'type': 'column', 'data': [3765], 'name': '虚拟物品'},
{'type': 'column', 'data': [4056], 'name': '服饰箱包'},
{'type': 'column', 'data': [3756], 'name': '闲置礼品'},
{'type': 'column', 'data': [4056], 'name': '图书/音乐/运动'},
{'type': 'column', 'data': [3765], 'name': '农用品'},
{'type': 'column', 'data': [4052], 'name': '母婴/儿童用品'},
{'type': 'column', 'data': [4056], 'name': '二手手机'},
{'type': 'column', 'data': [4055], 'name': '美容护肤/化妆品'},
{'type': 'column', 'data': [4055], 'name': '二手笔记本'},
{'type': 'column', 'data': [4055], 'name': '电子数码'},
{'type': 'column', 'data': [3765], 'name': '设备/办公用品'},
{'type': 'column', 'data': [4055], 'name': '台式电脑/网络'},
{'type': 'column', 'data': [4055], 'name': '老年用品'},
{'type': 'column', 'data': [4054], 'name': '家用电器'}]
the view fuction is :
def chart2(request):
context = {
'series':series_1
}
return render(request, 'chart2.html', context)
the code of highcharts in chart2.html is:
<script>
$(function () {
$('#container').highcharts({
title: {
text: 'Tongji',
x: -20 //center
},
yAxis: {
title: {
text: 'number'
},
plotLines: [{
value: 0,
width: 1,
color: '#808080'
}]
},
series:{{series|safe}}
});
});
</script>>
when I run it, I only can see the messy code like this:
if I use unicode string like u'二手家具',the highcharts also has error like this:
I really don't know how to deal with this problem,please help me,thank you!

Getting a python dict.keys trail for each item

I have a python dict that looks like this
{'data': [{'data': [{'data': 'gen1', 'name': 'objectID'},
{'data': 'familyX', 'name': 'family'}],
'name': 'An-instance-of-A'},
{'data': [{'data': 'gen2', 'name': 'objectID'},
{'data': 'familyY', 'name': 'family'},
{'data': [{'data': [{'data': '21',
'name': 'objectID'},
{'data': 'name-for-21',
'name': 'name'},
{'data': 'no-name', 'name': None}],
'name': 'An-instance-of-X:'},
{'data': [{'data': '22',
'name': 'objectID'}],
'name': 'An-instance-of-X:'}],
'name': 'List-of-2-X-elements:'}],
'name': 'An-instance-of-A'}],
'name': 'main'}
The structure is repeating and its rule is like:
A dict contains 'name' and 'data'
'data' can contain a list of dicts
If 'data' is not a list, it is a value I need.
'name' is a just a name
The problem is that for each value, I need to know every info for each parent.
So at the end, I need to print a list with items that looks something like:
objectID=gen2 family=familyY An-instance-of-X_objectID=21 An-instance-of-X_name=name-for-21
Edit: This is only one of several lines I want as the output. I need one line like this for each item that doesn’t have a dict as 'data'.
So, for each data that is not a dict, traverse up, find info and print it..
I don't know every function in modules like itertools and collections. But is there something in there I can use? What is this called (when I am trying to do research on my own)?
I can find many "flatten dict" methods, but not like this, not when I have 'data', 'name' like this..

This is a wonderful example what recursion is good for:
input_ = {'data': [{'data': [{'data': 'gen1', 'name': 'objectID'},
{'data': 'familyX', 'name': 'family'}],
'name': 'An-instance-of-A'},
{'data': [{'data': 'gen2', 'name': 'objectID'},
{'data': 'familyY', 'name': 'family'},
{'data': [{'data': [{'data': '21',
'name': 'objectID'},
{'data': 'name-for-21',
'name': 'name'},
{'data': 'no-name', 'name': None}],
'name': 'An-instance-of-X:'},
{'data': [{'data': '22',
'name': 'objectID'}],
'name': 'An-instance-of-X:'}],
'name': 'List-of-2-X-elements:'}],
'name': 'An-instance-of-A'}],
'name': 'main'}
def parse_dict(d, predecessors, output):
"""Recurse into dict and fill list of path-value-pairs"""
data = d["data"]
name = d["name"]
name = name.strip(":") if type(name) is str else name
if type(data) is list:
for d_ in data:
parse_dict(d_, predecessors + [name], output)
else:
output.append(("_".join(map(str,predecessors+[name])), data))
result = []
parse_dict(input_, [], result)
print "\n".join(map(lambda x: "%s=%s"%(x[0],x[1]),result))
Output:
main_An-instance-of-A_objectID=gen1
main_An-instance-of-A_family=familyX
main_An-instance-of-A_objectID=gen2
main_An-instance-of-A_family=familyY
main_An-instance-of-A_List-of-2-X-elements_An-instance-of-X_objectID=21
main_An-instance-of-A_List-of-2-X-elements_An-instance-of-X_name=name-for-21
main_An-instance-of-A_List-of-2-X-elements_An-instance-of-X_None=no-name
main_An-instance-of-A_List-of-2-X-elements_An-instance-of-X_objectID=22
I hope I understood your requirements correctly. If you don't want to join the paths into strings, you can keep the list of predecessors instead.
Greetings,
Thorsten

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sort dictionary by key numeric with alphanumeric data - python

Since you want to sort numerically you will need a helper function: def split_name(s): nameparts = s.split('.') for i,p in enumerate(nameparts): if p.isdigit(): nameparts[i] = int(p) return nameparts obj = obj.sort(key = lambda x:split_name(x['name']))

Related

JSON viewers don't accept my pattern even after dict going through json.dumps() + json.loads()

Python steamlit select box menu returns string, but I need dict or list

Assigning attributes to array of objects based on object array lookup

chinese character encoding about django and highcharts(unicode utf8)

Getting a python dict.keys trail for each item

Categories

Resources