Good ways to sort a queryset? - Django - python

what I'm trying to do is this:
get the 30 Authors with highest score ( Author.objects.order_by('-score')[:30] )
order the authors by last_name
Any suggestions?

What about
import operator
auths = Author.objects.order_by('-score')[:30]
ordered = sorted(auths, key=operator.attrgetter('last_name'))
In Django 1.4 and newer you can order by providing multiple fields.
Reference: https://docs.djangoproject.com/en/dev/ref/models/querysets/#order-by
order_by(*fields)
By default, results returned by a QuerySet are ordered by the ordering tuple given by the ordering option in the model’s Meta. You can override this on a per-QuerySet basis by using the order_by method.
Example:
ordered_authors = Author.objects.order_by('-score', 'last_name')[:30]
The result above will be ordered by score descending, then by last_name ascending. The negative sign in front of "-score" indicates descending order. Ascending order is implied.

I just wanted to illustrate that the built-in solutions (SQL-only) are not always the best ones. At first I thought that because Django's QuerySet.objects.order_by method accepts multiple arguments, you could easily chain them:
ordered_authors = Author.objects.order_by('-score', 'last_name')[:30]
But, it does not work as you would expect. Case in point, first is a list of presidents sorted by score (selecting top 5 for easier reading):
>>> auths = Author.objects.order_by('-score')[:5]
>>> for x in auths: print x
...
James Monroe (487)
Ulysses Simpson (474)
Harry Truman (471)
Benjamin Harrison (467)
Gerald Rudolph (464)
Using Alex Martelli's solution which accurately provides the top 5 people sorted by last_name:
>>> for x in sorted(auths, key=operator.attrgetter('last_name')): print x
...
Benjamin Harrison (467)
James Monroe (487)
Gerald Rudolph (464)
Ulysses Simpson (474)
Harry Truman (471)
And now the combined order_by call:
>>> myauths = Author.objects.order_by('-score', 'last_name')[:5]
>>> for x in myauths: print x
...
James Monroe (487)
Ulysses Simpson (474)
Harry Truman (471)
Benjamin Harrison (467)
Gerald Rudolph (464)
As you can see it is the same result as the first one, meaning it doesn't work as you would expect.

Here's a way that allows for ties for the cut-off score.
author_count = Author.objects.count()
cut_off_score = Author.objects.order_by('-score').values_list('score')[min(30, author_count)]
top_authors = Author.objects.filter(score__gte=cut_off_score).order_by('last_name')
You may get more than 30 authors in top_authors this way and the min(30,author_count) is there incase you have fewer than 30 authors.

Related

I have this following algorithms question that I have solved but is not good for different test cases

I am preparing for a technical round and while preparing I encountered this problem through leetcode's interview questions section.
My solution can take 3 items in its input dict anything less than that it throws error.
I would also like to know what do you think the ranking of this question will be in terms of LC easy, medium and hard if it was actually in the problems section of LC.
PROBLEM:
Juan Hernandez is a Shopify merchant that owns a Pepper sauce shop
with five locations: Toronto, Vancouver, Montreal, Calgary and Halifax.
He also sells online and ships his sauces across the country from one
of his brick-and-mortar locations.
The pepper sauces he sells are:
Jalapeño (J)
Habanero (H)
Serrano (S)
The inventory count for each location looks like this:
City J H S
Toronto 5 0 0
Vancouver 10 2 6
Montreal 3 5 5
Calgary 1 18 2
Halifax 28 2 12
Every time he gets an online order, he needs to figure out
which locations can fulfill that order. Write a function that
takes an order as input and outputs a list of locations which
have all the items in stock.
Example
Input : J:3. H:2 s:4
Output: Van, Mon, Hali
Input: H:7 S:1
Output: Cal
My Solution:
inven = {
'tor': {'j':5,'h':0,'s':0},
'van': {'j':10,'h':2,'s':6},
'mon': {'j':3,'h':5,'s':5},
'cal': {'j':1,'h':18,'s':2},
'hal': {'j':28,'h':2,'s':12},
}
order = {
'j':3,
'h':2,
's':4
}
def find_order(order):
output = []
for city in inven:
if order['j'] <= inven[city]['j'] and order['h'] <= inven[city]['h'] and order['s'] <= inven[city]['s']:
output.append(city)
return output
print(find_order(order))
Sorry, if the answer is something super easy. I am still kinda new to coding and its my first technical round.
I only know python as of now. If its not your language, a hint toward the right direction will be very helpful.
Here's a way to do it:
inven = {
'tor': {'j':5,'h':0,'s':0},
'van': {'j':10,'h':2,'s':6},
'mon': {'j':3,'h':5,'s':5},
'cal': {'j':1,'h':18,'s':2},
'hal': {'j':28,'h':2,'s':12},
}
order = {
'j':3,
'h':2,
's':4
}
order2 = {
'h':7,
's':1
}
def find_order(order):
return [city for city, amts in inven.items() if all(amt >= order[sauce] for sauce, amt in amts.items() if sauce in order)]
print(find_order(order))
print(find_order(order2))
Output:
['van', 'mon', 'hal']
['cal']
Explanation:
in the list comprehension, we build a list containing each city that satisfies a condition
the condition is that all sauces found in the order are available in a given city in sufficient quantity to fill the order.
Some help from the docs:
all()
list comprehensions
dict.items()
Your solution looks very close to ok. I'm guessing by less then 3 items you mean that not all types of sauces are present in the order. To fix the error that you get in that case you can just check if the dict contains all expected keys ('j', 'h' and 's'), and if some of them are missing, insert them with the value of 0.
def find_order(order):
if 'j' not in order:
order['j'] = 0
if 'h' not in order:
order['h'] = 0
if 's' not in order:
order['s'] = 0
output = []
for city in inven:
if order['j'] <= inven[city]['j'] and order['h'] <= inven[city]['h'] and order['s'] <= inven[city]['s']:
output.append(city)
return output

how to filter django model object if contains in python django

i am trying to filter out a model which contain sender,receiver,user, the problem is when i try to use return Private_messages.objects.filter(sender__contains=sender,receiver__contains=receiver,user=user)
it is strict in checking and will only return if condition are met...
model example
id:1
sender:"dan"
reciever:"harry"
id:2
sender:"harry"
reciever:"dan"
id:3
sender:"dan"
reciever:"harry"
in this sometimes sender is dan sometimes not. with Private_messages.objects.filter(sender="dan",receiver="dan")
i want to get all object with the who has sender dan and receiver dan not both dan
how can i get something like this in python django?
If you want to get the items where either sender is dan OR receiver is dan. Then you can do this
Using Q objects - Docs
from django.db.models import Q
ans = Private_messages.objects.filter(Q(sender='dan') | Q(receiver='dan'))
Using |
ans = Private_messages.objects.filter(sender='dan') | Private_messages.objects.filter(receiver='dan')
If I understood correctly, you want the a union of the following querysets:
q1 = Private_messages.objects.filter(sender__contains=sender)
q2 = Private_messages.objects.filter(receiver__contains=receiver)
q3 = Private_messages.objects.filter(user=user)
return q1.union(q2, q3)
You can check out more info on the union operation here: https://docs.djangoproject.com/en/3.2/ref/models/querysets/#union
You could also use Q queries: https://docs.djangoproject.com/en/3.2/topics/db/queries/#complex-lookups-with-q-objects

How to extract URL from a redirect URL using regex in Python?

I have the following test_string from which I need to obtain the actual URL.
Test string (partly shown):
An experimental and modeling study of autoignition characteristics of
butanol/diesel blends over wide temperature ranges
<http://scholar.google.com/scholar_url?url=3Dhttps://www.sciencedirect.com/=
science/article/pii/S0010218020301346&hl=3Den&sa=3DX&d=3D448628313728630325=
1&scisig=3DAAGBfm26Wh2koXdeGZkQxzZbenQYFPytLQ&nossl=3D1&oi=3Dscholaralrt&hi=
st=3Dv2Y_3P0AAAAJ:17949955323429043383:AAGBfm1nUe-t2q_4mKFiHSHFEAo0A4rRSA>
Y Qiu, W Zhou, Y Feng, S Wang, L Yu, Z Wu, Y Mao=E2=80=A6 - Combustion and =
Flame,
2020
Desired output for part of test_string
https://www.sciencedirect.com/science/article/pii/S0010218020301346
I have been trying to obtain this with the MWE given below applied to many strings, but it gives only one URL.
MWE
from urlparse import urlparse, parse_qs
import re
from re import search
test_string = '''
Production, Properties, and Applications of ALPHA-Terpineol
<http://scholar.google.com/scholar_url?url=https://link.springer.com/content/pdf/10.1007/s11947-020-02461-6.pdf&hl=en&sa=X&d=12771069332921982368&scisig=AAGBfm1tFjLUm7GV1DRnuYCzvR4uGWq9Cg&nossl=1&oi=scholaralrt&hist=v2Y_3P0AAAAJ:17949955323429043383:AAGBfm1nUe-t2q_4mKFiHSHFEAo0A4rRSA>
A Sales, L de Oliveira Felipe, JL Bicas
Abstract ALPHA-Terpineol (CAS No. 98-55-5) is a tertiary monoterpenoid
alcohol widely
and commonly used in the flavors and fragrances industry for its sensory
properties.
It is present in different natural sources, but its production is mostly
based on ...
Save
<http://scholar.google.com/citations?update_op=email_library_add&info=oB2z7uTzO7EJ&citsig=AMD79ooAAAAAYLfmix3sQyUWnFrHeKYZxuK31qlqlbCh&hl=en>
Twitter
<http://scholar.google.com/scholar_share?hl=en&oi=scholaralrt&ss=tw&url=https://link.springer.com/content/pdf/10.1007/s11947-020-02461-6.pdf&rt=Production,+Properties,+and+Applications+of+%CE%B1-Terpineol&scisig=AAGBfm0yXFStqItd97MUyPT5nRKLjPIK6g>
Facebook
<http://scholar.google.com/scholar_share?hl=en&oi=scholaralrt&ss=fb&url=https://link.springer.com/content/pdf/10.1007/s11947-020-02461-6.pdf&rt=Production,+Properties,+and+Applications+of+%CE%B1-Terpineol&scisig=AAGBfm0yXFStqItd97MUyPT5nRKLjPIK6g>
An experimental and modeling study of autoignition characteristics of
butanol/diesel blends over wide temperature ranges
<http://scholar.google.com/scholar_url?url=3Dhttps://www.sciencedirect.com/=
science/article/pii/S0010218020301346&hl=3Den&sa=3DX&d=3D448628313728630325=
1&scisig=3DAAGBfm26Wh2koXdeGZkQxzZbenQYFPytLQ&nossl=3D1&oi=3Dscholaralrt&hi=
st=3Dv2Y_3P0AAAAJ:17949955323429043383:AAGBfm1nUe-t2q_4mKFiHSHFEAo0A4rRSA>
Y Qiu, W Zhou, Y Feng, S Wang, L Yu, Z Wu, Y Mao=E2=80=A6 - Combustion and =
Flame,
2020
Butanol/diesel blend is considered as a very promising alternative fuel
with
agreeable combustion and emission performance in engines. This paper
intends to
further investigate its autoignition characteristics with the combination
of a heated =E2=80=A6
[image: Save]
<http://scholar.google.com/citations?update_op=3Demail_library_add&info=3DE=
27Gd756Qj4J&citsig=3DAMD79ooAAAAAYImDxwWCwd5S5xIogWp9RTavFRMtTDgS&hl=3Den>
[image:
Twitter]
<http://scholar.google.com/scholar_share?hl=3Den&oi=3Dscholaralrt&ss=3Dtw&u=
rl=3Dhttps://www.sciencedirect.com/science/article/pii/S0010218020301346&rt=
=3DAn+experimental+and+modeling+study+of+autoignition+characteristics+of+bu=
tanol/diesel+blends+over+wide+temperature+ranges&scisig=3DAAGBfm19DOLNm3-Fl=
WaO0trAxZkeidxYWg>
[image:
Facebook]
<http://scholar.google.com/scholar_share?hl=3Den&oi=3Dscholaralrt&ss=3Dfb&u=
rl=3Dhttps://www.sciencedirect.com/science/article/pii/S0010218020301346&rt=
=3DAn+experimental+and+modeling+study+of+autoignition+characteristics+of+bu=
tanol/diesel+blends+over+wide+temperature+ranges&scisig=3DAAGBfm19DOLNm3-Fl=
WaO0trAxZkeidxYWg>
Using NMR spectroscopy to investigate the role played by copper in prion
diseases.
<http://scholar.google.com/scholar_url?url=3Dhttps://europepmc.org/article/=
med/32328835&hl=3Den&sa=3DX&d=3D16122276072657817806&scisig=3DAAGBfm1AE6Kyl=
jWO1k0f7oBnKFClEzhTMg&nossl=3D1&oi=3Dscholaralrt&hist=3Dv2Y_3P0AAAAJ:179499=
55323429043383:AAGBfm1nUe-t2q_4mKFiHSHFEAo0A4rRSA>
RA Alsiary, M Alghrably, A Saoudi, S Al-Ghamdi=E2=80=A6 - =E2=80=A6 and of =
the Italian
Society of =E2=80=A6, 2020
Prion diseases are a group of rare neurodegenerative disorders that develop
as a
result of the conformational conversion of normal prion protein (PrPC) to
the disease-
associated isoform (PrPSc). The mechanism that actually causes disease
remains =E2=80=A6
[image: Save]
<http://scholar.google.com/citations?update_op=3Demail_library_add&info=3Dz=
pCMKavUvd8J&citsig=3DAMD79ooAAAAAYImDx3r4gltEWBAkhl0g2POsXB9Qn4Lk&hl=3Den>
[image:
Twitter]
<http://scholar.google.com/scholar_share?hl=3Den&oi=3Dscholaralrt&ss=3Dtw&u=
rl=3Dhttps://europepmc.org/article/med/32328835&rt=3DUsing+NMR+spectroscopy=
+to+investigate+the+role+played+by+copper+in+prion+diseases.&scisig=3DAAGBf=
m1RidyRD-x2FOemP6iqCsr-6GAVKA>
[image:
Facebook]
<http://scholar.google.com/scholar_share?hl=3Den&oi=3Dscholaralrt&ss=3Dfb&u=
rl=3Dhttps://europepmc.org/article/med/32328835&rt=3DUsing+NMR+spectroscopy=
+to+investigate+the+role+played+by+copper+in+prion+diseases.&scisig=3DAAGBf=
m1RidyRD-x2FOemP6iqCsr-6GAVKA>
'''
regex = re.compile('(http://scholar.*?)&')
url_all = regex.findall(test_string)
citation_url = []
for i in url_all:
if search('scholar.google.com',i):
qs = parse_qs(urlparse(i).query).values()
if search('http',str(qs[0])):
citation_url.append(qs[0])
print citation_url
Present output
https://link.springer.com/content/pdf/10.1007/s11947-020-02461-6.pdf
Desired output
https://link.springer.com/content/pdf/10.1007/s11947-020-02461-6.pdf
https://www.sciencedirect.com/science/article/pii/S0010218020301346
https://europepmc.org/article/med/3232883
How to get handle URL text wrapping with equal sign and extracting the redirect URL in Python?
You could match either a question mark or ampersand [&?] using a character class. Looking at the example data, for the url= part, you can add optional newlines and an optional equals sign and adjust accordingly.
Some urls start with 3D, you can make that part optional using a non capturing group (?:3D)?
Then capture in group 1 matching http followed by matching all chars except &
\bhttp://scholar\.google\.com.*?[&?]\n?u=?\n?r\n?l\n?=(?:3D)?(http[^&]+)
Regex demo
see this regex pattern i think it might help to extract redirect uri
(http:\/\/scholar[\w.\/=&?]*)[?]?u[=]?rl=([\w\:.\/\-=]+)
also see this example here https://regex101.com/r/dmkF3h/3

Get the members from a Twitter list with python

I am trying to create a Data frame with some data from the European Parliament members. However I am struggling with the data received when using the tweepy package.
api = tweepy.API(auth)
# Iterate through all members of the owner's list
member in tweepy.Cursor(api.list_members, 'Europarl_EN', 'all-meps-on-twitter').items():
m = member
print(member)
The problem is I do not how to get a readable table after this. Also I tried this just in order to get the names:
lel = api.list_members('Europarl_EN', 'all-meps-on-twitter', -10)
for i in lel:
print(i.name)
And the output is:
Jaromír Kohlíček
István Ujhelyi
Deli Andor
Maria Grapini
Winkler Gyula
LefterisChristoforou
Mircea Diaconu
Maria Heubuch
Daniel Buda
Marijana Petir
Maite Pagazaurtundúa
Janice Atkinson
Andrew Lewer
Martina Michels
Joachim Starbatty
Peter Jahr
Emil Radev
József Nagy
Quisthoudt-Rowohl
Dominique Bilde
All in, my intention is to transform lel into a dataframe or in the worst scenario to get the usernames.

How can you parse a document stored in the MARC21 format with Python

Yesterday harvard released open access to all its library metadata (some 12 million records)
I was looking to parse the data and play with it as the goal of the release was to "support innovation"
Download the 12GB tarball, unpacked it to find 13 .mrc files about 800MB each
MARC21 format
When I looked at the head and tail of the first few files, it looks to be very unstructured, even after reading a bit about MARC21.
Here's what the first 4k of the first file look like:
$ head -c 4000 ab.bib.00.20120331.full.mrc
00857nam a2200253 a 4500001001200000005001700012008004100029010001700070020001200087035001600099040001800115043001200133050002500145245011100170260004900281300002100330504004100351610006400392650005300456650003500509700003800544988001300582906000800595002000001-420020606093309.7880822s1985 unr b 000 0 ruso a 86231326 c0.45rub0 aocm18463285 aDLCcDLCdHLS ae-ur-un0 aJN6639.A8bK665 198500aInformat︠s︡ii︠a︡ v rabote partiĭnykh komitetov /c[sostavitelʹ Stepan Ivanovich I︠A︡lovega]. aKiev :bIzd-vo polit. lit-ry Ukrainy,c1985. a206 p. ;c20 cm. aIncludes bibliographical references.20aKomunistychna partii︠a︡ UkraïnyxInformation services. 0aParty committeeszUkrainexInformation services. 0aInformation serviceszUkraine.1 aI︠A︡lovega, Stepan Ivanovich. a20020608 0DLC00418nam a22001335u 4500001001200000005001700012008004100029110003000070245004600100260006000146500005800206988001300264906000700277002000002-220020606093309.7900925|1944 mx |||||||spa|d1 aCampeche (Mexico : State)10aLey del notariado del estado de Campeche.0 a[Campeche]bDepartamento de prensa y publicidad,c1944. aAt head of title: Gobierno constitucional del estado. a20020608 0MH00647nam a2200229M 4500001001200000005001700012008004100029010001700070035001600087040001300103041001100116050003600127100004200163245004100205246005600246260001600302300001900318500001500337650004400352988001300396906000800409002000003-020051201172535.0890331s1902 xx d 000 0 ota a 73960310 0 aocm23499219 aDLCcEYM0 aotaara0 aPJ6636.T8bU5 1973 (Orien Arab)1 aUnsī, Muḥammad ʻAlī ibn Ḥasan.10aQāmūs al-lughah al-ʻUthmānīyah.3 aDarārī al-lāmiʻāt fī muntakhabāt al-lughāt. c[1902 1973] a564 p.c22 cm. aRomanized. 0aTurkish languagevDictionariesxArabic. a20020608 0DLC00878nam a2200253 a 4500001001200000005001700012008004100029010001700070035001600087040001800103043001200121050002300133245012800156246004600284260006300330300004800393500003300441610003200474650005000506700002400556710002300580988001300603906000800616002000004-920020606093309.7880404s1980 yu fa 000 0 scco a 82167322 0 aocm17880048 aDLCcDLCdHLS ae-yu---0 aL53.P783bT75 198000aTrideset pet godina Prosvetnog pregleda, 1945-1980 /c[glavni i odgovorni urednik i urednik publikacije Ružica Petrović].3 a35 godina Prosvetnog pregleda, 1945-1980. aBeograd :bNovinska organizacija Prosvetni pregled,c1980. a146 p., [21] p. of plates :bill. ;c29 cm. aIn Serbo-Croatian (Cyrillic)20aProsvetni pregledxHistory. 0aEducationzYugoslaviaxHistoryy20th century.1 aPetrović, Ružica.2 aProsvetni pregled. a20020608 0DLC00449nam a22001455u 4500001001200000005001700012008004100029245008200070260002800152300001100180440006600191700002600257988001300283906000700296002000005-720020606093309.7900925|1981 pl |||||||pol|d10aZ zagadnień dialektyki i świadomości społecznej /cpod red. K. Ślęczka.0 aKatowice :bUŚ,c1981. a135 p. 0aPrace naukowe Uniwersytetu Śląskiego w Katowicach ;vnr 4621 aŚlęczka, Kazimierz. a20020608 0MH00331nam a22001455u 4500001001200000005001700012008004100029100002200070245002200092250001200114260002800126300001100154988001300165906000700178002000006-520020606093309.7900925|1980 pl |||||||pol|d1 aMencwel, Andrzej.10aWidziane z dołu. aWyd. 1.0 aWarszawa :bPIW,c1980. a166 p. a20020608 0MH00746cam a2200241 a 4500001001200000005001700012008004100029010001700070020001500087035001600102040001800118050002400136082001600160100001600176245008000192260007100272300002500343504004100368650003400409650004000443988001300483906000800496002000007-300000000000000.0900123s1990 enk b 001 0 eng a 90031350 a03910368230 aocm21081069 aDLCcDLCdHBS00aHF5439.8b.O35 199
Has anyone ever had to work with a MARC21 before? Does it typically look like this or do I need to parse it differently.
pymarc is the best option to parse MARC21 records using Python (full disclosure: I'm one of its maintainers). If you're unfamiliar with working with MARC21, it's worth reading through some of the specification you linked to on the Library of Congress website. I'd also read through the Working with MARC page on the Code4lib wiki.
You may want to check this out - pymarc
Disclaimer: I'm the author of marcx.
pymarc is a great library. For a few operations, which I missed in pymarc I implemented as a thin layer atop of it: marcx.
marcx.FatRecord is a small extension to pymarc.Record, that adds a few shortcuts. The gist are the twins add and remove, a (subfield) value generator itervalues and a generic test function.
The main benefit it an easier way to iterate over field (or subfield) values. Example:
>>> from marcx import FatRecord; from urllib import urlopen
>>> record = FatRecord(data=urlopen("http://goo.gl/lfJnw9").read())
>>> for val in record.itervalues('100.a', '700.a'):
... print(val)
Hunt, Andrew,
Thomas, David,

Categories

Resources