How to search for specific fields in a document in marqo - python

i am looking for a way to search for specific fields in a document using marqo because whenever i use the .search() method the _highlights returns a random field either the title, description or any other field but it is usually random.
this is an example of what i mean:
{
'hits': [
{
'Title': 'document_title',
'Description': 'document_description',
'_highlights': {
'Description': 'document_description'
},
'_id': 'document_id',
'_score': document_score
},
{
'Title': 'document_title',
'Description': "document_description",
'_highlights': {'Title': 'document_title'},
'_id': 'document_id',
'_score': document_score
}
],
'limit': 10,
'processingTimeMs': 49,
'query': 'search_query'
}
as you can see the first documents _highlights is Description while the second is title i want a way to make it uniform.
Thanks!

i think the best way of getting specific fields when using marqo is by add a keyword argument to the search method which is searchable_attributes=[] then you pass the fields you want to list as a string.
eg.
result = mq.index("your_index").search('query', searchable_attributes=['Title', 'Description'])

Related

Nesting oneOf inside allOf not working as expected (python jsonschema)

OK, I'm sure there's something wrong with this jsonschema, but I just can't seem to wrap my head around the problem.
I'm not going to post the actual code, but a minimal example that reproduces the issue.
Here's what I had before, which worked fine:
person = {
'type': 'object',
'properties': {
'name': {
'type': 'string'
}
'auth_token': {
'type': 'string',
},
'username': {
'type': 'string'
},
'password': {
'type': 'string'
}
},
'oneOf': [
{
'required': ['auth_token']
},
{
'required': ['username', 'password']
}
],
'required': ['name']
}
The idea here was that you always need to provide the name of the person, and then either an auth token or a username and password pair. As I said above, this validation worked fine, since we have parametrized tests that send all posible combinations of invalid JSON and evaluate the resulting error message, and those tests pass.
But then a new requirement came in and I needed to add a second mutually exclusive required pair of fields, which I did in this way:
person = {
'type': 'object',
'properties': {
'name': {
'type': 'string'
}
'auth_token': {
'type': 'string',
},
'username': {
'type': 'string'
},
'password': {
'type': 'string'
},
'project_id': {
'type': 'number'
},
'contract_date_from': {
'type': 'string'
}
'contract_date_to': {
'type': 'string'
}
},
'allOf': [
{
'oneOf': [
{
'required': ['auth_token']
},
{
'required': ['username', 'password']
}
]
},
{
'oneOf': [
{
'required': ['project_id']
},
{
'required': ['contract_date_from', 'contract_date_to']
}
]
}
],
'required': ['name']
}
But now the second validation always fails, whether the json provided is valid or invalid. The error message I get is:
{'name': 'John Doe', 'auth_token': '9d9a324b-26de-4ac3-85eb-05566e4a7204', 'username': None, 'password': None, 'project_id': 2785, 'contract_date_from': None, 'contract_date_to': None} is valid under each of {'required': ['contract_date_from', 'contract_date_to']}, {'required': ['project_id']}
No matter what values I send in those three fields (ie. project id, contract date from and contract date to), it fails with the same error. I've tried leaving all three empty, completing all three, and all permutations in between, but the error stays the same.
I've been reading the documentation for json schema but I can't seem to grasp what's going on with this example. I'm considering trying different approaches for this, but I'd really like to understand why this is not working. Any help is appreciated!
{'name': 'John Doe', 'auth_token': '9d9a324b-26de-4ac3-85eb-05566e4a7204', 'username': None, 'password': None, 'project_id': 2785, 'contract_date_from': None, 'contract_date_to': None} is valid under each of {'required': ['contract_date_from', 'contract_date_to']}, {'required': ['project_id']}
Read the error message more carefully: you requested that project_id be provided, OR contract_date_from and contract_date_to are provided, but you are providing all three of these. Providing a null value in a property is still providing a property. The error message is confusing, but you'd be failing validation anyway because null is not a string. Your evaluator is simply running the allOf->anyOfs first, so that's the error that comes back first. You should still get the type violation errors as well, though (if you don't, that's a bug: evaluators are required to provide ALL errors, not just the first.)
You can make the errors better at the expense of brevity by adding the "type" checks to live next to the "required" keywords. That will ensure the oneOf keywords produce failures rather than successes and maybe make the error messages more obvious.

building highcarts options in views.py

AVOID EVAL
My question has been answered and I ended up using eval, but after some searching on what eval does and can do I ended up not using it and instead used an alternative found here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/eval#Do_not_ever_use_eval!
In my application i'm building the whole chart options in the backend and returning it as a json response
def get_chart_data(request):
chart = {
'title': {
'text': ''
},
'xAxis': {
'categories': [],
'title': {
'text': ''
},
'type': 'category',
'crosshair': True
},
'yAxis': [{
'allowDecimals': False,
'min': 0,
'title': {
'text': ''
}
}, {
'allowDecimals': False,
'min': 0,
'title': {
'text': ''
},
'opposite': True
}],
'series': [{
'type': 'column',
'yAxis': 1,
'name': '',
'data': []
}, {
'type': 'line',
'name': '',
'data': []
}, {
'type': 'line',
'name': '',
'data': []
}]
}
return JsonResponse(chart)
And then get the data using ajax and use the response for the data
Highcharts.chart('dashboard1', data);
I'm ok with this so far but i've run into problems if I want to use highcharts functions as part of the options, for example setting the color of text using Highcharts.getOptions().colors[0],
'title': {
'text': 'Rainfall',
'style': {
'color': Highcharts.getOptions().colors[0]
}
},
If i don't put quotes to this when building the options in views.py it would be treated as python code and result in an error, however if i add quotes to it, it will be treated as string in javascript which would not work.
Is this possible? or should i just build the options in javascript and just get the data part in the backend and not the whole thing.
You could return the JS code in Django as a string, and then you can run eval() on it, but executing code like that opens the possibility of an XSS attack, especially if the information is user-submittable.
Your best bet otherwise would be to create the styling on the JS end if possible, and manipulate the incoming data.
document.querySelector('a').addEventListener('click', function (e) {
e.preventDefault();
var complexJson = {"parent": {"child": "alert('Here is a nested alert!')"}}
var alertString = "alert('Here is a simple alert!')";
eval(complexJson["parent"]["child"])
eval(alertString)
})
Click me!

What is the most efficient way to iterate through an SQL relationship when formatting a JSON response?

I noticed a bottleneck in an api I worked on recently, it relates to how the json response is formatted, specifically with having to iterate through the relationship to format the response. The database query itself takes ~1s while the formatting takes in excess of 20s for a sample size of 500 entries.
query
data = Item.query.limit(500).all()
formatting
data = [{
'attributes': {
'id': item.id,
[...]
'approved': item.approved,
}
'sizes': [{
'size': size.size,
'size_fr': size.size_fr,
'quantity': size.quantity,
'available': size.available,
'out_of_order': size.out_of_order
} for size in item.sizes],
'tags': [{
'tag': tag.tag,
'tag_fr': tag.tag_fr,
'id': tag.id
} for tag in item.tags],
'images': [{
'url': image.url,
'primary': image.primary,
'background_removed': image.background_removed,
'id': image.id
} for image in item.images],
} for item in data]
The issue is obviously due to iterating over the relationships on each iteration of the data collection, but I'm not sure how else to process the data.
I'm assuming there has to be a more efficient way to process and format the data.

$addToSet, along with updating other fields?

I am having trouble updating document in MongoDB that involves adding to list and updating some fields, using Pymongo.
To summarize, I would like to:
Add a value to the a list.
Update some fields.
Using a single update statement.
I have tried 2 methods, but both doesn't work:
key = {'username':'user1'}
user_detail = {
'name':{'first':'Marie', 'last':'Bender'},
'items':{'$addtoset':{'cars':'BMW'}}
}
user_detail2 = {
'name':{'first':'Marie', 'last':'Bender'},
'$addtoset':{'items.cars':'BMW'}
}
mongo_collection.update(key, user_detail, upsert=True)
mongo_collection.update(key, user_detail2, upsert=True)
error message: dollar ($) prefixed field '$addToSet' in '$addToSet' is not valid for storage.
My intended outcome:
Before:
{
'username':'user1',
'item': {'cars':['Merc','Ferrari'],'house':1}
}
Intended After:
{
'username':'user1',
'name': {'first':'Marie', 'last':'Bender'},
'item': {'cars':['Merc','Ferrari','BMW'],'house':1}
}
Your second attempt is closer, but you need to use the $set operator to set the value of name:
user_detail2 = {
'$set': {'name': {'first': 'Marie', 'last': 'Bender'}},
'$addtoset': {'items.cars': 'BMW'}
}

google adwords python api - how to get ad group bid

I am using the adwords python api. I need to get the bid amount and type. E.g. bid=4 ad type = cpc.
I am given the adgroup id.
Below is an example on to create and ad group. Once created...how do I retrieve the settings? How do I get e.g. the bid I set?
ad_group_service = client.GetService('AdGroupService', version='v201402')
operations = [{
'operator': 'ADD',
'operand': {
'campaignId': campaign_id,
'name': 'Earth to Mars Cruises #%s' % uuid.uuid4(),
'status': 'ENABLED',
'biddingStrategyConfiguration': {
'bids': [
{
'xsi_type': 'CpcBid',
'bid': {
'microAmount': '1000000'
},
}
]
}
}
}]
ad_groups = ad_group_service.mutate(operations)
Have a look at the corresponding example on googlads's github page.
Basically you'll user the AdGroupService's get method with a selector containing the right fields and predicates to retrieve an AdGroupPage containing the AdGroup objects you're interested in:
selector = {
'fields': ['Id', 'Name', 'CpcBid'],
'predicates': [
{
'field': 'Id',
'operator': 'EQUALS',
'values': [given_adgroup_id]
}
]
}
page = adgroup_service.get(selector)
adgroup = page.entries[0]
print('Adgroup "%s" (%s) has CPC %s' % (adgroup.name, adgroup.id,
adgroup.biddingStrategyConfiguration.bids.bid))
The available fields' names and the attributes they populate in the returned objects can be found at the selector reference page.
The AdGroupService's reference page might also be of interest.

Categories

Resources