pymongo db.command exclude fields / projection - python

I am trying to write a call to db.command() using PyMongo that performs a geoNear search and I would like to exclude fields. The documentation for db.runCommand on the Mongo site and the PyMongo documentation both do not explain how one can accomplish this.
I understand how to do this using db.collection.find():
response = collection.find_one(
filter = {"PostalCode": postal_code},
projection = {'_id': False}
)
However, I cannot find any example anywhere of how to accomplish this when performing a geoNear search utilizing db.command():
params = {
"near": {
"type": "Point",
"coordinates": [longitude, latitude]
},
"spherical": True,
"limit": 1,
}
response = self.db.command("geoNear", value=self._collection_name, **params)
Can anyone provide insight into how one excludes fields when using db.command?

The geoNear command does not have a "projection" feature. It always returns entire documents. See the geoNear command reference for its options:
https://docs.mongodb.com/manual/reference/command/geoNear/

Related

Find all unique values for field in Elasticsearch through python

I've been scouring the web for some good python documentation for Elasticsearch. I've got a query term that I know returns the information I need, but I'm struggling to convert the raw string into something Python can interpret.
This will return a list of all unique 'VALUE's in the dataset.
{"find": "terms", "field": "hierarchy1.hierarchy2.VALUE"}
Which I have taken from a dashboarding tool which accesses this data.
But I don't seem to be able to convert this into correct python.
I've tried this:
body_test = {"find": "terms", "field": "hierarchy1.hierarchy2.VALUE"}
es = Elasticsearch(SETUP CONNECTION)
es.search(
index="INDEX_NAME",
body = body_test
)
but it doesn't like the find value. I can't find anything in the documentation about find.
RequestError: RequestError(400, 'parsing_exception', 'Unknown key for
a VALUE_STRING in [find].')
The only way I've got it to slightly work is with
es_search = (
Search(
using=es,
index=db_index
).source(['hierarchy1.hierarchy2.VALUE'])
)
But I think this is pulling the entire dataset and then filtering (which I obviously don't want to be doing each time I run this code). This needs to be done through python and so I cannot simply POST the query I know works.
I am completely new to ES and so this is all a little confusing. Thanks in advance!
So it turns out that the find in this case was specific to Grafana (the dashboarding tool I took the query from.
In the end I used this site and used the code from there. It's a LOT more complicated than I thought it was going to be. But it works very quickly and doesn't put a strain on the database (which my alternative method was doing).
In case the link dies in future years, here's the code I used:
from elasticsearch import Elasticsearch
es = Elasticsearch()
def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
"""
Helper to get all distinct values from ElasticSearch
(ordered by number of occurrences)
"""
compositeQuery = {
"size": pagesize,
"sources": [{
fieldname: {
"terms": {
"field": fieldname
}
}
}
]
}
# Iterate over pages
while True:
result = es.search(**kwargs, body={
"aggs": {
"values": {
"composite": compositeQuery
}
}
})
# Yield each bucket
for aggregation in result["aggregations"]["values"]["buckets"]:
yield aggregation
# Set "after" field
if "after_key" in result["aggregations"]["values"]:
compositeQuery["after"] = \
result["aggregations"]["values"]["after_key"]
else: # Finished!
break
# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}

Elasticsearch dsl what does q('match', path = ) do?

In python using elasticsearch_dsl.query there is a helper function Q that does the DSL query. However, i do not understand what this query is trying to say in a code i found:
ES_dsl.Q('match', path=path_to_file)
What exactly is Q('match', path = path_to_file) doing?
Where path_to_file is a valid path to a file in the system in the index.
Isn't path only in nested queries? There is no path in 'match' queries? I'm guessing it is to detokenize the path_to_file to find an exact match? An explanation to what is happening would be appreciated.
the approach it takes is the query type as the first value, then what you want to query next. so that is saying;
run a match query - https://www.elastic.co/guide/en/elasticsearch/reference/7.15/query-dsl-match-query.html
use the path field and search for the value path_to_file
so matching that back to the docs page from above, it'd look like this in direct DSL;
GET /_search
{
"query": {
"match": {
"path": {
"query": "path_to_file"
}
}
}
}

Change the font of an entire document without affecting formatting using Google Docs API

I am trying to change the font of an entire Google Doc using the API. The purpose is to let users of our application export documents with their company’s font.
This is what I am currently doing:
from googleapiclient.discovery import build
doc_service = build("docs", "v1")
document = self.doc_service.documents().get(documentId="[Document ID]").execute()
requests = []
for element in document["body"]["content"]:
if "sectionBreak" in element:
continue # Trying to change the font of a section break causes an error
requests.append(
{
"updateTextStyle": {
"range": {
"startIndex": element["startIndex"],
"endIndex": element["endIndex"],
},
"textStyle": {
"weightedFontFamily": {
"fontFamily": "[Font name]"
},
},
"fields": "weightedFontFamily",
}
}
)
doc_service.documents().batchUpdate(
documentId=self.copy_id, body={"requests": requests}
).execute()
The code above changes the font, but it also removes any bold text formatting because it overrides the entire style of an element. Some options I have looked into:
DocumentStyle
Documents have a DocumentStyle property, but it does not contain any font information.
NamedStyles
Documents also have a NamedStyles property. It contains styles like NORMAL_TEXT and HEADING_1. I could loop through all these and change their textStyle.weightedFontFamily. This would be the ideal solution, because it would keep style information where it belongs. But I have not found a way to change NamedStyles using the API.
Deeper loop
I could continue with my current approach, looping through the elements list on each element, keeping everything but the font from textStyle (which contains things like bold: true). However, our current approach already takes too long to execute, and such an approach would be both slower and more brittle, so I would like to avoid this.
Answer:
Extract the textStyle out of the current element and only change/add the weightedFontFamily/fontFamily object.
Code Example:
for element in document["body"]["content"]:
if "sectionBreak" in element:
continue # Trying to change the font of a section break causes an error
textStyle = element["paragraph"]["elements"][0]["textStyle"]
textStyle["weightedFontFamily"]["fontFamily"] = "[Font name]"
requests.append(
{
"updateTextStyle": {
"range": {
"startIndex": element["startIndex"],
"endIndex": element["endIndex"],
},
"textStyle": textStyle,
"fields": "weightedFontFamily",
}
}
)
doc_service.documents().batchUpdate(
documentId=self.copy_id, body={"requests": requests}
).execute()
This seems to work for me, even with section breaks in between, and at the end of the document. You might want to explore more corner cases..
This basically tries to mimic the SelectAll option
document = service.documents().get(documentId=doc_id).execute()
endIndex = sorted(document["body"]["content"], key=lambda x: x["endIndex"], reverse=True,)[0]["endIndex"]
service.documents().batchUpdate(
documentId=doc_id,
body={
"requests": {
"updateTextStyle": {
"range": {
"endIndex": endIndex,
"startIndex": 1,
},
"fields": "fontSize",
"textStyle": {"fontSize": {"magnitude": 100, "unit": "pt"}},
}
}
},
).execute()
Same should work for other fields too.
However, if you are going to just share a docx file to all the clients, you could keep a local copy of the PDF / DOCX and then modify those. It is fairly easy to work around the styles in DOCX (it is a bunch of xml files)
Use this to explore and update DOCX files OOXML Tools Chrome Extension
Similarly PDFs are key-values pairs stored as records. Check this: ReportLab

How do you test your ncurses app in Python?

We've built cli-app with Python. Some part need ncurses, so we use
npyscreen. We've successfully tested most part of app using pytest
(with the help of mock and other things). But we stuck in 'how to test
the part of ncurses code'
Take this part of our ncurses code that prompt user to answer:
"""
Generate text user interface:
example :
fields = [
{"type": "TitleText", "name": "Name", "key": "name"},
{"type": "TitlePassword", "name": "Password", "key": "password"},
{"type": "TitleSelectOne", "name": "Role",
"key": "role", "values": ["admin", "user"]},
]
form = form_generator("Form Foo", fields)
print(form["role"].value[0])
print(form["name"].value)
"""
def form_generator(form_title, fields):
def myFunction(*args):
form = npyscreen.Form(name=form_title)
result = {}
for field in fields:
t = field["type"]
k = field["key"]
del field["type"]
del field["key"]
result[k] = form.add(getattr(npyscreen, t), **field)
form.edit()
return result
return npyscreen.wrapper_basic(myFunction)
We have tried many ways, but failed:
stringIO to capture the output: failed
redirect the output to file: failed
hecate: failed
I think it's only work if we run whole program
pyautogui
I think it's only work if we run whole program
This is the complete steps of what I have
tried
So the last thing I use is to use patch. I patch those
functions. But the cons is the statements inside those functions are
remain untested. Cause it just assert the hard-coded return value.
I find npyscreen docs
for writing test. But I don't completely understand. There is just one example.
Thank you in advance.
I don't see it mentioned in the python docs, but you can use the screen-dump feature of the curses library to capture information for analysis.

How do I generate python class source code from JSON? [duplicate]

Is there a python library for converting a JSON schema to a python class definition, similar to jsonschema2pojo -- https://github.com/joelittlejohn/jsonschema2pojo -- for Java?
So far the closest thing I've been able to find is warlock, which advertises this workflow:
Build your schema
>>> schema = {
'name': 'Country',
'properties': {
'name': {'type': 'string'},
'abbreviation': {'type': 'string'},
},
'additionalProperties': False,
}
Create a model
>>> import warlock
>>> Country = warlock.model_factory(schema)
Create an object using your model
>>> sweden = Country(name='Sweden', abbreviation='SE')
However, it's not quite that easy. The objects that Warlock produces lack much in the way of introspectible goodies. And if it supports nested dicts at initialization, I was unable to figure out how to make them work.
To give a little background, the problem that I was working on was how to take Chrome's JSONSchema API and produce a tree of request generators and response handlers. Warlock doesn't seem too far off the mark, the only downside is that meta-classes in Python can't really be turned into 'code'.
Other useful modules to look for:
jsonschema - (which Warlock is built on top of)
valideer - similar to jsonschema but with a worse name.
bunch - An interesting structure builder thats half-way between a dotdict and construct
If you end up finding a good one-stop solution for this please follow up your question - I'd love to find one. I poured through github, pypi, googlecode, sourceforge, etc.. And just couldn't find anything really sexy.
For lack of any pre-made solutions, I'll probably cobble together something with Warlock myself. So if I beat you to it, I'll update my answer. :p
python-jsonschema-objects is an alternative to warlock, build on top of jsonschema
python-jsonschema-objects provides an automatic class-based binding to JSON schemas for use in python.
Usage:
Sample Json Schema
schema = '''{
"title": "Example Schema",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
},
"dogs": {
"type": "array",
"items": {"type": "string"},
"maxItems": 4
},
"gender": {
"type": "string",
"enum": ["male", "female"]
},
"deceased": {
"enum": ["yes", "no", 1, 0, "true", "false"]
}
},
"required": ["firstName", "lastName"]
} '''
Converting the schema object to class
import python_jsonschema_objects as pjs
import json
schema = json.loads(schema)
builder = pjs.ObjectBuilder(schema)
ns = builder.build_classes()
Person = ns.ExampleSchema
james = Person(firstName="James", lastName="Bond")
james.lastName
u'Bond' james
example_schema lastName=Bond age=None firstName=James
Validation :
james.age = -2
python_jsonschema_objects.validators.ValidationError: -2 was less
or equal to than 0
But problem is , it is still using draft4validation while jsonschema has moved over draft4validation , i filed an issue on the repo regarding this .
Unless you are using old version of jsonschema , the above package will work as shown.
I just created this small project to generate code classes from json schema, even if dealing with python I think can be useful when working in business projects:
pip install jsonschema2popo
running following command will generate a python module containing json-schema defined classes (it uses jinja2 templating)
jsonschema2popo -o /path/to/output_file.py /path/to/json_schema.json
more info at: https://github.com/frx08/jsonschema2popo

Categories

Resources