Use ElasticSearch search URI into Python PyEs client - python

I have the following working query, using CURL:
curl -X GET 'http://myhost/myindex/_search?q=text:%7B1933%20TO%201949%7D'
which identifies all documents inside the index myindex which in the text field (string type) contains mentions of dates between 1933 and 1949. I would like to use this query programatically from Python, and in this sense I have the Python ElasticSearch client installed, Pyes:
from elasticsearch import ElasticSearch
from pyes import *
and then I would like to call
es = ElasticSearch('myhost')
totalDocs = es.search('myindex', body={'query':{'query_string':{'query': 'text:%7B1933%20TO%201949%7D'}},"size":0})['hits']['total']
but this syntax is not working. Also it is important that I look into the text field only for these date mentions. Are there any ways to use the initial query in Python? Many thanks!
Later edit: I have also tried it like this:
totalDocs = es.search('myindex', 'text:%7B1933%20TO%201949%7D')['hits']['total']
It works, but it returns no documents at all.
Actually my question resumes to this question: [ElasticSearch - specify range for a string field

Related

How to get data which have a specific child key using Pyrebase

I'm using Pyrebase to access my Firebase database. My database is currently structured like so:
- users
- 12345
name: "Kevin"
company: "Nike"
Where 12345 is the user's id, and the company is the company that the user belongs to. I'm currently trying to get all the users that belong to Nike. According to the Pyrebase docs, doing something like this should work:
db.child("users").order_by_child("company").equal_to("Nike").get().val()
but I'm getting the error "error" : "orderBy must be a valid JSON encoded path". Does anyone know why this might be the case?
There is something wrong with the Pyrebase library. Here's a link to the problem.
The solution is to add these lines of code in your app.
# Temporarily replace quote function
def noquote(s):
return s
pyrebase.pyrebase.quote = noquote
I managed to fix this problem, since I'm also using rest api to connect with my firebase realtime database. I'll demonstrate where the error lies with examples:
When I don't put wrap the orderBy value (child, key, etc) and other queries parameters with commas, retrofit (which I'm using) gives me error/bad request.
Here's the error/bad request url:
https://yourfirebaseprojecturl.com/Users.json?orderBy=username&startAt=lifeofkevin
See, both the orderBy value and startAt value, in this case, username and lifeofkevin, are not wrapped with commas, like this "username" and "lifeofkevin", so it will return orderBy must be a valid JSON encoded path.
In order to work, I need to wrap my orderBy and other query parameters, with commas, so that Firebase returns the data, you want to work with.
Here's the second example, the correct one:
https://yourfirebaseprojecturl.com/Users.json?orderBy="username"&startAt="gang"
Now notice, the difference? Both values of orderBy and startAt are wrapped with commas so now they'll return the data you want to work with.

How to select all data in PyMongo?

I want to select all data or select with conditional in table random but I can't find any guide in MongoDB in Python to do this.
And I can't show all data was select.
Here my code:
def mongoSelectStatement(result_queue):
client = MongoClient('mongodb://localhost:27017')
db = client.random
cursor = db.random.find({"gia_tri": "0.5748676522161966"})
# cursor = db.random.find()
inserted_documents_count = cursor.count()
for document in cursor:
result_queue.put(document)
There is a quite comprehensive documentation for mongodb. For python (Pymongo) here is the URL: https://api.mongodb.org/python/current/
Note: Consider the version you are running. Since the latest version has new features and functions.
To verify pymongo version you are using execute the following:
import pymongo
pymongo.version
Now. Regarding the select query you asked for. As far as I can tell the code you presented is fine. Here is the select structure in mongodb.
First off it is called find().
In pymongo; if you want to select specific rows( not really rows in mongodb they are called documents. I am saying rows to make it easy to understand. I am assuming you are comparing mongodb to SQL); alright so If you want to select specific document from the table (called collection in mongodb) use the following structure (I will use random as collection name; also assuming that the random table has the following attributes: age:10, type:ninja, class:black, level:1903):
db.random.find({ "age":"10" }) This will return all documents that have age 10 in them.
you could add more conditions simply by separating with commas
db.random.find({ "age":"10", "type":"ninja" }) This will select all data with age 10 and type ninja.
if you want to get all data just leave empty as:
db.random.find({})
Now the previous examples display everything (age, type, class, level and _id). If you want to display specific attributes say only the age you will have to add another argument to find called projection eg: (1 is show, 0 is do not show):
{'age':1}
Note here that this returns age as well as _id. _id is always returned by default. You have to explicitly tell it not to returning it as:
db.random.find({ "age":"10", "name":"ninja" }, {"age":1, "_id":0} )
I hope that could get you started.
Take a look at the documentation is very thorough.

Converting JSON into Python Dict with Postgresql data imported with SQLAlchemy

I've got a little bit of a tricky question here regarding converting JSON strings into Python data dictionaries for analysis in Pandas. I've read a bunch of other questions on this but none seem to work for my case.
Previously, I was simply using CSVs (and Pandas' read_csv function) to perform my analysis, but now I've moved to pulling data directly from PostgreSQL.
I have no problem using SQLAlchemy to connect to my engine and run my queries. My whole script runs the same as it did when I was pulling the data from CSVs. That is, until it gets to the part where I'm trying to convert one of the columns (namely, the 'config' column in the sample text below) from JSON into a Python dictionary. The ultimate goal of converting it into a dict is to be able to count the number of responses under the "options" field within the "config" column.
df = pd.read_sql_query('SELECT questions.id, config from questions ', engine)
df = df['config'].apply(json.loads)
df = pd.DataFrame(df.tolist())
df['num_options'] = np.array([len(row) for row in df.options])
When I run this, I get the error "TypeError: expected string or buffer". I tried converting the data in the 'config' column to string from object, but that didn't do the trick (I get another error, something like "ValueError: Expecting property name...").
If it helps, here's a snipped of data from one cell in the 'config' column (the code should return the result '6' for this snipped since there are 6 options):
{"graph_by":"series","options":["Strongbow Case Card/Price Card","Strongbow Case Stacker","Strongbow Pole Topper","Strongbow Base wrap","Other Strongbow POS","None"]}
My guess is that SQLAlchemy does something weird to JSON strings when it pulls them from the database? Something that doesn't happen when I'm just pulling CSVs from the database?
In recent Psycopg versions the Postgresql json(b) adaption to Python is transparent. Psycopg is the default SQLAlchemy driver for Postgresql
df = df['config']['options']
From the Psycopg manual:
Psycopg can adapt Python objects to and from the PostgreSQL json and jsonb types. With PostgreSQL 9.2 and following versions adaptation is available out-of-the-box. To use JSON data with previous database versions (either with the 9.1 json extension, but even if you want to convert text fields to JSON) you can use the register_json() function.
Just sqlalchemy query:
q = session.query(
Question.id,
func.jsonb_array_length(Question.config["options"]).label("len")
)
Pure sql and pandas' read_sql_query:
sql = """\
SELECT questions.id,
jsonb_array_length(questions.config -> 'options') as len
FROM questions
"""
df = pd.read_sql_query(sql, engine)
Combine both (my favourite):
# take `q` from the above
df = pd.read_sql(q.statement, q.session.bind)

Python Database update error

Usually i use Django orm for making database related query in python but now i am using the python itself
I am trying to update a row of my mysql database
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
But i am getting the error
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
AttributeError: 'C' object has no attribute 'uploaded'
Please help me what is wrong with my query ?
Get is probably mapping to a c object. Try renaming your "get" object to something else.
Here is a list of reserved words. I don't see get in there, but it sound like it could be part of a c library that's being included. If you're including something with from x import *, you could be importing it without knowing.
In short - get probably isn't what you think it is.
However, before you go much further building SQL queries with string formatting, I strongly advise you not to! Search for "SQL injection" and you'll see why. Python DB API compliant libraries utilise "placeholders" which the library can use to insert the variables into a query for you providing any necessary escaping/quoting.
So instead of:
query ='UPDATE callerdetail SET upload="{0}" WHERE agent="{1}" AND custid="{2}"AND screenname="{3}" AND status="1"'.format(get.uploaded,get.agent,get.custid,get.screenname)
An example using SQLite3 (using ? as a placeholder - others use %s or :1 or %(name)s - or any/all of the above - but that'll be detailed in the docs of your library):
query = "update callerdetail set upload=? where agent=? and custid=? and screename=? and status=?"
Then when it comes to execute the query, you provide the values to be substituted as a separate argument:
cursor.execute(query, (get.uploaded, get.agent, get.custid, get.screenname))
If you really wanted, you could have a convenience function, and reduce this to:
from operator import attrgetter
get_fields = attrgetter('uploaded', 'agent', 'custid', 'screenname')
cursor.execute(query, get_fields(get))

python SPARQL query RESULTS BINDINGS: IF statement on BINDINGs value?

Using Python with SPARQLWrapper, JSON, urlib2 & cgi. Had trouble passing a working SPARQL query with some NULL values to python so I populated the blanks with a literal and will try to filter at the output. I have this results section example:
for result in results["results"]["bindings"]:
project = result["project"]["value"].encode('utf-8')
filename = result["filename"]["value"].encode('utf-8')
url = result["url"]["value"].encode('utf-8')
...and I print the %s. Is there a way to filter a value, i.e., IF VALUE NE "string" then PRINT? Or is there another workaround? I'm at the tail-end of a small project, I know I need a better wrapper, I just need to get these results filtered before I can move on. T very much IA...
I'm one of the developers of the SPARQLWrapper library, and the question had been already answered at the mailing list.
Regarding optionals values on the original query, the result set will come with no values for those variables. The problems is that we'd need to parse the query to populate such missing entries, and we want to avoid such parsing; therefore you'd need to check it for avoiding runtime problems with KeyError.
Usually I use a code like:
for result in results["results"]["bindings"]:
party = result["party"]["value"] if ("party" in result) else None

Categories

Resources