Error in query while inserting data using RDFlib to GraphDB - python

I parse a database into an RDFlib graph. I now want to INSERT the triples from this graph into the GraphDB triple store. The code works fine when I execute it on an older version of GraphDB-Lite hosted on Sesame. However, I get an error while executing the same query on the now standalone GraphDB 7.0.0. The graph is partially parsed before the error is raised and the inserted triples do show up in the triple store.
This is part of the code:
graphdb_url = 'http://my.ip.address.here:7200/repositories/Test3/statements'
##Insert into Sesame
for s,p,o in graph1:
pprint.pprint ((s,p,o))
queryStringUpload = 'INSERT DATA {%s %s %s}' %(s,p,o)
# queryStringUpload = 'DELETE WHERE {?s ?p ?o .}'
# print queryStringUpload
sparql = SPARQLWrapper(graphdb_url)
sparql.method = 'POST'
sparql.setQuery(queryStringUpload)
sparql.query()
Following is the error:
ARQLWrapper.SPARQLExceptions.QueryBadFormed: QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed.
Response:
MALFORMED QUERY: Lexical error at line 1, column 93. Encountered: "/" (47), after : "purl.org"
What is causing the error and how do I resolve it?

It was a syntax error. I had URIs starting with http:/ instead of http:// in some places.

Related

When using the simple_salesforce package in python to ingest Account, the following error is given related to collections.OrderedDict', 'HY105')

Hello and thank you for taking the time to read this. For days I'm figuring out why I get this error when I try to load Account data towards an mssql database. The connection is fine.
But I keep on getting these errors:
(pyodbc.ProgrammingError) ('Invalid parameter type. param-index=17 param-type=collections.OrderedDict', 'HY105')
Exception: (102, b"Incorrect syntax near 'Id'.DB-Lib error message 20018, severity 15:\nGeneral SQL Server error: Check messages from the SQL Server\n")
Exception: One or more values in the dataframe have more characters than possible in the database table. The maximum number of characters in each column are:
How can I circumvent these errors and load the data without errors:
I use this for instance:
engine = sal.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(params))
conn = engine.connect()
for entity in ['Account']:
df = get_salesforce_data(sf=sf, sf_object=entity, method=method)
df.to_sql(entity, con = engine, if_exists ='append', index = False, chunksize = 1000)
There 94 columns in this Account table?
Thank you for thinking with me

Sparql query on Arabic DBpedia Ontology

I'm trying to make a query to get the predicate or the relation between 2 entities already exist in Arabic DBpedia...
I'm trying to do that in Python using the SPARQL Endpoint interface to Python (SPARQLWrapper), so I set the Data Set Name, with the query like that:
sparql = SPARQLWrapper("http://ar.dbpedia.org/sparql")
sparql.setReturnFormat(JSON)
property = []
query = "SELECT ?property WHERE {{ <{}> ?property <{}> }}".format('http://ar.dbpedia.org/resource/فرنسا', 'http://ar.dbpedia.org/resource/باريس')
sparql.setQuery(query)
string_s = sparql.query().convert()
if len(string_s['results']['bindings']) != 0:
bindings = string_s['results']['bindings']
for b in bindings:
property.append(b['property']['value'])
print(property)
The problem is when I specified the Data set name as (http://ar.dbpedia.org/sparql), it gave me an error in connection like that:
ConnectionResetError: [WinError 10054] An existing connection was
forcibly closed by the remote host
And when I change it to the Default one (http://dbpedia.org/sparql), it gives me no relation between the 2 entities (and I guess this is because I am making a request on the data set that's not having the two entities!), I test the previous code on English resources by changing the data set name to the default one, and changing the query to this:
query = "SELECT ?property WHERE {{ <{}> ?property <{}> }}".format('http://dbpedia.org/resource/France', 'http://dbpedia.org/resource/Paris')
it worked and gave me the (?property) like this:
['http://dbpedia.org/ontology/wikiPageWikiLink',
'http://dbpedia.org/ontology/capital']
So my question is, How could I do the same request and get the same answer on the Arabic DBpedia? How can I inquire about the property that links between the 2 Arabic resources (Arabic entities) that already exist in Arabic DBpedia?

How to use the BigQuery API using a Python script calling a UDF

Against a BigQuery table, I'm trying to run a SQL statement calling a UDF. This statement is executed within a Python script and the call is made via the BigQuery API.
When I execute a simple SQL statement without a UDF, it works fine. However, I keep getting the same error when I try to use a UDF script (stored either locally or in a GCS bucket).
This what I get on my local Terminal (I run the script via Python Launcher):
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/googleapiclient/http.py",
line 840, in execute
raise HttpError(resp, content, uri=self.uri) googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/[projectId]/queries?alt=json
returned "Required parameter is missing">
And this is my Python script:
credentials = SignedJwtAssertionCredentials(
SERVICE_ACCOUNT_EMAIL,
key,
scope='https://www.googleapis.com/auth/bigquery')
aservice = build('bigquery','v2',credentials=credentials)
query_requestb = aservice.jobs()
query_data = {
'configuration': {
'query': {
'userDefinedFunctionResources': [
{
'resourceUri': 'gs://[bucketName]/[fileName].js'
}
],
'query': sql
}
},
'timeoutMs': 100000
}
query_response = query_requestb.query(projectId=PROJECT_NUMBER,body=query_data).execute(num_retries=0)
Any idea what 'parameter is missing' or how I can get this to run?
Instead of specifying userDefinedFunctionResources, use CREATE TEMP FUNCTION in the body of your 'query' with the library referenced as part of the OPTIONS clause. You will need to use standard SQL for this, and you can also refer to the documentation on user-defined functions. Your query would look something like this:
#standardSQL
CREATE TEMP FUNCTION MyJsFunction(x FLOAT64) RETURNS FLOAT64 LANGUAGE js AS """
return my_js_function(x);
"""
OPTIONS (library='gs://[bucketName]/[fileName].js');
SELECT MyJsFunction(x)
FROM MyTable;
The query I wanted to run was to categorise traffic and sales by marketing channel which I usually use a UDF for. This is the query I ran using standard SQL. This query is stored in a file which I read and store in the variable sql:
CREATE TEMPORARY FUNCTION
mktchannels(source STRING,
medium STRING,
campaign STRING)
RETURNS STRING
LANGUAGE js AS """
return channelGrouping(source,medium,campaign) // where channelGrouping is the function in my channelgrouping.js file which contains the attribution rules
""" OPTIONS ( library=["gs://[bucket]/[path]/regex.js",
"gs://[bucket]/[path]/channelgrouping.js"] );
WITH
traffic AS ( // select fields from the BigQuery table
SELECT
device.deviceCategory AS device,
trafficSource.source AS source,
trafficSource.medium AS medium,
trafficSource.campaign AS campaign,
SUM(totals.visits) AS sessions,
SUM(totals.transactionRevenue)/1e6 as revenue,
SUM(totals.transactions) as transactions
FROM
`[datasetId].[table]`
GROUP BY
device,
source,
medium,
campaign)
SELECT
mktchannels(source,
medium,
campaign) AS channel, // call the temp function set above
device,
SUM(sessions) AS sessions,
SUM(transactions) as transactions,
ROUND(SUM(revenue),2) as revenue
FROM
traffic
GROUP BY
device,
channel
ORDER BY
channel,
device;
And then in the Python script:
fd = file('myquery.sql', 'r')
sql = fd.read()
fd.close()
query_data = {
'query': sql,
'maximumBillingTier': 10,
'useLegacySql': False,
'timeoutMs': 300000
}
Hope this helps anyone in the future!

Raw sql to json in Django, with Datetime and Decimal MySql columns

I am using Ajax to make some requests from client to server, I am using DJango and I have used some Raw Sql queries before, but all of my fields was Int, varchar and a Decimal, for the last one I had an enconding problem, but I overrided the "default" property of Json and everything worked.
But that was before, now I have a query wich gives me Decimal and DateTime fields, both of them gave me enconding errors, the overrided "default" doesn't work now, thats why with this new one I used DjangoJSONEncoder, but now I have another problem, and its not an encoding one, I am using dictfetchall(cursor) method, recomended on Django docs, to return a dictionary from the Sql query, because cursor.fetchall() gives me this error: 'tuple' object has no attribute '_meta'.
Before I just sended that dictionary to json.dumps(response_data,default=default) and everything was fine, but now for the encoding I have to use the following: json.dumps(response_data,cls=DjangoJSONEncoder) and if I send the dictionary in that way, I get this error:
SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data
And if I try to use the serializers, like this:
response_data2= serializers.serialize('json', list(response_data))
And later send response_data2 to dumps, I get this error:
'dict' object has no attribute '_meta'
This is the code for the MySql query:
def consulta_sql_personalizada(nombres,apellidos,puesto):
from django.db import connection, transaction
cursor = connection.cursor()
cursor.execute("""select E.idEmpleado as id,CONCAT(Per.nombres_persona,' ',Per.apellidos_persona) as nombre,P.nombre_puesto as puesto,E.motivo_baja_empleado as motivo_baja,E.fecha_contratacion_empleado AS fecha_contratacion,E.fecha_baja_empleado as fecha_baja, SUM(V.total_venta) AS ventas_mes,E.fotografia_empleado as ruta_fotografia from Empleado as E
inner join Puesto as P on E.Puesto_idPuesto=P.idPuesto
inner join Venta as V on V.vendedor_venta=E.idEmpleado
inner join Persona as Per on E.Persona_idPersona=Per.idPersona
where (Per.nombres_persona like %s OR Per.apellidos_persona like %s OR E.Puesto_idPuesto=%s)
AND E.estado_empleado=1 AND V.estado_venta=1
AND
(YEAR(V.fecha_venta) = YEAR(Now())
AND MONTH(V.fecha_venta) = MONTH(Now()))""",[nombres,apellidos,puesto])
row = dictfetchall(cursor)
return row
And this is the last part of the view that makes the query and send it to ajax using json:
response_data=consulta_sql_personalizada(rec_nombres,rec_apellidos,rec_puesto)
return HttpResponse(
json.dumps(response_data,cls=DjangoJSONEncoder),
content_type="application/json"
)
else:
return HttpResponse(
json.dumps({"nothing to see": "this isn't happening"}),
content_type="application/json"
)
What I want to know is, how can I parse the raw sql result to Json using that enconding?
Sorry, was my bad, i'm using JQuery ajax method, and in the "success" part I forgot to stop using json.parse to print the data in the console, the data was json already, that's why I had that line 1 column 1 error. My code worked exactly like it was posted here. If someone want to know how to make asynchronous requests, I followed this tutorial: Django form submissions using ajax

Syntax Error with Validated Query when sent into BigQuery via Python API Client

Here is my query:
SELECT hits.page.pagePath
FROM [(project_id):(dataset_id).ga_sessions_20151019]
GROUP BY hits.page.pagePath LIMIT 1
It runs in the web UI.
Here is my code:
from oauth2client.service_account import ServiceAccountCredentials
from httplib2 import Http
from apiclient.discovery import build
import json
query = "SELECT hits.page.pagePath FROM [(project_id):(dataset_id).ga_sessions_20151019] GROUP BY hits.page.pagePath LIMIT 1",
path = (filepath of credentials json file)
scopes = ['https://www.googleapis.com/auth/bigquery']
credentials = ServiceAccountCredentials.from_json_keyfile_name(path,scopes)
http_auth = credentials.authorize(Http())
bigquery = build('bigquery','v2',http=http_auth)
req_body = {
"timeoutMs": 60000,
"kind": "bigquery#queryRequest",
"dryRun": False,
"useQueryCache": True,
"useLegacySql": False,
"maxResults": 100,
"query": query,
"preserveNulls": True,
}
bigquery.jobs().query(projectId=(project_id),body=req_body).execute()
When I run this, I get the following error:
HttpError: <HttpError 400 when requesting https://www.googleapis.com/bigquery/v2/projects/cardinal-path/queries?alt=json returned "Syntax error: Unexpected "["">
It doesn't seem to like the brackets in my query string, but I don't know how to escape them (if that is the issue). Does anyone see what I'm doing wrong? I don't thing it's an issue with my connection to the API because I am able to see all the jobs I've started (which have all failed due to the above HttpError / Syntax Error) by calling the service object's ('bigquery' above) jobs().list() function. Thanks!
I see you are setting useLegacySql to False in your query request.
Bracket-quoting literals like [projectid:datasetid.tableid] is part of the legacy BigQuery SQL dialect.
The new sql dialect uses back-ticks to quote literals. So try:
SELECT hits.page.pagePath FROM `project_id:dataset_id.ga_sessions_20151019` GROUP BY hits.page.pagePath LIMIT 1
Alternately, since you are passing project_id as the project you are running the job in, all dataset lookups will resolve to that project by default, so you can drop the projectid: prefix and just use datasetid.tableid like:
SELECT hits.page.pagePath FROM dataset_id.ga_sessions_20151019 GROUP BY hits.page.pagePath LIMIT 1
While this is convenient for user-typed queries, if all your queries are code generated it is probably safest to always use quoted fully-qualified referenced.
Update: Another alternative is to use SQL's standard dot separator with non legacy SQL dialect, i.e.
SELECT hits.page.pagePath
FROM project_id.dataset_id.ga_sessions_20151019
GROUP BY hits.page.pagePath LIMIT 1

Categories

Resources