How to load an json api in cayley graph db - python

I have various data located within various data sources like Json ,various apis etc,.Now there is requirement to collate all these data and push it into cayley graph data base.
this will eventually act as an input for a chatbot framework. i am currently not aware of how collate existing data and push it into cayley graph n retrieve cayley graph database.
help needed …
thanks in advance

Unfortunately, Cayley cannot import JSON data directly by design.
The main reason is that it has no way of knowing which values in JSON are node IDs and which are regular string values.
However, it supports JSON-LD format which is the same as regular JSON but includes some additional annotations. These annotations help to solve an uncertainity I mentioned.
I suggest checking JSON-LD Playground examples first and then schema.org for a list of well-known object types. Note that it's also possible to define your own types. See JSON-LD documentation for details.
The last step would be to use Cayley's HTTP API v2 to import the data. Make sure to pass a correct Content-Type header, or use Cayley client that supports JSON-LD.

Related

Convert GraphQLResponse dictionary to python object

I am running a graphql query using aiographql-client and getting back a GraphQLResponse object, which contains a raw dict as part of the response json data.
This dictionary conforms to a schema, which I am able to parse into a graphql.type.schema.GraphQLSchema type using graphql-core's build_schema method.
I can also correctly get the GraphQLObjectType of the object that is being returned, however I am not sure how to properly deserialize the dictionary into a python object with all the appropriate fields, using the GraphQLObjectType as a reference.
Any help would be greatly appreciated!
I'd recommend using Pydantic to the heavy lifting in the parsing.
You can then either generate the models beforehand and select the ones you need based on GraphQLObjectType or generate them at runtime based on the definition returned by build_schema.
If you really must define your models at runtime you can do that with pydantic's create_model function described here : https://pydantic-docs.helpmanual.io/usage/models/#dynamic-model-creation
For the static model generation you can probably leverage something like https://jsontopydantic.com/
If you share some code samples I'd be happy to give some more insights on the actual implementation
I faced the same tedious problem while developing a personal project.
Because of that I just published a library which has the purpose of managing the mappings between python objects and graphql objects (python objects -> graphql query and graphql response -> python objects).
https://github.com/dapalex/py-graphql-mapper
So far it manages only basic query and response, if it will become useful I will keep implementing more features.
Try to have a look and see if it can help you
Coming back to this, there are some projects out there trying to achieve this functionality:
https://github.com/enra-GmbH/graphql-codegen-ariadne
https://github.com/sauldom102/gql_schema_codegen

How to set parameters for an API POST query using Python and JSON

I'm querying a real estate API using Python (requests), with POST data submitted in JSON format.
I'm getting responses as expected - however each time I want to make a query I'm editing the fields in a hardcoded JSON object in the .py file.
I'd like to do something a bit more robust - eg using a user prompt to populate the JSON object to be submitted, based on the API search schema (see JSON file (pastebin)) (open to alternative python based solutions to this).
The linked schema includes the full list of parameters available to query - I'll likely trim this down to the ones that are most relevant to the queries that I'm building/POSTing, so that there are less parameters to deal with. I'd like to know of a Pythonic way to cycle through the Parameters in the Schema and then add the ones I wish to submit for a query to the JSON object?
TIA.

Python jsonschema: how to query schema to determine the type of a property?

Consider the following use case:
I have the configuration of a network router (OpenWRT) in text format that I'm converting to JSON (NetJSON to be specific), the text format used by the router only uses strings and I have to convert many configuration attributes from string to booleans and integers.
I would like to query the JSON Schema to automatically determine what is the expected type of the attributes and perform the right conversion.
The JSON schema we are using is quite complex, contains many definitions that are merged using allOf, anyOf, etc, therefore just looping on a specific part of the schema is not good enough.
Is there a way to do this using the python jsonschema library or are there alternative ways of doing it?
PS: the implementation of this feature is open source, you can find out more about the OpenWISP netjsonconfig library and the pull request to add the backward conversion feature into the library.

Serializing data and unpacking safely from untrusted source

I am using Pyramid as a basis for transfer of data for a turn-based video game. The clients use POST data to present their actions, and GET to retrieve serialized game board data. The game data can sometimes involve strings, but is almost always two integers and two tuples:
gamedata = (userid, gamenumber, (sourcex, sourcey), (destx, desty))
My general client side framework was to Pickle , convert to base 64, use urlencode, and submit the POST. The server then receives the POST, unpacks the single-item dictionary, decodes the base64, and then unpickles the data object.
I want to use Pickle because I can use classes and values. Submitting game data as POST fields can only give me strings.
However, Pickle is regarded as unsafe. So, I turned to pyYAML, which serves the same purpose. Using yaml.safe_load(data), I can serialize data without exposing security flaws. However, the safe_load is VERY safe, I cannot even deserialize harmless tuples or lists, even if they only contain integers.
Is there some middle ground here? Is there a way to serialize python structures without at the same time allowing execution of arbitrary code?
My first thought was to write a wrapper for my send and receive functions that uses underscores in value names to recreate tuples, e.g. sending would convert the dictionary value source : (x, y) to source_0 : x, source_1: y. My second thought was that it wasn't a very wise way to develop.
edit: Here's my implementation using JSON... it doesn't seem as powerful as YAML or Pickle, but I'm still concerned there may be security holes.
Client side was constructed a bit more visibly while I experimented:
import urllib, json, base64
arbitrarydata = { 'id':14, 'gn':25, 'sourcecoord':(10,12), 'destcoord':(8,14)}
jsondata = json.dumps(arbitrarydata)
b64data = base64.urlsafe_b64encode(jsondata)
transmitstring = urllib.urlencode( [ ('data', b64data) ] )
urllib.urlopen('http://127.0.0.1:9000/post', transmitstring).read()
Pyramid Server can retrieve the data objects:
json.loads(base64.urlsafe_b64decode(request.POST['data'].encode('ascii')))
On an unrelated note, I'd love to hear some other opinions about the acceptability of using POST data in this method, my game client is in no way browser based at this time.
Why not use colander for your serialization and deserialization? Colander turns an object schema into simple data structure and vice-versa, and you can use JSON to send and receive this information.
For example:
import colander
class Item(colander.MappingSchema):
thing = colander.SchemaNode(colander.String(),
validator=colander.OneOf(['foo', 'bar']))
flag = colander.SchemaNode(colander.Boolean())
language = colander.SchemaNode(colander.String()
validator=colander.OneOf(supported_languages)
class Items(colander.SequenceSchema):
item = Item()
The above setup defines a list of item objects, but you can easily define game-specific objects too.
Deserialization becomes:
items = Items().deserialize(json.loads(jsondata))
and serialization is:
json.dumps(Items().serialize(items))
Apart from letting you round-trip python objects, it also validates the serialized data to ensure it fits your schema and hasn't been mucked about with.
How about json? The library is part of the standard Python libraries, and it allows serialization of most generic data without arbitrary code execution.
I don't see raw JSON providing the answer here, as I believe the question specifically mentioned pickling classes and values. I don't believe using straight JSON can serialize and deserialize python classes, while pickle can.
I use a pickle-based serialization method for almost all server-to-server communication, but always include very serious authentication mechanisms (e.g. RSA key-pair matching). However, that means I only deal with trusted sources.
If you absolutely need to work with untrusted sources, I would at the very least, try to add (much like #MartijnPieters suggests) a schema to validate your transactions. I don't think there is a good way to work with arbitrary pickled data from an untrusted source. You'd have to do something like parse the byte-string with some disassembler and then only allow trusted patterns (or block untrusted patterns). I don't know of anything that can do this for pickle.
However, if your class is "simple enough"… you might be able to use the JSONEncoder, which essentially converts your python class to something JSON can serialize… and thus validate…
How to make a class JSON serializable
The impact is, however, you have to derive your classes from JSONEncoder.

Parsing an xml file and storing it into a database

Is there a generic/automatic way in R or in python to parse xml files with its nodes and attributes, automatically generate mysql tables for storing that information and then populate those tables.
Regarding
Is there a generic/automatic way in R
to parse xml files with its nodes and
attributes, automatically generate
mysql tables for storing that
information and then populate those
tables.
the answer is a good old yes you can, at least in R.
The XML package for R can read XML documents and return R data.frame types in a single call using the xmlToDataFrame() function.
And the RMySQL package can transfer data.frame objects to the database in a single command---including table creation if need be---using the dbWriteTable() function defined in the common DBI backend for R and provided for MySQL by RMySQL.
So in short: two lines can do it, so you can easily write yourself a new helper function that does it along with a commensurate amount of error checking.
They're three separate operations: parsing, table creation, and data population. You can do all three with python, but there's nothing "automatic" about it. I don't think it's so easy.
For example, XML is hierarchical and SQL is relational, set-based. I don't think it's always so easy to get a good relational schema for every single XML stream you can encounter.
There's the XML package for reading XML into R, and the RMySQL package for writing data from R into MySQL.
Between the two there's a lot of work. XML surpasses the scope of a RDBMS like MySQL so something that could handle any XML thrown at it would be either ridiculously complex or trivially useless.
We do something like this at work sometimes but not in python. In that case, each usage requires a custom program to be written. We only have a SAX parser available. Using an XML decoder to get a dictionary/hash in a single step would help a lot.
At the very least you'd have to tell it which tags map to which to tables and fields, no pre-existing lib can know that...

Categories

Resources