I went through pydruid documentation docs and could not find a single example where I can use the existing JSON file to convert it into a format which pydruid can use to post(start the job). We are using older version of druid so we may not have all the latest functionalities. I'm sure I might be able to achieve the same behavior with pycurl but somehow feel pydruid is a better approach here. Can someone post an example of how I can use the exiting JSON file with prdruid?
Related
I have huge collection of .json files containing hundreds or thousands of documents I want to import to arangodb collections. Can I do it using python and if the answer is yes, can anyone send an example on how to do it from a list of files? i.e:
for i in filelist:
import i to collection
I have read the documentation but I couldn't find anything even resembling that
So after a lot of trial and error I found out that I had the answer in front of me. So I didn't need to import the .json file, I just needed to read it and then do a bulk import of documents. The code is like this:
a = db.collection('collection_name')
for x in list_of_json_files:
with open(x,'r') as json_file:
data = json.load(json_file)
a.import_bulk(data)
So actually it was quite simple. In my implementation I am collecting the .json files from multiple folders and importing them to multiple collections. I am using the python-arango 5.4.0 driver
I had this same problem. Though your implementation will be slightly different, the answer you need (maybe not the one you're looking for) is to use the "bulk import" functionality.
Since ArangoDB doesn't have an "official" Python driver (that I know of), you will have to peruse other sources to give you a good idea on how to solve this.
The HTTP bulk import/export docs provide curl commands, which can be neatly translated to Python web requests. Also see the section on headers and values.
ArangoJS has a bulk import function, which works with an array of objects, so there's no special processing or preparation required.
I have also used the arangoimport tool to great effect. It's command-line, so it could be controlled from Python, or used stand-alone in a script. For me, the key here was making sure my data was in JSONL or "JSON Lines" format (each line of the file is a self-contained JSON object, no bounding array or comma separators).
I have a large number of Graphviz files that I need to convert to Neo4j. At first blush, it looks like it should be easy enough to read it as a text file and convert to cypher but I am hoping that one of the python graphviz libraries would make it easier to "parse" the input, or that someone is aware of a prebuilt library.
Is anyone aware of, or has already built, a parser for conversion ? Partial examples are fine. Thanks
You can probably hack this together pretty easily using NetworkX. They implement a read_dot to read in the graphviz format, then I'm sure you can use one of their graph exporters to dump that back into a format that neo4j can use. For example, here's a package that attempts to simplify that export process (disclaimer: I've never tried this package, it just showed up in Google).
I'm a beginner in python, I'm currently working on a project scheduler, I get the data entered by the user, then put them in a table in a file to print it later.
unfortunately I have no idea how to do it, I searched a lot on the internet without success.
So if someone can help me it would be really cool
your request seems rather broad and not overly specific so this answer may not be what you're looking for but I will try help anyway.
Saving Files in Python
If you want to learn about saving files look up a tutorial on Pickle. Pickle lets you save data while maintaining its data type, for example, you can save a list in a file using Pickle then load the file using Pickle to get the list back. To use Pickle make sure to have the line import pickleat the top of your code and use pickle. before every Pickle function. e.g. pickle.dump()
Here's a useful tutorial I found on Pickle https://pythonprogramming.net/python-pickle-module-save-objects-serialization/
You will also want to ensure you know about file handling in Python. Here's a cheat sheet with all the basic file handling functions https://www.pythonforbeginners.com/cheatsheet/python-file-handling
Dates and Times in Python
A helpful module called Datetime will enable you to check your system's current Date and/or Time in Python. There are many functions of Datetime however you will probably only need to use the basic aspects of it. Again make sure you have the line import datetime at the top of your code and use dattime before every Datetime function.
Here's a useful tutorial I found on Datetime https://www.tutorialspoint.com/python/python_date_time.htm
If you find yourself stuck or you're not sure what to do feel free to ask more questions. Hope this has been somewhat helpful for you.
Does anyone know of a python library to convert JSON to JSON in an XSLT/Velocity template style?
JSON + transformation template = JSON (New)
Thanks!
Sorry if it's old, but you can use this module https://github.com/Onyo/jsonbender
Basically it transform a dicc to another Dicc object using a mapping. What you can do is to dump the json into a dicc, transform it to another dicc and then transfrom it back to a json.
I have not found the transformer library suitable for my needs and spend couple of days trying to create my own. And then I realized that creating transformation scheme is more difficult than writing native python code that transforms one json-like python object to another.
I understand, that this is not the answer to original question. And I also understand that my approach has certain limitations. F.e. if you need to generate documentation it wouldn't work.
But if you just need to transform json-like objects consider the possibility to just write python code that does it. Chances are that the code would be cleaner and easier to understand than transformation schema description.
I wish considered this approach more seriously couple of days ago.
I found pyjq library very magical, you can feed it a template and json file and it will map it out for you.
https://pypi.org/project/pyjq/
The only thing that is annoying about it was the requirements I have to install for it, it worked perfect on my local machine, but it failed when I tried to build it failed to build dependencies for lambda an aws.
I notice that Sphinx has the ability to generate documentation in JSON. What are these files used for?
As the docs say, it's
for use of a web application (or
custom postprocessing tool) that
doesn’t use the standard HTML
templates.
json's a good simple way for language-agnostic data interchange, so, why not?-)
I assume you're talking about the SerializingHTMLBuilder, in which case I think the answer might be that there isn't necessarily a specific purpose in mind. Rather many things provide conversion routines of various kinds with a "loads/dumps" API convention, and the json module (known as simplejson before it was brought standard library in 2.6) is but one of many such packages.
Presumably some people would prefer to work with data in JSON format for their own purposes. If I were trying to build some sort of dynamic Javascripty documentation system, I could well imagine choosing to use JSON as the way to get documentation from the backend out to the client in a manageable format, if for some reason HTML or XML didn't seem like the better option.