Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am searching something like sqlite but not-relational. In other words I would like to work with a triple-store (a set of object-predicate-subject triplets) instead of tables. It means that I want to use SPARQL queries instead of SQL.
The first idea that comes into mind is RDFLib. However, I see two problems with this option:
RDFLib is not a data base and, as a consequence, it is not designed to work with parallel process (for example with parallel request induced by many web-users). It might lead to inconsistencies if two users at the same time try to add to or delete from the triple-store.
RDFLib is designed to work with RDF, which is a particular implementation (syntax) of the triple-store. For example, each object, predicate and subject have to have URI and I do not have them. In my triple-store I would like to have triplets like that: ("Washington","is capital of", "USA") (so, no URI).
SPARQL is explicitly for RDF; if you want to use it you'll need to create your own ontology or utilise existing ones.
I recommend taking a look at ORDF with 4store as the back-end.
As the commenters already said, there is a wrapper interface for SPARQL - the SPARQL Endpoint interface to Python (currently in version 1.6.0):
http://rdflib.github.io/sparqlwrapper/
I also came across another thread discussing non-relational databases with Python, although it doesn't specifically mention SPARQL: portable non-relational database
While it doesn't have a whole lot to it, this guide for Python SPARQL developers has some pointers: http://www.openlinksw.com/blog/~kidehen/?id=1651
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I need a key-value database, like redis or memcached, but not in memory and rather on disk. After filling the database (which we do regularly and from scratch), I'd actually only need the get operation, but from many different processes (so Kyoto Cabinet and LevelDB do not work for me).
I need like 5 million keys and ~10-30gb of data, so some other simple databases don't work as well.
I can't find any information on whether RocksDB can handle multiple read-only clients; it's not straight-forward to build on my OS so I wanted to ask before doing that. If it can't, is there any database which would work? Preferably with an Ubuntu package and Python bindings ;-).
We're just using many-many small files now, but it really sucks, as we want easy backups, copying, etc. I also suspect this may cause slowdowns, but it doesn't really matter that much.
Yes, you should be able to run multiple read-only clients on a single RocksDB database. Just open the database with DB::OpenForReadOnly() call: https://github.com/facebook/rocksdb/blob/master/include/rocksdb/db.h#L108
The simplest answer is probably Berkeley DB, and bindings are a part of the stdlib: https://docs.python.org/2/library/anydbm.html
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am confused as to how APIs work with django in general. I am looking to get started but am a bit confused where to start. I am fairly new to django but have mastered most of the basics.
I am looking to understand how to communicate with other REST apis, how to send and receive json data, what is needed for this data(where does it live, are models required?, do I create views to access json data?)
I am looking for a comprehensive tutorial or book/article that I can follow that will teach me the ins and outs of this. Any help on where to get started would be much appreciated.
Django is for web development. If what you want to do is get JSON from some remote RESTful service, no part of that requires Django. Instead, try urllib or httplib2, and check simple examples elsewhere.
Again, sending JSON data is a simple as using the json library in python, and using the same urllib tricks you use to consume JSON from other people. So no django needed there either.
As for whether models are required, well that depends 100% on what you're trying to do. Your question about needing "views" on json data I think confuses several different issues.
I'd recommend you read up on RESTful services in general, and where JSON fits before you start implementation.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm reading around and see that it is a bad idea to have remote application talk directly to my MongoDB e.g. install a Mongodb driver in a phone app. The best way is to have a REST interface on a server to talk between the database and the end user. But what about the aggregation framework?
I see Sleepy.mongoose and Eve but I cannot see anything about aggregation.
Is there any way/or REST interface which allows you to make aggregation calls (I'm interested in subdocuments)?
E.g. requesting $ curl 'http://localhost:27080/customFunction/Restaurant' and return all the subdocuments matching shop.kind with Restaurant.
I'm familiar with python and java, is there any API framework that allows you to do that?
Before you get flagged as off-topic as you likely will for asking for opinions and not a specific programming question I'll just say one bit. Hopefully on-topic.
I highly doubt that most projects will go beyond being a basic CRUD adaptor allowing you access to collection objects and sometimes (badly) database objects. Is with their various ORM backed counterparts they will do doubt allow a similar query syntax to be executed from the client, so queries could be composed and sent through as JSON, which will not surprisingly look much like (identical) to the standard query syntax for MongoDB.
For myself I prefer to roll my own, and largely because you may want to implement a lot of customer behavior and actions, and in some way abstract a little from having a lot of CRUD code in the client. Let's face it, you're probably passing through and passing JSON that is going into the native structures you're using anyway. So it's not hard really. Anyhow, each to his own I suppose.
There is a listing of other implementations on available here:
http://docs.mongodb.org/ecosystem/tools/http-interfaces/
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I would like to design an algorithm using python that scrapes thousands of pages like this one and this one, gathers all the data and inserts it into a MySQL database. The script will be run on a weekly or bi-weekly basis to update the database of any new information added to each individual page.
Ideally I would like a scraper that is easy to work with for table structured data but also data that does not have unique identifiers (ie. id and classes attributes).
Which scraper add-on should I use? BeautifulSoup, Scrapy or Mechanize?
Are there any particular tutorials/books I should be looking at for this desired result?
In the long-run I will be implementing a mobile app that works with all this data through querying the database.
first thought:
(in order to save some time) Have you seen thewaybackmachine? http://archive.org/web/
2nd thought:
If you are going to develop a mobile app then the layout of this site doesn't lend itself to be put on handheld devices easily. I would suggest not bothering with the webpage portion of this. You are just going to have to dig all the information out eventually and change your scrappers each time they change some little thing on their website.
You can get the data from their developer API in Json or CSV format.
From the raw data you can make it into whatever format you want. (for personal use only according to their site)
Caveats:
Pay attention to the robots.txt file on the site.
http://www.robotstxt.org/robotstxt.html
If they don't want to be scrapped they will tell you so. You can do this for personal use, but if you try making money from their content you will find yourself sued.
You could use lxml, which can take XPath specifiers. It takes a while to get used to the XPath syntax, but it's useful in cases like this.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm looking to get into web development. My goal is to create some interactive webpages that interact with MS SQL databases (read/insert/updates), and also possibly sites that interact with XML files.
I've got some basic understanding of Python and Perl scripting. Can someone point me in the right direction in either of those languages to accomplish what i'm looking to do, or if it's easier to accomplish in another language what would that be?
Apologies if my stated goal is too broad.
I'd strongly suggest you to look into some of the web development frameworks. They take care of many low-level tasks which is needed in order to build a solid web page. I'm not very familiar with perl, so I can only suggest Python frameworks, especially one of the my favourites - Django. It has very good documentation which is essential for the first-timer. I believe you should be fine as long as you follow the official documentation.
Good luck
You can use SQL Alchamy in python, and lxml or the default ElementTree xml module for simple cases.
I have done both for a webservice I maintain, and they work nice.
You can also use a web development framework. I personally suggest Flask based on that it is a lightweight framework as opposted to django for instance. However, depending on your exact use case the latter might be better.