Extract data from notion using python

Extract data from notion using python - python

I'm using Notion to store my data {client name, subscription...} in tables.
I want to extract some data using python but I can't figure out how.
For example count the total number of clients, get the total amount of subscriptions...
Could you please suggest a way to help me.

If you need to do this only once - you can export a notion page (database) to HTML, which will probably be easier to extract from.
If you want this as a weekly/daily/monthly thing - I can't help with doing that in python but zapier and automate.io would be perfect.

For fetching the data you can use notion-client. The great thing about it is that it supports both sync and async interfaces. What it lacks, though, is an easy way to navigate the data's structure (which is quite complicated in Notion)
For that you can use basic-notion. It allows you to use model classes to easily access all the properties and attributes of your Notion objects - kind of like you would with an ORM.
In your case the code might look something like this:
from notion_client import Client
from basic_notion.query import Query
from basic_notion.page import NotionPage, NotionPageList
from basic_notion.field import SelectField, TitleField, NumberField
# First define models
class MyRow(NotionPage):
name = TitleField(property_name='Name')
subscription = SelectField(property_name='Subscription')
some_number = NumberField(property_name='Some Number')
# ... your other fields go here
# See your database's schema and the available field classes
# in basic_notion.field to define this correctly.
class MyData(NotionPageList[MyRow]):
ITEM_CLS = MyRow
# You need to create an integration and get an API token from Notion:
NOTION_TOKEN = '<your-notion-api-token>'
DATABASE_ID = '<your-database-ID>'
# Now you can fetch the data
def get_data(database_id: str) -> MyData:
client = Client(auth=NOTION_TOKEN)
data = client.databases.query(
**Query(database_id=database_id).filter(
# Some filter here
MyRow.name.filter.starts_with('John')
).sorts(
# You can sort it here
MyRow.name.sort.ascending
).serialize()
)
return MyData(data=data)
my_data = get_data()
for row in my_data.items():
print(f'{row.name.get_text()} - {row.some_number.number}')
# Do whatever else you may need to do
For more info, examples and docs see:
notion-client: https://github.com/ramnes/notion-sdk-py
basic-notion: https://github.com/altvod/basic-notion
Notion API Reference: https://developers.notion.com/reference/intro

Related

How to enrich the Django request object with extended functions?

In my Django project, I have a situation where different views communicate via a request session data as follows:
def view_one(request):
...
request.session['my_session_key'] = data
...
def view_two(request):
...
data = request.session['my_session_key']
...
However, this has the following problems:
The key string my_session_key is not a constant so it will be hard to scale it up to other parts of the code if I start writing to and/or reading from it in other parts of my code.
As this system grows, it will become harder to identify which are the available keys being used and written in the session.
In Kotlin (which I'm more familiar with), one way to solve this would be using an extension function, like so:
var Request.my_session_key: Int
get() = this.session['my_session_key']
set(value) { this.session['my_session_key'] = value }
This way, I could now write my views as follows:
def view_one(request):
...
request.my_session_key = data
...
def view_two(request):
...
data = request.my_session_key
...
Is there any way to accomplish something similar in Python or with Django? Or, alternatively, taking a step back, what would be the best way to organize the data stored in a Django session across multiple views?

Python - Tweepy - How to use lookup_friendships?

I'm trying to figure out if I'm following a user from which the streaming API just received a tweet. If I don't, then I want to follow him.
I've got something like:
def checkFollow(status):
relationship = api.lookup_friendships("Privacy_Watch_",status.user.id_str)
From there, how do I check if I follow this user already?

The lookup_friendships method will return everyone you follow each time you call it, in blocks of 100 users. Provided you follow a lot of people, that will be highly inefficient and consume a lot of requests.
You can use instead the show_friendship method, it will return a JSON containing information about your relationship with the id provided.
I cannot test it right now, but the following code should do what you want:
def checkFollow(status):
relation = api.show_friendship(source_screen_name=your_user_name, target_screen_name=status.user.id_str)
if relation.target.following: #I'm not sure if it should be "target" or "source" here
return True
return False

How send lists of objects with GAE endpoints?

I'm working in a API in GAE and I'm using Endpoints (with python). The data that I want send with response are a few objects that I build in this moment. So, to send this objects, I build a class with ProtoRPC message. And how I want send a list of them I build a class that represent a collections or a list of them.
This is the basis code:
class Greeting(messages.Message):
"""Greeting that stores a message."""
message = messages.StringField(1)
class GreetingCollection(messages.Message):
"""Collection of Greetings."""
items = messages.MessageField(Greeting, 1, repeated=True)
But when I want build a collection, I don't found the way. Obviously, seeing the documentation, I read that I can build a static colection to send it, for example:
STORED_GREETINGS = GreetingCollection(items=[
Greeting(message='hello world!'),
Greeting(message='goodbye world!'),
])
But if I want build this dynamically?
In my case I have a process that return a list of Greetings, and I dont find the way to convert this to a collection of Greeting to send with EndPoints.
return STORED_GREETINGS
Maybe I'm searching something like this:
(only orientative)
for greeting in greetings:
STORED_GREETINGS.add(greeting)
but I don't find how doing.
Any help will be welcome.
Thanks you so much.

Just build a normal list containing the Greeting objects and assign it to the GreetingCollection:
greetingItems = []
greetingItems.append(Greeting(message='hello world!'))
greetingItems.append(Greeting(message='goodbye world!'))
...
STORED_GREETINGS = GreetingCollection(items=greetingItems)

you should be able to just do:
greeting_collection = GreetingCollection()
greeting_collection.items = list_of_greetings
Or, alternatively:
greeting_collection = GreetingCollection()
greeting_collection.items.extend(iterable_of_greetings)

Is it OK to send the whole POST as a JSON object?

I am using GAE with python, and I am using many forms. Usually, my code looks something like this:
class Handler(BaseHandler):
#...
def post(self):
name = self.request.get("name")
last_name = self.request.get("last_name")
# More variables...
n = self.request.get("n")
#Do something with the variables, validations, etc.
#Add them to a dictionary
data = dict(name=name, last_name=last_name, n=n)
info = testdb.Test(**data)
info.put()
I have noticed lately that it gets too long when there are many inputs in the form (variables), so I thought maybe I could send a stringified JSON object (which can be treated as a python dictionary using json.loads). Right now it looks like this:
class Handler(BaseHandler):
#...
def post(self):
data = validate_dict(json.loads(self.request.body))
#Use a variable like this: data['last_name']
test = testdb.Test(**data)
test.put()
Which is a lot shorter. I am inclined to do things this way (and stop using self.request.get("something")), but I am worried I may be missing some disadvantage of doing this apart from the client needing javascript for it to even work. Is it OK to do this or is there something I should consider before rearranging my code?

There is absolutely nothing wrong with your short JSON-focused code variant (few web apps today bother supporting clients w/o Javascript anyway:-).
You'll just need to adapt the client-side code preparing that POST, from being just a traditional HTML form, to a JS-richer approach, of course. But, I'm pretty sure you're aware of that -- just spelling it out!-)
BTW, there is nothing here that's App Engine - specific: the same considerations would apply no matter how you chose to deploy your server.

Making google analytics ID a variable

My app serves multiple domains which I understand should be done by namespaces which I'm researching. Since multiple domains should have multiple analytics ID:s I get the analytics ID from the code but I want to make it even more configurable:
if os.environ.get('HTTP_HOST').endswith('.br') \
or os.environ['SERVER_NAME'].endswith('.br'):
data[u'analytics'] = 'UA-637933-12'
else:
data[u'analytics'] = 'UA-637933-18'
self.response.out.write(template.render(os.path.join(os.path.dirname(__file__),
'templates', name + '.html'), data))
The above sets analytics ID to ..-12 if it's my brazilian domain and to the other ID ...-18 if it is my dot com. But this is only for 2 domains and it's not easiliy generalizable. How can I achieve this function in a more scientific and scalable way so that it becomes easy to add my application to a domain without manually adding the domain to my application?
I suppose namespaces is the way to go here since the domains are google apps domains but I don't understand how to use namespaces:
def namespace_manager_default_namespace_for_request():
"""Determine which namespace is to be used for a request.
The value of _NAMESPACE_PICKER has the following effects:
If _USE_SERVER_NAME, we read server name
foo.guestbook-isv.appspot.com and set the namespace.
If _USE_GOOGLE_APPS_DOMAIN, we allow the namespace manager to infer
the namespace from the request.
If _USE_COOKIE, then the ISV might have a gateway page that sets a
cookie called 'namespace', and we set the namespace to the cookie's value
"""
name = None
if _NAMESPACE_PICKER == _USE_SERVER_NAME:
name = os.environ['SERVER_NAME']
elif _NAMESPACE_PICKER == _USE_GOOGLE_APPS_DOMAIN:
name = namespace_manager.google_apps_namespace()
elif _NAMESPACE_PICKER == _USE_COOKIE:
cookies = os.environ.get('HTTP_COOKIE', None)
if cookies:
name = Cookie.BaseCookie(cookies).get('namespace')
return name
I suppose I should use the namespace manager, get the namespace and set the analytics ID according to the namespace but how?
Thank you

The simplest way to do this is with a Python dict:
analytics_ids = {
'mydomain.br': 'UA-637933-12',
'mydomain.com': 'UA-637933-18',
}
data['analytics'] = analytics_ids[self.request.host]
If you have other per-domain stats, you may want to make each dictionary entry a tuple, a nested dict, or a configuration object of some sort, then fetch and store it against the current request for easy reference.
If you want to be able to reconfigure this at runtime, you could use a datastore model, but that will impose extra latency on requests that need to fetch it; it seems likely to me that redeploying each time you add a domain isn't likely to be a problem in your case.
Namespaces are tangential to what you're doing. They're a good way to divide up the rest of your data between different domains, but they're not useful for dividing up configuration data.

I presume you have two instances of the same application running.
Instead of fiddling with namespaces, I suggest you turn the Analytics ID into a configuration variable.
That is, either store it in a config file or a database your web is using. Then set one ID for each deployment (in each place your web is running from) and fetch it in the runtime.
For example:
Config file:
analyticsId="UA-637933-12"
Code:
data[u'analytics'] = getValueFromConfig("analyticsId")
where getValueFromConfig is a function you define to read the appropriate value. (To use configuration files effortlessly, you may use the ConfigParser module.)
Now you've gained a lot more flexibility - you don't have to do any checking and switching at runtime. You only have to define the value once per web site and be done with it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract data from notion using python - python

I'm using Notion to store my data {client name, subscription...} in tables. I want to extract some data using python but I can't figure out how. For example count the total number of clients, get the total amount of subscriptions... Could you please suggest a way to help me.

If you need to do this only once - you can export a notion page (database) to HTML, which will probably be easier to extract from. If you want this as a weekly/daily/monthly thing - I can't help with doing that in python but zapier and automate.io would be perfect.

Related

How to enrich the Django request object with extended functions?

Python - Tweepy - How to use lookup_friendships?

How send lists of objects with GAE endpoints?

Is it OK to send the whole POST as a JSON object?

Making google analytics ID a variable

Categories

Resources