can't get page title from notion using api - python

I'm using notion.py and I'm new to python I want to get a page title from page and post it in another page but when I try I'm getting an error
Traceback (most recent call last):
File "auto_notion_read.py", line 16, in <module>
page_read = client.get_block(list_url_read)
File "/home/lotfi/.local/lib/python3.6/site-packages/notion/client.py", line 169, in get_block
block = self.get_record_data("block", block_id, force_refresh=force_refresh)
File "/home/lotfi/.local/lib/python3.6/site-packages/notion/client.py", line 162, in get_record_data
return self._store.get(table, id, force_refresh=force_refresh)
File "/home/lotfi/.local/lib/python3.6/site-packages/notion/store.py", line 184, in get
self.call_load_page_chunk(id)
File "/home/lotfi/.local/lib/python3.6/site-packages/notion/store.py", line 286, in call_load_page_chunk
recordmap = self._client.post("loadPageChunk", data).json()["recordMap"]
File "/home/lotfi/.local/lib/python3.6/site-packages/notion/client.py", line 262, in post
"message", "There was an error (400) submitting the request."
requests.exceptions.HTTPError: Invalid input.
My code is that I'm using is
from notion.client import NotionClient
import time
token_v2 = "my page tocken"
client = NotionClient(token_v2 = token_v2)
list_url_read = 'the url of the page page to read'
page_read = client.get_block(list_url_read)
list_url_post = 'the url of the page'
page_post = client.get_block(list_url_post)
print (page_read.title)

It isn't recommended to edit source code for dependencies, as you will most certainly cause a conflict when updating the dependencies in the future.
Fix PR 294 has been open since the 6th of March 2021 and has not been merged.
To fix this issue with the currently open PR (pull request) on GitHub, do the following:
pip uninstall notion
Then either:
pip install git+https://github.com/jamalex/notion-py.git#refs/pull/294/merge
OR in your requirements.txt add
git+https://github.com/jamalex/notion-py.git#refs/pull/294/merge
Source for PR 294 fix

You can find the fix here
In a nutshell you need to modify two files in the library itself:
store.py
client.py
Find "limit" value and change it to 100 in both.

Related

Snscrape sntwitter Error while using query and TwitterSearchScraper: "4 requests to ... failed giving up"

Up until this morning I successfully used snscrape to scrape twitter tweets via python.
The code looks like this:
import snscrape.modules.twitter as sntwitter
query = "from:annewilltalk"
for i,tweet in enumerate(sntwitter.TwitterSearchScraper(query).get_items()):
# do stuff
But this morning without any changes I got the error:
Traceback (most recent call last):
File "twitter.py", line 568, in <module>
Scraper = TwitterScraper()
File "twitter.py", line 66, in __init__
self.get_tweets_talkshow(username = "annewilltalk")
File "twitter.py", line 271, in get_tweets_talkshow
for i,tweet in enumerate(sntwitter.TwitterSearchScraper(query).get_items()):
File "/home/pi/.local/lib/python3.8/site-packages/snscrape/modules/twitter.py", line 1455, in get_items
for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
File "/home/pi/.local/lib/python3.8/site-packages/snscrape/modules/twitter.py", line 721, in _iter_api_data
obj = self._get_api_data(endpoint, apiType, reqParams)
File "/home/pi/.local/lib/python3.8/site-packages/snscrape/modules/twitter.py", line 691, in _get_api_data
r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
File "/home/pi/.local/lib/python3.8/site-packages/snscrape/base.py", line 221, in _get
return self._request('GET', *args, **kwargs)
File "/home/pi/.local/lib/python3.8/site-packages/snscrape/base.py", line 217, in _request
raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Aannewilltalk&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo failed, giving up.
I found online, that the URL encoding must not exceed a length of 500 and the one from the error message is about 800 long. Could that be the problem? Why did that change overnight?
How can I fix that?
the same problem here. It was working just fine and sudenly stoped. I get the same error code.
This is the code I used:
import snscrape.modules.twitter as sntwitter
import pandas as pd
# Creating list to append tweet data to
tweets_list2 = []
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('xxxx since:2022-06-01 until:2022-06-30').get_items()):
if i>100000:
break
tweets_list2.append([tweet.date, tweet.id, tweet.content, tweet.user.username])
# Creating a dataframe from the tweets list above
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Datetime', 'Tweet Id', 'Text', 'Username'])
print(tweets_df2)
tweets_df2.to_csv('xxxxx.csv')
Twitter changed something that broke the snscrape requests several days back. It looks like they just reverted back to the configuration that worked about 15 minutes ago, though. If you are using the latest snscrape version, everything should be working for you now!
Here is a thread with more info:
https://github.com/JustAnotherArchivist/snscrape/issues/671
There was a lot of discussion in the snscrape github, but none of the answers worked for me.
I now have a workaround using the shell-client. Executing the queries via terminal works for some reason.
So I execute the shell command via subprocess, save the result in a json and load that json back to python.
Example shell command: snscrape --jsonl --max-results 10 twitter-search "from:maybritillner" > test_twitter.json

Error while updating Confluence page using Python

status = confluence.update_page(
parent_id=None,
page_id={con_pageid},
title={con_title},
body='Updated Page. You can use <strong>HTML tags</strong>!')
Using this code gives me the following error:
Traceback (most recent call last):
File "update2.py", line 24, in
status = confluence.update_page(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/atlassian/confluence.py", line 1513, in update_page
if not always_update and body is not None and self.is_page_content_is_already_updated(page_id, body, title):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/atlassian/confluence.py", line 1433, in is_page_content_is_already_updated
current_title = confluence_content.get("title", None)
AttributeError: 'str' object has no attribute 'get'
Does anyone have an idea on how to update a confluence page using python? I've tried various solutions provided even here, but none of them is working for me.
Finally tweaked a code that works. Providing whole code in case anyone would need it later.
import requests
from atlassian import Confluence
confluence = Confluence(
url='https://confluence.<your domain>.com',
token='CyberPunk EdgeRunners')
status = confluence.get_page_by_title(space='ABC', title='testPage')
print(status)
status = confluence.update_page(
parent_id=<a number sequence>,
page_id=<a number sequence>,
title='testpage',
body='<h1 id="WindowsSignatureSet2.4.131.3-2-HandoffinstructionstoOperations">A</h1>'
)
print(status)

How to download datasat from The Humanitarian Data Exchange (hdx api python)

I don't quite understand how I can download data from a dataset. I only download one file, and there are several of them. How can I solve this problem?
I am using hdx api library. There is a small example in the documentation. A list is returned to me and I use the download method. But only the first file from the list is downloaded, not all of them.
My code
from hdx.hdx_configuration import Configuration
from hdx.data.dataset import Dataset
Configuration.create(hdx_site='prod', user_agent='A_Quick_Example', hdx_read_only=True)
dataset = Dataset.read_from_hdx('novel-coronavirus-2019-ncov-cases')
resources = dataset.get_resources()
print(resources)
url, path = resources[0].download()
print('Resource URL %s downloaded to %s' % (url, path))
I tried to use different methods, but only this one turned out to be working, it seems some kind of error in the loop, but I do not understand how to solve it.
Result
Resource URL https://data.humdata.org/hxlproxy/api/data-preview.csv?url=https%3A%2F%2Fraw.githubusercontent.com%2FCSSEGISandData%2FCOVID-19%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_time_series%2Ftime_series_covid19_confirmed_global.csv&filename=time_series_covid19_confirmed_global.csv downloaded to C:\Users\tred1\AppData\Local\Temp\time_series_covid19_confirmed_global.csv.CSV
Forgot to add that I get a list of strings where there is a download url value. Probably the problem is in the loop
When I use a for-loop I get this:
for res in resources:
print(res)
res[0].download()
Traceback (most recent call last):
File "C:/Users/tred1/PycharmProjects/pythonProject2/HDXapi.py", line 31, in <module>
main()
File "C:/Users/tred1/PycharmProjects/pythonProject2/HDXapi.py", line 21, in main
res[0].download()
File "C:\Users\tred1\AppData\Local\Programs\Python\Python38\lib\collections\__init__.py", line 1010, in __getitem__
raise KeyError(key)
KeyError: 0
Datasets
You can get the download link as follows:
dataset = Dataset.read_from_hdx('acled-conflict-data-for-africa-1997-lastyear')
lita_resources = dataset.get_resources()
dictio=lista_resources[1]
url=dictio['download_url']

running a simple pdblp code to extract BBG data

I am currently logged on to my BBG anywhere (web login) on my Mac. So first question is would I still be able to extract data using tia (as I am not actually on my terminal)
import pdblp
con = pdblp.BCon(debug=True, port=8194, timeout=5000)
con.start()
I got this error
pdblp.pdblp:WARNING:Message Received:
SessionStartupFailure = {
reason = {
source = "Session"
category = "IO_ERROR"
errorCode = 9
description = "Connection failed"
}
}
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/prasadkamath/anaconda2/envs/Pk36/lib/python3.6/site-packages/pdblp/pdblp.py", line 147, in start
raise ConnectionError('Could not start blpapi.Session')
ConnectionError: Could not start blpapi.Session
I am assuming that I need to be on the terminal to be able to extract data, but wanted to confirm that.
This is a duplicate of this issue here on SO. It is not an issue with pdblp per se, but with blpapi not finding a connection. You mention that you are logged in via the web, which only allows you to use the terminal (or Excel add-in) within the browser, but not outside of it, since this way of accessing Bloomberg lacks a data feed and an API. More details and alternatives can be found here.

django test file download - "ValueError: I/O operation on closed file"

I have code for a view which serves a file download, and it works fine in the browser. Now I am trying to write a test for it, using the internal django Client.get:
response = self.client.get("/compile-book/", {'id': book.id})
self.assertEqual(response.status_code, 200)
self.assertEquals(response.get('Content-Disposition'),
"attachment; filename=book.zip")
so far so good. Now I would like to test if the downloaded file is the one I expect it to download. So I start by saying:
f = cStringIO.StringIO(response.content)
Now my test runner responds with:
Traceback (most recent call last):
File ".../tests.py", line 154, in test_download
f = cStringIO.StringIO(response.content)
File "/home/epub/projects/epub-env/lib/python2.7/site-packages/django/http/response.py", line 282, in content
self._consume_content()
File "/home/epub/projects/epub-env/lib/python2.7/site-packages/django/http/response.py", line 278, in _consume_content
self.content = b''.join(self.make_bytes(e) for e in self._container)
File "/home/epub/projects/epub-env/lib/python2.7/site-packages/django/http/response.py", line 278, in <genexpr>
self.content = b''.join(self.make_bytes(e) for e in self._container)
File "/usr/lib/python2.7/wsgiref/util.py", line 30, in next
data = self.filelike.read(self.blksize)
ValueError: I/O operation on closed file
Even when I do simply: self.assertIsNotNone(response.content) I get the same ValueError
The only topic on the entire internet (including django docs) I could find about testing downloads was this stackoverflow topic: Django Unit Test for testing a file download. Trying that solution led to these results. It is old and rare enough for me to open a new question.
Anybody knows how the testing of downloads is supposed to be handled in Django? (btw, running django 1.5 on python 2.7)
This works for us. We return rest_framework.response.Response but it should work with regular Django responses, as well.
import io
response = self.client.get(download_url, {'id': archive_id})
downloaded_file = io.BytesIO(b"".join(response.streaming_content))
Note:
streaming_content is only available for StreamingHttpResponse (also Django 1.10):
https://docs.djangoproject.com/en/1.10/ref/request-response/#django.http.StreamingHttpResponse.streaming_content
I had some file download code and a corresponding test that worked with Django 1.4. The test failed when I upgraded to Django 1.5 (with the same ValueError: I/O operation on closed file error that you encountered).
I fixed it by changing my non-test code to use a StreamingHttpResponse instead of a standard HttpResponse. My test code used response.content so I first migrated to CompatibleStreamingHttpResponse, then changed my test code code to use response.streaming_content instead to allow me to drop CompatibleStreamingHttpResponse in favour of StreamingHttpResponse.

Categories

Resources