Upload file to CKAN with ckanapi (Python)

Upload file to CKAN with ckanapi (Python) - python

import ckanapi
try:
ckan = ckanapi.RemoteCKAN(serverurl,
apikey='myapikeyhere',
user_agent='useragenthere')
res = ckan.action.resource_create(
package_id='2ad3c9de-502c-403a-8b03-bfc619697ff2',
#url='url',
#revision_id='revid',
description='my first upload with CKANAPI',
upload=open('./upload.csv')
)
except Exception as e:
raise Exception(str(e.error_dict))
It fails with:
Field errors: {u'url': [u'Missing value'], u'__type': u'Validation Error'}
They made url a required attribute in this discussion on GitHub:
https://github.com/ckan/ckan/pull/1641
So what is the expected value of the url attribute?
If it's expecting the url to the local file, it's not hosted.
And I cannot supply the url of the file on CKAN, because the resourceid was not created, yet.
PS: When passing an arbitrary value for the url attribute, the upload succeeds.
It makes no sense to require the url attribute. Can anybody explain?

That's, in my opinion, a bug in CKAN. I've created a issue to track it at https://github.com/ckan/ckan/issues/2769. I've also wrote a pull request on ckanapi to abstract this bug at https://github.com/ckan/ckanapi/pull/74.
As a workaround in the mean time, you can set the url to an empty string.

Related

Office365-REST-Python-Client 401 on File Update

I finally got over the hurdle of uploading files into SharePoint which enabled me to answer my own question here:
Office365-REST-Python-Client Access Token issue
However, the whole point of my project was to add metadata to the files being uploaded to make it possible to filter on them. For the avoidance of double, I am talking about column information in Sharepoints Document Libraries.
Ideally, I would like to do this when I upload the files in the first place but my understanding of the rest API is that you have to upload first and then use a PUT request to update its metadata.
The link to the Git Hub for Office365-REST-Python-Client:
https://github.com/vgrem/Office365-REST-Python-Client
This Libary seems to be the answer but the closest I can find to documentation is under the examples folder. Sadly the example for update file metadata does not exist. I think part of the reason for this stems from the only option being to use a PUT request on a list item.
According to the REST API documentation, which this library is built on, an item's metadata must be operated on as part of a list.
REST API Documentation for file upload:
https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-folders-and-files-with-rest#working-with-files-by-using-rest
REST API Documentation for updating list metadata:
https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-lists-and-list-items-with-rest#update-list-item
There is an example for updating a list item:
'https://github.com/vgrem/Office365-REST-Python-Client/blob/master/examples/sharepoint/listitems_operations_alt.py' but it returns a 401. If you look at my answer to my own question in the link-up top you will see that I granted this App full control. So an unauthorized response and stopped has stopped me dead in my tracks wondering what to do next.
So after all that, my question is:
How do I upload a file to a Sharepoint Document Libary and add Metadata to its column information using Office365-REST-Python-Client?
Kind Regards
Rich

Upload endpoint request
url: http://site url/_api/web/GetFolderByServerRelativeUrl('/Shared Documents')/Files/Add(url='file name', overwrite=true)
method: POST
body: contents of binary file
headers:
Authorization: "Bearer " + accessToken
X-RequestDigest: form digest value
content-type: "application/json;odata=verbose"
content-length:length of post body
could be converted to the following Python example:
ctx = ClientContext(url, ctx_auth)
file_info = FileCreationInformation()
file_info.content = file_content
file_info.url = os.path.basename(path)
file_info.overwrite = True
target_file = ctx.web.get_folder_by_server_relative_url("Shared Documents").files.add(file_info)
ctx.execute_query()
Once file is uploaded, it's metadata could be set like this:
list_item = target_file.listitem_allfields # get associated list item
list_item.set_property("Title", "New title")
list_item.update()
ctx.execute_query()

I'm glad I stumbled upon this post and Office365-REST-Python-Client in general. However, I'm currently stuck trying to update a file's metadata, I keep receiving:
'File' object has no attribute 'listitem_allfields'
Any help is greatly appreciated. Note, I also updated this module to v 2.3.1
Here's my code:
list_title = "Documents"
target_folder = ctx.web.lists.get_by_title(list_title).root_folder
target_file = target_folder.upload_file(filename, filecontents)
ctx.execute_query()
list_item = target_file.listitem_allfields
I've also tried:
library_root = ctx.web.get_folder_by_server_relative_url('Shared Documents')
file_info = FileCreationInformation()
file_info.overwrite = True
file_info.content = filecontent
file_info.url = filename
upload_file = library_root.files.add(file_info)
ctx.load(upload_file)
ctx.execute_query()
list_item = upload_file.listitem_allfields
I've also tried to get the uploaded file item directly with the same result:
target_folder = ctx.web.lists.get_by_title(list_title).root_folder
target_file = target_folder.upload_file(filename, filecontent)
ctx.execute_query()
uploaded_file = ctx.web.get_file_by_server_relative_url(target_file.serverRelativeUrl)
print(uploaded_file.__dict__)
list_item = uploaded_file.listitem_allfields
All variations return:
'File' object has no attribute 'listitem_allfields'
What am I missing? How to add metadata to a new SPO file/list item uploaded via Python/Office365-REST-Python-Client
Update:
The problem was I was looking for the wrong property of the uploaded file. The correct attribute is:
uploaded_file.listItemAllFields
Note the correct casing. Hopefully my question/answer may help someone else who's is as ignorant as me of attribute/object casing.

PRAW Errors updating to new Python Distro

So I am trying to create a bot that cross posts from a sub (r/pics) to (r/polpics) using a bit of code from u/GoldenSights. I upgraded to a new python distro and I get a ton of errors, I don't even know where to begin. Here is the code (formatting off, error lines bold):
Traceback (most recent call last):
File "C:\Users\tonyc\AppData\Local\Programs\Python\Python36-32\Lib\site-
packages\praw\subdump.py", line 84, in <module>
r = praw.Reddit(USERAGENT)
File "C:\Users\tonyc\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\praw\reddit.py", line 150, in __init__
raise ClientException(required_message.format(attribute))
praw.exceptions.ClientException: Required configuration setting 'client_id'
missing.
This setting can be provided in a praw.ini file, as a keyword argument to the `Reddit` class constructor, or as an environment variable.
This seems to be related to USERAGENT setting. I don't think I have that configured right.
USERAGENT = ""
# This is a short description of what the bot does. For example
"/u/GoldenSights' Newsletter bot"
SUBREDDIT = "pics"
# This is the sub or list of subs to scan for new posts.
# For a single sub, use "sub1".
# For multiple subs, use "sub1+sub2+sub3+...".
# For all use "all"
KEYWORDS = ["It looks like this post is about US Politics."]
# Any comment containing these words will be saved.
KEYDOMAINS = []
# If non-empty, linkposts must have these strings in their URL
This is the error line:
print('Logging in')
r = praw.Reddit(USERAGENT) <--here, this is error line 84
r.set_oauth_app_info(APP_ID, APP_SECRET, APP_URI)
r.refresh_access_information(APP_REFRESH)
Also in Reddit.py :
raise ClientException(required_message.format(attribute)) <--- error
praw.exceptions.ClientException: Required configuration setting 'client_id'
missing.
This setting can be provided in a praw.ini file, as a keyword argument to
the `Reddit` class constructor, or as an environment variable.

Firstly, you're going to want to have your API credentials stored externally in your praw.ini file. This makes things a lot more secure, and looks like it might go some way to fixing your issue. Here's what a completed praw.ini file looks like, including the useragent, so try to replicate this.
[DEFAULT]
# A boolean to indicate whether or not to check for package updates.
check_for_updates=True
# Object to kind mappings
comment_kind=t1
message_kind=t4
redditor_kind=t2
submission_kind=t3
subreddit_kind=t5
# The URL prefix for OAuth-related requests.
oauth_url=https://oauth.reddit.com
# The URL prefix for regular requests.
reddit_url=https://www.reddit.com
# The URL prefix for short URLs.
short_url=https://redd.it
[appname]
client_id=IE*******T14_w
client_secret=SW***********************CLY
password=******************
username=appname
user_agent=web:appname:1.0.0 (by /u/username)
Let me know how things go after you sort this out.

Default file in Tornado's StaticFileHandler

I have following Application configuration:
settings = {
'default_handler_class': BaseHandler
}
app = web.Application([
(r'/', IndexHandler),
(r'/ws', SocketHandler),
(r'/js/(.*)', web.StaticFileHandler, {'path': 'assets/js', 'default_filename': 'templates/error.html'}),
(r'/css/(.*)', web.StaticFileHandler, {'path': 'assets/css'}),
(r'/images/(.*)', web.StaticFileHandler, {'path': 'assets/images'})
], **settings)
When I type in http://localhost:8888/js/d3.min.js the file is served, but when I mispell file name and provide http://localhost:8888/js/d3.mi.js for example I would like to obtain my default error page which is located at templates/error.html. For URL like http://localhost:8888/not/existing it works fine but the http://localhost:8888/js/d3.mi.js gives me just plain 404: Not Found.
I found following part in documentation:
To serve a file like index.html automatically when a directory is
requested, set static_handler_args=dict(default_filename="index.html")
in your application settings, or add default_filename as an
initializer argument for your StaticFileHandler.
However I can't understand where I should specify mentioned code. The 'default_filename': 'templates/error.html' in my code doesn't work.

default_filename
The file specified in default_filename should be in given static path. So if you move error.html to assets/js directory, than navigate to /js/ you will see content of error.html.
Basically this functionality is a helper with limited usecase (imho). More at https://stackoverflow.com/a/27891339/681044.
Custom error pages
Every request handler handles/renders errors in write_error function. This is the recommended way to create custom error pages:
class MyStaticFileHandler(tornado.web.StaticFileHandler):
def write_error(self, status_code, *args, **kwargs):
# custom 404 page
if status_code in [404]:
self.render('templates/error.html')
else:
super().write_error(status_code, *args, **kwargs)

In fact The 'default_filename' work well in your code.
What does default_filename mean ?
"default_filename" means that if you request a directory such as "http://localhost:1234/js/" ,server will return a default file to you.
so, you must be aware that "default file" is not error file, "default_filename" isn't what you need.
What do you need?
to write a subclass of "StaticFileHandler" will resolve. in the method "validate_absolute_path" of "StaticFileHandler"
if not os.path.exists(absolute_path):
raise HTTPError(404)
Don't raise 404, just return your error file path (such as js/error.js).
Good luck!
my english is poor, i don't known if you can get it ^_^.
it's my pleasure to exchange experience with you.

django-user-accounts package does not load the account/signup.html template

I am trying to use the django-user-accounts package from pinax. I am completely new to django, and I get stuck at some point, I've been struggling for hours but I still cannot display the account/signup page.
So, I have the following line in my urls.py:
url(r"^account/", include("account.urls")),
Then, I went to check in the urls.py of the account package, and
it countains this line.
url(r"^signup/$", SignupView.as_view(), name="account_signup"),
So, when I give the address: 127.0.0.1:8000/account/signup/ in my browser, I think that django should give back the SignupView. But I don't really know what the "as_view()" function does. Usually the second argument of url() should be a function that returns a HTMLResponse. So I went to see in the views.py of the account package: The class SignupView has an attribute
template_name = "account/signup.html"
I would expect the HTMLResponse returned by SignupView.as_view() to be using this template but it doesnt. Instead, I got this error:
TypeError at /
'str' object is not callable
Request Method: GET Request URL: http://127.0.0.1:8000/ Django
Version: 1.6.1 Exception Type: TypeError Exception Value:
'str' object is not callable
Exception Location:
/usr/lib/python2.7/dist-packages/django/core/handlers/base.py in
get_response, line 112 Python Executable: /usr/bin/python Python
Version: 2.7.6
Note that this is just the default behaviour of django-user-accounts, I have not modified anything. So I guess that I am missing a dependency or something, but I cannot interprete the error message. By the way, the returned error is exaclty the same as when I give this address in the browser 127.0.0.1:8000/. Here I expect to receive an error because I have no home page yet, but still, why does the SignupView try to get the html of the root page ???
I am stuck here and I have no idea how I can try to debug this. Any hints would be more than welcome.

So, the problem has been solved thanks to a comment of Bilou06, the problem was actually caused by another line in the urls.py of the project, because the root of the project was pointing to an indexView that did not exist.

Pylons: response renaming? Is there a better way?

I've got a Pylons controller with an action called serialize returning content_type=text/csv. I'd like the response of the action to be named based on the input patameter, i.e. for the following route, produced csv file should be named {id}.csv : /app/PROD/serialize => PROD.csv (so a user can open the file in Excel with a proper name directly via a webbrowser)
map.connect('/app/{id}/serialize',controller = 'csvproducer',action='serialize')
I've tried to set different HTTP headers and properties of the webob's response object with no luck. However, I figured out a workaround by simply adding a new action to the controller and dynamically redirecting the original action to that new action, i.e.:
map.connect('/app/{id}/serialize',controller = 'csvproducer',action='serialize')
map.connect('/app/csv/{foo}',controller = 'csvproducer', action='tocsv')
The controller's snippet:
def serialize(self,id):
try:
session['key'] = self.service.serialize(id) #produces csv content
session.save()
redirect_to(str("/app/csv/%s.csv" % id))
except Exception,e:
log.error(e)
abort(503)
def tocsv(self):
try:
csv = session.pop("rfa.enviornment.serialize")
except Exception,e:
log.error(e)
abort(503)
if csv:
response.content_type='text/csv'
response.status_int=200
response.write(csv)
else:
abort(404)
The above setup works perfectly fine, however, is there a better/slicker/neater way of doing it? Ideally I wouldn't like to redirect the request; instead I'd like to either rename location or set content-disposition: attachment; filename='XXX.csv' [ unsuccessfully tried both :( ]
Am I missing something obvious here?
Cheers
UPDATE:
Thanks to ebo I've managed to do fix content-disposition. Should better read W3C specs next time ;)

You should be able to set the content-disposition header on a response object.
If you have already tried that, it may not have worked because the http standard says that the quotes should be done by double-quote marks.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Upload file to CKAN with ckanapi (Python) - python

That's, in my opinion, a bug in CKAN. I've created a issue to track it at https://github.com/ckan/ckan/issues/2769. I've also wrote a pull request on ckanapi to abstract this bug at https://github.com/ckan/ckanapi/pull/74. As a workaround in the mean time, you can set the url to an empty string.

Related

Office365-REST-Python-Client 401 on File Update

PRAW Errors updating to new Python Distro

Default file in Tornado's StaticFileHandler

django-user-accounts package does not load the account/signup.html template

Pylons: response renaming? Is there a better way?

Categories

Resources