I'm part of a project team that created PPTX presentations to present to clients. After creating all of the files, we need to add additional slides to each presentation. All of the new slides will be the same across the each presentation.
What is the best way to accomplish this programmatically?
I don't want to use VBA because (as far as I understand) I would have to open each presentation to run the script.
I've tried using the python-pptx library. But the documentation states:
"Copying a slide from one presentation to another turns out to be pretty hard to get right in the general case, so that probably won’t come until more of the backlog is burned down."
I was hoping something like the following would work -
from pptx import Presentation
main = Presentation('Universal.pptx')
abc = Presentation('Test1.pptx')
main_slides = main.slides.get(1)
abc_slides = abc.slides.get(1)
full = main.slides.add_slide(abc_slides[1])
full.save('Full.pptx')
Has anyone had success do anything like that?
I was able to achieve this by using python and win32com.client. However, this doesn't work quietly. What I mean is that it launches Microsoft PowerPoint and opens input files one by one, then copies all slides from an input file and pastes them to an output file in a loop.
import win32com.client
from os import walk
def mergePresentations(inputFileNames, outputFileName):
Application = win32com.client.Dispatch("PowerPoint.Application")
outputPresentation = Application.Presentations.Add()
outputPresentation.SaveAs(outputFileName)
for file in inputFileNames:
currentPresentation = Application.Presentations.Open(file)
currentPresentation.Slides.Range(range(1, currentPresentation.Slides.Count+1)).copy()
Application.Presentations(outputFileName).Windows(1).Activate()
outputPresentation.Application.CommandBars.ExecuteMso("PasteSourceFormatting")
currentPresentation.Close()
outputPresentation.save()
outputPresentation.close()
Application.Quit()
# Example; let's say you have a folder of presentations that need to be merged
# to new file named "allSildesMerged.pptx" in the same folder
path,_,files = next(walk('C:\\Users\\..\\..\\myFolder'))
outputFileName = path + '\\' + 'allSildesMerged.pptx'
inputFiles = []
for file in files:
inputFiles.append(path + '\\' + file)
mergePresentations(inputFiles, outputFileName)
The GroupDocs.Merger REST API is also another option to merge multiple PowerPoint presentations into a single document. It is paid API but provides 150 monthly free API calls.
Currently, it supports working with cloud providers: Amazon S3, DropBox, Google Drive Storage, Google Cloud Storage, Windows Azure Storage, FTP Storage along with GroupDocs internal Cloud Storage. However, in near future, it has a plan to support merge files from the request body(stream).
P.S: I'm developer evangelist at GroupDocs.
# For complete examples and data files, please go to https://github.com/groupdocs-merger-cloud/groupdocs-merger-cloud-python-samples
# Get Client ID and Client Secret from https://dashboard.groupdocs.cloud
client_id = "XXXX-XXXX-XXXX-XXXX"
client_secret = "XXXXXXXXXXXXXXXX"
documentApi = groupdocs_merger_cloud.DocumentApi.from_keys(client_id, client_secret)
item1 = groupdocs_merger_cloud.JoinItem()
item1.file_info = groupdocs_merger_cloud.FileInfo("four-slides.pptx")
item2 = groupdocs_merger_cloud.JoinItem()
item2.file_info = groupdocs_merger_cloud.FileInfo("one-slide.docx")
options = groupdocs_merger_cloud.JoinOptions()
options.join_items = [item1, item2]
options.output_path = "Output/joined.pptx"
result = documentApi.join(groupdocs_merger_cloud.JoinRequest(options))
A free tool called "powerpoint join" can help you.
Related
Is there a way to refresh Tableau embedded datasource using python. I am currently using Tableau server client library to refresh published datasources which is actually working fine. Can someone help me to figure out a way?
The way you can reach them is kinda annoying from my perpective.
You need to use populate_connections() function to load embedded datasources. It would be easier if you know the name of the workbook.
import tableauserverclient as TSC
#sign in using personal access token
server = TSC.Server(server_address='server_name', use_server_version=True)
server.auth.sign_in_with_personal_access_token(auth_req=TSC.PersonalAccessTokenAuth(token_name='tokenName', personal_access_token='tokenValue', site_id='site_name'))
#use RequestOptions() with a filter to pull an specific workbook
def get_workbook(name):
req_opt = TSC.RequestOptions()
req_opt.filter.add(TSC.Filter(req_opt.Field.Name, req_opt.Operator.Equals, name))
return server.workbooks.get(req_opt)[0][0] #workbooks.get () function is intended to return a list items that you can iterate, but here we are assuming it will be find only one result
workbook = get_workbook(name='workbook_name') #gets the workbook
server.workbooks.populate_connections(workbook) #this function will load all the embedded datasources in the workbook
for datasource in workbook.connections: #iterate in datasource list
#Note: each element of this list is not an TSC.DatabaseItem, so, you will need to load a valid one using the "datasource_id" attribute from the element.
#If you try server.datasources.refresh(datasource) it will fail
ds = server.datasources.get_by_id(datasource.datasource_id) #loads a valid TSC.DatabaseItem
server.datasources.refresh(ds) #finally, you will be able to refresh it
...
The best practice is do not embeddeding datasources but publish them independently.
Update:
There is an easy way to achieve this. There are two types of extract tasks, Workbook and Data source. So, for embedded data sources, you need to perform a workbook refresh.
workbook = get_workbook(name='workbook_name')
server.workbooks.refresh(workbook.id)
You can use "tableauserverclient" Python package. You can pip install it from PyPy.
After installing it, you can consult the docs.
I will attach an example I used some time ago:
import tableauserverclient as TSC
tableau_auth = TSC.TableauAuth('user', 'pass', 'homepage')
server = TSC.Server('server')
with server.auth.sign_in(tableau_auth):
all_datasources, pagination_item = server.datasources.get()
print("\nThere are {} datasources on
site:".format(pagination_item.total_available))
print([datasource.name for datasource in all_datasources])
TL;DR version - I need to programmatically add a password to .docx/.xlsx/.pptx files using LibreOffice and it doesn't work, and no errors are reported back either, my request to add a password is simply ignored, and a password-less version of the same file is saved.
In-depth:
I'm trying to script the ability to password-protect existing .docx/.xlsx/.pptx files using LibreOffice.
I'm using 64-bit LibreOffice 6.2.5.2 which is the latest version at the time of writing, on Windows 8.1 64-bit Professional.
Whilst I can do this manually via the UI - specifically, I open the "plain" document, do "Save As" and then tick "Save with Password", and enter the password in there, I cannot get this to work via any kind of automation. I'm been trying via Python/Uno, but to no gain. Although the code below correctly opens and saves the document, my attempt to add a password is completely ignored. Curiously, the file size shrinks from 12kb to 9kb when I do this.
Here is my code:
import socket
import uno
import sys
localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", localContext)
ctx = resolver.resolve( "uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext" )
smgr = ctx.ServiceManager
desktop = smgr.createInstanceWithContext( "com.sun.star.frame.Desktop",ctx)
from com.sun.star.beans import PropertyValue
properties=[]
oDocB = desktop.loadComponentFromURL ("file:///C:/Docs/PlainDoc.docx","_blank",0, tuple(properties) )
sp=[]
sp1=PropertyValue()
sp1.Name='FilterName'
sp1.Value='MS Word 2007 XML'
sp.append(sp1)
sp2=PropertyValue()
sp2.Name='Password'
sp2.Value='secret'
sp.append(sp2)
oDocB.storeToURL("file:///C:/Docs/PasswordDoc.docx",sp)
oDocB.dispose()
I've had great results using Python/Uno to open password-protected files, but I cannot get it to protect a previously unprotected document. I've tried enabling the macro recorder and recording my actions - it recorded the following LibreOffice BASIC code:
sub SaveDoc
rem ----------------------------------------------------------------------
rem define variables
dim document as object
dim dispatcher as object
rem ----------------------------------------------------------------------
rem get access to the document
document = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
rem ----------------------------------------------------------------------
dim args1(2) as new com.sun.star.beans.PropertyValue
args1(0).Name = "URL"
args1(0).Value = "file:///C:/Docs/PasswordDoc.docx"
args1(1).Name = "FilterName"
args1(1).Value = "MS Word 2007 XML"
args1(2).Name = "EncryptionData"
args1(2).Value = Array(Array("OOXPassword","secret"))
dispatcher.executeDispatch(document, ".uno:SaveAs", "", 0, args1())
end sub
Even when I try to run that, it...saves an unprotected document, with no password encryption. I've even tried converting the macro above into the equivalent Python code, but to no avail either. I don't get any errors, it simply doesn't protect the document.
Finally, out of desperation, I've even tried other approaches that don't include LibreOffice, for example, using the Apache POI library as per the following existing StackOverflow question:
Python or LibreOffice Save xlsx file encrypted with password
...but I just get an error saying "Error: Could not find or load main class org.python.util.jython". I've tried upgrading my JDK, tweaking the paths used in the example, i.e. had an "intelligent" go, but still no joy. I suspect the error above is trivial to fix, but I'm not a Java developer and lack the experience in this area.
Does anyone have any solution? Do you have some LibreOffice code that can do this (password-protect .docx/.xlsx/.pptx files)? Or OpenOffice for that matter, I'm not precious about which package I use. Or something else entirely!
NOTE: I appreciate this is trivial using full-fat Microsoft Office, but thanks to Microsoft's licensing restrictions, is a complete no-go for this project - I have to use an alternative.
The following example is from page 40 (file page 56) of Useful Macro Information
For OpenOffice.org by Andrew Pitonyak (http://www.pitonyak.org/AndrewMacro.odt). The document is directed to OpenOffice.org Basic but is generally applicable to LibreOffice as well. The example differs from the macro recorder version primarily in its use of the documented API rather than dispatch calls.
5.8.3. Save a document with a password
To save a document with a password, you must set the “Password”
attribute.
Listing 5.19: Save a document using a password.
Sub SaveDocumentWithPassword
Dim args(0) As New com.sun.star.beans.PropertyValue
Dim sURL$
args(0).Name ="Password"
args(0).Value = "test"
sURL=ConvertToURL("/andrew0/home/andy/test.odt")
ThisComponent.storeToURL(sURL, args())
End Sub
The argument name is case sensitive, so “password” will not work.
I am trying to do a quick proof of concept for building a data processing pipeline in Python. To do this, I want to build a Google Function which will be triggered when certain .csv files will be dropped into Cloud Storage.
I followed along this Google Functions Python tutorial and while the sample code does trigger the Function to create some simple logs when a file is dropped, I am really stuck on what call I have to make to actually read the contents of the data. I tried to search for an SDK/API guidance document but I have not been able to find it.
In case this is relevant, once I process the .csv, I want to be able to add some data that I extract from it into GCP's Pub/Sub.
The function does not actually receive the contents of the file, just some metadata about it.
You'll want to use the google-cloud-storage client. See the "Downloading Objects" guide for more details.
Putting that together with the tutorial you're using, you get a function like:
from google.cloud import storage
storage_client = storage.Client()
def hello_gcs_generic(data, context):
bucket = storage_client.get_bucket(data['bucket'])
blob = bucket.blob(data['name'])
contents = blob.download_as_string()
# Process the file contents, etc...
This is an alternative solution using pandas:
Cloud Function Code:
import pandas as pd
def GCSDataRead(event, context):
bucketName = event['bucket']
blobName = event['name']
fileName = "gs://" + bucketName + "/" + blobName
dataFrame = pd.read_csv(fileName, sep=",")
print(dataFrame)
I'm trying to create a rails app that is a CMS for a client. The app currently has a documents class that uploads the document with paperclip.
Separate to this, we're running a python script that accesses the database and gets a bunch of information for a given event, creates a proposal word document, and uploads it to the database under the correct event.
This all works, but the app does not recognize the document. How do I make a python script that will correctly upload the document such that paperclip knows what's going on?
Here is my paperclip controller:
def new
#event = Event.find(params[:event_id])
#document = Document.new
end
def create
#event = Event.find(params[:event_id])
#document = #event.documents.new(document_params)
if #document.save
redirect_to event_path(#event)
end
end
private
def document_params
params.require(:document).permit(:event_id, :data, :title)
end
Model
validates :title, presence: true
has_attached_file :data
validates_attachment_content_type :data, :content_type => ["application/pdf", "application/msword"]
Here is the python code.
f = open(propStr, 'r')
binary = psycopg2.Binary(f.read())
self.cur.execute("INSERT INTO documents (event_id, title, data_file_name, data_content_type) VALUES (%d,'Proposal.doc',%s,'application/msword');" % (self.eventData[0], binary))
self.con.commit()
You should probably use Ruby to script this since it can load in any model information or other classes you need.
But assuming your requirements dictate the use of python, be aware that Paperclip does not store the documents in your database tables, only the files' metadata. The actual file is stored in your file system in the /public dir by default (could also be s3, etc depending on your configuration). I would make sure you were actually saving the file to the correct anticipated directory. The default path according to the docs is:
:rails_root/public/system/:class/:attachment/:id_partition/:style/:filename
so you will have to make another sql query to retrieve the id of your new record. I don't believe pdfs have a :style attribute since you don't use imagicmagick to resize them, so build a path that looks something like this:
/public/system/documents/data/000/000/123/my_file.pdf
and save it from your python script.
I am converting code away from the deprecated files api.
I have the following code that works fine in the SDK server but fails in production. Is what I am doing even correct? If yes what could be wrong, any ideas how to troubleshoot it?
# Code earlier writes the file bs_file_name. This works fine because I can see the file
# in the Cloud Console.
bk = blobstore.create_gs_key( "/gs" + bs_file_name)
assert(bk)
if not isinstance(bk,blobstore.BlobKey):
bk = blobstore.BlobKey(bk)
assert isinstance(bk,blobstore.BlobKey)
# next line fails here in production only
assert(blobstore.get(bk)) # <----------- blobstore.get(bk) returns None
Unfortunately, as per the documentation, you can't get a BlobInfo object for GCS files.
https://developers.google.com/appengine/docs/python/blobstore/#Python_Using_the_Blobstore_API_with_Google_Cloud_Storage
Note: Once you obtain a blobKey for the GCS object, you can pass it around, serialize it, and otherwise use it interchangeably anywhere you can use a blobKey for objects stored in Blobstore. This allows for usage where an app stores some data in blobstore and some in GCS, but treats the data otherwise identically by the rest of the app. (However, BlobInfo objects are currently not available for GCS objects.)
I encountered this exact same issue today and it feels very much like a bug within the blobstore api when using google cloud storage.
Rather than leveraging the blobstore api I made use of the google cloud storage client library. The library can be downloaded here: https://developers.google.com/appengine/docs/python/googlecloudstorageclient/download
To access a file on GCS:
import cloudstorage as gcs
with gcs.open(GCSFileName) as f:
blob_content = f.read()
print blob_content
It sucks that GAE has different behaviours when using blobInfo in local mode or the production environment, it took me a while to find out that, but a easy solution is that:
You can use a blobReader to access the data when you have the blob_key.
def getBlob(blob_key):
logging.info('getting blob('+blob_key+')')
with blobstore.BlobReader(blob_key) as f:
data_list = []
chunk = f.read(1000)
while chunk != "":
data_list.append(chunk)
chunk = f.read(1000)
data = "".join(data_list)
return data`
https://developers.google.com/appengine/docs/python/blobstore/blobreaderclass