2009-10-16

Converting word doc to pdf in CruiseControl.Net

Today I needed to automate the process where we create PDF files out of our client documentation in Microsoft Word format. Since up to now we have done it manually by opening the document and printing to a installed PDF printer. Quite tedious work.
After some searching I start to realize that there are many PDF creators around; and most of them uses Ms word and prints using a pdf printer (just as we use to do, but by automation).
The problem is that I don't want to install printers or the Ms Office package on our build server (if I can avoid it). Another wish is that the solution should be free and open source.
So finally I stumbled upon this article by Graham J. Williams that uses OpenOffice.org to do the conversion and a custom macro to allow the conversion to be called from the commandline.
REM  *****  BASIC  *****

Sub ConvertWordToPDF(cFile)
cURL = ConvertToURL(cFile)

' Open the document.
' Just blindly assume that the document is of a type that OOo will
' correctly recognize and open -- without specifying an import filter.
oDoc = StarDesktop.loadComponentFromURL(cURL, "_blank", 0, Array(MakePropertyValue("Hidden", True), ))

cFile = Left(cFile, LastIndexOf(cFile,".")) + "pdf"
cURL = ConvertToURL(cFile)

' Save the document using a filter.
oDoc.storeToURL(cURL, Array(MakePropertyValue("FilterName", "writer_pdf_Export"), ))

oDoc.close(True)

End Sub

Function LastIndexOf ( cText as String, cMatch as String) As Integer
lastIndex = 0
pos=0
Do
pos = InStr(pos+1,cText,cMatch)'
If pos >0 Then
lastIndex = pos
End If
Loop While pos >0
LastIndexOf()=lastIndex
End Function

Function MakePropertyValue( Optional cName As String, Optional uValue ) As com.sun.star.beans.PropertyValue
Dim oPropertyValue As New com.sun.star.beans.PropertyValue
If Not IsMissing( cName ) Then
oPropertyValue.Name = cName
EndIf
If Not IsMissing( uValue ) Then
oPropertyValue.Value = uValue
EndIf
MakePropertyValue() = oPropertyValue
End Function
I modified it a bit to allow correct handing of extensions.
To call it from a windows command prompt just enter
c:\program files\OpenOffice.Org 3\Program\swriter.exe -invisible "macro:///Standard.Module1.ConvertWordToPDF(c:\temp\My word document.doc)"
I incorporated it in our cruisecontrol.net nant scripts so all solutions that needs pdf conversions in the build chain can have it.
Great!

2010-05-11 Update: Fixed a bug with filenames that contained more than one period (.), added the LastIndexOf function