Documentation Automation
The Goal
The goal of this project is to create a process that will allow to automate documentation of ETL processes.Starting Point
The content of PDI files (jobs and transformations) and Mondrian files is in XML format. These can be parsed to extract the text specified between the defined tags and use this for documentation. This should allow for example, to get the names of all transformations included in the job, all the dimensions referred to in a cube etc. Note that a similar project has been specified to also look at documenting and analysing files.Implementation Ideas
To parse xml files we can try to use XSLT. Useful links:- http://www.w3schools.com/xsl/
- http://www.w3schools.com/xsl/tryxslt.asp?xmlfile=cdcatalog&xsltfile=cdcatalog
First Results
The xsl templates are attached to this page. The following steps will allow you to create the documentation:- Download the templates and save them in the same folder with the files are that you want to document.
- Create a copy of the files that should be documented.
- Insert the following code after into the copied files:
<?xml version="1.0" encoding="UTF-8"?>- .kjb files:
<?xml-stylesheet type="text/xsl" href="documentationjob.xsl"?> - .ktr files:
<?xml-stylesheet type="text/xsl" href="documentationtrans.xsl"?> - .mondrian.xml files:
<?xml-stylesheet type="text/xsl" href="documentationcube.xsl"?>
- .kjb files:
- Open your copied and edited files in the web browser and see a formatted description of those files
on 21/01/2010 at 19:55