Friday, March 2, 2012

Exploring SSIS Architecture and Execution History Through Scripting

My previous article on SSIS was focused on the architecture and the functioning of the product. I had also provided a few essential T-SQL scripts which provide certain ways for documenting the SSIS environment.
In this article I will focus more on the T-SQL scripting and the ways to reveal configuration, performance and architectural information through scripting.

Exploring SSIS’s Metadata Objects

SSIS Metadata Objects in sys.objects

Let’s start simple by exploring the metadata objects that are related to the SSIS installation.
If we look at the Integration Services metadata objects in SQL 2005 we will notice that the objects contain the ‘DTS’ phrase in their names. By executing the following script in SQL 2005 we will get all objects related to the SSIS metadata (notice that the script is executed in the MSDB context):
USE msdb ;SELECT  *FROM    sys.objectsWHERE   name LIKE '%dts%'
Later on, in the SQL 2008 and later we have objects containing the phrase ‘DTS’ as well as ‘SSIS’ in the names. Execute the following script to view the objects (again, in the context of the MSDB database):
USE msdb ;SELECT  *FROM    sys.objectsWHERE   name LIKE '%dts%'
        
OR name LIKE '%ssis%'
Why is this? In SQL Server 2005 you will find dbo.sysdtspackages and dbo. sysdtspackages90, which help SQL Server distinguish between Integration Services packages created in BIDS and legacy packages inherited and transferred from the old SQL Server 2000 DTS (Data Transformation Services).
In SQL Server 2008 and up we find dbo. sysdtspackages and dbo. sysssispackages, where the first table contains legacies, and the second – the BIDS packages with versions from 2005 and 2008.

SSIS Metadata Objects in other system tables

In SQL 2008 and up we have:
name - 2008
sysdtscategoriesOne row for each category description
sysdtspackagelogLegacy
sysdtspackagesLegacy
sysdtssteplogLegacy
sysdtstasklogLegacy
sysssislogOne row per entry generated by SSIS package at runtime (when the SQL Server log provider is used)
sysssispackagefoldersOne row for each folder in the SSIS structure
sysssispackagesOne row for each SSIS package
… and in SQL 2005 there is …
name - 2005
sysdtscategoriesOne row for each category description
sysdtslog90One row per entry generated by SSIS package at runtime
sysdtspackagefolders90One row for each folder in the SSIS structure
sysdtspackagelogLegacy
sysdtspackagesLegacy
sysdtspackages90One row for each SSIS package
sysdtssteplogLegacy
sysdtstasklogLegacy

Structure and contents of the SSIS packages

As we know, the SSIS packages are just structured XML files that contain all information needed for the package to carry out its tasks. In other words, the SSIS package itself contains the objects in the flows, the precedence, the connections and their configurations.
SSIS Packages may be saved on the file system, or in MSDB repository. In the case of the package being saved in MSDB, the package definition is saved in the packagedata column of the dbo.sysssispackages table (or in dbo.sysdtspackages90 in SQL Server 2005).
The column itself is of the image datatype, hence in order for us to retrieve the contents, we need to cast it as a VARBINARY(MAX) first, and then as a XML data type.
Depending on the security level of the package, however, it might not be very easy to explore the contents of the package definitions in MSDB; in case the package is encrypted, the package definition will begin with theEncryptedData tag.

Retrieving the definitions of the SSIS Packages

So, here is how to retrieve the definitions of the SSIS packages in MSDB:

In 2005:

SELECT     p.[name] AS [PackageName]
           
,[description] AS [PackageDescription]
           
,CASE [packagetype]
               
WHEN THEN 'Undefined'
               
WHEN THEN 'SQL Server Import and Export Wizard'
               
WHEN THEN 'DTS Designer in SQL Server 2000'
               
WHEN THEN 'SQL Server Replication'
               
WHEN THEN 'SSIS Designer'
               
WHEN THEN 'Maintenance Plan Designer or Wizard'
           
END     AS [PackageType]
           
,CASE [packageformat]
               
WHEN THEN 'SSIS 2005 version'
               
WHEN THEN 'SSIS 2008 version'
           
END AS [PackageFormat]
           
,p.[createdate]
           
,CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]     p

In 2008 and up:

SELECT     p.[name] AS [PackageName]
           
,[description] AS [PackageDescription]
           
,CASE [packagetype]
               
WHEN THEN 'Undefined'
               
WHEN THEN 'SQL Server Import and Export Wizard'
               
WHEN THEN 'DTS Designer in SQL Server 2000'
               
WHEN THEN 'SQL Server Replication'
               
WHEN THEN 'SSIS Designer'
               
WHEN THEN 'Maintenance Plan Designer or Wizard'
           
END     AS [PackageType]
           
,CASE [packageformat]
               
WHEN THEN 'SSIS 2005 version'
               
WHEN THEN 'SSIS 2008 version'
           
END AS [PackageFormat]
           
,p.[createdate]
           
,CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysssispackages]     p
Now that we have the definition, what can we do with it? We can parse it and extract some useful data.

Extracting connection strings from an SSIS definition

Here is how to retrieve the data connection strings:

In SQL 2005:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS-- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="DelayValidation"][1]',
    
'varchar(100)'AS DelayValidation,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)'AS ObjectName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="Description"][1]',
    
'varchar(100)'AS Description,
   
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="Retain"][1]'
'varchar(MAX)'Retain,
   
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="ConnectionString"][1]'
'varchar(MAX)'ConnectionStringFROM  --SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]
        
PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:ConnectionManager'SSIS_XML (SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysdtspackages90] c ON PackageXML.id c.id

In SQL 2008 and up:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS-- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="DelayValidation"][1]',
    
'varchar(100)'AS DelayValidation,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)'AS ObjectName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="Description"][1]',
    
'varchar(100)'AS Description,
   
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="Retain"][1]'
'varchar(MAX)'Retain,
   
SSIS_XML.value('pNS1:ObjectData[1]/pNS1:ConnectionManager[1]
/pNS1:Property[@pNS1:Name="ConnectionString"][1]'
'varchar(MAX)'ConnectionStringFROM  --SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysssispackages]
        
PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:ConnectionManager'SSIS_XML (SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysssispackages] c ON PackageXML.id c.id

Extracting connection strings from an SSIS definition

Here is how to retrieve the package configurations:

In SQL 2005:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS-- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)'AS ConfigurationType,
  
CASE CAST(SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)'AS INT)
    
WHEN THEN 'Parent Package'
    
WHEN THEN 'XML File'
    
WHEN THEN 'Environmental Variable'
    
WHEN THEN 'Registry Entry'
    
WHEN THEN 'Parent Package via Environmental Variable'
    
WHEN THEN 'XML File via Environmental Variable'
    
WHEN THEN 'Registry Entry via Environmental Variable'
    
WHEN THEN 'SQL Server'
  
END AS ConfigurationTypeDesc,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationVariable"][1]',
    
'varchar(100)'AS ConfigurationVariable,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)'AS ConfigurationName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationString"][1]',
    
'varchar(100)'AS ConfigurationStringFROM  SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysdtspackages90]
        
PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:Configuration'SSIS_XML (SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysdtspackages90] c ON PackageXML.id c.id

In SQL 2008 and up:

;WITH XMLNAMESPACES ('www.microsoft.com/SqlServer/Dts' AS pNS1,
  
'www.microsoft.com/SqlServer/Dts' AS DTS-- declare XML namespacesSELECT c.name,
  
SSIS_XML.value('./pNS1:Property [@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)'AS ConfigurationType,
  
CASE CAST(SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationType"][1]',
    
'varchar(100)'AS INT)
    
WHEN THEN 'Parent Package'
    
WHEN THEN 'XML File'
    
WHEN THEN 'Environmental Variable'
    
WHEN THEN 'Registry Entry'
    
WHEN THEN 'Parent Package via Environmental Variable'
    
WHEN THEN 'XML File via Environmental Variable'
    
WHEN THEN 'Registry Entry via Environmental Variable'
    
WHEN THEN 'SQL Server'
  
END AS ConfigurationTypeDesc,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationVariable"][1]',
    
'varchar(100)'AS ConfigurationVariable,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ObjectName"][1]',
    
'varchar(100)'AS ConfigurationName,
  
SSIS_XML.value('./pNS1:Property[@pNS1:Name="ConfigurationString"][1]',
    
'varchar(100)'AS ConfigurationStringFROM  SELECT    id ,
                    
CAST(CAST(packagedata AS VARBINARY(MAX)) AS XMLPackageXML
          
FROM      [msdb].[dbo].[sysssispackages]
        
PackageXML
        
CROSS APPLY PackageXML.nodes('/DTS:Executable/DTS:Configuration'SSIS_XML (SSIS_XML )
        
INNER JOIN [msdb].[dbo].[sysssispackages] c ON PackageXML.id c.id
There are many other aspects to be explored in the definitions of the SSIS packages, and it is all matter of finding the node names and parsing them.
In the remaining part of this article, I would like to shift our attention to these areas: the interaction between SQL Agent and the SSIS packages and some scripts to gather performance statistics.

Overriding the package internal configurations

The SSIS packages can be executed in several ways: as a scheduled job from the SQL Server Agent or from the command line (or even from a batch file).
Regardless of which method is used for the execution, it is always the DTExec.exe who carries the task.
Before executing the SSIS package, the SQL Server Agent or the command line script have to form an execution string and pass parameters to the DTExec, and thus control the execution of the package.
Here is a script which shows all SQL Agent jobs steps which execute SSIS packages and the custom configurations provided through the SQL Agent job:
USE [msdb]
GO
SELECT j.job_id,
       
s.srvname,
       
j.name,
       
js.subsystem,
       
js.step_id,
       
js.command,
       
j.enabled,
       
js.output_file_name,
       
js.last_run_outcome,
       
js.last_run_duration,
       
js.last_run_retries,
       
js.last_run_date,
       
js.last_run_time,
        
js.proxy_id FROM   dbo.sysjobs jJOIN   dbo.sysjobsteps js
   
ON  js.job_id j.job_id JOIN   MASTER.dbo.sysservers s
   
ON  s.srvid j.originating_server_id--filter only the job steps which are executing SSIS packages WHERE  subsystem 'SSIS'--use the line below to enter some search criteria
--AND js.command LIKE N'%ENTER_SEARCH%'
GO
As you noticed, you can even use the script above to filter and search through the configurations of the SQL Agent Jobs. For example, you can search for all jobs which are executing encrypted SSIS packages by using …
AND    js.command LIKE N'%/DECRYPT%'
…as a search criteria in the above script. You may also want to search for a server name, for example.

Exploring execution history

Finally, let’s look into some execution history of the SSIS packages which are scheduled as SQL Server Agent jobs.
The following script will return all SQL Server Agent Jobs, which are currently (as of the moment of the execution of the script) executing SSIS packages and also the last execution time and duration, as well as the execution command.
SET NOCOUNT ON-- Check if the SQL Server Agent is runningIF EXISTS ( SELECT  1
            
FROM    MASTER.dbo.sysprocesses
            
WHERE   program_name N'SQLAgent - Generic Refresher' )
    
BEGIN
        SELECT  
@@SERVERNAME AS 'InstanceName' ,
                
AS 'SQLServerAgentRunning'
    
END
ELSE
    BEGIN
        SELECT  
@@SERVERNAME AS 'InstanceName' ,
                
AS 'SQLServerAgentRunning'             
        
RAISERROR('The SQL Server Agent is not running.'161WITH SETERROR;             
    
END-- Execute the scriptIF EXISTS ( SELECT  *
            
FROM    tempdb.dbo.sysobjects
            
WHERE   id OBJECT_ID(N'[tempdb].[dbo].[Temp1]') )
    
DROP TABLE [tempdb].[dbo].[Temp1]
GO
CREATE TABLE [tempdb].[dbo].[Temp1]
    
(
      
job_id UNIQUEIDENTIFIER NOT NULL ,
      
last_run_date NVARCHAR(20) NOT NULL ,
      
last_run_time NVARCHAR(20) NOT NULL ,
      
next_run_date NVARCHAR(20) NOT NULL ,
      
next_run_time NVARCHAR(20) NOT NULL ,
      
next_run_schedule_id INT NOT NULL ,
      
requested_to_run INT NOT NULL ,
      
request_source INT NOT NULL ,
      
request_source_id SYSNAME COLLATE database_default
                                
NULL ,
      
running INT NOT NULL ,
      
current_step INT NOT NULL ,
      
current_retry_attempt INT NOT NULL ,
      
job_state INT NOT NULL
    )
DECLARE @job_owner SYSNAME
DECLARE 
@is_sysadmin INT
SET 
@is_sysadmin ISNULL(IS_SRVROLEMEMBER('sysadmin'), 0)SET @job_owner SUSER_SNAME()INSERT  INTO [tempdb].[dbo].[Temp1]
        
EXECUTE MASTER.dbo.xp_sqlagent_enum_jobs @is_sysadmin@job_owner
        
UPDATE  [tempdb].[dbo].[Temp1]SET     last_run_time RIGHT('000000' last_run_time6) ,
        
next_run_time RIGHT('000000' next_run_time6) ;-----SELECT  j.name AS JobName ,
        
j.enabled AS Enabled ,
        
CASE x.running
          
WHEN THEN 'Running'
          
ELSE CASE h.run_status
                 
WHEN THEN 'Inactive'
                 
WHEN THEN 'Inactive'
                 
ELSE 'Completed'
               
END
        END AS 
CurrentStatus ,
        
COALESCE(x.current_step0AS CurrentStepNbr ,
        
CASE x.running
          
WHEN THEN js.step_name
          
ELSE NULL
        
END AS CurrentStepName ,
        
CASE WHEN x.last_run_date 0
             
THEN CONVERT (DATETIMESUBSTRING(x.last_run_date14) + '-'
                  
SUBSTRING(x.last_run_date52) + '-'
                  
SUBSTRING(x.last_run_date72) + ' '
                  
SUBSTRING(x.last_run_time12) + ':'
                  
SUBSTRING(x.last_run_time32) + ':'
                  
SUBSTRING(x.last_run_time52) + '.000'121)
             
ELSE NULL
        
END AS LastRunTime ,
        
CASE h.run_status
          
WHEN THEN 'Fail'
          
WHEN THEN 'Success'
          
WHEN THEN 'Retry'
          
WHEN THEN 'Cancel'
          
WHEN THEN 'In progress'
        
END AS LastRunOutcome ,
        
CASE WHEN h.run_duration 0
             
THEN h.run_duration 1000000 ) * ( 3600 24 )
                  + ( 
h.run_duration 10000 100 ) * 3600 + ( h.run_duration
                                                              
100 100 )
                  * 
60 + ( h.run_duration 100 )
             
ELSE NULL
        
END AS LastRunDuration ,
        
js.command AS SSISPackageExecutionCommandFROM    [tempdb].[dbo].[Temp1] x
        
LEFT JOIN msdb.dbo.sysjobs j ON x.job_id j.job_id
        
JOIN msdb.dbo.sysjobsteps js ON js.job_id j.job_id
        
LEFT OUTER JOIN msdb.dbo.syscategories c ON j.category_id c.category_id
        
LEFT OUTER JOIN msdb.dbo.sysjobhistory h ON x.job_id h.job_id
                                                    
AND x.last_run_date h.run_date
                                                    
AND x.last_run_time h.run_time
                                                    
AND h.step_id 0WHERE   x.running 1
        
AND js.subsystem 'SSIS'DROP TABLE [tempdb].[dbo].[Temp1]
In conclusion, SSIS is a vast product which provides significant amount of metadata available to the SQL Server administrator. In this article I have shown the way to explore the SSIS metadata through some scripts and hopefully they will make the daily administration of your SSIS environments much easier.

Source: 

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Thank you so much for sharing this informative and interesting content. SSIS operations are always very crucial for DBAs and excelling in them can always give them and edge.

    ReplyDelete