All posts by The Author

Talend Calling Stored Procedures on SQL Server

Calling stored procedures on SQL Server with parameters

In order to call stored procedures in Talend we need to configure a source for the various parameters e.g. with a tFixedFlowInput as shown below

tfixedFlowInput details

In the tFixedFlowInput we need to create a schema matching the parameters on the stored procedure and then use tick “Use Single Table”

Create a column for each input parameter and set the value you wish to use (note that this can actually be sourced from e.g. an iteration loop instead of a fixed flow to get the details from the database)

Connect the input source to a tdbSP connection

tDBSP Connection Details

Drag a component from the Db Connections meta data for the database / server that you wish to execute the stored procedure on and configure as follows :-

SO Name = Name of the stored procedure to be executed enclosed in quotes

Add parameters using the + button under the parameters section of the tDBSP connection and map them to the schema column name from the input (tFixedFlowInput in this example)

Define the direction of the parameter in the Type column e.g. IN / OUT / RECORD SET

The example above just dumps the parameters and results into the log output

Integrated security connections

How to connect Talend to SQL Server using integrated Security

The default setup of Talend doesn’t allow integrated security connections to SQL server. In order to connect to sql server with integrated security you must have ntlmauth.dll installed – it should be in your BIN and JRE\BIN folders e.g. c:\program files\Zulu\zulu-8\bin and c:\program files\Zulu\zulu-8\jre\bin. I also prefer to have a copy in windows\system32 just in case.

If that doesn’t work I also use -vm argument to ensure that the correct version of zulu\jre\bin is loaded e.g. – vm ” c:\program files\Zulu\zulu-8\jre\bin”

Then its just a matter of adding IntegratedSecurity=SSPI to your connection in the Additional parameters section of your Database connection .

oh and don’t forget to add your port if sql is running as an instance

Data transfer using Azure Data Factory

Azure data factory can be a useful alternative to SSIS in the data migration life cycle, however, there are some areas where the “current” implementation lets it down sligthly.

Mapping Data Flows

One major difference if you’re used to SSIS is the requirement to pre-create the output tables – no easy click new on a data flow destination here.

Templates

Standard templates can be used for copying multiple sources to their relevant destinations. This is easy to set up but has the downside of being a nightmare to maintain as you then have to edit the JSON that the template creates

Triggers

Again triggers to run the data flow can be set up – either timed, tumbling window or event based e.g. when a file arrives in blob storage start the flow. However, changes to the data flow require the trigger to be set up again doesn’t automatically pick up changes to the data flow.

Processing text files using Azure Data Factory

Data loaded from source files

Data factory makes life easier to process multiple source files e.g. where the source system is not accessible and reports have to be used to extract the data or in one case where all the different offices had their own system and the data ended up being loaded from manual spreadsheets.

Excel format files

These are not supported in data factory but it’s fairly easy to just save these as text files e.g. csv files or tab delimited files watch out for saving excel as tab delimited if unicode characters or non standard language sets in the source system – tab delimited is not unicode….

Blob Storage

Push the files up into blob storage and then add shared access signature keys and process all the files in one go.

Fetching data from Oracle with SSIS

The attunity connector in SSIS is probably one of the fastest methods of getting data out of Oracle – knocks spots off using linked servers. However, make sure you use VS2017 version of SSDT otherwise you will waste a lot of time thinking the oracle configuration is wrong.

BlockSize is important – the default fetches 100 rows at a time and on a 90 Gb table in Oracle changing this to a fetch size of 2500 can knock a lot of time off the extract time

Make sure that Oracle works before doing any tests and make sure you have the 32 bit drivers for development.

And finally don’t forget to keep a copy of the package in git or your preferred source control.

ContrOCC Transfer Data Script

Be aware that this task has built in de-duplication logic so can affect time table delivery where multiple time tables are required – especially if quantity based rather than time based.

Example if two “quantities” are delivered on a Tuesday and there are no “notes” to differentiate the two times then one of them will be effectively deleted causing under payment to the provider and possible impact on charges

Filtered sp_who2 one of many ways

[code language=”sql”]
DECLARE @Table TABLE(
SPID INT,
Status VARCHAR(MAX),
LOGIN VARCHAR(MAX),
HostName VARCHAR(MAX),
BlkBy VARCHAR(MAX),
DBName VARCHAR(MAX),
Command VARCHAR(MAX),
CPUTime INT,
DiskIO INT,
LastBatch VARCHAR(MAX),
ProgramName VARCHAR(MAX),
SPID_1 INT,
REQUESTID INT
)

INSERT INTO @Table EXEC sp_who2

SELECT *
FROM @Table
where LOGIN like ‘%xxx%’
[/code]

SSRS Automation

SSRS Automation

First of all create a proxy class in c# using wsdl see this Technet Article for more details

wsdl /out:C:/ReportService.cs http://[servername]/ReportServer/ReportExecution2005.asmx?wsdl

Create a new SSIS Package using BIDS
Add a Script component, pre-configure the component as a transformation, and connect it to the OLE DB source.
Confirm that the ScriptLanguage property is set to Microsoft Visual C# 2008. On the Input Columns page of the Script Transformation Editor, select all the available input columns.
Edit the script to add references to the System.Web.Services and System.Xml assemblies, and to add the C# proxy class.
Build the code.

If you don’t Build the code – then the references will disappear – don’t know why but just a single build is all you need!

Data Dictionary

The need for a Data Dictionary in Data Migration

At a bare minimum the data dictionary needs to contain the following attributes

Source Table 
Source Column 
Source Data Type
Validation Rules.
Typical Values.
Transformation Rules.
Target Table
Target Column
Target Data Type.

optionally adding fields to the data dictionary for
Date changed
Revision Number
Comments
Author
Reviewer
This can be implemented in many ways – depending partly on how public the data needs to be – for example, it can be stored as an excel workbook, an access database or in SQL server itself (possibly with an access frontend to allow simple updates to the underlying data)

It’s advisable to make the data dictionary publicly available – for example by publishing a report on the data dictionary or by making the access / excel data available.
In all cases, I would recommend change tracking – which lends itself more to the SQL server stored data even if it’s then made visible via Excel as a linked source to all relevant SME’s.

Common transformation rules could include
Lookup
Cast as datatype
format datetime 112

Scripting Triggers in Current Database

Start Powershell from SSMS within the database you want to script – note that you will have to change “servername” to the actual database server e.g. “svektestserver”

$srv = new-object "Microsoft.SqlServer.Management.SMO.Server" "servername"

foreach ($proc in Get-ChildItem Tables)
{
if($proc.triggers -ne $null)
{
foreach ($trig in $proc.Triggers)
{
$SavePath = "h:tempdatabases" + $trig.name + ".sql"
$scriptr = new-object ('Microsoft.SqlServer.Management.Smo.Scripter') ($SMOserver)
$scriptr.Server = $srv
$scriptr.Options.AllowSystemObjects = $False
$scriptr.Options.ScriptDrops = $true
$scriptr.Options.ToFileOnly = $true
$scriptr.Options.Filename = "h:tempDatabases" + $trig.name + ".sql"
$scriptr.Options.Filename
$trig.name
$scriptr.Script($trig)
$scriptr.Options.ScriptDrops = $False
$scriptr.Options.AppendToFile = $True
$scriptr.Script($trig)
}
}
}