Excel files with mixed data

Ever find that when you import data from excel files with SSIS that you sometimes lose data? Well one of the issues is that SSIS only samples a small number of rows – well it’s fully documented in http://technet.microsoft.com/en-us/library/ms141683.aspx.  In summary – adding IMEX=1 to the source input will resolve a lot of these issues automatically, otherwise you have a few options

  1. Because of the sample size issue you can reorder the source spreadsheet to get a mixture of the correct data types in the first 8 rows (or use the tip in the technet article to change the sample size).  
  2. Create a new column for the offending column in the source data and use =text(cell) in this column – this will force SSIS to import the data as text (doesnt always work but sometimes does)
  3. Convert the spreadsheet to tab delimited (which also gives you the option of doing a straight bulk insert)
  4. Change connection string to add imex=1 this will force the connection to be reset but resolves most issues ** EXCEPT ** when there is text data in a row that is mostly numeric!

The biggest issue is that when SSIS determines the type for the data thats it – you can’t amend it.