Today I want to show off a quick trick I learned many years ago that is useful to this day.
The Scenario: you have a spreadsheet, CSV file, or some other delimited flat file and need to load that into a database. There are many options to load data into SQL Server; however, a quick method I am fond of is to use Microsoft Excel and manipulate the data to craft insert statements then paste into SSMS and execute.
Read on to see how to parse and load a CSV flat file, character delimited flat file, and columnar text.
Write SQL in Excel
Many beginning and aspiring database people start work not as a developer or admin but as an analyst. It is common to work with large data sets and receive an Excel workbook that needs to be imported into your SQL Server database.
In the below examples I will be using the AdventureWorks2016 database to demonstrate. Let’s use the Person.Address table. I will stage a small sample of data to demo by creating a smaller subset of just NC residents.
1 2 3 4 |
select * from Person.[Address] where City = 'Raleigh'; --let's pick a good city... select * from Person.[Address] where StateProvinceID = 42; --8 expanded to a nice state (NC) select * into Address_NC from Person.[Address] where StateProvinceID = 42; select * from Address_NC; --8 |
Here are the contents of the table:
Let us also capture the table definition. This can be important for dealing with constraints, indexes, defaults, and identity columns.
Protip: alt + F1 is a default keyboard shortcut in SSMS to calling sp_help on the table.
There is an identity column so we will take that into account when crafting our insert statements
CSV Flat File
Assume: you are given a CSV flat file and need to quickly load it into a database – no 3rd party tools. Let us load the contents of the Address_NC table into a comma separated values (CSV) flat file.
NOTE: before loading a flat file into Excel please make sure the formatting in the sheet is set to text. Otherwise you may end up with some truncated or otherwise changed data.
Highlight all data –> right click format –> text
Here is the data loaded into a CSV file.
Now we are going to craft a SQL insert statement using the Excel concatenation function. Open MS Excel and choose a blank workbook. Here is one way to import the CSV file we just created:
Choose your file and click Import. Excel O365 will automatically format it as a table. It does not matter if you just pasted the text of the CSV file into the spreadsheet.
Highlight the column after the last as General. This is important as the statement will not work right as text – it will literally just be the text you have in the cell instead of the results of a formula.
Here is the command to type into Excel:
1 |
=CONCATENATE("INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('",[@Column2],"','",[@Column3],"','",[@Column4],"',",[@Column5],",'",[@Column6],"', NEWID(), getdate())") |
NOTE: for this demo I am skipping the SpatialLocation column because the strange data type takes away from the example. Since it is a nullable column I can safely do this.
This is just a simple demo to understand the technique. You could mock up dummy data or load other data sets into SQL Server this way. Let’s run these newly crafted INSERT statements.
1 2 3 4 5 6 7 8 |
INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('Po Box 2257','NULL','Greensboro',42,'27412', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('3990 Silas Creek Parkway','NULL','Winston-Salem',42,'27104', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('956k North Wesleyan Blvd.','NULL','Rocky Mount',42,'27803', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('Factory Stores Of America','NULL','Smithfield',42,'27577', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('7700 Green Road','NULL','Raleigh',42,'27603', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('6333 Cloverleaf Parkway','NULL','Kannapolis',42,'28081', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('8520 University City Blvd','NULL','Charlotte',42,'28202', NEWID(), getdate()) INSERT INTO [dbo].[Address_NC] ([AddressLine1],[AddressLine2],[City],[StateProvinceID],[PostalCode],[rowguid],[ModifiedDate]) VALUES ('3605 Sinclair Ave.','NULL','Charlotte',42,'28202', NEWID(), getdate()) |
Here are the newly inserted rows:
Tab Delimited Flat File
The same technique applied above works here too. The only difference is the creating and loading of the file.
Here we save the output in SSMS as a tab delimited file flat file:
This is what the file looks like when opened in Notepad++ and showing tabs / white space symbols:
It loads into Excel the same way and the concatenation is the same way too.
Let’s load it into Excel just pasting it. First, select all the fields in an empty sheet and format them as text.
Here it is pasted into a sheet.
NOTE: if this gives you trouble you can use the TEXT TO COLUMNS feature in Excel to parse a character or space delimited file into their own columns.
The same technique with the CONCAT function can be used here.
Working with Columnar Text
This assumes you receive data in an Excel workbook that you need to load into a database. You might have to load it into another sheet already formatted as text and mess with the date formats. Excel does some strange things and we need to closely inspect the data before writing to SQL Server.
if you liked this then you might like T-SQL Tuesday #104 – Code You Would Hate to Live Without