Category Archives: MS SQL Server

How to: Change the length of an attribute in MS MDS

You would think that it would be easy to change the length of an attribute in MDS, but going to the webinterface might make you think otherwise.  

The length parameter is greyed out. But fear not.

Instead open the table with your excel plugin


Click on the column you want to change, click attribute properties, and change the length to the desired length.

Publish your changes, and go back to the webinterface.


Tada 🙂 The length has been changed 🙂

A quick look at System-Versioned Tables in SQL server 2016

SQL server 2016 has been available for preview for some time now. One of the more exiting features I am looking forward to is system versioned tables.

As a DWH developer, one of the things, you spent a lot of time with, is developing dimension tables with columns that stores history. The history is typically in the form of Type1 or Type2, where Type1 is no history, and Type2 is where you store the history, by adding another row, when a value changes for a given natural key in the source system.
Type2 history is normally developed by using

  • A DWH framework
  • 3rd Party component
  • Developed specific for the dimension that needs the Type2 history

In version 2016, SQL server can help you with this, using system-versioned tables.

Right click tables->new-> And then you see a new option called “System-versioned Table”

There is no designer for it yet, so we will do it in SQL, which look like this.

CREATE TABLE dbo.Product
(

SK_Product int IDENTITY(1,1) Not Null,
BK_productid INT NOT NULL
CONSTRAINT PK_product PRIMARY KEY NONCLUSTERED,
ProductName NVARCHAR(50) NOT NULL,
ProductCategory NVARCHAR(50) NOT NULL,

sysstart DATETIME2(0) GENERATED ALWAYS AS ROW START NOT NULL,
sysend DATETIME2(0) GENERATED ALWAYS AS ROW END NOT NULL,
PERIOD FOR SYSTEM_TIME (sysstart, sysend)
)
WITH ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo. ProductHistory ) );

Where

  • SK_Product is the surrogate key
  • BK_productid is the natural key
  • ProductName and ProductCategory are collumns we want to track history on(Well, preferably only ProductCategory, but this is not on column level yet)
  • Sysstart is the valid from
  • Sysend is the valid to
  • And PERIOD indicates which collumns to be used for valid from and to.

Required columns are of course a primary key, Sysstart, Sysend and PERIOD.
Below a screenshot of what it looks like in management studio.

As you can see, the producthistory table is placed under the product table, to show the relationship.
Let us try to insert some values:

INSERT INTO [dbo].[Product]
([BK_productid]
,[ProductName]
,[ProductCategory])
VALUES
(1,'Product1','Category1')

So, the insert, only insert a row in the current table, and no rows in the history table.
Let us try with an update:

UPDATE Product
SET ProductCategory = 'CategoryUpdated'
WHERE BK_productid=1

So, the update, updates the current table, and insert a row in the history table, hence, we will always have the current value in the “current table”, and old values in the history table.

Conclusion

When would you use this feature?
It could be helpful in an OLTP system, where they know they will need the history for reporting in the future, but does not have a DWH yet, to store the history.

There will be an overhead in performance and storage.

Would we use it in a DWH, for a dimension with type2 history?

Properly not. This is one specific implementation of it, and we cant change anything to the implementation, so we would have to use it in combination with our specific needs, and then it would be easier not to use it at all.

This could be useful in an archive layer in a DWH, where you want to store all changes of the extracted values over time.

This blog post is based upon a CTP version of SQL server, so much could still change.
Read more about it here: https://msdn.microsoft.com/en-us/library/dn935015.aspx

Error code: 0x80040E21 Reading from SSAS with SSIS

So, I upgraded a big project for a customer to 2012 from 2008R2. Everything worked fine, except the last part, where some datamarts was loaded with SSIS from SSAS. They had worked with no problem before, but suddenly, I received the above error code. It made no sense to me, and it was hard to find out, what precisely had happen. The worst, was, that I was able to see the preview in the datasource, but not run the package.

A workaround was to create a linked server to the cube, and then just select from that like so

SELECT * FROM openquery(<ServerName>, ‘<MDX>’)

And that was, properly they way I would have implemented it myself. But having hundreds of packages, rewriting them, was not the way to go.
Instead, I found an old post on google, where a person had the same problem, in an older SQL server. The solution was, to open the connection, click the all botton, and type in “(Format=Tabular)” in the extended properties field. Se the image bellow.

So, yeah…., YIEAKS. But hey, if you find yourself having the same problem, you now know what the solution is 🙂

Partitioning revisited

When building big data warehouses(DWH), partitioning tables is something you want to consider.

There is some good examples/explanations in BOL, but with this blog post I just want to give som easy hands on examples, so you can get started right away.

What
Partitioning tables is splitting up the data in a table, so the data is stored sepearatly, and can be managed seperatly. The SQL user can use the table without knowing the table is partitioned.

Why
Why do we want to partition big tables in for instance an enterprise DWH?
Well some benefits are listed on the SQL CAT(See resources) page, and they say it all

  • Partitioning can provide manageability and often performance benefits.
  • Faster, more granular index maintenance.
  • More flexible backup / restore options.
  • Faster data loading and deleting

How
Partitions can be managed both via management studio and with raw SQL. It is actually much easier to understand when using SQL, so I will only use SQL in this blog post.

Lets just start completely from scratch, by creating the partition function. It is simply used for defining the ranges in the partitions.

CREATE PARTITION FUNCTION myRangePF1 (datetime)
AS RANGE LEFT FOR VALUES ('2013.01.01', '2013.02.01', '2013.04.01') ;

After that, we want to create the partition scheme, which is used for mapping the partitions to a filegroup. For the case of simplicity we just use the primary filegroup In this example , but you don’t want to do that in production.

CREATE PARTITION SCHEME myRangePS1
AS PARTITION myRangePF1
TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY]) ;

Next, let’s create som partitioned tables

CREATE TABLE FactSomething (DW_TS DATETIME, factValue int)
ON myRangePS1 (DW_TS) ;

--Create an archieve table, where we want to store old values
CREATE TABLE FactSomethingArchive (DW_TS DATETIME, factValue int)
ON myRangePS1 (DW_TS) ;

Then let us insert some test data, so we have some data to work with.

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2013.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2013.02.02',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2014.01.01',1)

You can use this select stm. to get all the necessary information about the partitions:

--Long SQL with all the information regarding the partition table
SELECT t.name AS TableName,p.rows,
p.partition_number,
p.partition_id,
f.function_id,
f.type_desc,
r.boundary_id,
r.value AS BoundaryValue
FROM sys.tables AS t
JOIN sys.indexes AS i
ON t.object_id = i.object_id
JOIN sys.partitions AS p
ON i.object_id = p.object_id AND i.index_id = p.index_id
JOIN sys.partition_schemes AS s
ON i.data_space_id = s.data_space_id
JOIN sys.partition_functions AS f
ON s.function_id = f.function_id
LEFT JOIN sys.partition_range_values AS r
ON f.function_id = r.function_id and r.boundary_id = p.partition_number
WHERE t.name = 'FactSomething' -- AND i.type <= 1
ORDER BY p.partition_number

Which in our case will show the following result.

Where the rows column show the number of rows in the partitions. Lets say we want to archive the oldest partition. Then we just switch the partition like so

ALTER TABLE dbo.FactSomething
SWITCH PARTITION 1
TO dbo.FactSomethingArchive PARTITION 1

If we run the partition info SQL again, we now get the following result

Where you can see, that we now have 0 rows in the first partition. If you do a select * from the archieve table, we will now have 4 rows there.

Lets say, that we don’t want to have the old partitions in our fact table anymore, we just run this merge stm.

ALTER PARTITION FUNCTION myRangePF1()
MERGE RANGE ('2013-01-01')

partition info SQL will now return only 3 rows as shown below.

To sum up 🙂

I have shown how to create partitioned tables, and how to use them. Next step will be managing there indexes, and how to make sliding window partitioning. Maybe there will be a post about those subjects here later 🙂

Resource

http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best-practices-for-building-a-large-scale-relational-data-warehouse.aspx

http://msdn.microsoft.com/en-us/library/ms190787.aspx

SQL TIP-> Display seconds as HH:MM:SS

In many BI application you have stored a duration as seconds. To display that on the format HH:MM:SS just use the following SQL(Replace 100 with your column name):

SELECT
convert(char(8),dateadd(second,100,0),108) as [Duration]

Wich yield the result

00:01:40

As in my previous blogpost where I did the same in analysis services, this only works if the duration is below 24 hours, else you have to do some additional logic.

Fun with SQL: Joining on column name and row field value

Ok, it is that time of the day where I’m starting to get high on caffeine 🙂

So this is the scenario. I have an application that is automatically generating a table in my database. It can add columns on the fly(But thankfully not delete).

It could look like this:

CREATE TABLE Cube.AppGenereatedTable
(
x1value float,
x2value float,
ID_Region INT
)

We can now query information about that table like so:

select column_name,ordinal_position,DATA_TYPE , *  from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'

The problem is, that right now, I do not know what to do with that data, and when I’m not at the customer anymore, more fields can be added by the application. What I would like, is a place where the customer can add more information about that column, and decide if that column should be included in some logic. To do that, I add a “metadata” table.

CREATE TABLE Dim.SomeMetadata
(
ID_SomeMetadata INT,
ColName nvarchar(100),
friendlyName nvarchar(100),
includeInETLLogig bit
)

Now I want to put it all in a fact table, where I have 1 row / column value. It could look like this:

CREATE TABLE Fact.SomeFactTable
(
Value float,
ID_Metadata INT
)

Ehhh, Ok, so know I will do what the title is saying. To load the data, I need to create the sql automaticly like so:

--Use the colnr to reference all collumns
DECLARE @colnr AS INT
DECLARE @sql AS NVARCHAR(MAX)
SET @colnr = 1
--Do for as many collumns you expect there ever will be. Do some automation here if you like
WHILE (@colnr < 50)
BEGIN
SET  @sql =
'INSERT INTO Fact.SomeFactTable
(Value, ID_Metadata)

--And now we do the SQL with the join. Pay attention 

SELECT '+(select column_name from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'
AND ordinal_position=@colnr AND DATA_TYPE = 'float')+' as value
,someMetaData.ID_SomeMetadata
FROM Cube.AppGenereatedTable appGenereatedTable
INNER JOIN Dim.SomeMetadata someMetaData on someMetaData.ColName='''+(select
column_name from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'
AND ordinal_position=@colnr AND DATA_TYPE = 'float')+''''

print @colnr
print @sql
BEGIN TRY
exec( @sql )
END TRY
BEGIN CATCH
--Catch error. Do nothing
END CATCH
SET @colnr = @colnr + 1
END

Well, I don’t know if there is other than me, who can use this for something. Next post will be about precisely about what I’m using this to 🙂

SSAS Tip -> Format seconds as HHHH:MM:SS

A common way of showing durations in SSAS, is to format them as HH:MM:SS. A common scenario is to store the durations in seconds. To display them in the HH:MM:SS format, first we have to convert the duration to hours, and then put on the format, like so

CREATE MEMBER CURRENTCUBE.[MEASURES].[someMeasure] AS

[Measures].[someMesureInSeconds]/86400,FORMAT_STRING = “HH:MM:SS”,…

However, this only works if the duration is less than 24 hours.

If it is more, then there is no out the box way to show it in the correct format. To get around that problem, you have to create a mearure as a string. First calc the hours, and then add the minutes and seconds formated as above, like so:

CREATE MEMBER CURRENTCUBE.[Measures].[MesureX]
AS
CSTR(INT([Measures].[LongDurationInSec]/(60*60))) + “:”
+ FORMAT(CDATE([Measures].[LongDurationInSec]/86400
– INT([Measures].[LongDurationInSec]/86400)), “mm:ss”)
FORMAT_STRING = “[HH]:MM:SS”, …

SQL TIP ->Other uses of the SQL GO CMD

As most developers know, the SQL GO CMD signals the end of a batch of SQL statements.

However, another use of it, is to use it to execute a SQL statement multiple times. Just add a number after it. For instance:

SELECT GETDATE() as now
GO 5

Which will execute the Select statement 5 times.
Or, something maybe more useful, create a table with integers from 1-5:

CREATE TABLE #numbers(number int)
GO --> Execute the first batch
INSERT INTO #numbers
VALUES((SELECT COALESCE(MAX(number),0) FROM #numbers) +1)
GO 5-->Do 5 times
SELECT * FROM #numbers

Which will give you the result:

Batch execution completed 5 times.
number
-----------
1
2
3
4
5
(5 row(s) affected)

Well, that was it. Another use of the go cmd. Have fun 🙂

SQL TIP -> Recursive CTEs

This morning I  had to do some recursive common table experessions. It is acctually relativly simple to do.

A common scenario:

We have a parent child relation in a selfreferencing table. That is common when we have an employee manager relationship or organisation relationship.

Let’s just look at  the  employee manager relationship.

Let’s create an example tmp table like so

CREATE TABLE #Employee (
EmployeeID  INT,
EmployeeName NVARCHAR(50),
ManagerID INT
)

And let’s insert some test data into the table

INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (1,'Mr. Manager',NULL)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (2,'Jensen',1)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (3,'Hansen',1)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (4,'Clausen',2)

Where, as you can see,  Mr. Manager is at the top, and Clausen is at the bottom.

We now want to use as recursive common table expression(CTE) to show what level a person is on, in the employee hierarchy.

To do that, we start with the highest level(Where managerID is Null). Then we do a union all with a select which is referencing it self, like so:

WITH EmployeeCTE (EmployeeID,EmployeeName, ManagerID , Level) AS
(
SELECT EmployeeID, EmployeeName,ManagerID, 0 AS Level
FROM #Employee WHERE
ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.EmployeeName,e.ManagerID, Level+1
FROM #Employee   e
INNER JOIN EmployeeCTE  cte ON cte.EmployeeID=e.ManagerID
)
SELECT * FROM EmployeeCTE

And the result can be seen below. This SQL shows what level an employee is on.

EmployeeID  EmployeeName ManagerID   Level
1           Mr. Manager NULL        0
2           Jensen      1           1
3           Hansen      1           1
4           Clausen     2           2

If we want to see an employees way to the top, we can select from the same CTE like so:

SELECT  lvl0.EmployeeName as lvl0Name,  lvl1.EmployeeName AS lvl1Name,      lvl2.EmployeeName AS lvl2Name FROM EmployeeCTE lvl2
INNER JOIN EmployeeCTE lvl1 ON lvl1.EmployeeID=lvl2.ManagerID
INNER JOIN EmployeeCTE lvl0 ON lvl0.EmployeeID=lvl1.ManagerID
WHERE lvl2.Level=2

And the result:

lvl0Name     lvl1Name     lvl2Name
Mr. Manager Jensen        Clausen

Now you know how to do recursive CTE 🙂

Have fun