Author Archives: Claus

How to: Change the length of an attribute in MS MDS

You would think that it would be easy to change the length of an attribute in MDS, but going to the webinterface might make you think otherwise.  

The length parameter is greyed out. But fear not.

Instead open the table with your excel plugin


Click on the column you want to change, click attribute properties, and change the length to the desired length.

Publish your changes, and go back to the webinterface.


Tada šŸ™‚ The length has been changed šŸ™‚

A quick look at System-Versioned Tables in SQL server 2016

SQL server 2016 has been available for preview for some time now. One of the more exiting features I am looking forward to is system versioned tables.

As a DWH developer, one of the things, you spent a lot of time with, is developing dimension tables with columns that stores history. The history is typically in the form of Type1 or Type2, where Type1 is no history, and Type2 is where you store the history, by adding another row, when a value changes for a given natural key in the source system.
Type2 history is normally developed by using

  • A DWH framework
  • 3rd Party component
  • Developed specific for the dimension that needs the Type2 history

In version 2016, SQL server can help you with this, using system-versioned tables.

Right click tables->new-> And then you see a new option called ā€œSystem-versioned Tableā€

There is no designer for it yet, so we will do it in SQL, which look like this.

CREATE TABLE dbo.Product
(

SK_Product int IDENTITY(1,1) Not Null,
BK_productid INT NOT NULL
CONSTRAINT PK_product PRIMARY KEY NONCLUSTERED,
ProductName NVARCHAR(50) NOT NULL,
ProductCategory NVARCHAR(50) NOT NULL,

sysstart DATETIME2(0) GENERATED ALWAYS AS ROW START NOT NULL,
sysend DATETIME2(0) GENERATED ALWAYS AS ROW END NOT NULL,
PERIOD FOR SYSTEM_TIME (sysstart, sysend)
)
WITH ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo. ProductHistory ) );

Where

  • SK_Product is the surrogate key
  • BK_productid is the natural key
  • ProductName and ProductCategory are collumns we want to track history on(Well, preferably only ProductCategory, but this is not on column level yet)
  • Sysstart is the valid from
  • Sysend is the valid to
  • And PERIOD indicates which collumns to be used for valid from and to.

Required columns are of course a primary key, Sysstart, Sysend and PERIOD.
Below a screenshot of what it looks like in management studio.

As you can see, the producthistory table is placed under the product table, to show the relationship.
Let us try to insert some values:

INSERT INTO [dbo].[Product]
([BK_productid]
,[ProductName]
,[ProductCategory])
VALUES
(1,'Product1','Category1')

So, the insert, only insert a row in the current table, and no rows in the history table.
Let us try with an update:

UPDATE Product
SET ProductCategory = 'CategoryUpdated'
WHERE BK_productid=1

So, the update, updates the current table, and insert a row in the history table, hence, we will always have the current value in the ā€œcurrent tableā€, and old values in the history table.

Conclusion

When would you use this feature?
It could be helpful in an OLTP system, where they know they will need the history for reporting in the future, but does not have a DWH yet, to store the history.

There will be an overhead in performance and storage.

Would we use it in a DWH, for a dimension with type2 history?

Properly not. This is one specific implementation of it, and we cant change anything to the implementation, so we would have to use it in combination with our specific needs, and then it would be easier not to use it at all.

This could be useful in an archive layer in a DWH, where you want to store all changes of the extracted values over time.

This blog post is based upon a CTP version of SQL server, so much could still change.
Read more about it here: https://msdn.microsoft.com/en-us/library/dn935015.aspx

Doing Windows development on my android tablet with Azure

The scenario: So, I’m on the run, to a family party, a boring meeting, or something similar šŸ™‚ Suddenly, I get this great idea for the BI project I’m working on. I didn’t bring my monster laptop, because it is to heavy & ugly. Instead I’m sitting with my android tablet. What to do now??? šŸ™‚ A solution I have started to use, is the Azure virtual machines. You can create your own on the azure portal, and you can even get them with the software you need preinstalled. I won’t describe the procedure here, because it is simple and out the box. One of the new good things, is that you only pay for it, when it is online, so remember to shut it down afterwards. So how to do it? It’s described in the pics below:

My setup, an android tablet, with bluetooth keyboard

Navigate to the azure portal, and start your VM

From the portal, you can download the connection file, so you don’t have to remember the connection information. I keep it saved in dropbox. When I click it, the MS remote desktop connect to the VM.

And now I’m connected. Windows 2012 R2 on my android tablet with Metro style.

Searching the apps

Running datatools. I could connect to TFS in the cloud, and get alle the sourcecode, and start doing some work.

Querying a SQL database, potentially fixing a bug šŸ™‚

All done! Remember to stop the VM, when you are done.

Conclusion: For me, the best way to develop, is using a powerfull laptop, connected to a good keyboard, mouse, and some big screens. However, if you need to do something fast, it is possible to do some work, just using your android tablet and an Azure VM.

GOTO day 3 – Keynote: “How to become accomplished”

Seriousely, the keynotes at GOTO are really good this year. Today was by Chad Fowler. It was about, as the title says, how to become accomplished. To me, Chad Fowler is like a motivational speaker for programmers, and he is really really good.Ā 

In short, you are going to spend a lot of your life working, so why not do it good, love the process, accomplish some stuff on the way and be successful. Chad Fowler knows how to do that, he has an incredibly life story.

I really cant recap what the talk was about. Just go buy his book. You will not regret it.

http://www.amazon.com/The-Passionate-Programmer-Remarkable-Development/dp/1934356344/ref=sr_1_1?ie=UTF8&qid=1380707932&sr=8-1&keywords=the+passionate+programmer

If you are in to motivationel stuff, you may know that life is the answer to the question we give our self. Chad Fowler had some question, you should ask yourself. I cant remember them precisely, but here is what I kind of remember šŸ™‚

Are You Better Than Yesterday?

-> If not, then just be better šŸ™‚

It is ā€œHell yeahā€ or no.

-> If it is not hell yeah, then it is no.

Ask Why you are doing what you are doing once a day.

There were some other good take aways, but I can’t remember them now. Just buy the book. But last take away.

Surround yourself with people who are better than yourself

Funday Monday: Day 1 at GOTO

So today was day 1 at the GOTO conference. Today none of the tracks related directly to what I work with daily, so I think I was around almost all the tracks for different sessions.

Solving Real and Big (Data) Problems Using Hadoop
So the first session actually was the only one, that related a litle to my work. I think approximately, once a year, a new technology is born in business intelligence land, which promise that it can replace the old enterprise data warehouse. This year it is Hadoop šŸ™‚

I can understand why salespeople want to get that message out, because it is a good business case, since the development of an enterprise data warehouse can be expensive, and if you can replace it with something that is almost free, hey, why not.

But it is not going to happen, for many obvious reasons I don’t want elaborate on in this post.
However, Hadoop is really cool, and it can enrich many companies BI & DWH solutions(Not replace them).

I have seen a lot of presentations at different conferences and user groups about Hadoop, but the presentation by Eva Andreasson from Cloudera was the best so far, and she almost convinced me to buy the book ā€œDefinitive Guide to Hadoopā€ in the FactumĀ bookstore.

Microsoft Devices and Services
As an old .Net developer, it is allways exciting to see what MS is up to. I attended 2 session from MS. ā€œWindows Phone app developmentā€ and ā€œWindows Azure Mobile Servicesā€. One thing that amazes me, is how fast it is to develop to windows phone and all the azure things.

Reliving the history of multiplayer games
I also went to the session ā€œReliving the history of multiplayer gamesā€. I don’t do game development myself, but hey, I do a lot of gamming. It was fun to see the presenters be so excited about games. My key take away: I want to work on something that relate to computer games in the near future šŸ™‚

Taking The Hippie Bus To The Enterprise
My last session of the day. Again, not something I work with directly, but it is always cool to catch up with the trends in .Net land. A lot of code examples and overall a great presentation by Microsoft MVP Mogens from d60.

All in all, an exciting day.

My GOTO program

I have not completely decided on my GOTO program yet, but here is a list of some of the sessions I’m looking forward to šŸ™‚ Ā Some of them are at the same time, so I really hope that they will be available on VOD afterwards.

Don’t be religious about the Internet of Things!
As an owner of a BI company, I have noticed that all of my competitors are talking about the internet of things, and yet, I still do not know what it is šŸ™‚ So yes, I’m looking very much forward to this session, where I’m hoping to get at good introduction to the subject.

I have a feeling that it is somehow, someones view on big data, but from another perspective. I have tried to take a look at the comic book by The Alexandra Institute, but I’m still not completely sure about what it is.

The ā€œMaking sense of dataā€ track
Well, the job description of every business intelligence developer, is properly to ā€œmake sense of dataā€, so I’m guessing this is a must šŸ™‚

The track ā€œOpen Data / eGovā€œ
A part of big data (In my line of work, it is important to say big data a lot these days), is to enrich a companies data warehouses with all kinds of market data. So all information about how to do that is welcome.

The NoSQL, MySQL and Big data track(There is that word again)
There is a trend at the current time, that make developers use no SQL databases. As a consumer of the data the developers applications create, I have to know there mindset behind this, so sessions about this, is also of big interest to me.

Well I guess that is it. Else I’ll just go with the flow, and get inspired by the people I talk to at the conference.

Attending GOTO as a Business intelligence developer

Back in my .NET developer days, like 100 years or so ago, I went to my first GOTO conference(Or JAOO as it was called back then). Back then, I loved .NET and anything that related to it, but that wasn’t the reason I went to GOTO.

If I wanted to know more about the changes from 1.0 to 1.1, it was proberly a MS conference I should have went to.

The reason for attending to GOTO, was to get inspired. Get inspired by all the different tracks, learning about what was going on with all the other languages and trends.

So, fast forward approx. a 100 year and a technology lane shift to a BI developer. Why should I go now??? Well, the same reason. If I want to know more about how the new indexes works in MS SQL server, or for instance the new in memory technologies from MS, other conferences offer that.

But my world changes fast, and a lot of new technologies are emerging, and even as a BI developer, GOTO still offers a lot of inspiration. All the next big things in my line of work, like Big data, the internet of things, no sql, open data, and a lot of other things, all has a track at GOTO.

BTW, If I haven’t mentioned it, I’m attending GOTO this year, and if you are like me, I guess I’ll see you there šŸ™‚

Error code: 0x80040E21 Reading from SSAS with SSIS

So, I upgraded a big project for a customer to 2012 from 2008R2. Everything worked fine, except the last part, where some datamarts was loaded with SSIS from SSAS. They had worked with no problem before, but suddenly, I received the above error code. It made no sense to me, and it was hard to find out, what precisely had happen. The worst, was, that I was able to see the preview in the datasource, but not run the package.

A workaround was to create a linked server to the cube, and then just select from that like so

SELECT * FROM openquery(<ServerName>, ā€˜<MDX>’)

And that was, properly they way I would have implemented it myself. But having hundreds of packages, rewriting them, was not the way to go.
Instead, I found an old post on google, where a person had the same problem, in an older SQL server. The solution was, to open the connection, click the all botton, and type in ā€œ(Format=Tabular)ā€ in the extended properties field. Se the image bellow.

So, yeah…., YIEAKS. But hey, if you find yourself having the same problem, you now know what the solution is šŸ™‚

Partitioning revisited

When building big data warehouses(DWH), partitioning tables is something you want to consider.

There is some good examples/explanations in BOL, but with this blog post I just want to give som easy hands on examples, so you can get started right away.

What
Partitioning tables is splitting up the data in a table, so the data is stored sepearatly, and can be managed seperatly. The SQL user can use the table without knowing the table is partitioned.

Why
Why do we want to partition big tables in for instance an enterprise DWH?
Well some benefits are listed on the SQL CAT(See resources) page, and they say it all

  • Partitioning can provide manageability and often performance benefits.
  • Faster, more granular index maintenance.
  • More flexible backup / restore options.
  • Faster data loading and deleting

How
Partitions can be managed both via management studio and with raw SQL. It is actually much easier to understand when using SQL, so I will only use SQL in this blog post.

Lets just start completely from scratch, by creating the partition function. It is simply used for defining the ranges in the partitions.

CREATE PARTITION FUNCTION myRangePF1 (datetime)
AS RANGE LEFT FOR VALUES ('2013.01.01', '2013.02.01', '2013.04.01') ;

After that, we want to create the partition scheme, which is used for mapping the partitions to a filegroup. For the case of simplicity we just use the primary filegroup In this example , but you don’t want to do that in production.

CREATE PARTITION SCHEME myRangePS1
AS PARTITION myRangePF1
TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY]) ;

Next, let’s create som partitioned tables

CREATE TABLE FactSomething (DW_TS DATETIME, factValue int)
ON myRangePS1 (DW_TS) ;

--Create an archieve table, where we want to store old values
CREATE TABLE FactSomethingArchive (DW_TS DATETIME, factValue int)
ON myRangePS1 (DW_TS) ;

Then let us insert some test data, so we have some data to work with.

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2010.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2013.01.01',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2013.02.02',1)

INSERT INTO [dbo].[FactSomething] ([DW_TS] ,[factValue])
VALUES ('2014.01.01',1)

You can use this select stm. to get all the necessary information about the partitions:

--Long SQL with all the information regarding the partition table
SELECT t.name AS TableName,p.rows,
p.partition_number,
p.partition_id,
f.function_id,
f.type_desc,
r.boundary_id,
r.value AS BoundaryValue
FROM sys.tables AS t
JOIN sys.indexes AS i
ON t.object_id = i.object_id
JOIN sys.partitions AS p
ON i.object_id = p.object_id AND i.index_id = p.index_id
JOIN sys.partition_schemes AS s
ON i.data_space_id = s.data_space_id
JOIN sys.partition_functions AS f
ON s.function_id = f.function_id
LEFT JOIN sys.partition_range_values AS r
ON f.function_id = r.function_id and r.boundary_id = p.partition_number
WHERE t.name = 'FactSomething' -- AND i.type <= 1
ORDER BY p.partition_number

Which in our case will show the following result.

Where the rows column show the number of rows in the partitions. Lets say we want to archive the oldest partition. Then we just switch the partition like so

ALTER TABLE dbo.FactSomething
SWITCH PARTITION 1
TO dbo.FactSomethingArchive PARTITION 1

If we run the partition info SQL again, we now get the following result

Where you can see, that we now have 0 rows in the first partition. If you do a select * from the archieve table, we will now have 4 rows there.

Lets say, that we don’t want to have the old partitions in our fact table anymore, we just run this merge stm.

ALTER PARTITION FUNCTION myRangePF1()
MERGE RANGE ('2013-01-01')

partition info SQL will now return only 3 rows as shown below.

To sum up šŸ™‚

I have shown how to create partitioned tables, and how to use them. Next step will be managing there indexes, and how to make sliding window partitioning. Maybe there will be a post about those subjects here later šŸ™‚

Resource

http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best-practices-for-building-a-large-scale-relational-data-warehouse.aspx

http://msdn.microsoft.com/en-us/library/ms190787.aspx

SQL TIP-> Display seconds as HH:MM:SS

In many BI application you have stored a duration as seconds. To display that on the format HH:MM:SS just use the following SQL(Replace 100 with your column name):

SELECT
convert(char(8),dateadd(second,100,0),108) as [Duration]

Wich yield the result

00:01:40

As in my previous blogpost where I did the same in analysis services, this only works if the duration is below 24 hours, else you have to do some additional logic.