dbi Blog

Introduction

A database is very often created without thinking of its growing and in the most cases that is not really an issue until the data volume is not increasing a lot. And most of the time when a high data volume is expected, some actions are taken to organize the database in a more complex way, creating several data files or several filegroups with several files attached to them.
But when the design has not been setup correctly since the beginning, or business requirements changed drastically, increasing considerably the data volume estimated at the beginning of a project, you need probably to change and adapt your database design to ensure that the performances remains stable and that you still can administrate it in a professional way without disturbing your business and your users.

We will see how to change your design and adapt your database structure for big data volume.

Some notions to consider

First of all, you need to know the different possibilities you have to optimize the way the data is stored in a database and in tables.
The way a SQL database is organized consist of
– Files: physical location of the data
– Filegroups: logical grouping of files

The way a data is stored in tables consist of:
– Heap: when the table is defined without clustered index
– Clustered index: being the table data reorganized by the clustered index key you defined
– Non-clustered index being a copy of part of your table data stored beside your table containing as well a pointer to your table data (Row ID)

When you expect to have big data volume, it is recommended to use the following storage notion on table level, enabling you to organize your data in smaller subset.
– Partitions

Using partitions is greatly recommended when you data volume is important. You can as well linked your partitions with filegroups in order to finally physically store your partition data in different files. Working with smaller subset of data, stored in smaller files enables optimal usages of multi-threading used by you SQL Server optimizer. It will enable you to tune the maintenance and operational tasks on your database in a more flexible way.

When redesigning your database structure and storage strategy, you may also consider the compression possibilities that SQL Server offer, that by experience bring a lot for process performance optimization.

Creating a database

When you create a database you will have the following structure
– 1 Filegroup PRIMARY
– 1 Data file in the PRIMARY filegroup
– 1 Transaction log file

For database with middle size (between 500GB and 1TB) I definitely recommend and it is a common practice, to have several data files. A Best practice is to have a file per CPU (with a max a 8 files).

But for bigger database, I recommend to design your database in order to store smaller subset of data in the same physical entity. The less data has to be manipulated by your SQL Server instance the better it is. And it can avoid you the nightmare of the too short night for running the business and maintenance processes.

I advice than not to wait for changing the way your data is organized. The more data volume you have the more difficult it will be to change your structure.

Starting point

Let’s consider a database having a very simple structure

USE [master]
CREATE DATABASE [SampleDB]
ON  PRIMARY
( NAME = N'SampleDB', FILENAME = N'C:SQLServerInst01DataSampleDB.mdf' , SIZE = 1GB , MAXSIZE = UNLIMITED, FILEGROWTH = 1GB )
LOG ON
( NAME = N'SampleDB_log', FILENAME = N'C:SQLServerInst01TxLogSampleDB_log.ldf' , SIZE = 256MB , MAXSIZE = UNLIMITED, FILEGROWTH = 256MB)
GO

In this database I have a table that was created like that:

use [SampleDB]
GO
CREATE TABLE [dbo].[f_SalesRevenue](
[ProductID] [int] NULL,
[Date] [date] NULL,
[SalesCountry] [varchar](50) NULL,
[Zip] [varchar](10) NULL,
[Units] [int] NULL,
[Revenue] [numeric](18, 6) NULL,
[DateID] [int] NULL,
[CountryZipId] [varchar](60) NULL
) ON [PRIMARY]
GO

SET ANSI_PADDING ON
GO

CREATE CLUSTERED COLUMNSTORE INDEX [CCI_SalesRevenue] ON [dbo].[f_SalesRevenue] WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0, DATA_COMPRESSION = COLUMNSTORE) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_CountryZipId] ON [dbo].[f_SalesRevenue]
(
[CountryZipId] ASC,
[DateID] ASC
)
INCLUDE ( [ProductID],
[Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_Date_Product_Geo_id] ON [dbo].[f_SalesRevenue]
(
[DateID] ASC,
[CountryZipId] ASC,
[ProductID] ASC
)
INCLUDE ( [Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_DateId] ON [dbo].[f_SalesRevenue]
(
[DateID] ASC
)
INCLUDE ( [Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_ProductID] ON [dbo].[f_SalesRevenue]
(
[ProductID] ASC,
[DateID] ASC
)
INCLUDE ( [CountryZipId],
[Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

With the following script you can see the location of my data in my actual filegroup structure:

Use [SampleDB]
SELECT
t.name AS TableName,
i.name as IdxName,
CASE WHEN ds.type='FG' THEN ds.name ELSE '(Partitioned)' END AS FilegroupName,
p.rows
FROM sys.tables t
inner join sys.indexes i on i.object_id = t.object_id
inner join sys.partitions p on p.object_id = i.object_id and p.index_id = i.index_id
left join sys.data_spaces ds on ds.data_space_id = i.data_space_id
ORDER BY  FilegroupName, TableName, IdxName
GO

TableName	IdxName	FilegroupName	rows
f_SalesRevenue	CCI_SalesRevenue	PRIMARY	7235490
f_SalesRevenue	idx_f_SalesRevenue_CountryZipId	PRIMARY	7235490
f_SalesRevenue	idx_f_SalesRevenue_Date_Product_Geo_id	PRIMARY	7235490
f_SalesRevenue	idx_f_SalesRevenue_DateId	PRIMARY	7235490
f_SalesRevenue	idx_f_SalesRevenue_ProductID	PRIMARY	7235490

You can see that all my table data is on the PRIMARY filegroup, of course because I defined only this one

Create new filegroups

As I said when volume is increasing you can split your database storage in several files.
The new structure is of course depending on your situation. But a good start could be to create new filegroups and attached physical files to them.
Form my demo, I will create a new filegroup to store the data and another to store the non-cluster indexes. As I have an SQL Server Instance with 4 CPU I will attached 4 files to my new filegroups.
Of course you are free to create as many filegroups you wish for your purpose, for example a one for storing the sales info, another to store the stock, etc…

Caution:
The max number of filegroups is 32,767 per database and the same number of files per database

So I start to change my database file structure:

USE [master]
GO
ALTER DATABASE [SampleDB] ADD FILEGROUP [FG_DATA]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Data01', FILENAME = N'C:SQLServerInst01DataSampleDB_Data01.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_DATA]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Data02', FILENAME = N'C:SQLServerInst01DataSampleDB_Data02.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_DATA]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Data03', FILENAME = N'C:SQLServerInst01DataSampleDB_Data03.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_DATA]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Data04', FILENAME = N'C:SQLServerInst01DataSampleDB_Data04.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_DATA]
GO
ALTER DATABASE [SampleDB] ADD FILEGROUP [FG_INDEX]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Index01', FILENAME = N'C:SQLServerInst01DataSampleDB_Index01.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_INDEX]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Index02', FILENAME = N'C:SQLServerInst01DataSampleDB_Index02.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_INDEX]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Index03', FILENAME = N'C:SQLServerInst01DataSampleDB_Index03.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_INDEX]
ALTER DATABASE [SampleDB] ADD FILE ( NAME = N'SampleDB_Index04', FILENAME = N'C:SQLServerInst01DataSampleDB_Index04.ndf' , SIZE = 256MB , FILEGROWTH = 256MB) TO FILEGROUP [FG_INDEX]
GO
IF NOT EXISTS (SELECT name FROM [SampleDB].sys.filegroups WHERE is_default=1 AND name = N'FG_DATA') ALTER DATABASE [SampleDB] MODIFY FILEGROUP [FG_DATA] DEFAULT
IF NOT EXISTS (SELECT name FROM [SampleDB].sys.filegroups WHERE is_autogrow_all_files=1 AND name = N'FG_DATA') ALTER DATABASE [SampleDB] MODIFY FILEGROUP [FG_DATA] AUTOGROW_ALL_FILES
IF NOT EXISTS (SELECT name FROM [SampleDB].sys.filegroups WHERE is_autogrow_all_files=1 AND name = N'FG_INDEX') ALTER DATABASE [SampleDB] MODIFY FILEGROUP [FG_INDEX] AUTOGROW_ALL_FILES
GO

Now that my structure has been changed I can move my data to my new filegroups.
I move my cluster index to my FG_DATA filegroup and the non-clustered indexes to the FG_INDEX filegroup.
You will notice that I recreate the index again with the hint DROP_EXISTING = ON and specifying on which filegroup I want to store it.
Depending on your SQL Server Version, you will be able to move this index online. But of course depending of your data volume this can take time.
If your table has no clustered index, one solution can be to create one, move the index and drop it after the move is over.

use [SampleDB]
GO 
CREATE CLUSTERED COLUMNSTORE INDEX [CCI_SalesRevenue] ON [dbo].[f_SalesRevenue] WITH (DROP_EXISTING = ON, COMPRESSION_DELAY = 0, DATA_COMPRESSION = COLUMNSTORE) ON [FG_DATA]
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_CountryZipId] ON [dbo].[f_SalesRevenue]
(
[CountryZipId] ASC,
[DateID] ASC
)
INCLUDE ( [ProductID],
[Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [FG_INDEX]
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_Date_Product_Geo_id] ON [dbo].[f_SalesRevenue]
(
[DateID] ASC,
[CountryZipId] ASC,
[ProductID] ASC
)
INCLUDE ( [Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [FG_INDEX]
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_DateId] ON [dbo].[f_SalesRevenue]
(
[DateID] ASC
)
INCLUDE ( [Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [FG_INDEX]
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_ProductID] ON [dbo].[f_SalesRevenue]
(
[ProductID] ASC,
[DateID] ASC
)
INCLUDE ( [CountryZipId],
[Revenue],
[Units]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [FG_INDEX]
GO

Checking the data location of my table you will get:

TableName	IdxName	FilegroupName	rows
f_SalesRevenue	CCI_SalesRevenue	FG_DATA	7235490
f_SalesRevenue	idx_f_SalesRevenue_CountryZipId	FG_INDEX	7235490
f_SalesRevenue	idx_f_SalesRevenue_Date_Product_Geo_id	FG_INDEX	7235490
f_SalesRevenue	idx_f_SalesRevenue_DateId	FG_INDEX	7235490
f_SalesRevenue	idx_f_SalesRevenue_ProductID	FG_INDEX	7235490

Let’s consider partitioning

When you data volume is very big, you have to consider partitioning. I recommend building your database file structure closely to your partitioning strategy. For instance, if you decide you make yearly partitions to store your data in some tables, why not to store these partitions in different filegroups. It will give you the flexibility to adapt easier your operational and maintenance tasks. You will be able to set filegroups as read-only when no data is altered anymore and make partial backups, it will also be possible to optimize your index maintenance on very big tables or considering DBCC CHECKFILEGROUP instead of DBCC CHECKDB that could last too long on big databases.

Creation of the partition and the storage structure

As already said, you have first to defined your partition strategy. Changing it later because it is not well fitting your scenario can be cumbersome. Here are some tips to define your strategy the right way:
– Decide what data can be partitioned and on which info (i.e. date, customer, Product,…).
– Decide your partition boundaries. If your partition column is a date, decide to set the boundaries on year or month even day depending on your data volume and your scenario.
– Decide where your partition will be stored and design your filegroup structure accordingly. It can make sense to separate the storage of the sales information and the one of the stock for instance and therefore you have to adapt your partition design accordingly.
– But in any case do not over complexify your design otherwise it will be difficult to maintained.
– Remember not to exceed the limit of 32,767 filegroups or files per database. Implementing it, you will have to process in the right order: To illustrate the implementation, in my case I will create yearly partition from 2011 to 2025 and a I will provision the filegroup and belonging files accordingly. I chose to have 4 files per filegroup and have a different filegroup for each year. I created a script to be able to generate the filegroups and belonging files in order to avoid an typo errors. It can be very helpful if you have an important number of partition schemes and multitude of data storage sets.

1- create your filegroup structures

declare
@DBName sysname = 'PartitionDB',
@PartitionSuffix sysname =  'PDataSales',
@PartitionRange nvarchar(max) = '2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022,2023,2024,2025',
@DataFilePath sysname = 'C:SQLServerInst01Data',
@FileSize nvarchar(16) = '256MB',
@FileGrowth nvarchar(16) = '256MB',
@NbFilesPerFG int = 4; --###recommanded max 8
declare @TPartitionValues table (PartVal nvarchar(32));
declare @PartVal nvarchar(32)
declare @TSqlCmd nvarchar(max) = '';

declare @FgName sysname = '';
declare @FileName sysname = '';
declare @FgFiles nvarchar(max) = '';

declare @iFile int;

insert into @TPartitionValues (PartVal)
select value from string_split(@PartitionRange,',');

declare pv_cur cursor for
select PartVal
from @TPartitionValues
order by PartVal;

set @TSqlCmd = 'use ' + QUOTENAME(@DBName) + char(13) + char(10);

open pv_cur;
fetch next from pv_cur into @PartVal;

while @@FETCH_STATUS = 0
begin
--print @TSqlCmd

set @FgName = '[fg_' + @DBName + '_' + @PartitionSuffix + @PartVal + ']'
set @TSqlCmd = @TSqlCmd
+ 'ALTER DATABASE ' + QUOTENAME(@DBName) + ' ADD FILEGROUP ' + @FgName
+ char(13) + char(10);

set @iFile = 0;
set @FgFiles = '';
while @iFile < @NbFilesPerFG
begin
set @iFile = @iFile + 1;
set @FileName = @DBName + '_' + @PartitionSuffix + @PartVal + '_' + right('00' + convert(nvarchar,@iFile),2)
set @FgFiles = @FgFiles
+ 'ALTER DATABASE ['+ @DBName +'] '
+ 'ADD FILE ( NAME = N'''+ @FileName + ''', '
+ 'FILENAME = N''' + @DataFilePath + @FileName + '.ndf'', '
+ 'SIZE =' +@FileSize + ' , FILEGROWTH =' + @FileGrowth + ') '
+ 'TO FILEGROUP ' + @FgName
+ char(13) + char(10);

end

set @TSqlCmd = @TSqlCmd
+ @FgFiles
+ char(13) + char(10);

fetch next from pv_cur into @PartVal;
end

close pv_cur;
deallocate pv_cur;

select @TSqlCmd

2- Create your partition functions

CREATE PARTITION FUNCTION [pf_bySalesYear](date) AS RANGE RIGHT FOR VALUES (
N'2011-01-01',
N'2012-01-01',
N'2013-01-01',
N'2014-01-01',
N'2015-01-01',
N'2016-01-01',
N'2017-01-01',
N'2018-01-01',
N'2019-01-01',
N'2020-01-01',
N'2021-01-01',
N'2022-01-01',
N'2023-01-01',
N'2024-01-01',
N'2025-01-01'
)
go

Create your partition schemes

CREATE PARTITION SCHEME [ps_bySalesYear] AS PARTITION [pf_bySalesYear] TO (
[fg_PartitionDB_PDataSales2010],
[fg_PartitionDB_PDataSales2011],
[fg_PartitionDB_PDataSales2012],
[fg_PartitionDB_PDataSales2013],
[fg_PartitionDB_PDataSales2014],
[fg_PartitionDB_PDataSales2015],
[fg_PartitionDB_PDataSales2016],
[fg_PartitionDB_PDataSales2017],
[fg_PartitionDB_PDataSales2018],
[fg_PartitionDB_PDataSales2019],
[fg_PartitionDB_PDataSales2020],
[fg_PartitionDB_PDataSales2021],
[fg_PartitionDB_PDataSales2022],
[fg_PartitionDB_PDataSales2023],
[fg_PartitionDB_PDataSales2024],
[fg_PartitionDB_PDataSales2025]
)
go

USE [PartitionDB]

ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2010] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2011] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2012] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2013] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2014] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2015] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2016] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2017] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2018] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2019] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2020] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2021] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2022] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2023] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2024] AUTOGROW_ALL_FILES
ALTER DATABASE [PartitionDB] MODIFY FILEGROUP [fg_PartitionDB_PDataSales2025] AUTOGROW_ALL_FILES
GO

Moving the data to the new partition structure

Now that we have the new partitions setup, we have to move the data to them.

For very big tables the more efficient way is probably to create a new table that will enable to insert the data from the source table.
So, you will be able to control the way the data is transferred to the destination table, splitting dataset and avoiding to run too big transactions.
Creating the table or indexes you will need to specify which partition scheme is used and the associated column (in my case ON [ps_bySalesYear]([Date]))

CREATE TABLE [dbo].[f_SalesRevenue_pf](
        [ProductID] [int] NULL,
        [Date] [date] NULL,
        [SalesCountry] [varchar](50) NULL,
        [Zip] [varchar](10) NULL,
        [Units] [int] NULL,
        [Revenue] [numeric](18, 6) NULL,
        [DateID] [int] NULL,
        [CountryZipId] [varchar](60) NULL
) ON [ps_bySalesYear]([Date])
GO

I insert now some records:

insert into dbo.f_SalesRevenue_pf
select * from dbo.f_SalesRevenue sr
where sr.Date >=  '2011-01-01'
        and sr.Date < '2012-01-01'

With the following script let’s have a look at the Filgroup situation

Use [PartitionDB]
SELECT 
        t.name AS TableName, 
        CASE WHEN ds.type='FG' THEN ds.name ELSE fg.name END AS FilegroupName,
        i.name as IdxName,
        i.type_desc,
        p.rows
FROM sys.tables t 
        inner join sys.indexes i on i.object_id = t.object_id 
        inner join sys.partitions p on p.object_id = i.object_id and p.index_id = i.index_id
        left join sys.data_spaces ds on ds.data_space_id = i.data_space_id 
        left join sys.partition_schemes ps on ps.data_space_id  = i.data_space_id
        left join sys.destination_data_spaces AS dds2 on dds2.partition_scheme_id = ps.data_space_id AND dds2.destination_id = p.partition_number
        left join sys.filegroups AS fg (NOLOCK) ON fg.data_space_id = dds2.data_space_id
ORDER BY  TableName,FilegroupName, IdxName 
GO

You can see that the fg_PartitionDB_PDataSales2011 filegroup has now data.

TableName	FilegroupName	IdxName	type_desc	rows
f_SalesRevenue	Data	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_DateId	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_ProductID	NONCLUSTERED	10417146
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	NULL	HEAP	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	NULL	HEAP	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	NULL	HEAP	0

We can create the indexes and load the other months in the new table, than you can rename the source table and apply the former table name to the new one.
When you sure that everything is working fine you can drop the old table.

Index creation:

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_CountryZipId] ON [dbo].[f_SalesRevenue_pf]
(
        [CountryZipId] ASC,
        [DateID] ASC
)
INCLUDE ( 	[ProductID],
        [Revenue],
        [Units]
        ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [ps_bySalesYear]([Date])
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_Date_Product_Geo_id] ON [dbo].[f_SalesRevenue_pf]
(
        [Date] ASC,
        [CountryZipId] ASC,
        [ProductID] ASC
        
)
INCLUDE ( 	[Revenue],
        [Units]
        ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [ps_bySalesYear]([Date])
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_DateId] ON [dbo].[f_SalesRevenue_pf]
(
        [Date] ASC
)
INCLUDE ( 	[Revenue],
        [Units]
        ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [ps_bySalesYear]([Date])
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [idx_f_SalesRevenue_ProductID] ON [dbo].[f_SalesRevenue_pf]
(
        
        [ProductID] ASC,
        [Date] ASC
        
)
INCLUDE ( 	[CountryZipId],
        [Revenue],
        [Units]
        ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [ps_bySalesYear]([Date])
GO

CREATE CLUSTERED COLUMNSTORE INDEX [CCI_SalesRevenue] ON [dbo].[f_SalesRevenue_pf] WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0, DATA_COMPRESSION = COLUMNSTORE) ON [ps_bySalesYear]([Date])
GO

Transfer the data:

insert into dbo.f_SalesRevenue_pf
select * from dbo.f_SalesRevenue sr
where sr.Date >=  '2012-01-01'
        and sr.Date < '2013-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2013-01-01'
        and sr.Date < '2014-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2014-01-01'
        and sr.Date < '2015-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2015-01-01'
        and sr.Date < '2016-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2016-01-01'
        and sr.Date < '2017-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2017-01-01'
        and sr.Date < '2018-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2018-01-01'
        and sr.Date < '2019-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2019-01-01'
        and sr.Date < '2020-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2020-01-01'
        and sr.Date < '2021-01-01' go insert into dbo.f_SalesRevenue_pf select * from dbo.f_SalesRevenue sr where sr.Date >=  '2021-01-01'
        and sr.Date < '2022-01-01'
go

Of course you can parallelized the data transfer.
If you do not have a sufficient long slot to transfer all the data, you also create a view on the source table with an union all with the new table, but do not forget to delete the data in the source table to avoid duplicating it.

Now that the data is completely transferred the data storage looks like this:

TableName	FilegroupName	IdxName	type_desc	rows
f_SalesRevenue	Data	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_DateId	NONCLUSTERED	10417146
f_SalesRevenue	Index	idx_f_SalesRevenue_ProductID	NONCLUSTERED	10417146
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	idx_f_SalesRevenue_DateId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2010	idx_f_SalesRevenue_ProductID	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	idx_f_SalesRevenue_DateId	NONCLUSTERED	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2011	idx_f_SalesRevenue_ProductID	NONCLUSTERED	528063
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	704573
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	704573
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	704573
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	idx_f_SalesRevenue_DateId	NONCLUSTERED	704573
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2012	idx_f_SalesRevenue_ProductID	NONCLUSTERED	704573
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	867642
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	867642
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	867642
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	idx_f_SalesRevenue_DateId	NONCLUSTERED	867642
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2013	idx_f_SalesRevenue_ProductID	NONCLUSTERED	867642
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	946044
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	946044
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	946044
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	idx_f_SalesRevenue_DateId	NONCLUSTERED	946044
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2014	idx_f_SalesRevenue_ProductID	NONCLUSTERED	946044
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1007612
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1007612
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1007612
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	idx_f_SalesRevenue_DateId	NONCLUSTERED	1007612
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2015	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1007612
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1048864
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1048864
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1048864
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	idx_f_SalesRevenue_DateId	NONCLUSTERED	1048864
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2016	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1048864
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1060797
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1060797
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1060797
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	idx_f_SalesRevenue_DateId	NONCLUSTERED	1060797
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2017	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1060797
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	idx_f_SalesRevenue_DateId	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2018	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1048879
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1048879
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1048879
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	idx_f_SalesRevenue_DateId	NONCLUSTERED	1048879
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2019	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1048879
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1060882
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1060882
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1060882
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	idx_f_SalesRevenue_DateId	NONCLUSTERED	1060882
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2020	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1060882
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	idx_f_SalesRevenue_DateId	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2021	idx_f_SalesRevenue_ProductID	NONCLUSTERED	1071895
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	idx_f_SalesRevenue_DateId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2022	idx_f_SalesRevenue_ProductID	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	idx_f_SalesRevenue_DateId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2023	idx_f_SalesRevenue_ProductID	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	idx_f_SalesRevenue_DateId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2024	idx_f_SalesRevenue_ProductID	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	CCI_SalesRevenue	CLUSTERED COLUMNSTORE	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	idx_f_SalesRevenue_CountryZipId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	idx_f_SalesRevenue_Date_Product_Geo_id	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	idx_f_SalesRevenue_DateId	NONCLUSTERED	0
f_SalesRevenue_pf	fg_PartitionDB_PDataSales2025	idx_f_SalesRevenue_ProductID	NONCLUSTERED	0

Conclusion

We have seen that even if your database is growing and its design is no more adequate, you have the possibility to adapt it.
Depending on the data volume it can be a long process until your can change the structure completely.
Therefore do not wait too long to decide to take action. The longer you wait the more difficult it will be and the more problems you will face with performance.

Post Views: 1,872

SQL Server database design – Moving table to new filegroups

Introduction

Some notions to consider

Creating a database

Starting point

Create new filegroups

Let’s consider partitioning

Creation of the partition and the storage structure

1- create your filegroup structures

2- Create your partition functions

Create your partition schemes

Moving the data to the new partition structure

Conclusion

Christophe Cosme

Leave a Reply:

Related blog articles

Introduction

Some notions to consider

Creating a database

Starting point

Create new filegroups

Let’s consider partitioning

Creation of the partition and the storage structure

1- create your filegroup structures

2- Create your partition functions

Create your partition schemes

Moving the data to the new partition structure

Conclusion

Christophe Cosme

Leave a Reply:

Related blog articles

Another file system for Linux: bcachefs (2) – multi device file systems

Another file system for Linux: bcachefs (1) – basics

Build SQL Server audit reports with Powershell

SQL-Server: What are VLF’s and why should I care about them?