Infrastructure at your Service

MySQL 5.6 offers improved partition maintenance through “partition exchange”. Even if not yet available, dbi services performed some tests on the MySQL Community Edition 5.6.2 m5 release (still under development).

Starting with release 5.1, MySQL offers table partitioning. As for any new feature, partitioning has been improved over the last years and now comes with quite a complete scope of functionalities:

  • Range, list, hash, composite partition types are supported
  • Advanced partitioning maintenance operations are also supported: merge, split (called reorganize), repair, optimize, etc.

Among several missing features, one of them was quite “critical”. Frequently used in many data warehouse environments, “partition exchange” allows to exchange an empty partition with an existing table. This was a pity since we know how easy it is to quickly load data from a external/legacy system on a MySQL server. This new feature has been introduced in MySQL 5.6 (still under development and not yet “GA”: generally available) and tested by dbi services.

Let’s illustrate the feature through an example. At the end of the year, we receive a csv file with the sales information to be loaded on the data warehouse. This sales information contains the salesman_id, the product_id, the amount sold and the date of the sale.

To load this file in a temporary (staging) table is very easy with MySQL. First of all let’s have a look at the file:

head -5 /tmp/sales_range_2012.lst
2,43,193,2012-09-09
5,23,167,2012-11-05
3,29,182,2012-11-25
3,23,185,2012-11-03
1,18,180,2012-11-24

To load this data, there is nothing easier. Loading this data is a straightforward process, we first create the staging table, then load the data with one MySQL statement:

mysql> CREATE TABLE temp_sales ( salesman_id INT, product_id  INT, sales_amount INT,  sales_date date) ENGINE=MyISAM;
 Query OK, 0 rows affected (0.00 sec)
mysql> LOAD DATA INFILE '/tmp/sales_range_2012.lst' INTO TABLE temp_sales FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n';
 Query OK, 37787 rows affected (0.16 sec)
 Records: 37787  Deleted: 0  Skipped: 0  Warnings: 0

Our sales table (partitioned by range) looks like this:

mysql> show create table sales_range G
 *************************** 1. row ***************************
 Table: sales_range
 Create Table: CREATE TABLE `sales_range` (
 `salesman_id` int(11) DEFAULT NULL,
 `product_id` int(11) DEFAULT NULL,
 `sales_amount` int(11) DEFAULT NULL,
 `sales_date` date DEFAULT NULL
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1
 /*!50100 PARTITION BY RANGE (YEAR(sales_date))
 (PARTITION sales_year_2008 VALUES LESS THAN (2009) ENGINE = MyISAM,
 PARTITION sales_year_2009 VALUES LESS THAN (2010) ENGINE = MyISAM,
 PARTITION sales_year_2010 VALUES LESS THAN (2011) ENGINE = MyISAM,
 PARTITION sales_year_2011 VALUES LESS THAN (2012) ENGINE = MyISAM) */
 1 row in set (0.00 sec)

At this stage some data “transformation” might occur based on MySQL PL-SQL in the sales_temp table (if needed).

Once the data is ready, MySQL 5.6 “partition exchange” might be used to load the sales data from 2012 in the 2012 partition.

First of all, the partition “sales_year_2012” should be created within the sales_range table:

mysql> ALTER TABLE sales_range ADD PARTITION (PARTITION sales_year_2012 VALUES LESS THAN (2013) ENGINE = MyISAM);
 Query OK, 0 rows affected (0.01 sec)
 Records: 0  Duplicates: 0  Warnings: 0

Since the data is already loaded and transformed in sales_temp, we just have to exchange sales_year_2012 and temp_sales:

mysql> ALTER TABLE sales_range EXCHANGE PARTITION sales_year_2012 WITH TABLE temp_sales;
 Query OK, 0 rows affected (0.11 sec)

That’s it! All data contained in the sales_temp table has been anchored to the sales_range partitioned table. Let’s have a look at this with the “explain” feature, selecting the 2012 sales, only the sales_year_2012 partition is scanned:

mysql> explain partitions select count(*) from sales_range where sales_date>= '2012-01-01';
 +----+-------------+-------------+-----------------+------+---------------+------+---------+------+
 | id | select_type | table       | partitions      | type | possible_keys | key  | key_len | ref  +----+-------------+-------------+-----------------+------+---------------+------+---------+------+
 |  1 | SIMPLE      | sales_range | sales_year_2012 | ALL  | NULL          | NULL | NULL    | NULL +----+-------------+-------------+-----------------+------+---------------+------+---------+------+
 1 row in set (0.00 sec)

What’s next?

As mentioned in the MySQL documentation the next important feature for partitioning usage will be the possibility to perform parallel queries while computing SQL aggregations – i.e sum() – over several partitions. Several threads in parallel should read the several partitions to provide the whole result faster. MySQL is working on this topic.

One Comment

  • Wow, very comprehensive review. I’m thinking about learning HTML5. I’m seeing more and more job/freelance listings asking for html5 media playerknowledge and so along with reading some of the online references you listed, I’m going to have to pick up the book as well. Really appreciated the chapter breakdowns.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Yann Neuhaus
Yann Neuhaus

Chairman of the Board, Chief Sales Officer (CSO), Region Manager