MySQL and a number of it's variants can be used as a time-series database. Using the MySQL example employees database we are going to provide a list of time-series analysis that we want performed, giving you a chance to try writing the SQL yourself. We will then provide the SQL we would have used.

Importing Example MySQL Database

Download employees_db-full-1.0.6.tar.bz2 and unzip the contents. Once unzipped run the following command:

We now have a database containing some time-series data.

Time Series SQL Queries Exercise

We want to answer the following queries:

  1. Querying only the employees table, we want to form the following Group By Time SQL queries:
    1. Find the number of employees hired each year.
    2. Find the number of employees hired each month.
    3. Find the number of employees hired each week. This table should have the format:
      +------+------+-----------+
      | Year | Week | Employees |
      +------+------+-----------+
      | 1999 |   48 |        10 |
      | 1999 |   49 |         6 |
      | 1999 |   50 |         6 |
  2. Querying only the titles table, pivot title rows by year to give:
    Year Engineer Manager SeniorEngineer SeniorStaff Staff TechniqueLeader AssistantEngineer
    1985-01-01 6472 9 1835 1665 6467 897 0
    1986-01-01 6914 0 2017 1733 7169 1056 0
    1987-01-01 7125 0 2051 1738 7136 957 0
    1988-01-01 7023 2 2073 1825 7195 1013 0
    1989-01-01 7058 2 2050 1758 7363 1036 0
    ... ... ... ... ... ... ... ...
  3. Find the 3 most recently hired employees and what department they work in:
    emp_no birth_date first_name last_name gender hire_date
    463807 1964-06-12 Bikash Covnot M 2000-01-28
    428377 1957-05-09 Yucai Gerlach M 2000-01-23
    499553 1954-05-06 Hideyuki Delgrande F 2000-01-22
  4. Find the Running Daily Sum of number of employees hired ever:
    +------------+-----------+------------+
    | hire_date  | Employees | TotalHired |
    +------------+-----------+------------+
    | 2000-01-02 |         2 |          2 |
    | 2000-01-03 |         1 |          3 |
    | 2000-01-04 |         1 |          4 |
    | 2000-01-06 |         1 |          5 |
    | 2000-01-08 |         1 |          6 |
    | 2000-01-11 |         1 |          7 |
    | 2000-01-12 |         1 |          8 |
    | 2000-01-13 |         1 |          9 |
    | 2000-01-22 |         1 |         10 |
    | 2000-01-23 |         1 |         11 |
    | 2000-01-28 |         1 |         12 |
    +------------+-----------+------------+
    

1. Group By Year/Month/Day

Group By Year

To find the number of employees hired each year we could count the groupings by year. The easiest way to do this is to use the YEAR function:

MySql Group By Month/Week

Group By Month/Week is easily accomplished using the built-in functions: MONTH

Group By Week after selected date

Alternative methods and a discussion of their trade-offs can be found on SO

2. Find the Titles by Year - Pivot Rows to Columns

The secret to generating this table is pivoting rows to become columns. Forming a simple query to get the count of each year/title is simple:

The difficult part is the pivoting. MySQL does not support the pivot that other databases do, instead we either have to hand code a limited set of values or turn to dynamic SQL. For our limited number I hardcoded the values:

Year Engineer Manager SeniorEngineer SeniorStaff Staff TechniqueLeader AssistantEngineer
1985-01-01 6472 9 1835 1665 6467 897 0
1986-01-01 6914 0 2017 1733 7169 1056 0
1987-01-01 7125 0 2051 1738 7136 957 0
1988-01-01 7023 2 2073 1825 7195 1013 0
1989-01-01 7058 2 2050 1758 7363 1036 0
... ... ... ... ... ... ... ...

3. Find the N most recently hired employees

emp_no birth_date first_name last_name gender hire_date
463807 1964-06-12 Bikash Covnot M 2000-01-28
428377 1957-05-09 Yucai Gerlach M 2000-01-23
499553 1954-05-06 Hideyuki Delgrande F 2000-01-22

For this simple case, we can use limit together with an order by. There are however a few edge cases to be careful with, the case where there are tied entries, e.g. If there was another employee hired on 2000-01-22 or when there may be less than N/3 matching rows available.

4. Running Daily Sum of number of employees hired ever

To get a running daily sum, one of the quickest ways is to use a variable together with a subquery:

Summary - MySQL Time-Series SQL Support

MySQL provides a large range of functions for manipulating and pulling apart dates and times, however compared to other databases they are inconsistant and slow. We are missing functionality required for elegantly pivoting or taking advantage of ordered data. So for these simple SQL queries we can get the solutions we want, they may be a little slower or verbose, the real test will be when we want to write more advanced time-series analytics on larger datasets.