kdb+ »

KX closes down commercial 32 bit kdb, open alternatives?

November 5th, 2015 by John Dempster

Previously on our blog we had a lively debate about a possibly Open Sourced kdb+ , unfortunately kx now seems to be moving the opposite direction. In a recent announcement they are now restricting “32-bit kdb+ for non-commercial use only”. The timing is particularly unfortunate as:

More column oriented databases are becoming available.
One of those databases: Greenplum just announced that it is going open source

Alternative (far less enterprise proven) solutions are available:

MAN AHL have released Arctic an open source Market Data platform based on python and MongoDB
Kerf Database – A DB aimed at the same market as kdb has now partnered with Briarcliff-Hall and is making greater sales inroads

This renewed interest in kdb alternatives hasn’t so far delivered a kdb+ killer but I fear in time it will.

kdb+,timeseries 3 Comments

kdb code highlighting in intellij

May 2nd, 2015 by admin

An intellij keyword file is now available to provide syntax highlighting of kdb code in intellij:

q Code Intellij Highlighting

To install it copy this xml file to this directory:
C:\Users\USERNAME\.IdeaIC14\config\filetypes
Where USERNAME is obviously your username. Then restart intellij and open a .q file.

We’ve updated our notepad++ qlang.xml to provide code folding and highlighting of the .Q/.z namespaces.

kdb+ Add a comment

kdb qunit testing now open source on github

May 2nd, 2015 by admin

We’ve now posted all source code from this website on our github kdb page.

Additionally we are open sourcing qunit, our kdb testing framework.
We look forward to receiving pull requests to fix our (hopefully few) bugs.

kdb+ Add a comment

qStudio adds Nested Server Folder Support

April 23rd, 2015 by admin

Since our last qStudio kdb+ IDE announcment we have added a lot of new features:

Bulk importing kdb server lists

There’s a lot of new features to allow supporting a huge number of servers efficiently:

Support importing HUGE number of servers:
- 5000+ server connections are now supported
- To prevent massive memory use, the object tree for a server is no longer refreshed at startup only on connection.
- Allow specifying default username/password once for all servers
- Allow nested connection folders
- Add critical color option – servers with prod in name get highlighted in red
Sort File Tree Alphabetically
Numerous bugfixes including:
- Fix critical Mac bug that prevented launching in some instances
- Fix query cancelling

kdb+,qStudio Add a comment

Open Sourced kdb+

January 19th, 2015 by Ryan Hamilton

In a world overran with open source big data solutions is kdb+ going to be left behind? I hope not…

Every few weeks someone comes to me with a big data problem often with a half-done mongoDB/NoSQL solution that they “want a little help with”. Usually after looking at it I think to myself

“I could solve this in 20 minutes with 5 lines of q code”

But how do I tell someone they should use a system which may end up costing them £1,000s per core in licensing costs. Yes it’s great that there’s a free 32-bit trial version but the risk that you may end up needing the 64-bit is too great a risk.

kdb+ vs mongoDB database popularity

Given the ever-increasing number of NoSQL solutions and in particular the rising popularity of Hadoop, R, python and MongoDB it’s not hard to see that open-source is taking over the world. Right now kdb+ still has the edge, that it’s faster, sleeker, sexier..but I don’t think that will save it in the long run. The power of open-source is that it let’s everyone contribute, witness the 100’s of libraries available for R, the 1000’s of javascript frameworks. The truly sad thing is that it’s not always the best underlying technology that wins. A 1000 amateurs creating a vibrant ecosystem of plug-ins, add-ons, tutorials… can beat other technologies through sheer force of numbers.

APL was a great language yet it remains relegated to history while PHP flourishes.
PostgreSQL was technically superior to MySQL yet MySQL is deployed everywhere

I believe kdb+ is the best solution to a large number of “big data” problems (small data to kdb+), When you stop and think, time-series data is everywhere, open sourcing kdb+ would open up entirely new sectors to kdb+ and I hope it’s a step kx take before it’s too late.

What do you think? Leave your comments below.

kdb+,timeseries 10 Comments

qStudio kdb+ GUI adds Dark Theme and Chinese Language

October 20th, 2014 by admin

Based on user requests we have released a number of new features with qStudio 1.36:

Download the latest ->qStudio<- now.

Dark Code Editor Themes

Which can be set under settings->preferences

qstudio-settings-preferences

Open Results and Charts in New Window

To expand a panel into a new window click the “pop-out” icon.
pop-out

This will bring up the result in a new window:

UTF-8 Chinese Language Support

qstudio-utf8

Tags: kdb+, qstudio. kdb+,qStudio Add a comment

Developer Salary by Location

October 8th, 2014 by admin

kdb+ London Contract – £800 p/day
kdb+ belfast Citigroup Contract £400 p/day
Java Poland £150 p/day

kdb+ Add a comment

Command Line Kdb+ Charts

September 8th, 2014 by admin

sqlDashboards are included as a bundle with qStudio, part of that package is a command line utility called sqlChart that allows generating customized sql charts from the command line.

Checkout the video to see how you can create a chart based on data from a kdb+ database in 2 minutes:

The sqlChart page has all the documentation you need, Download the qstudio.zip to try it now.

The q Code

C:\temp\ch\qstudio>sqlchart -s kdb -P 5000 -e "([] dt:2013.01.01+til 21; cosineWave:cos a; sineWave:sin a:0.6*til 21)" -c timeseries -W 600 -t  dark
C:\temp\ch\qstudio\out.png

Help Screen

C:\temp\ch\qstudio>sqlchart
Option (* = required)             Description
---------------------             -----------
-?, --help                        Display a help message and exit.
-D, --database <db_name>          The database to use.
-H, --height <output_height>      Set the height of the chart output
                                    (default: 300)
-P, --port <port_num>             The TCP/IP port number to use for the
                                    SQL Server connection.
-W, --width <output_width>        Set the width of the chart output
                                    (default: 400)
-c, --chart <chart_type>          Set the selected chart type. Options
                                    available: timeseries, areachart,
                                    barchart, bubblechart, candlestick,
                                    datatable, heatmap, histogram,
                                    linechart, noredraw, piechart,
                                    scatterplot (default: barchart)
-e, --execute <sql_statement>     Execute the selected sql statement.
-h, --host <host_name>            SQL server host that will be queried.
                                    (default: localhost)
-o, --out <file_name>             The name of the destination image
                                    file. (default: out.png)
-p, --password <password>         Password used to connect to SQL server.
* -s, --servertype <server_type>  The type of sql server being queried.
                                    Valid values include:kdb,mysql,
                                    postgres,mssql,h2.
-t, --theme <color_theme>         Set the color theme for the chart.
                                    Options available: light,dark,pastel
                                    (default: light)
-u, --user <user_name>            Username used to connect to SQL server.

Tags: kdb chart, kdb+, qstudio, timeseries. kdb+,qStudio Add a comment

Bitwise Operators for Kdb+ Database

April 10th, 2014 by admin

Kdb does not have built-in functions for bitwise and,or,xor operations, we are going to create a C DLL extension that provides bitwise operators.

To download the source and definitions for and/or/xor bitwise operations click here
If your new to extending Kdb+ see this tutorial on writing a DLL for Kdb+

I also thought this would be a good opportunity to look at using q-code to generate C code, to generate Kdb Dll’s. When converting an existing library to work with Kdb or writing similar functions for different types, there’s a lot of repetitive coding. Either you can use macros or you can use one language to generate another, to save a lot of typing. Let’s look at bitwise operations as an example of what I mean:

If I wanted to write a bitwise and function band that takes two lists and performs a corresponding operation, we could write the C functions like so:

Notice the similarity between the bandJJ and bandII functions. Plus if we want to handle the case where the second argument is an atom we need an entire other set of C functions similar to bandJ to handle that. This is soon beginning to spiral beyond an easy copy paste job. Instead I used the following 20 lines of q code to generate the 130 lines of C code:

 / x-op, y-type letter, z-func name
DEF:" long c=0; X* a = kX(f); ";
ft2a:{ssr["\r\nK ",z,"XX(K f, K g) { ",DEF,"K s = ktn(KX, f->n);  X* r = kX(s);  X* b = kX(g);\r\n if(g->n != f->n) { return krr(\"length\"); }; for(c=0; c<f->n; c++) { r[c] = a[c]",x,"b[c]; }; \r\nreturn s; }\r\n";"X";y]};
ft2b:{ssr["\r\nK ",z,"X(K f, K g) { ",DEF,"K s = ktn(KX, f->n);  X* r = kX(s); \r\nfor(c=0; c<f->n; c++) { r[c] = a[c]",x,"g->",lower[y],"; }; \r\nreturn s; }\r\n";"X";y]};
ft1:{ssr["K  ",z,"X(K f) { X r; ",DEF," if(f->n>0) { r = a[0]; }; for(c=1; c<f->n; c++) { r = r",x,"a[c]; }; return k",lower[y],"(r); }";"X";y]};
 / x - func name, y - uppercase type string, z - args
callt1:{ r:{"\r\nif(a->t == K",y,") { return ",x,y,"(",z,"); };" }[x;;z] each enlist each y; raze r,"return krr(\"type\");"};
callt2:{ r:{"\r\nif(a->t == K",y,") { if(b->t == K",y,") { return ",x,y,y,"(",z,"); } else if(b->t == -K",y,") { return ",x,y,"(",z,"); } }" }[x;;z] each enlist each y; raze r,"return krr(\"type\");"};
f2:{enlist "\r\nK ",x,"(K a, K b) {	\r\n ",callt2[x;y;"a,b"],"}"};
f1:{enlist "\r\nK ",x,"(K a) { ",callt1[x;y;"a "],"}"};
 / x - single char operation, y - name
genCFunc:{ t:"HIJ"; 
    r: (ft1[(),x;;y,"1"] each t),f1[y,"1";t];
    r,:(ft2a[(),x;;y] each t),(ft2b[(),x;;y] each t),f2[y;t]; 
    "\r\n" sv r};
 / set file x to text y
sett:{ @[hdel;x;`]; h:hopen x; neg[h] each y; hclose h };

 / create .c,.q,.def files
nm:("band";"bor";"bxor");
sett[`:bitops.c; enlist["#include \"k.h\""],genCFunc .' flip ("&|^";nm)];
sett[`:bitops.q; {f:{"{x set `bitops 2:(x;",y,")} each `",("`" sv x),";"}; (f[nm;"2"];f[nm,\:"1";"1"])}[]];
sett[`:bitops.def; enlist["EXPORTS"],nm,nm,\:"1"];

This generates my C code, a .def file exporting the functions that I want to provide and creates a .q file that loads those functions into kdb. Which means any time I add a function or want to support a new type, everything is done for me. let’s look at our functions in action:

q)bor[1 2 3 4 5 6; 3] / bitwise or array against 3
3 3 3 7 7 7
q)band[1 2 3 4 5 6; 3] / bitwise and array against 3
1 2 3 0 1 2
q)bor1[1 2 3 4 5 6] / bitwise or array with every element of itself
7
q)band1[1 2 3 4 5 6] / bitwise and array with every element of itself
0

Yes the q code is messy (it could be improved) but the concept of dynamically generating your imports can be a real time saver.

This code is not intended for reliability or performance, use at your own risk! If however bitwise operations are a topic that interest you I recommend this article on: Writing a fast vectorised OR function for Kdb.

kdb+ Add a comment

Kdb qSQL vs standard SQL queries

March 30th, 2014 by Ryan Hamilton

Often at the start of one of our training courses I’m asked why banks use the Kdb Database for their tick data.

One well known reason is that kdb is really fast at typical financial time-series queries
(due to Kdbs column-oriented architecture).
Another reason is that qSQL is extremely expressive and well suited for time-series queries.

To demonstrate this I’d like to look at three example queries, comparing qSQL to standard SQL.

SQL Queries dependent on order

From the table below we would like to find the price change between consecutive rows.

time	price
07:00	0.9
08:30	1.5
09:59	1.9
10:00	2
12:00	9

a:([] time:07:00 08:30 09:59 10:00 12:00; 
      price:0.9 1.5 1.9 2 9.)

q Code

In kdb qSQL this would be the extremely simple and readable code:

q)update change:price-prev price from a
time  price change
------------------
07:00 0.9
08:30 1.5   0.6
09:59 1.9   0.4
10:00 2     0.1
12:00 9     7

Standard SQL

In standard SQL there are a few methods, we can use. The simplest is if we already have a sequential id column present:

select a.time, a.price -(select b.price from tab b where b.id = a.id + 1), 
  as diff from tab a

Even for this simple query our code is much longer and not as clear to read. If we hadn’t had the id column we would have needed much more code to create a temporary table with row numbers. As our queries get more complex the situation gets worse.

Select top N by category

Given a table of stock trade prices at various times today, find the top two trade prices for each ticker.

trade table
time	sym	price
09:00	a	80
09:03	b	10
09:05	c	30
09:10	a	85
09:20	a	75
09:30	b	13
09:40	b	14

trade:([] time:09:00+0 3 5 10 20 30 40; 
    sym:`a`b`c`a`a`b`b; 
    price:80 10 30 85 75 13 14);

qSQL Code

select 2 sublist desc price by sym from trade

In q code this would be select 2 sublist desc price by sym from trade, anyone that has used kdb for a few days could write the query and it almost reads like english. Select the prices in descending order by sym and from those lists take the first 2 items (sublist).

SQL Code

In standard SQL a query that depends on order is much more difficult , witness the numerous online posts with people having problems: stackoverflow top 10 by category, mysql first n rows by group, MS-SQL top N by group. The nicest solution, if your database supports it, is:

SELECT sym, price FROM (
         SELECT 
             ROW_NUMBER() OVER ( PARTITION BY sym ORDER BY price DESC ) AS 'RowNumber', 
             sym, price FROM trade
      ) dt WHERE RowNumber <= 2

The SQL version is much harder to read and will require someone with more experience to be able to write it.

Joining Records on Nearest Time

Lastly we want to consider performing a time based join. A common finance query is to find the prevailing quote for a given set of trades. i.e. Given the following trade table t and quote table q shown below, we want to find the prevailing quote before or at the exact time of each trade.

trades t
time	sym	price	size
07:00	a	0.9	100
08:30	a	1.5	700
09:59	a	1.9	200
10:00	a	2	400
12:00	b	9	500
16:00	a	10	800

quotes q
time	sym	bid
08:00	a	1
09:00	b	9
10:00	a	2
11:00	b	8
12:00	b	8.5
13:00	a	3
14:00	b	7
15:00	a	4

t:([] time:07:00 08:30 09:59 10:00 12:00 16:00; 
      sym:`a`a`a`a`b`a; 
      price:0.9 1.5 1.9 2 9. 10.; 
      size:100*1 7 2 4 5 8);

q:([] time:08:00+60*til 8; 
      sym:`a`b`a`b`b`a`b`a;
      bid:1 9 2 8 8.5 3 7 4.);

In qSQL this is: aj[`sym`time; t; q], which means perform an asof-join on t, looking up the nearest match from table q based on the sym and time column.

In standard SQL, again you’ll have difficulty: sql nearest date, sql closest date even just the closest lesser date isn’t elegant. One solution would be:

WITH cte AS
(SELECT t.sym, t.time, q.bid,
	ROW_NUMBER() OVER (PARTITION BY t.ID, t.time 
           ORDER BY ABS(DATEDIFF(dd, t.time, p.time))) AS rowNum
	FROM t LEFT JOIN q ON t.sym = q.sym)
SELECT	sym,time,bid FROM cte WHERE rowNum = 1

It’s worth pointing out this is one of the queries that is typically extremely slow (minutes) on row-oriented databases compared to column-oriented databases (at most a few seconds).

qSQL vs SQL

Looking at the simplicity of the qSQL code compared to the standard SQL code we can see how basing our database on ordered lists rather than set theory is much more suited to time-series data analysis. By being built from the ground up for ordered data and by providing special time-series based joins, kdb let’s us form these example queries using very simple expressions. Once we need to create more complex queries and nested selects, attempting to use standard SQL can quickly spiral into a verbose unmaintainable mess.

I won’t say qSQL can’t be cryptic 🙂 but for time-series queries qSQL will mostly be shorter and simpler than trying to use standard SQL.

If you think you have shorter SQL code that solves one of these examples or you are interested in one of our kdb training courses please get in touch.

Tags: kdb+, qslq, query, sql, timeseries database. kdb+,timeseries 2 Comments

TimeStored Blog

Archive for the 'kdb+' Category

KX closes down commercial 32 bit kdb, open alternatives?

kdb code highlighting in intellij

kdb qunit testing now open source on github

qStudio adds Nested Server Folder Support

Open Sourced kdb+

qStudio kdb+ GUI adds Dark Theme and Chinese Language

Dark Code Editor Themes

Open Results and Charts in New Window

UTF-8 Chinese Language Support

Developer Salary by Location

Command Line Kdb+ Charts

The q Code

Help Screen

Bitwise Operators for Kdb+ Database

Kdb qSQL vs standard SQL queries

SQL Queries dependent on order

q Code

Standard SQL

Select top N by category

qSQL Code

SQL Code

Joining Records on Nearest Time

qSQL vs SQL