Archive for May, 2013

q kdb math library 0.4 released

Andrey Zholos announced a new version of q Math Library:

qml 0.4 is out:
http://althenia.net/qml

Now compatible with kdb+ 3.0 and whatever implementation of LAPACK your
distro provides.

Thanks to everyone who prompted me to continue working on this, and
especially to Jim Schmitz for his invaluable ideas, Kim Kuen Tang
for providing an interim version, and Kx for making a binary-
incompatible change to the interface

If there’s a permissively-licensed numerical library you’d like to see
included in the next version, let Andrey know. Or just submit a patch.

Reconcile tickerplant feeds using longest common subsequence matching

Sanket Agrawal just posted some very cool code to the k4 mailing list for finding the longest common sub-sequences. This actually solves a problem that some colleagues had where they wanted to combine two tickerplant feeds that never seemed to match up. Here’s an example:


q)/ assuming t is our perfect knowledge table

q)t
time sym size price
------------------------
09:00 A 8 7.343062
09:01 A 0 5.482385
09:02 B 5 8.847715
09:03 A 1 6.78881
09:04 B 5 3.432312
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:07 B 3 1.676582
09:08 B 7 7.163578
09:09 B 4 3.300548

Let us now create two tables, u and v, neither of which contain all the data.

q)u:t except t 7 8
q)v:t except t 1 2 3 4
q)u
time sym size price
------------------------
09:00 A 8 7.343062
09:01 A 0 5.482385
09:02 B 5 8.847715
09:03 A 1 6.78881
09:04 B 5 3.432312
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:09 B 4 3.300548
q)v
time sym size price
------------------------
09:00 A 8 7.343062
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:07 B 3 1.676582
09:08 B 7 7.163578
09:09 B 4 3.300548

We can find the indices that differ using Sankets difftables function

q)p:diffTables[u;v;t `sym;`price`size]
q)show p 0; / rows in u that are not in v
1 2 3 4
q)show p 1; / rows in v that are not in u
3 4

/ combine together again

q)`time xasc (update src:`u from u),update src:`v from v p 1
time sym size price src
----------------------------
09:00 A 8 7.343062 u
09:01 A 0 5.482385 u
09:02 B 5 8.847715 u
09:03 A 1 6.78881 u
09:04 B 5 3.432312 u
09:05 A 0 0.2801381 u
09:06 A 2 3.775222 u
09:07 B 3 1.676582 v
09:08 B 7 7.163578 v
09:09 B 4 3.300548 u
q)t~`time xasc u,v p 1
1b

The code can be downloaded at:
http://code.kx.com/wsvn/code/contrib/sagrawal/lcs/miller.q

http://code.kx.com/wsvn/code/contrib/sagrawal/lcs/myers.q

Thanks Sanket.

Algorithm details:

Myers O(ND): http://www.xmailserver.org/diff2.pdf

Miller O(NP):  http://www.bookoff.co.jp/files/ir_pr/6c/np_diff.pdf