Memory Management in Kdb

Buddy Memory System - reference counting
Version Differences
Finding the memory used
Garbage Collection
Reference counting in detail
Memory Mapped files
Compression and Memory

Buddy Memory System - reference counting

Kdb uses a variant of the buddy memory system using reference counting for tracking live objects.

Objects are allocated memory in blocks of powers of 2
Memory for objects < 32MB will come from an internal heap which can only ever grow - this memory is given back to the heap when the object is no longer referenced, and can be used again for further allocations < 32MB.
Memory for objects > 32MB will be given back to the OS when the object is no longer referenced.
Symbols are stored as an interned pool

Version Differences

Version	Behaviour
2.4	Memory never returned
2.5/2.6	Unreferenced memory blocks over 32MB/64MB are returned immediately
2.7	Unreferenced memory blocks returned when memory full or .Q.gc[] called

Finding the memory used

The following commands can be used to get memory usage

q).Q.w[]
used| 180560
heap| 201326592
peak| 201326592
wmax| 0
mmap| 12010254138
mphy| 8565985280
syms| 1181
symw| 59133
q)\w
180416 201326592 201326592 0 12010254138 8565985280j
q)\w 0
1181 59133j
q)

All values are in bytes

used - subset of heap in actual use.
heap - physically memory allocated to this process.
peak - largest heap size that q process has yet had.
wmax - the memory limit as set using the -w command line argument.
mmap - memory used for memory mapping files on disk.
mphy - physical memory available on the machine.
syms - Number of distinct syms in this q process.
symw - memory footprint of interned string pool.

In older versions of q, .Q.w[] was not present. The older, less user friendly way of obtaining the above statistics is

\w - used heap peak wmax mmap
\w 0 - syms symw

Garbage Collection

.Q.gc[]

(since 2.7) invokes the garbage collector. Returns the amount of memory that was returned to the OS.

Command line -g parameter

Switch garbage collection between immediate (1) and deferred (0) modes.

// deferred mode it takes a .Q.gc[] to actually return memory
q)\g 0
q)a:til 40100200
q)withA:.Q.w[]
q)delete a from `.
`.
q).Q.w[]-withA
used| -268435424
heap| 0
peak| 0
wmax| 0
mmap| 0
mphy| 0
syms| 0
symw| 0
q).Q.gc[]
268435456j
q).Q.w[]-withA
used| -268435408
heap| -268435456
peak| 0
wmax| 0
mmap| 0
mphy| 0
syms| 0
symw| 0

// immediate mode it is returned straight away
q)\g
1
q)a:til 40100200
q)withA:.Q.w[]
q)delete a from `.
`.
q).Q.w[]-withA
used| -268435424
heap| -268435456
peak| 0
wmax| 0
mmap| 0
mphy| 0
syms| 0
symw| 0

Reference counting in detail

The C API details reference couting as encountered when extending kdb using C

From within kdb we can use -16! - Returns the number of references to an object

q)a:til 200
q)-16!a
1
q)b:a
q)-16!a
2
q)-16!b
2

Vectors are copied by reference when possible, but editing just one value causes another entire vector to be allocated. Note columns in a table are just vectors as shown below:

q)pre:.Q.w[]`used
q)t1:([] c:til 8100200)
q)show (.Q.w[]-pre)`used
33554688j
q)t2:update c1:c from t1
q)show (.Q.w[]-pre)`used
33554816j
q) // very little memory usage increase
q)-16!t1 `c
2
q)delete t2 from `.
`.
q)-16!t1 `c // delete reference and ref count goes back to 1
1
// note was passed by reference and very little memory usage increase

q)t2:update c1:c+1 from t1 where i=4 // change one value
q)show (.Q.w[]-pre)`used
67109536j
q)-16! t1 `c
2
q)-16! t2 `c1
1
// massive memory use as entire new vector created

Memory Mapped files

There are two modes of memory mapping - immediate and deferred:

deferred mode - column is memory mapped on demand as needed for the duration of the query.
immediate mode - the columns memory map is maintained, whether or not the memory is actually used, this is down to OS details.

q)`:t/ set ([] a:til 8000000)
`:t/
q)\l .
q).Q.w[]
used| 136832
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
mphy| 8565985280
syms| 557
symw| 24571
// created splayed table and mmap is zero as no files currently memory mapped

// during a select the files are mapped in
q)select {show .Q.w[];x} a from t
used| 137872
heap| 67108864
peak| 67108864
wmax| 0
mmap| 32000016
mphy| 8565985280
syms| 560
symw| 24659

q)// if we assign the value of a select to a variable.
q)// the memory mapping is immediate
q)b:select a from t
q).Q.w[]
used| 137456
heap| 67108864
peak| 67108864
wmax| 0
mmap| 32000016
mphy| 8565985280
syms| 561
symw| 24686
q)delete b from `.
`.
q).Q.w[]
used| 137328
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
mphy| 8565985280
syms| 561
symw| 24686

Compression and Memory

Uncompressed columns are stored in memory for the duration of a query, this can significantly increase memory requirements.

Contents