So in the `Function Execution Control` section, exercise 3 asks to time the factorial functions `factw` that uses a `while` and `factp` that uses `prds`.
For reference:
`factw:{[x] f:1; while[x>1; f*:x; x-:1]; f}`
`factp:{[x] last prds 1 _ til x + 1}`
Using `\ts` to time the functions running for 1000 times, we see that `factp` using `prds` is more that 2x faster.
Why is that?
I know that `q` is a vector-based language, but what does that actually mean in this context? Does the `prds` arguments get paralellized? Whatever optimization `prds`, why can’t we apply it in the `while`?
Thanks.
If we look at the definition of prds, we can see:
`q) prds
*\`
The `*` is the multiplication operator, and the `\` is the scanning adverb, which simply gets the cumulative results when applying the preceding operator. See http://code.kx.com/wiki/Reference/BackSlash#scan and http://code.kx.com/wiki/Reference/scan for more info.
Now, you have to remember that q is an interpreted language and not a compiled language*, and so factw will have to decode the commands several times, which is a big overhead in loops. On the other hand, factp’s looping logic (the condition checking and incrementing of the counter) is already in the scan adverb which is defined natively (in compiled C?). This allows it to avoid the overhead of the interpreter despite doing essentially the same loop**.
Additionally, like you said, kdb may indeed perform some easy optimizations that allow parallel processing especially when dealing with pure functions. However, I do not think that this is the case (at least as of now).
By the way, it might seem odd that kdb is known for being fast despite q being interpreted. That’s because most of the time, as was mentioned in the video, good or idiomatic q code would avoid writing explicit loops and thus avoid most of the overhead introduced by the interpreter. Additionally, when dealing with lots of data, it is usually IO that becomes the bottleneck, and kdb has been optimized for analyzing tick data or sorted data by being an in-memory column-store database.
* Technically, languages are neither compiled nor interpreted. It’s the implementation that decides. In this case, I’m talking about the official kdb implementation of course.
** Actually, scan does a little more than the loop since it also stores all of the partial results.
hm thats cool. didnt know about `\`. so its like `inject` in ruby.
yea the extra overhead you talk about makes sense, and the `\` function is just like a for with a calculation and not a function decoding and calculator.
thanks 🙂
Good answer. Thanks Matt.