matt - matt
Forum Replies Created
-
AuthorPosts
-
mattParticipantHm.. I’m not an expert but I don’t think there is. I don’t think it’s common enough to encounter in actual q code that they have a function for it. I think kdb was designed to be more minimal and a bit less general purpose than APL.
In fact, before v3.4, reshaping was only limited to 2 dimensions http://code.kx.com/wiki/Reference/NumberSign. Also, you have to remember that kdb’s vectors are very flexible and are able to have items of different lengths and types.
The closest thing you could probably do is to use the count function. You could use the each adverb to easily get the lengths at the different dimensions.
`
q) l: ((1 2; 3); (1; 2 3); (1 2 3))
q) l
(1 2;3)
(1;2 3)
1 2 3
q) count l / # of items (1st dimension)
3
q) count each l / # of items per item (2nd dimension), notice that it may vary
2 2 3
q) (count each) each l / # of items per item per item (3rd dimension), non-lists have a count of 1
2 1
1 2
1 1 1
`
Notice how you can derive all 3 dimensions just from the last one (based on the number of rows or columns returned).
mattParticipantIf we look at the definition of prds, we can see:
`q) prds
*\`
The `*` is the multiplication operator, and the `\` is the scanning adverb, which simply gets the cumulative results when applying the preceding operator. See http://code.kx.com/wiki/Reference/BackSlash#scan and http://code.kx.com/wiki/Reference/scan for more info.Now, you have to remember that q is an interpreted language and not a compiled language*, and so factw will have to decode the commands several times, which is a big overhead in loops. On the other hand, factp’s looping logic (the condition checking and incrementing of the counter) is already in the scan adverb which is defined natively (in compiled C?). This allows it to avoid the overhead of the interpreter despite doing essentially the same loop**.
Additionally, like you said, kdb may indeed perform some easy optimizations that allow parallel processing especially when dealing with pure functions. However, I do not think that this is the case (at least as of now).
By the way, it might seem odd that kdb is known for being fast despite q being interpreted. That’s because most of the time, as was mentioned in the video, good or idiomatic q code would avoid writing explicit loops and thus avoid most of the overhead introduced by the interpreter. Additionally, when dealing with lots of data, it is usually IO that becomes the bottleneck, and kdb has been optimized for analyzing tick data or sorted data by being an in-memory column-store database.
* Technically, languages are neither compiled nor interpreted. It’s the implementation that decides. In this case, I’m talking about the official kdb implementation of course.
** Actually, scan does a little more than the loop since it also stores all of the partial results.
mattParticipantHey, thanks for the link. Although that file is indeed different, it does share the same columns as what was shown in the videos. I think it’d be fine to use that file for the exercises since it is the solution (i.e. the queries) that matters not the actual result itself.
June 11, 2016 at 2:02 pm in reply to: Cannot download files in importing data into kdb module #111285
mattParticipantYeah, I managed to find them using the same URL used for the exercises:
http://www.timestored.com/kdb-training/online/dfile.php?f=nyse_20110621.csv
http://www.timestored.com/kdb-training/online/dfile.php?f=nyse_20110621_fixed.txtThe links will incorrectly give a PDF MIME type though, so your browser might try to treat it as a PDF file even though they really are text files. Unfortunately, the same trick didn’t work for the trades.q file needed for the exercises.
mattParticipantShort answer: Assuming that it is a positive number, then it would overflow (and most likely end up as a negative value).
Long answer: kdb+ represents infinity as the largest positive integer (in 2’s complement) of that datatype. So for long (64 bits) that would be 2^63-1. This can easily be verified:
`
0W – 1 = 0111 1111 1111 … 1110 = 2^63-2 = 9223372036854775806
0W = 0111 1111 1111 … 1111 = 2^63-1 = 0W
0W + 1 = 1000 0000 0000 … 0000 = -2^63 = 0N
0W + 2 = 1000 0000 0000 … 0001 = -2^63+1 = -0W
0W + 3 = 1000 0000 0000 … 0010 = -2^63+2 = -9223372036854775806
`
mattParticipantHey Jim,
Make sure you enabled quickedit mode in the command prompt. This was shown in one of the first videos (if not the first) of the training. You can also check https://www.tekrevue.com/tip/boost-productivity-quickedit-mode-windows-command-prompt/ for instructions. Once it is enabled, simply select the text you want to copy with your left mouse button (left click) and then press the right mouse button (right click) to copy. To paste, just press the right mouse button (right click) with nothing selected. There is no need to select a menu item in quickedit mode.
mattParticipantHi Jim!
I’m by no means an expert in kdb+, but I believe it’s because \P sets the number of digits to display, not the number of decimal places. You might want to use Q.fmt and .Q.f instead if you want to follow a more specific format for output (note: these return strings). See http://code.kx.com/wiki/Reference/DisplayPrecision for more info.
-
AuthorPosts