QStudio
Query kdb+ Servers and Chart Results.
This article is part of a wider tutorial, that outlines the major components that could be used to make a bitcoin trading system based upon kdb+ tick. It's intended as an exercise for those that have completed the TimeStored online kdb+ training course, if you have completed one of our courses contact us to download the source code.
Historical data within kdb+ is stored in a process called the HDB. However before we can load up that process we must first import our data from our raw data files then agree on and implement a database structure optimised for our system and the queries we would like to consider.
Recommended Tutorials: Casting and Parsing, Importing Data into kdb+
Bitcoin historical data is available as CSV files here. We want to create a function that will:
Starting from a CSV file such as mtgoxUSD.csv that has three columns (timestamp, price, size):
Once you have
Our bitcoin data is tiny (1.5GB in kdb+ format) on a good system we could easily store it fully in RAM. However for the purposes of learning and to allow efficient access from a slow laptop with a single drive and limited RAM, we want to consider partitioning our data. Partioning is logically dividing a database into distinct independent parts to increase manageability and performance. We cover partitioning in our Partitioned Database Tables tutorial, you should remember the diagram:
Partitioning by date as is common with typical stock data is too fine grained for our data, it would result in thousands of files less than 1kb in size.
I suggest partitioning on currency, then sorting the data by time. This will allow fast access for queries such as:
What was the USD Bitcoin rate between 17th and 23rd of June
The functions you will want to create are:
Function | Description |
---|---|
getCsvFileList | return table of csv zip file names at bitcoincharts (http://api.bitcoincharts.com/v1/csv/)
Columns are: file, d-date, t-time, size, link |
downToFolder:[urls; folder] | Delete folder and then fill with files from urls. |
unzipAndGet[zdir; csvZipFile] | Unzips file to csv directory, parses csv files and returns table
return table with columns ts,price,size,cur,ex,tp where each row is a trade |
These then give a main loop of events:
Query kdb+ Servers and Chart Results.