Data Replay and Backtest in TorQ

Blog Data Analytics & Engineering 31 Mar 2017

Data Intellect

TorQ has a new utility for replaying historical data into real-time data processes, datareplay.q.  Replaying data is usually the first step towards allowing you to backtest. 

datareplay.q builds a table of upd function calls like those generated by a tickerplant, but instead, using a historical database as the data source, making it simple to test new or existing real-time subscribers with existing data. Or to run backtesting on historical data.

It can bucket the data based on a set time interval, or have one message per timestamp and can generate calls to a function of your choice also at a set time interval.

In this post, we will go over how to use the utility and show an example of using it for backtesting.

How do I use it?

The utility is located under the .datareplay namespace included with TorQ and is interfaced mainly using the function tablesToDataStream.

Key

Example Value Description Required Default

tabs

`trade`quote or `trade List of tables to include Yes

N/A

sts

2014.04.04D07:00:00.000 Start timestamp for data Yes N/A

ets

2014.04.04D16:30:00.000 End of timestamp for data Yes N/A

syms

`AAPL`IBM

List of symbols to include

No

All syms

where

enlist (=;`src;enlist`L)

Custom where clause in functional form

No

none

timer

1b

Generate timer function flag

No

0b

h

5i

Handle to hdb process

No

0i (self)

interval

0D00:00:01.00

Time interval used to chunk data, bucketed by timestamp if no time interval set

No

None

tc

`data_time

Name of time column to cut on

No

`time

timerfunc .z.ts Timer function to use if `timer parameter is set No

.z.ts

Using these parameters it is possible to generate tickerplant messages within a set time range, from selected tables, from the current process or remote hdb, bucketed either by an interval of your choice or not at all, and even generate functions to be called at the end of every interval.

The utility also has support for custom where clauses that allow flexibility when generating data.  More information on the utility is available on the TorQ utility documentation page.

Backtest Example

One example of how this utility can be used is for back-testing purposes, as the historical data is presented to the process as if it was real time data, it is possible to use this to test and dial in code on historical data as if it was real time data.  To demonstrate how the utility can be used for backtesting, we have added the process vwapsub.q to the TorQ Finance Starter Pack (the full code for this can be viewed here).

The most important functions are the upd and calcvwap functions, these represent business logic that requires back-testing using the datareplay utility.  Currently, the upd function is setup to cache values required to calculate the vwap by sym from the trade table.

// upd function gets sum of price*size and sum of size by sym
// and adds it to the running total inside the vwap table 
// This can be used to calculate current vwap quickly. 
upd:{[t;d]
  if[t~`trade;
      `vwap set (`.[`vwap]) + select spts:sum price*size,ssize:sum size by sym from d;
     ];
 };

And calcvwap is setup to retrieve the vwap by sym at the current time.

// Calculates vwap by sym at the current time
calcvwap:{
    select vwap:spts{e673f69332cd905c29729b47ae3366d39dce868d0ab3fb1859a79a424737f2bd}ssize by sym from `.[`vwap]
};

When .vwapsub.realtime is set to 1b, the process will subscribe to the tickerplant for the trade table, calculate vwap by sym based on the time interval set at .vwapsub.replayinterval and log the result to the vwaptimes table.

Otherwise, when .vwapsub is set to 0b, the process will retrieve trade table data from the hdb within the time period defined by .vwapsub.replaysts and .vwapsub.replayets and also calculate vwap by sym based on the time interval set at .vwapsub.replayinterval and log it to vwaptimes.

In both of these modes, data is being sent to the upd function, and the calcvwap function is being called to do the vwap calculation, but the real-time mode is using real-time data while the back-testing mode is using historical data.

Try For Yourself

To start the process in real-time mode, while the TorQ Finance Starter Pack is setup and running (instructions can be found here), enter the following command from the TorQ home directory.

q torq.q -load code/processes/vwapsub.q -procname vwapsub1 -proctype vwapsub -debug -.vwapsub.realtime 1

To start the process in back-testing mode, enter this command

q torq.q -load code/processes/vwapsub.q -procname vwapsub1 -proctype vwapsub -debug -.vwapsub.realtime 0

In both of these modes the vwaptimes and vwap tables can be accessed as shown below:

q)vwap
sym | spts ssize
----| --------------
AAPL| 13172.82 181
AIG | 757.51 95
DELL| 2586.3 30
DOW | 6954.34 342
GOOG| 28887.94 245
HPQ | 15139.89 595
IBM | 29537.49 290
INTC| 7930.24 284
MSFT| 6399.26 436
AMD | 10385.49 233

q)vwaptimes
time vwap ..
-----------------------------------------------------------------------------..
2015.01.07D01:00:00.000000000 (`s#+(,`sym)!,`symbol$())!+(,`vwap)!,() ..
2015.01.07D01:10:00.000000000 (`s#+(,`sym)!,`s#`AAPL`AIG`AMD`DELL`DOW`GOOG`HP..
2015.01.07D01:20:00.000000000 (`s#+(,`sym)!,`s#`AAPL`AIG`AMD`DELL`DOW`GOOG`HP..
2015.01.07D01:30:00.000000000 (`s#+(,`sym)!,`s#`AAPL`AIG`AMD`DELL`DOW`GOOG`HP..
2015.01.07D01:40:00.000000000 (`s#+(,`sym)!,`s#`AAPL`AIG`AMD`DELL`DOW`GOOG`HP..

If you would like more information on TorQ and data capture, please email us directly at info@aquaq.co.uk

Share this:

LET'S CHAT ABOUT YOUR PROJECT.

GET IN TOUCH