How dataframe looked like after merging informative data

Let’s assume I’m using 1m timeframe and 5m informative timeframe. Before I start, the timestamp that the exchanges gives are the open time of each candles. So if you have a candle with timestamp 00:00, that means that candle opens at 00:00 UTC. When it closes will depend on the timeframe used. Now I have this 1m candles’ data

TimestampClose value
23:594
00:005
00:014
00:026
00:037
00:046
00:053
00:064
00:073
00:085
00:092
00:104
table 1

And this is my 5m informative candles’ data

TimestampClose value
23:554
00:006
00:052
00:104
table 2

And this is how it would looks like in your dataframe after you merged them

Timestampcloseclose_5m
00:0054
00:0144
00:0264
00:0374
00:0466
00:0536
00:0646
00:0736
00:0856
00:0922
00:1042
table 3

Now, what if at 00:10 UTC, I want to know the average close value of the latest 3 candles? The correct way is to do it inside informative function, like this

@informative('5m')
def populate_indicators_5m(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
    dataframe['avg_close_3'] = dataframe['close'].rolling(3).mean()
    return dataframe

It will correctly use data from table 2 above, which will give you the latest value avg_close_3_5m of 4. But if you do the calculation inside any main populate functions, for example inside populate_indicators like this

def populate_indicators(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
    dataframe['avg_close_3_5m'] = dataframe['close_5m'].rolling(3).mean()
    return dataframe

It would use data from table 3, which means the avg_close_3_5m value calculated would be 4.666667, which is wrong.

That’s why, if you want to do any calculation using informative data, it’s better to do it inside each of the respective informative functions. Hopefully this article helps.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *