freqai transformer continual_learning save model error #10034

piaofeifengxinzi · 2024-04-01T05:16:24Z

Describe your environment

Operating system: Ubuntu 22.04.3 LTS
Python Version: Python 3.10.12
CCXT version: ccxt==4.2.78
Freqtrade Version: freqtrade 2024.3-dev-7d6d3d38f

Describe the problem:

I'm using freqai backtesting, I set "continual_learning": true in the configuration, but I get the error TypeError: cannot pickle '_thread.lock' object, when I set it to false, the error disappeared

When continual_learning is turned on, freqai will load the model from the old model and continue training when there is a model. However, when the training is completed, an error will be reported when saving the model. If "pytrainer": self is commented out, no error will be reported when saving, but an error will be reported when loading. at model = zip["pytrainer"]

Steps to reproduce:

freqtrade backtesting --freqaimodel PyTorchTransformerRegressor --strategy strategy4 --config ./user_data/config4temp1.json --timerange 20231002-20240325

Observed Results:

What happened?
What did you expect to happen?

Relevant code exceptions or logs

Note: Please copy/paste text of the messages, no screenshots of logs please.

File "/home/xxxx/freqtrade/freqtrade/freqai/freqai_interface.py", line 161, in start
    dk = self.start_backtesting(dataframe, metadata, self.dk, strategy)
  File "/home/xxxx/freqtrade/freqtrade/freqai/freqai_interface.py", line 365, in start_backtesting
    self.dd.save_data(self.model, pair, dk)
  File "/home/xxxx/freqtrade/freqtrade/freqai/data_drawer.py", line 484, in save_data
    model.save(save_path / f"{dk.model_filename}_model.zip")
  File "/home/xxxx/freqtrade/freqtrade/freqai/torch/PyTorchModelTrainer.py", line 190, in save
    torch.save({
  File "/home/xxxx/freqtrade/.venv/lib/python3.10/site-packages/torch/serialization.py", line 629, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol, _disable_byteorder_record)
  File "/home/xxxx/freqtrade/.venv/lib/python3.10/site-packages/torch/serialization.py", line 841, in _save
    pickler.dump(obj)
TypeError: cannot pickle '_thread.lock' object

The text was updated successfully, but these errors were encountered:

piaofeifengxinzi · 2024-04-01T07:40:57Z

I also set continual_learning: true, and when I set it to false, the error disappeared，Can't both of them be enabled at the same time in backtesting? Or in backtest mode, will the previous model be automatically loaded when training a new model?

piaofeifengxinzi · 2024-04-03T08:25:42Z

I found the reason, the save method of PyTorchModelTrainer saves self

torch.save({
             "model_state_dict": self.model.state_dict(),
             "optimizer_state_dict": self.optimizer.state_dict(),
             "model_meta_data": self.model_meta_data,
             "pytrainer": self
         }, path)

But I don't know how to fix it

robcaulk · 2024-04-04T08:33:34Z

Hello,

continual_learning for PyTorch models may only be suppprted in Dry Live modes.

If you need continual_learning in backtesting as well, you can use one of the other supported 18 models like XGBoost/Catboost.

You may have noticed in the documentation of continual_learning that this is a highly experimental feature, so if you would like to use it with PyTorch in backtesting, you may need to wait until someone has time to try to debug it for you. I will try my best, but it is not a top priority at the moment since it is an edge case on an experimental feature.

https://www.freqtrade.io/en/stable/freqai-running/#continual-learning

cheers,

rob

piaofeifengxinzi · 2024-04-04T15:46:37Z

Hello,

continual_learning for PyTorch models may only be suppprted in Dry Live modes.

If you need continual_learning in backtesting as well, you can use one of the other supported 18 models like XGBoost/Catboost.

You may have noticed in the documentation of continual_learning that this is a highly experimental feature, so if you would like to use it with PyTorch in backtesting, you may need to wait until someone has time to try to debug it for you. I will try my best, but it is not a top priority at the moment since it is an edge case on an experimental feature.

https://www.freqtrade.io/en/stable/freqai-running/#continual-learning

cheers,

rob

Thank you for your work. Using continual_learning in Dry mode does not work. This problem may only exist with PyTorchTransformerRegressor. I have not tried continual_learning on other pytorch-based models, and I think turning on continual_learning on the Transformer model may have better results. Hope this issue can be resolved.

piaofeifengxinzi · 2024-05-03T08:53:44Z

I solved this problem by modifying the following code，Modify the fit method of the file PyTorchTransformerRegressor.py

def fit(self, data_dictionary: Dict, dk: FreqaiDataKitchen, **kwargs) -> Any:
        """
        User sets up the training and test data to fit their desired model here
        :param data_dictionary: the dictionary holding all data for train, test,
            labels, weights
        :param dk: The datakitchen object for the current coin/model
        """

        n_features = data_dictionary["train_features"].shape[-1]
        n_labels = data_dictionary["train_labels"].shape[-1]
        model = PyTorchTransformerModel(
            input_dim=n_features,
            output_dim=n_labels,
            time_window=self.window_size,
            **self.model_kwargs
        )
        model.to(self.device)
        optimizer = torch.optim.AdamW(model.parameters(), lr=self.learning_rate)
        criterion = torch.nn.MSELoss()
        # check if continual_learning is activated, and retreive the model to continue training
        temp_trainer = self.get_init_model(dk.pair)
        if temp_trainer is None:
            trainer = PyTorchTransformerTrainer(
                model=model,
                optimizer=optimizer,
                criterion=criterion,
                device=self.device,
                data_convertor=self.data_convertor,
                window_size=self.window_size,
                tb_logger=self.tb_logger,
                **self.trainer_kwargs,
            )
        else:
            trainer = PyTorchTransformerTrainer(
                model=temp_trainer.model,
                optimizer=temp_trainer.optimizer,
                model_meta_data=temp_trainer.model_meta_data,
                criterion=criterion,
                device=self.device,
                data_convertor=self.data_convertor,
                window_size=self.window_size,
                tb_logger=self.tb_logger,
                **self.trainer_kwargs,
            )
        trainer.fit(data_dictionary, self.splits)
        return trainer

I don't know if this modification is correct, but it makes it run.

piaofeifengxinzi added the Triage Needed Issues yet to verify label Apr 1, 2024

xmatthias assigned robcaulk Apr 1, 2024

xmatthias added the freqAI Issues and PR's related to freqAI label Apr 1, 2024

piaofeifengxinzi changed the title ~~freqai backtest save model error~~ freqai Transformer continual_learning save model error Apr 3, 2024

piaofeifengxinzi changed the title ~~freqai Transformer continual_learning save model error~~ freqai transformer continual_learning save model error Apr 3, 2024

xmatthias removed the Triage Needed Issues yet to verify label Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

freqai transformer continual_learning save model error #10034

freqai transformer continual_learning save model error #10034

piaofeifengxinzi commented Apr 1, 2024 •

edited

piaofeifengxinzi commented Apr 1, 2024 •

edited

piaofeifengxinzi commented Apr 3, 2024

robcaulk commented Apr 4, 2024

piaofeifengxinzi commented Apr 4, 2024

piaofeifengxinzi commented May 3, 2024

freqai transformer continual_learning save model error #10034

freqai transformer continual_learning save model error #10034

Comments

piaofeifengxinzi commented Apr 1, 2024 • edited

Describe your environment

Describe the problem:

Steps to reproduce:

Observed Results:

Relevant code exceptions or logs

piaofeifengxinzi commented Apr 1, 2024 • edited

piaofeifengxinzi commented Apr 3, 2024

robcaulk commented Apr 4, 2024

piaofeifengxinzi commented Apr 4, 2024

piaofeifengxinzi commented May 3, 2024

piaofeifengxinzi commented Apr 1, 2024 •

edited

piaofeifengxinzi commented Apr 1, 2024 •

edited