FinRL_DeepSeek

Report Abuse

Basic Information

FinRL-DeepSeek implements LLM-infused, risk-sensitive reinforcement learning methods for automated trading agents. The repository packages code and assets to add language model signals to financial reinforcement learning workflows and to train and backtest trading policies. It provides data preprocessing pipelines that combine news sentiment and risk assessments with market data, example environment implementations compatible with FinRL style stock trading gyms, and multiple training entry points for PPO, CPPO and their LLM-enhanced variants. The project is presented as research code accompanying a paper and has been integrated into the upstream FinRL project. The repo also includes a backtesting notebook, an installation script, references to prepared Hugging Face datasets and guidance for running distributed training on a recommended Ubuntu server setup.

Links

App Details

Features
The repository contains scripts to generate LLM signals and risk labels using sentiment_deepseek_deepinfra.py and risk_deepseek_deepinfra.py and data preparation scripts train_trade_data_deepseek_sentiment.py and train_trade_data_deepseek_risk.py. Training entry points include train_ppo.py, train_cppo.py, train_ppo_llm.py and train_cppo_llm_risk.py with example mpirun usage for parallel training. Multiple environment implementations are provided: env_stocktrading.py, env_stocktrading_llm(.py) and env_stocktrading_llm_risk(.py). There is an installation_script.sh for dependencies, a Colab backtesting notebook, prepared Hugging Face datasets for prices and news, and evaluation metrics such as Information Ratio, CVaR and Rachev Ratio. Logging guidance highlights monitoring AverageEpRet, KL and ClipFrac.
Use Cases
This repository helps researchers and practitioners build and evaluate trading agents that merge LLM-derived signals with reinforcement learning and risk-aware objectives. It supplies end-to-end components from LLM signal extraction and dataset assembly to environment definitions, distributed training commands and backtesting examples, enabling reproducible experiments and comparison across PPO, CPPO and their LLM-augmented variants. The included metrics and log pointers support quantitative evaluation of risk-adjusted performance. Practical notes include an installation script and a recommended server profile for training. The README records a preliminary regime-level finding that PPO performed better in a bull market while CPPO-DeepSeek was preferred in a bear market, which can guide experiment design.

Please fill the required fields*