3rd Year University Undergraduate dissertation project, supervised by https://www.nottingham.ac.uk/computerscience/people/thomas.gaertner focusing on using Natural Language Processing, AI Portfolio Optimisation and Machine Learning to produce an Automated Trading Agent. Received 89% overall.
Your random forest is using a randomized train / test split. Your technical indicators embed prior day information into the calculations, which is leaking data into your randomized splits. Your reference [10] paper has this same issue (https://www.reddit.com/r/algotrading/comments/cv83yh/overfitting).
For time series analysis, you should not use randomized splits, as that is how not data would be received in a real environment.