Improving Language Models Inductive Bias with Q*
Q*, a hybridisation of Q-learning and the pathfinding algorithm A*, has the potential to enhance the inductive bias of a language model in tasks that demand certain types of reasoning. An implementation of Q* is described here https://lnkd.in/giMTvSQR and implemented with a watsonx language model here https://github.com/jamesdhope/q--deliberate-planning-watsonx with the following parameters and adaptions:
Jul 10, 2024