Fine-tune-CoT is proposed, a method that leverages the capabilities of very large LMs to generate reasoning samples and teach smaller models via fine-tuning that enables substantial reasoning capability in small models, whereas previous prompt-based baselines exhibit near-random performance.
Wenhu Chen,Xueguang Ma,Xinyi Wang,William W. Cohen
Under both few-shot and zero-shot settings, PoT can show an average performance gain over CoT by around 12% across all the evaluated datasets, and by combining PoT with self-consistency decoding, can achieve SoT performance on all math problem datasets and near-SoTA performance on financial datasets.
This paper provides a comprehensive survey of cutting-edge research on reasoning with language model prompting and introduces research works with comparisons and summaries and provides systematic resources to help beginners.
Zhuosheng Zhang,Aston Zhang,Mu Li,Hai Zhao,G. Karypis,Alexander J. Smola
This work proposes Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework that separates rationale generation and answer inference so that answer inference can leverage better generated rationales that are based on multimodal information.
Thomas Carta,Clément Romac,Thomas Wolf,S. Lamprier,Olivier Sigaud,P. Oudeyer
This paper considers an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals.