Modular LLM Architecture with Pluggable Reasoning Heads: A Scalable Approach to Multi-Modal AI Reasoning

Anurag Pathak; Dilip Kumar Sharma; Harshada Agrawal; Rathachai Chawuthai; Jirayu Petchhan

doi:10.1109/etncc66224.2025.11299615

Modular LLM Architecture with Pluggable Reasoning Heads: A Scalable Approach to Multi-Modal AI Reasoning

dc.contributor.author	Anurag Pathak
dc.contributor.author	Dilip Kumar Sharma
dc.contributor.author	Harshada Agrawal
dc.contributor.author	Rathachai Chawuthai
dc.contributor.author	Jirayu Petchhan
dc.date.accessioned	2026-05-08T19:25:55Z
dc.date.issued	2025-8-5
dc.description.abstract	This paper introduces a modular architecture for Large Language Models (LLMs) that incorporates pluggable, domain-specific reasoning heads to augment the model's capabilities beyond conventional text generation. Central to our approach is an attention-routing controller that intelligently classifies and dispatches user prompts to appropriate reasoning modules—namely symbolic, logical, or graph—based heads-based on the prompt's structure and intent. This design enables hybrid, multi-paradigm reasoning without requiring retraining or finetuning of the base LLM. By decoupling reasoning tasks from general language understanding, our system improves both computational efficiency and interpretability. We demonstrate the architecture using Groq as the base LLM and integrate lightweight engines such as SymPy for symbolic mathematics and custom modules for logical and graph-based reasoning. Experiments across a suite of structured and unstructured prompts show a significant reduction in inference latency and token usage, along with higher accuracy and better explainability. The proposed framework offers a scalable foundation for embedding modular reasoning capabilities into modern LLM-driven applications.
dc.identifier.doi	10.1109/etncc66224.2025.11299615
dc.identifier.uri	https://dspace.kmitl.ac.th/handle/123456789/20338
dc.subject	Topic Modeling
dc.subject	Advanced Graph Neural Networks
dc.subject	Multimodal Machine Learning Applications
dc.title	Modular LLM Architecture with Pluggable Reasoning Heads: A Scalable Approach to Multi-Modal AI Reasoning
dc.type	Article

Collections

All

Modular LLM Architecture with Pluggable Reasoning Heads: A Scalable Approach to Multi-Modal AI Reasoning

Files

Collections