Extractive Text Summarization for Malayalam News Articles

by Devi B S, Dr. Rani Koshy

Published: April 18, 2026 • DOI: 10.51244/IJRSI.2026.1303000223

Abstract

The rapid increase in textual data across digital environments has made automatic text processing an essential component of Natural Language Processing (NLP). Extractive approaches involve evaluating, identifying, and selecting the most relevant sentences and are considered efficient, interpretable, and systematic alternatives to abstractive methods. Previous methods have struggled to capture meaningful semantic relationships and contextual relevance using statistical or rule-based techniques. To address these limitations, this study proposes a headline- guided extractive model that combines multilingual transformer embeddings with linguistic cues to improve relevance and information retention. The system selects sentences based on semantic similarity and syntactic importance, ensuring that the generated summaries are coherent and concise. Additionally, it reduces redundancy, thereby enhancing applicability in real-world tasks.