Evaluating the Impact of Prompt Engineering on Factual Accuracy and Hallucination in Large Language Models

by Diya Jain, Dr. Deepti Sharma, Pallak Anand

Published: April 30, 2026 • DOI: 10.51244/IJRSI.2026.1304000068

Abstract

The propensity of large language models (LLMs) to generate factually unsupported yet linguistically convincing text—commonly referred to as hallucination—poses a fundamental obstacle to their adoption in accuracy-critical settings. This paper investigates whether prompt engineering techniques can meaningfully reduce hallucination and strengthen user-perceived factual reliability. A sequential mixed-methods design was employed: a systematic review of fourteen peer-reviewed sources spanning 2017–2026, combined with an original empirical survey of 96 participants [15] who evaluated AI-generated responses across three prompting conditions—basic (A), structured (B), and detailed/context-rich (C). Perceived accuracy rates were calculated per question and condition, and a weighted completeness metric was derived to quantify informational depth across conditions. Results indicate that 56.3% of respondents maintain only partial trust in AI-generated facts and that users systematically prefer brief responses irrespective of their informational completeness—a behavioural pattern termed the brevity-trust bias. Step-by-step instruction was the most endorsed prompting strategy (55.2%), independently corroborating chain-of-thought prompting from the scholarly literature. Objective analysis further shows that basic prompts yielded the lowest weighted completeness scores across all five questions despite dominating user preference. The study concludes with a five-component integrated mitigation framework combining user-side prompting, retrieval-augmented generation (RAG), reinforcement learning from human feedback (RLHF), automated fact-checking, and structured user education.