Efficient Context Filtering for Extractive Question Answering: A Hybrid Approach with Semantic Validation

Authors

DOI:

https://doi.org/10.32996/jcsts.2026.8.2.1

Keywords:

Extractive Question Answering, Large Language Models, Context Filtering, Hybrid Similarity Metrics, Efficiency Optimization

Abstract

Extractive question answering on lengthy documents remains computationally expensive due to quadratic attention complexity and context truncation requirements in modern language models. This work proposes a hybrid context filtering framework that combines classical similarity metrics, including cosine similarity and Word Mover’s Distance, with the Bitap algorithm, and utilizes selective LLM-based validation to reduce inference cost while maintaining competitive accuracy. The method filters irrelevant sentences before passage encoding, thereby reducing computational overhead without requiring learned retrieval components. Evaluation on SQuAD 2.0 across four open-source models (Llama 2 8B, T5-3B, Flan-T5-XL, mT5-Base) using 5-shot learning and fine-tuning demonstrates a 2.3  inference speedup and 58% latency reduction with a modest accuracy trade-off of 5.7% relative F1 degradation compared to full-context baselines. Component ablation confirms the synergistic contribution of each similarity metric, while robustness evaluation across various context lengths and out-of-distribution settings validates the method’s generalization capabilities. These results indicate that intelligent, parameter-free context filtering can achieve meaningful computational efficiency without necessitating complex learned retrievers.

Author Biographies

  • Vahid Ghanbarizadeh, Florida Atlantic University (FAU), Boca Raton, USA

    Vahid Ghanbarizadeh holds a master’s degree in Artificial Intelligence, Department of Electrical Engineering and Computer Science

  • Amin Moeinian, HEC Montreal, Montreal, Canada

    Amin Moeinian holds a master’s degree in Applied financial economics, Department of Applied Economics

  • Zahra Younes Pour Langaroudi , University of Trieste, Trieste, Italy

    Master of science, Trieste Department of Mathematics, Informatics and Geosciences

  • Mohsen Mohammadagha, University of Texas at Arlington, Texas, USA

    Ph.D. Candidate at the Department of Civil Engineering

  • Athar Sharifi, Padua University, Padua, Italy

    Master in medical biotechnology

Downloads

Published

2026-01-25

Issue

Section

Research Article

How to Cite

Ghanbarizadeh, V., Moeinian, A., Younes Pour Langaroudi , Z., Mohammadagha, M., & Sharifi, A. (2026). Efficient Context Filtering for Extractive Question Answering: A Hybrid Approach with Semantic Validation. Journal of Computer Science and Technology Studies, 8(2), 01-09. https://doi.org/10.32996/jcsts.2026.8.2.1