Contextual Retrieval-Augmented Generation: A Serverless Architecture Using AWS Kendra and Claude
DOI:
https://doi.org/10.32996/jcsts.2025.7.9.56Keywords:
Retrieval-Augmented Generation, Serverless Architecture, Enterprise Search, Language Model Integration, Cloud ComputingAbstract
Retrieval-Augmented Generation frameworks represent a transformative paradigm in enterprise information systems, combining intelligent document retrieval with advanced language model capabilities to deliver contextually relevant responses to complex queries. This technical contribution presents a comprehensive serverless architecture that integrates cloud-based intelligent search services with external language model APIs through managed compute orchestration and gateway services. The implementation leverages microservices-oriented design principles to create a fully managed, scalable solution for enterprise knowledge management that automatically adjusts to varying workloads without traditional infrastructure management overhead. The architecture comprises four primary layers: API gateway management for request handling, elastic compute orchestration, intelligent document retrieval with semantic understanding capabilities, and external language model integration for contextual generation. Performance evaluation demonstrates robust scalability characteristics supporting enterprise-scale deployments with consistent response quality across various document types including technical documentation, policy documents, and knowledge base interactions. The system demonstrates improved contextual relevance scoring through sophisticated semantic understanding algorithms and comprehensive uptime reliability. The implementation includes robust security systems, industry standard encryption and fine grain access controls, as well as audit logging for enterprise compliant obligations.