AI & LLM Systems Engineer

Building production-ready AI platforms with a focus on LLM systems, AI agents, and machine learning engineering. London, UK.

About

My journey into AI systems began with a foundation in software development and cloud computing architecture. After completing my MSc in Cloud Computing, I recognized that the next frontier in building scalable, intelligent platforms lay in operationalizing machine learning models and large language systems.

Today, I focus on designing and deploying production AI systems that solve real business problems. My work centers on LLM-based agents, document intelligence platforms, and automated workflows that leverage both pretrained and fine-tuned models. I prioritize system design thinking—ensuring that AI components integrate cleanly with existing infrastructure, handle scale, and maintain reliability.

Beyond implementation, I care deeply about architecture decisions: when to use pretrained models versus fine-tuning, how to structure agent-based systems for maintainability, and how to build AI platforms that are both secure and compliant with regulatory requirements.

Production code I've built remains private under organization accounts, as is standard for proprietary platforms. I share technical insights through architecture documentation, system design case studies, and technical decision records that focus on patterns and principles rather than implementation details.

AI & LLM Systems

Production systems built with LLMs, focusing on architecture decisions, model selection, and operational concerns.

VoxyDocs: Agent-Based Document Intelligence

Problem Statement

Organizations process thousands of documents daily with varying formats and structures. Traditional OCR and template-based extraction fails when document layouts change, creating maintenance overhead and poor accuracy on unstructured content.

Why LLMs

LLMs excel at understanding context, extracting structured information from unstructured text, and handling document variations without rigid templates. Agent-based architectures allow for multi-step reasoning, validation, and error recovery that traditional pipelines cannot provide.

Architecture Overview

Multi-agent system with specialized roles: document parser agent (handles format detection and text extraction), extraction agent (uses LLM to identify key fields), validation agent (cross-checks extracted data against business rules), and enrichment agent (adds metadata and classifications). Agents communicate via a message queue, with a coordinator orchestrating workflow state.

Data Flow

Documents enter via API or batch upload → storage layer (S3/GCS) → parser agent extracts raw text → extraction agent processes text through LLM API with structured output format → validation agent reviews extractions → enrichment agent adds tags/classifications → results stored in database with confidence scores → webhook/callback notifies client system.

Model Choices

Base extraction uses GPT-4-turbo for structured outputs (JSON mode) due to reliability. Fine-tuned Llama 2 13B deployed on GPU instances for high-volume, cost-sensitive workflows. Sentence transformers (all-MiniLM-L6-v2) for semantic search and document similarity. Embeddings stored in vector database (Pinecone) for retrieval-augmented workflows.

Deployment Considerations

API deployed on Kubernetes with auto-scaling based on queue depth. LLM calls rate-limited and cached where possible. Async processing pipeline (Celery/RQ) for batch jobs. Model endpoints abstracted behind a service layer for easy swapping. Monitoring includes latency percentiles, token usage, cost tracking, and accuracy metrics via human-in-the-loop validation samples.

VoxyDocs: Multi-Agent Architecture
Client API/ Batch UploadS3/GCSStorageMessage QueueRedis/RabbitMQCoordinatorOrchestratorParser AgentFormat DetectionText ExtractionExtraction AgentLLM ProcessingStructured OutputValidation AgentBusiness RulesCross-CheckEnrichment AgentMetadataClassificationLLM ServicesGPT-4-turboLlama 2 13BEmbeddingsPineconeVector DBPostgreSQLResults StorageWebhookCallback

Job Application Automation AI Agent

Problem Statement

Applying to multiple positions requires customizing cover letters, extracting relevant experience, and matching qualifications to job descriptions. Manual process is time-intensive and often leads to lower-quality applications due to repetition fatigue.

Why LLMs

LLMs can analyze job descriptions, extract key requirements, match them against candidate profiles, and generate personalized application materials. Agent frameworks enable multi-step workflows: research, drafting, review, and submission with human oversight at critical steps.

Architecture Overview

Single autonomous agent with access to multiple tools: resume parser, job board API client, document generator, and email client. Agent maintains conversation state and can iterate on outputs. Human-in-the-loop checkpoints before final submission. Tool-use pattern (ReAct) for deterministic actions, LLM for reasoning and generation.

Data Flow

User provides resume and job URLs → agent fetches job descriptions → extracts requirements using LLM → parses resume into structured format → agent generates tailored cover letter with LLM → user reviews → agent revises if needed → final documents generated → submission via email/API (if approved). State persisted in database for resume and audit trail.

Model Choices

GPT-4 for reasoning and generation tasks (best quality for personalized content). Claude 3 Opus for complex analysis tasks. OpenAI embeddings (text-embedding-3-small) for semantic matching between resume sections and job requirements. No fine-tuning needed—prompt engineering and RAG techniques provide sufficient personalization.

Deployment Considerations

Agent runs as a long-lived process (or serverless function with state management) with API endpoints for job initiation and status checks. Rate limiting per user to prevent abuse. Output caching for repeated job types. Human approval workflow integrated via webhooks. All LLM interactions logged for compliance and improvement.

Job Application Automation: Agent Architecture
User InputResume + Job URLsPreferencesAutonomous AgentGPT-4 / Claude 3 OpusReAct PatternState ManagementTool UseResume ParserStructured DataExtractionJob Board APIFetch DescriptionsRequirementsDocument GeneratorCover LettersApplicationsEmail ClientSubmissionTrackingEmbeddingstext-embedding-3Semantic MatchingResume ↔ JobsHuman-in-the-LoopReview & ApproveCheckpointState StorageConversation HistoryAudit Trail

Secure Document Verification Workflows

Problem Statement

Financial and legal institutions need to verify document authenticity, extract sensitive information securely, and ensure compliance with regulations (GDPR, KYC). Traditional manual review is slow, expensive, and inconsistent. Automated systems must balance accuracy with privacy and auditability.

Why LLMs

LLMs can extract structured data from identity documents, contracts, and certificates while maintaining high accuracy. Combined with cryptographic verification and zero-knowledge techniques, they enable automated processing while preserving privacy. LLMs can also detect anomalies and flag documents requiring human review.

Architecture Overview

Hybrid system: LLM extracts structured data, cryptographic verification validates document integrity, rule-based engine applies compliance checks. Documents encrypted at rest and in transit. Extraction happens in isolated, audited environment. PII redaction layer before any external API calls. Audit log captures all operations for compliance reporting.

Data Flow

Document uploaded with encryption → stored in secure vault → decryption in isolated processing environment → LLM extraction with PII detection → structured data stored separately from source documents → compliance rules engine evaluates data → results encrypted and returned → audit log updated. Original documents deleted after verification window unless retention required.

Model Choices

GPT-4 with structured outputs for reliable extraction. Local model (Llama 2 fine-tuned) for sensitive PII fields to avoid sending data externally. Named entity recognition models for PII detection. Custom classification model for document type detection. No data sent to external APIs for highest-security workflows; all processing on-premises when required.

Deployment Considerations

Deployed in private cloud or on-premises for sensitive workloads. API endpoints behind authentication and rate limiting. Processing pipeline uses secure enclaves where available. Database encryption at rest. Regular security audits and penetration testing. Compliance reports generated automatically. Fail-closed design: any uncertainty triggers human review rather than auto-approval.

Secure Document Verification: Hybrid Architecture
Document UploadEncryptedTLSSecure VaultEncrypted StorageKMSIsolated ProcessingEnvironmentLLM ExtractionPII DetectionStructured OutputLocal ModelLlama 2 Fine-tunedSensitive PIIGPT-4 APIPII RedactionLayerBefore API CallStructured OutputCompliance RulesEngineGDPR / KYCBusiness RulesCryptographicVerificationDocument IntegrityAuthenticityStructured DataStorageEncrypted DBSeparate from DocsAudit LogAll OperationsCompliance ReportingImmutableEncrypted ResultsReturned to ClientVerification StatusExtracted DataHuman ReviewFlagged CasesFail-ClosedDesign

Case Studies

Production systems built and deployed, with emphasis on architecture, system design, and operational considerations. Detailed case studies are available in the architecture documentation repository.

VoxyDocs

AI document intelligence platform for automated data extraction and processing

Visit →

Business Problem

Organizations struggle with extracting structured data from diverse document formats. Manual data entry is slow and error-prone, while template-based automation breaks when document layouts change. This creates bottlenecks in processes like invoice processing, contract analysis, and form handling.

Technical Solution

Built an agent-based system where specialized AI agents handle different stages: parsing, extraction, validation, and enrichment. The system uses LLMs for context-aware extraction that adapts to document variations. Implemented a microservices architecture with async processing pipelines, allowing horizontal scaling. Added confidence scoring and human-in-the-loop workflows for high-stakes extractions. The platform exposes REST APIs and webhooks for integration with existing systems.

Tech Stack

PythonFastAPIOpenAI GPT-4LangChainPostgreSQLRedisCeleryDockerKubernetesAWS S3Pinecone

Key Learnings

Agent-based architectures require careful state management and error handling. Learned to balance between autonomous agents and deterministic validation steps. Cost optimization became critical—implemented caching, batch processing, and model routing (using cheaper models for simple extractions, expensive ones only when needed). Found that structured outputs (JSON mode) dramatically improved reliability over free-form extraction.

VoxyDocs: Multi-Agent Architecture
Client API/ Batch UploadS3/GCSStorageMessage QueueRedis/RabbitMQCoordinatorOrchestratorParser AgentFormat DetectionText ExtractionExtraction AgentLLM ProcessingStructured OutputValidation AgentBusiness RulesCross-CheckEnrichment AgentMetadataClassificationLLM ServicesGPT-4-turboLlama 2 13BEmbeddingsPineconeVector DBPostgreSQLResults StorageWebhookCallback

Verilett

Secure rental and payments platform with automation workflows

Visit →

Business Problem

Rental property management involves complex workflows: tenant verification, contract generation, payment processing, and compliance tracking. Manual processes lead to delays, errors, and security risks. Property managers need automation while maintaining trust and legal compliance.

Technical Solution

Developed a full-stack platform combining document verification, automated contract generation using LLMs, payment processing with Stripe, and workflow automation. Built document verification system that extracts and validates tenant information from IDs and financial documents. Automated contract generation personalizes standard templates based on property and tenant data. Integrated payment scheduling and reminders. All sensitive operations are logged for audit compliance.

Tech Stack

Next.jsTypeScriptNode.jsPostgreSQLStripe APIOpenAI APIAWS SESDockerVercelPrisma

Key Learnings

Security and compliance are non-negotiable when handling financial and personal data. Learned to implement proper encryption, access controls, and audit logging. Document verification accuracy required careful prompt engineering and validation rules. Payment automation needed robust error handling and reconciliation workflows. User experience matters even for B2B tools—automation should feel helpful, not opaque.

Verilett: Rental Platform Architecture
Next.js FrontendTypeScriptReactVercelNode.js APIREST APIAuthenticationAuthorizationDocument VerificationID ExtractionLLM ProcessingValidationContract GenerationOpenAI APITemplate PersonalizationLLM-basedPostgreSQLPrisma ORMData StorageEncryptedStripe APIPayment ProcessingSchedulingRemindersAWS SESEmail NotificationsRemindersAlertsWorkflow EngineAutomationState ManagementOrchestrationAudit LogComplianceTracking

RoomTo.Live

Real-world product platform connecting property seekers with verified listings

Visit →

Business Problem

Property search platforms suffer from low-quality listings, fake listings, and poor matching between seekers and properties. Users waste time on irrelevant results or unreliable landlords. The platform needs to ensure listing quality and provide intelligent matching without being intrusive.

Technical Solution

Built a marketplace platform with AI-powered listing verification and recommendation engine. Implemented automated listing quality checks using LLMs to analyze descriptions and flag inconsistencies. Developed matching algorithm combining traditional filters with semantic search (embeddings) for finding properties that match user preferences beyond keywords. Built verification workflows for landlords and automated communication tools. The system uses event-driven architecture to handle high concurrency during peak search times.

Tech Stack

ReactNext.jsNode.jsPostgreSQLElasticsearchOpenAI EmbeddingsRedisWebSocketsAWS LambdaS3

Key Learnings

Recommendation systems benefit from hybrid approaches—combining collaborative filtering, content-based filtering, and LLM-powered semantic understanding. Learned that user feedback loops are essential for improving recommendations. Handling real-time updates (new listings, availability changes) required careful event sourcing patterns. Marketplace platforms need to balance trust (verification) with friction (too much verification reduces listings).

RoomTo.Live: Marketplace Architecture
React FrontendNext.jsReal-time UIWebSocketsAPI GatewayNode.jsREST APIWebSocket ServerListing VerificationLLM AnalysisQuality ChecksInconsistency DetectionRecommendationEngineHybrid ApproachSemantic SearchEmbeddingsCollaborative FilterEmbeddingsOpenAItext-embedding-3Semantic MatchingPostgreSQLUser DataListingsMetadataElasticsearchFull-text SearchFilteringIndexingRedis CacheQuery ResultsSession DataEvent StreamReal-time UpdatesNew ListingsAvailabilityAWS LambdaAsync ProcessingBackground JobsS3 StorageImagesDocuments

Architecture Documentation

Production systems and proprietary platforms I've built are private under organization accounts. I maintain a public repository of architecture notes, system design documentation, and technical decision records.

System Architecture Notes

A collection of architecture documentation from production AI/LLM systems, including design patterns, decision frameworks, and system design case studies.

Architecture PatternsSystem DesignDecision RecordsCase Studies

Contents Include

  • • Multi-agent architecture patterns
  • • LLM system scaling strategies
  • • Model selection frameworks
  • • Data pipeline architectures
  • • Infrastructure deployment patterns
  • • Security & compliance approaches

Format

  • • Architecture diagrams and schemas
  • • Technical decision records (ADRs)
  • • System design case studies
  • • Trade-off analysis and constraints
  • • Production considerations
  • • Failure cases and lessons learned

Note: All documentation is sanitized and contains no proprietary code, customer data, or sensitive implementation details. Focus is on architectural patterns, decision-making frameworks, and system design principles applicable across production ML/AI systems.

Resume & Links

Download Resume

PDF resume available for download.

Download Resume (PDF)

Contact

Interested in discussing AI systems, LLM engineering, or potential opportunities? Get in touch.

Or reach out directly at: pradeepannagarasu@gmail.com