Systems

Project Archetypes

Most names are under NDA. The architecture, scale, and outcomes are not.

Document Intelligence

Turns contracts, regulations, records, and reports into instant, sourced answers - so your team stops hunting through documents manually.

Preprocessing Script with OCR / VL Models

Self-hosted OCR + Vision-Language Model pipeline running on GPU with CUDA optimization. Handles scanned documents, complex layouts, tables, and embedded images. Benchmarked multiple OCR engines (PaddleOCR, Nanonets, MistralOCR, Deepseek OCR, OlmOCR) against a representative corpus to select the optimal engine per document type. Includes intelligent chunking with semantic embedding for downstream RAG.

ORS Health Portal RAG Chatbot

RAG chatbot for ORS.rs, the Serbian national health insurance portal. Answers questions about health coverage, procedures, and documentation requirements by retrieving from official ORS documents - PPTX presentations, SQL databases, and regulatory text. Self-hosted infrastructure: Supabase for data and authentication, Qdrant for vector storage, OpenRouter for multi-model LLM access. Hugging Face embeddings with reranking pipeline. Arize Phoenix observability for tracing retrieval quality and latency. Chunked document retrieval with relevance scoring, source attribution on every answer, and memory for multi-turn conversations about complex healthcare procedures.

Academy Assistant

Multilingual RAG chatbot for a Serbian higher-education institution, fielding student and staff questions about exam schedules, deadlines, and general academic information. Hybrid retrieval splits factual exam queries to a live MySQL table so answers always reflect the current schedule, and routes everything else to semantic search over scraped institutional content (Pinecone, paraphrase-multilingual-MiniLM-L12-v2 embeddings, Gemini synthesis). Custom ingestion pipeline transliterates Cyrillic to Latin, strips HTML boilerplate, and chunks with RecursiveCharacterTextSplitter before embedding. Conversation history and authentication persisted in MySQL; Streamlit frontend for student-facing access.

Agentic RAG for Laws

Multi-agent RAG pipeline for complex legal document corpora. Custom AI chunker segments non-standardized legal text while preserving cross-references. Hybrid retrieval combines BM25 keyword search with vector similarity search, fused via Reciprocal Rank Fusion (RRF) for optimal ranking. LLM-as-Judge validation loop scores retrieved passages for relevance and completeness before answer generation. Metadata enrichment adds jurisdiction, document type, and date filters.

AI Chunking Script

AI agent-driven document chunking pipeline that analyzes document structure and contextualizes chunks with logical boundaries. The agent determines optimal split points based on semantic meaning, section hierarchy, and content type. Falls back to LangChain splitters (RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter) when agent segmentation is not cost-effective. Cleans headers and footers, normalizes metadata for vector storage, detects tables, and generates image descriptions for embedded visuals. Built to handle real ingestion scale - million-page document collections, heterogeneous sources, and schema normalization across formats that break standard pipelines.

LLM Evaluation System

Automated quality assessment pipeline for LLM question-answering outputs. Built custom metrics alongside DeepEval framework and NLP algorithm metrics for comprehensive benchmarking of retrieval and generation quality. Golden datasets enable rapid comparison of prompting strategies, chunking approaches, and model selections across different providers. Continuous improvement pipeline that feeds evaluation results back into system tuning.

AI Operations & Automation

Replaces repetitive human interaction loops - calls, tickets, lookups, scheduling - with autonomous agents that run 24/7 inside your existing workflow.

Product Support & Installation Chatbot

Website chatbot built for iso.de that answers questions about their product catalog, compares multiple products side-by-side, and delivers detailed installation instructions - all in German. Collects contact details through guided flows with suggested questions to reduce friction. Analytics dashboard tracks conversations, user feedback, and conversion paths. FAQ system with confidence scoring to speed up self-service resolution.

Legal Building Regulations AI

Specialized RAG system for building code and construction regulation documents. Parses legal text with structural awareness - chapters, sections, subsections, and amendments - to maintain hierarchical context. Retrieves exact passages with citation metadata (document, chapter, section, paragraph) so every answer points to the specific regulatory source. Source verification step cross-checks retrieved citations against the original document to prevent hallucinated references.

Text2SQL AI Chatbot

Hybrid analytics chatbot combining structured SQL querying with semantic vector search. Routes queries between OpenAI and Google Gemini based on complexity. Supabase integration for data management, authentication, and real-time sync. Automated retrieval with explanation generation and analytics tracking dashboard.

Knowledge Generator AI

Autonomous crawling agent that structures platform documentation into a queryable knowledge base. Crawls unstructured sources, extracts meaningful content, and organizes it for team and customer access. Generates step-by-step instructions on how to use the web app based on documented features and workflows. Self-updating architecture that detects documentation changes and regenerates affected sections.

Knowledge-Based Chatbot

Query interface over generated knowledge bases. Delivers instant, specific answers from organizational data - not web search. Retrieves from collected company knowledge with source attribution. Designed for teams that need fast answers without hunting through documentation.

Visual AI Assistant

Multimodal extension to the text-based chatbot. Users ask questions in natural language and the system retrieves answers directly from a vector store of images - product manuals, installation diagrams, technical sheets - without OCR or caption generation. ColPali and ColQwen encode each image into late-interaction embeddings that capture fine-grained visual and textual features. At query time, the user's question is embedded and matched against this image vector store via similarity search, returning the most visually and semantically relevant pages. Answers are grounded in the retrieved image content combined with text documentation.

AI Interface for Factory Machine Data

Natural language interface for deeply nested tree/folder structured data. Parses hierarchy of machine specifications, maintenance schedules, and operational status stored in folder structures. Plain-language queries return precise answers without manual navigation through nested directories.

Phone AI Receptionist

AI that answers your phone calls, takes bookings, and handles the full reservation flow through natural conversation. Built on SIP with real-time speech processing, it integrates directly into your dispatch system - no hold times, no extra operators, no missed calls.

Company Information AI Assistant

RAG chatbot over company documentation - product specs, service descriptions, policies, and FAQs. Two response modes: concise answers for quick lookup, and expanded mode that retrieves additional context from related documents for deep dives. Auto-generated suggested questions based on retrieved context guide users to relevant follow-ups. Analytics track which questions are asked most and where the knowledge base has gaps.

AI-Powered Finance Tracker

Full-stack AI finance platform with WhatsApp as the primary user channel. Built and integrated the WhatsApp bot end-to-end: onboarding flow guides users through account setup, income categorization, and goal definition via natural conversation, and weekly AI-generated summaries are delivered directly to WhatsApp with spending breakdowns, trend analysis, and goal progress. AI layer analyzes financial profiles to generate personalized advice and runs calculations on income, expenses, and goals. AI-generated Pro Plans built by asking users detailed questions about their financial situation, risk tolerance, and objectives - producing structured, actionable financial roadmaps. Dynamic UI updates reflect personalized insights per user.

Data Infrastructure & Product

The interfaces, pipelines, and infrastructure that make AI systems actually usable at scale - from web-scale scraping to production mobile apps and enterprise frontends.

Next.js AI Chat Frontend

Production Next.js 14 frontend for AI chat interfaces. JWT silent refresh keeps users authenticated without interruptions. Conversation history with persistent threads. Docker deployment for reproducible builds. Enterprise-ready authentication with role-based access control.

Venue Finder

Conversational AI using Google Places API for venue discovery. Follow-up questioning narrows requirements until search criteria are clear. Web scraping and reranking filter results. Generates personalized outreach emails for selected venues. End-to-end flow from discovery to contact in a single conversation.

Agnostic Website Scraper - 2,000 Sites

Crawl4AI as the primary scraping engine for high-performance, JavaScript-aware crawling at scale. Proxy rotation and stealth mode handle anti-bot evasion. Vision AI fallback (Gemini) processes sites that block even headless browsers. Pydantic schema enforcement ensures structured, consistent output regardless of source heterogeneity. Extracted pricing tiers, feature lists, and company metadata from 2,000+ sites with precision.

AI Voice Mobile App

React Native mobile application with AI voice conversations. Vapi.ai handles real-time speech processing. Supabase provides authentication and data persistence. RevenueCat manages In-App Purchase subscriptions and monetization. Production-ready iOS build for App Store deployment with full subscription lifecycle handling.

Advanced Capabilities

The architectural layer that makes everything above accurate, fast, and deployable on real hardware - not just promising in a demo environment.

AI-Powered Drug Discovery Pipeline

End-to-end molecular simulation pipeline for precision medicine research. Automated protein-ligand docking with AutoDock Vina and GNINA. RAG system built on research papers and clinical datasets, integrated into CellType CLI for interactive research analysis. Proposed AxialBridge patient app for EULAR - ML/CV-based Axial Spondylitis detection from MRI and ultrasound imaging.

Hybrid Graph + Vector RAG System

Neo4j knowledge graph combined with vector embeddings for hybrid retrieval. Outperforms vector-only approaches on multi-hop queries by following relationship paths. Outperforms graph-only approaches on semantic similarity matching. Best of both architectures for complex reasoning tasks.

Multi-Agent Voice Chat (LangGraph)

LangGraph state machine with conditional edges for contextual routing between conversation personas. Real-time speech classification determines which agent handles each user turn. ElevenLabs TTS for natural voice output.

LLM Fine-Tuning Automation

Automated SFT pipeline via API with pre-flight validation. Training file format checking prevents costly failed jobs. Cost estimation before submission. Real-time monitoring of training progress and evaluation metrics. Reusable system that protects against common fine-tuning mistakes.

Local LLM Training Pipeline

MLX-based pipeline for Apple Silicon: LoRA fine-tuning, embeddings generation, reranking, and contrastive learning. All four tasks benchmarked with MTEB for quality validation. 4,000+ tokens per second on M4 with BF16 precision. Self-hosted training without cloud dependency.