Project Detail

AI Learning Platform - Backend

A backend platform composed of independently deployable microservices that provide authentication, user profile management, file storage and streaming, real-time chat, and an AI Retrieval-Augmented-Generation (RAG) pipeline. The system exp...

Backend Engineer / ArchitectDuration: 6 monthsType: platform

Key Achievement Metrics

microservices

proto_files

compose_services

Architecture View

Processing state: architecture signal graph is initializing...

Decision Log

Contract-first internal communication with gRPC (protobuf)

Strong typing, generated client/server stubs, efficient binary serialization, natural fit for Java/Python interop.

Trade-off: gRPC web and browser compatibility requires the gateway to bridge to REST/HTTP; adds operational complexity for schema versioning and tooling.

Per-service Postgres persistence (one DB/schema per service)

Encapsulates schema evolution; reduces blast radius of schema changes.

Trade-off: Requires event-driven patterns for derived state and complicates multi-DB transactions (eventual consistency). Querying across services requires materialized views or read-side replication.

Kafka as asynchronous backbone with DLTs

Enables decoupled scaling, replayability and backpressure handling.

Trade-off: Operational overhead (broker management), ordering/partition design required, and increased complexity for exactly-once semantics.

Architecture Narrative

Challenge

Provides a scalable backend for interactive learning experiences that require secure identity, persistent chat, searchable user content, and low-latency AI-assisted retrieval and generation over user-owned files.

Solution

microservices + event-driven pipelines. Public edge (REST/WebSocket/SSE) -> API Gateway -> internal gRPC services; asynchronous integration via Kafka topics; vector retrieval via Qdrant for RAG.

Result

Key measurable signals: microservices (7), proto_files (6), compose_services (11).

Trade-off Matrix

Dimension	Selected Option	Impact	Compromise
Internal API protocol	gRPC / Protobuf internally, REST at edge	compact binary protocol, strong typing, fast client/server codegen	Browser compatibility requires gateway bridging; adds schema/version management overhead
Data ownership	Per-service Postgres instances (bounded contexts)	Service autonomy, safer schema evolution, easier deployments per team	Cross-service queries require asynchronous syncing or join-less designs; eventual consistency.

What I'd Do Differently

Add distributed tracing (OpenTelemetry) across gateway, gRPC, and Kafka producers/consumers to track and troubleshoot AI pipeline latency end-to-end.

Introduce a schema registry (Kafka Schema Registry or Avro/JSON Schema) and automated compatibility checks for event/topic evolution; add CI enforcement for .proto changes.