← Back to projects

Procer Cloud-Native Modernization

Complete Infrastructure Transformation: Windows → Kubernetes + GitOps

DevOps Engineer (Projectiva)2023GitOps & Kubernetes

Overview

Led complete infrastructure modernization for Procer, transforming a legacy Windows-based deployment into a cloud-native Kubernetes platform with GitOps workflows, Infrastructure as Code, and distributed retail unit deployment. Architected the entire solution from scratch, implementing ArgoCD for continuous delivery, Terraform for infrastructure provisioning, and Kubernetes for container orchestration across multiple retail locations.

The Challenge: Legacy Architecture at Breaking Point

Procer's retail management platform was struggling with a monolithic, Windows-based architecture:

  • Manual Deployments: Windows VMs requiring manual updates across 50+ retail units
  • No Version Control: Application deployments not tracked in Git, making rollbacks impossible
  • Scaling Issues: Peak traffic (end-of-month, Black Friday) required manual VM provisioning days in advance
  • Configuration Drift: Each retail unit slightly different, causing "works in production but not here" issues
  • Database Bottlenecks: PostgreSQL queries taking 15+ seconds during peak periods
  • Jenkins Chaos: 20+ Jenkins pipelines with duplicated logic, no standardization

Business Impact: Deployments taking 2-3 hours per retail unit. Peak traffic causing system outages. Development team spending 60% of time on deployment issues instead of features.

Architectural Transformation

Decision: Kubernetes vs Traditional VMs

Choice: Ubuntu 22.04 + Microk8s for lightweight Kubernetes at retail edge locations

Why not stay with Windows VMs?

  • Windows licensing costs per retail unit were unsustainable
  • No container orchestration = manual scaling
  • Configuration drift would persist

Why Microk8s over full Kubernetes?

  • Retail units run on commodity hardware (limited RAM/CPU)
  • Microk8s: 540MB memory footprint vs K8s: 2GB+
  • Single-node capable (no cluster required per retail unit)
  • Snap-based installation = easy updates

Trade-off: Less feature-rich than full Kubernetes, but perfect for edge deployments where resources are constrained.

Decision: GitOps with ArgoCD

Choice: ArgoCD for continuous delivery with GitLab as single source of truth

What is GitOps?

Declarative infrastructure: Git repository describes desired state, ArgoCD ensures Kubernetes matches it automatically. Every infrastructure change goes through Git PR/MR review.

Why ArgoCD over traditional CI/CD push?

  • Pull vs Push: ArgoCD pulls from Git (more secure than Jenkins pushing to prod)
  • Self-healing: If someone manually edits Kubernetes, ArgoCD reverts to Git state
  • Audit Trail: Every deployment is a Git commit with author, timestamp, review
  • Rollback: Git revert = instant infrastructure rollback

Result: Deployment time reduced from 2-3 hours to 5 minutes. Zero-touch deployments across 50+ retail units.

Technical Implementation

1. Infrastructure as Code with Terraform

Challenge: 50+ retail units, each needing identical infrastructure but with location-specific configs (IP addresses, store IDs, local databases).

Solution:

  • • Terraform modules for repeatable infrastructure (compute, networking, storage)
  • • Workspace per retail unit (terraform workspace select loja-42)
  • • Variables file per location (loja-42.tfvars with store-specific config)
  • • Remote state in GitLab backend (prevents concurrent edits)

Key Benefit: Onboarding new retail unit went from 3 days (manual setup) to 30 minutes (terraform apply -var-file=loja-new.tfvars).

2. Kubernetes Architecture

Application Stack:

  • Backend API: Python FastAPI containerized, deployed as Kubernetes Deployment with 3 replicas
  • Frontend: Next.js static export served via NGINX ingress controller
  • PostgreSQL: StatefulSet with persistent volumes for database state
  • Ceres API Gateway: NGINX-based reverse proxy for microservices routing

Kubernetes Resources:

  • Deployments: Application pods with rolling update strategy
  • StatefulSets: PostgreSQL with stable network identity and persistent storage
  • Services: Internal ClusterIP for service-to-service communication
  • Ingress: External traffic routing with TLS termination
  • ConfigMaps: Application configuration (environment-specific)
  • Secrets: Database credentials, API keys (encrypted at rest)
  • HorizontalPodAutoscaler: Auto-scaling based on CPU/memory thresholds

3. Change Data Capture (CDC) Pipeline

Problem: Retail units need to sync data to central warehouse for reporting/analytics. Previous solution: nightly batch jobs taking 4+ hours, blocking morning operations.

Solution: Kafka Connect + Debezium for real-time CDC

  • Debezium PostgreSQL Connector: Captures PostgreSQL WAL (Write-Ahead Log) changes in real-time
  • Kafka: Message broker streaming database changes to central data warehouse
  • No Application Changes: CDC operates at database level, zero code changes required
  • Exactly-Once Semantics: Kafka transactional guarantees prevent duplicate records

Result: Data replication latency reduced from 4+ hours (batch) to < 5 seconds (streaming). Morning operations no longer blocked by overnight sync jobs.

4. ArgoCD GitOps Workflow

Deployment Flow:

  1. Developer pushes code to GitLab feature branch
  2. GitLab CI builds Docker image, pushes to container registry
  3. Developer updates Kubernetes manifest (deployment.yaml) with new image tag
  4. Merge request reviewed and merged to main branch
  5. ArgoCD detects Git change, syncs to Kubernetes cluster automatically
  6. Kubernetes performs rolling update (zero downtime)
  7. ArgoCD health check verifies deployment success

Rollback: Git revert + ArgoCD sync = instant rollback to previous version. No manual kubectl commands needed.

5. Database Query Optimization

Problem: Dashboard queries taking 15+ seconds during peak traffic, causing timeout errors and poor user experience.

Analysis:

  • • Used EXPLAIN ANALYZE to identify slow queries
  • • Found missing indexes on foreign keys
  • • Sequential scans on 500K+ row tables
  • • N+1 query problem in ORM (SQLAlchemy)

Optimizations:

  • • Added composite indexes on frequently queried columns
  • • Implemented eager loading (joinedload) to eliminate N+1 queries
  • • Created materialized views for complex aggregations
  • • Added PostgreSQL connection pooling (pgBouncer)

Result: Query time reduced from 15 seconds to < 500ms. Dashboard now usable during peak traffic.

6. Jenkins Pipeline Consolidation

Before: 20+ separate Jenkins pipelines with duplicated logic (build, test, deploy steps copy-pasted).

After: Single unified pipeline with branch-based deployment strategy:

  • feature/* branches: Build + run tests only
  • develop branch: Deploy to development environment
  • staging branch: Deploy to staging environment
  • main branch: Deploy to production (requires manual approval)

Benefit: Single Jenkinsfile to maintain. Changes propagate to all environments automatically.

Technical Stack

Kubernetes (Microk8s)ArgoCDTerraformAnsibleGitLab CI/CDDockerPostgreSQLKafkaDebeziumPython (FastAPI)Next.jsNGINXUbuntu 22.04GrafanaPrometheus

Impact & Results

40%
Infrastructure cost reduction through resource optimization
5 min
Deployment time (down from 2-3 hours per retail unit)
< 5 sec
Data replication latency with CDC (was 4+ hours)
50+
Retail units running identical infrastructure via GitOps
Zero
Downtime during rolling Kubernetes updates
97%
Reduction in developer time spent on deployment issues

Key Lessons Learned

  • GitOps Trust Takes Time: Development team initially skeptical of "Git as source of truth." After first successful rollback via Git revert, they were sold. Cultural shift is harder than technical implementation.
  • Edge Kubernetes Needs Lightweight: Initially tried full K8s at retail units, hit resource limits. Microk8s was the right choice for constrained hardware.
  • CDC Beats Batch Every Time: Real-time data replication eliminated morning operational bottleneck. Kafka investment paid off immediately.
  • ArgoCD Self-Healing Saves Sanity: Multiple incidents where operators manually edited Kubernetes configs. ArgoCD reverted changes automatically, maintaining consistency.
  • Query Optimization > Hardware: Planned to upgrade database servers (expensive). Index optimization and N+1 query fixes eliminated need. Always optimize before scaling up.

Why This Matters for Senior DevOps Roles

This project demonstrates capabilities critical for modern DevOps positions:

  • Cloud-Native Expertise: Kubernetes, containerization, microservices architecture
  • GitOps Mastery: ArgoCD, declarative infrastructure, Git-based workflows
  • Infrastructure as Code: Terraform modules, repeatable deployments
  • Real-Time Data Pipelines: Kafka, Debezium, CDC patterns
  • Performance Optimization: Database tuning, query optimization, scaling strategies
  • Edge Computing: Distributed deployments across 50+ locations
  • Business Impact: 40% cost reduction, 97% reduction in deployment friction