terminal

shivang@cloud:~$ whoami

DevOps Engineer • SRE mindset • Platform Reliability

I build systems that don’t break.

Cloud-native platforms, reliability automation, observability-first operations, and incident readiness — built with a clean, product-like mindset.

Current focus:AWS EKS

System Status: All services operational

Last incident: resolved • postmortem completed • learnings shipped

Interactive Ops Console

Explore how I think during production operations.

ops-console
Try:
- whoami
- kubectl get projects
- slo status
- terraform plan
- incident last

$4B+ daily trades

fintech platform scale

100+ microservices

EKS modernization

MTTD ↓ 40%

observability improvements

RTO < 30m

DR readiness automation

About

A bit of story + a bit of philosophy.

I’m Shivang — a DevOps/SRE-minded engineer who enjoys building cloud-native platforms that are reliable, observable, and easy to operate. I work across AWS & Azure, Kubernetes, CI/CD, Terraform, and incident readiness.

My creative side shows up in how I design systems: clean dashboards, clear alerts, and interfaces that help teams move fast without breaking production.

ReliabilityAutomationObservabilitySecurityFintech scale

Now

  • Building: reliability-first delivery workflows
  • Exploring: SLOs, error budgets, on-call maturity
  • Polishing: this portfolio like a product

How I think (and build)

Flip the switches. That’s basically my mindset in production.

Automation > Manual Ops

toggle to expand

I turn repeatable work into pipelines, scripts, and IaC—so humans focus on decisions, not clicks.

Observability First

toggle to expand

Metrics + logs + alerting before scale. If we can’t see it, we can’t own it.

SLO-driven Reliability

toggle to expand

Secure by Default

toggle to expand

My platform blueprint

Hover cards = the stack I actually work with.

Ingress / LB

ALB/NLB • Ingress Controller • TLS

Kubernetes

EKS • Helm • Autoscaling • RBAC

CI/CD

Jenkins • GitHub Actions • ArgoCD • GitOps

IaC

Terraform • modular stacks • environment provisioning

Observability

Prometheus • Grafana • Loki • Alertmanager

Reliability

SLIs/SLOs • incident response • postmortems

Security

IAM • SIEM • audit trails • WORM retention

Projects (the real work)

Short, outcome-focused, and production-relevant.

  • $4B+ daily trade platform modernization and runtime stability improvements.
  • Monolith → 100+ microservices on AWS EKS for scalability and faster releases.
  • Standardized deployments using ArgoCD + Jenkins across environments.

Toolbox

Tools I’ve used in production (not just “familiar with”).

Cloud

  • AWS (EKS, VPC, IAM, ALB/NLB, EC2, S3, CloudWatch)
  • Azure (DevOps, Backup, DR)

Platform

  • Kubernetes (EKS)
  • Docker
  • Helm
  • Ingress Controllers
  • Autoscaling

Delivery

  • Jenkins
  • ArgoCD
  • GitHub Actions
  • GitOps workflows

IaC

  • Terraform
  • Modular stacks
  • State management

Observability

  • Prometheus
  • Grafana
  • Loki
  • Alertmanager
  • SLI/SLOs

Security

  • RBAC/IAM
  • Audit trails
  • SIEM monitoring
  • WORM retention

Let’s build something reliable

If you’re hiring for DevOps/SRE, I’m happy to share deeper architecture + incident learnings.