shivang@cloud:~$ whoami
DevOps Engineer • SRE mindset • Platform Reliability▍
I build systems that don’t break.
Building scalable, reliable systems using modern cloud infrastructure, ensuring seamless operations and robust incident readiness.
Platforms Handling $4B+ daily trades
Platforms Handling $4B+ daily trades
100+ microservices deployed
100+ microservices deployed
MTTD(Mean Time to Detect) ↓ 40%
MTTD(Mean Time to Detect) ↓ 40%
RTO(Recovery Time Objective) < 30m
RTO(Recovery Time Objective) < 30m
About
A bit of story + a bit of philosophy.
I’m Shivang — a DevOps/Site Reliabilty engineer who thrives on building cloud-native platforms that are not just reliable but also efficient. From designing systems on AWS and Azure to implementing Kubernetes and Terraform solutions, I specialize in creating environments that run seamlessly under pressure.
My approach blends technical expertise with design thinking: I focus on creating systems that not only perform well but also give teams the tools they need to operate with confidence. Whether it’s through clear, actionable alerts or responsive dashboards, my goal is to make systems both powerful and simple to use.
What’s new I’m working on:
- Multi-region infrastructure
- Advanced SLOs and error budget strategies
- Serverless Architectures
How I think (and build)
Flip the switches. That’s basically my mindset in production.
Automation > Manual Ops
toggle to expand
I turn repeatable work into pipelines, scripts, and IaC—so humans focus on decisions, not clicks.
Observability First
toggle to expand
Metrics + logs + alerting before scale. If we can’t see it, we can’t own it.
SLO-driven Reliability
toggle to expand
Secure by Default
toggle to expand
Projects (the real work)
Short, outcome-focused, and production-relevant.
- Modernized a $4B+ daily trade platform, enhancing runtime stability and performance.
- Monolith → 100+ microservices on AWS EKS for scalability and faster releases.
- Standardized deployments using ArgoCD + Jenkins across environments.
Toolbox
Tools I’ve used in production (not just “familiar with”).
Cloud
- AWS (EKS, VPC, IAM, ALB/NLB, EC2, S3, CloudWatch)
- Azure (DevOps, Backup, DR)
Platform
- Kubernetes (EKS)
- Docker
- Helm
- Ingress Controllers
- Autoscaling
Delivery
- Jenkins
- ArgoCD
- GitHub Actions
- GitOps workflows
IaC
- Terraform
- Modular stacks
- State management
Observability
- Prometheus
- Grafana
- Loki
- Alertmanager
- SLI/SLOs
Security
- RBAC/IAM
- Audit trails
- SIEM monitoring
- WORM retention