Senior DevOps / Infrastructure Engineer
Causa Prima
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Tech stack
API
Cloud Computing
Code Review
Databases
Continuous Integration
Data Stores
DevOps
Fault Tolerance
Github
Graph Database
Identity and Access Management
Python
Network Security
Neo4j
OAuth
Ansible
TypeScript
Data Logging
Pulumi
Cloud Platform System
Cloud Monitoring
Large Language Models
Amazon Web Services (AWS)
Kubernetes
Sentry
Kafka
Event Store
Terraform
Docker
Pagerduty
Job description
- CI/CD - GitHub Actions + Cloud Build, security-aware pipeline design, production approval gates, container image scanning, secret isolation, signed commits.
- Observability - OpenTelemetry distributed tracing across TypeScript and Python services, Cloud Monitoring, Sentry with PII-stripping hooks, structured logging with sanitization, per-agent behavioural monitoring, tiered alerting.
- Secret management & rotation - Credential lifecycle for LLM API keys, database credentials, OAuth tokens, and agent signing keys in GCP Secret Manager.
- Container orchestration - Docker builds, registry management, GKE cluster configuration. Design the path toward Kubernetes-native deployment as we scale.
- Incident response infrastructure - Per-agent circuit breakers, graceful degradation, tiered alerting (logged Slack PagerDuty), forensic tooling via event store replay and traces.
- Network security - VPC firewall rules, private ingress for all data stores, egress controls, PII Vault on restricted-access infrastructure.
- Neo4j Aura operations - Monitoring, scaling decisions, and backup verification for the managed graph database.
Requirements
Do you have experience in Terraform?, Do you have a Master's degree?, * 5+ years in DevOps, infrastructure, or SRE roles for production systems.
- Strong systems design skills - you think in deployment topologies, failure domains, blast radius, and operational security.
- Production experience with GCP (Cloud Run, GKE, Cloud SQL, IAM, Secret Manager) or equivalent cloud platform with willingness to go deep on GCP.
- Hands-on experience with Kubernetes in production - cluster management, networking, scaling, security policies.
- Experience with infrastructure-as-code: Terraform, Pulumi, Ansible, or similar. Ideally more than one.
- Experience designing CI/CD pipelines with security in mind - secret isolation, approval gates, image scanning, deployment strategies.
- Experience with observability systems - distributed tracing, structured logging, alerting hierarchies, dashboarding.
- Security awareness at the infrastructure level - you think about network isolation, least-privilege IAM, and credential hygiene as defaults.
- Strong code review skills for infrastructure-as-code and deployment configuration.
- Nice to have:
- Event streaming infrastructure (Kurrent, Redpanda, Kafka).
- SOC 2 or GDPR compliance from an infrastructure perspective.
- Fintech or regulated-environment background.