Kubernetes MLOps Engineer
OpenKyber LLC
6 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Remote
Tech stack
.NET
Azure
Cloud Computing
Databases
Continuous Delivery
Continuous Integration
Github
Identity and Access Management
IP Addressing
Virtual Private Networks (VPN)
Microsoft SQL Server
SQL Azure
OpenID
Openshift
Data Logging
Network Switches
Google Cloud Platform
Delivery Pipeline
Grafana
Multi-Cloud
Infrastructure as Code (IaC)
Amazon Web Services (AWS)
Kubernetes
Kafka
Database Replication
Machine Learning Operations
Cloud Integration
Api Gateway
Terraform
Api Management
Confluent
Job description
Google Cloud Engineer( Kubernetes, Terraform, GitOps, multi-cloud networking) Remote role 1-2 rounds of Interviews Job Description Key Responsibilities:
- Multi-Cloud Network Engineering: Design and implement high-bandwidth, low-latency connectivity between Azure and Google Cloud Platform using Direct Cloud-to-Cloud Interconnect and Site-to-Site VPN meshes to support synchronous data replication.
- Kubernetes Platform Management: Provision and manage Google Kubernetes Engine (GKE) and OpenShift clusters, ensuring optimized IP address strategies using VPC-native clusters and Class E (240.x) secondary ranges to prevent IP exhaustion.
- Infrastructure as Code (IaC) Development: Build and maintain modular Terraform and Terragrunt repositories to automate the deployment of Google Cloud Platform landing zones, networking, and datastores while enforcing corporate standards for logging and security.
- API Gateway Ingress & Governance: Configure and secure the API management layer (APIM, including APIM self-hosted) to handle cross-cloud traffic, ensuring sub-20-second global deployment cycles and consistent rate-limiting across environments.
- Multi-Cloud Observability: Set up OpenTelemetry (OTEL) collectors to aggregate traces, metrics, and logs from Google Cloud Platform services and GKE clusters, exporting them to central backends like Honeycomb or Grafana Cloud.
- Security & Identity Federation: Implement Workload Identity Federation (OIDC) to establish trust between GitHub Actions and Google Cloud Platform IAM, replacing long-lived service account keys with short-lived, scoped tokens.
- Automated GitOps Pipelines: Manage the deployment of containerized .NET applications using ArgoCD to ensure the Google Cloud Platform environment remains in sync with the declarative source of truth in GitHub.
Requirements
- Google Cloud Platform Core Infrastructure Expertise: Deep experience with Google Cloud networking (VPC, Cloud Interconnect, HA VPN), IAM, and project hierarchy management.
- IaC Mastery: Advanced skills in Terraform and Terragrunt for managing complex multi-environment state and cross-module dependencies.
- CI/CD & GitOps Experience: Hands-on experience with GitHub Actions for build pipelines and ArgoCD for continuous deployment to Kubernetes.
- Database & Messaging Knowledge: Familiarity with Cloud SQL (SQL Server engine) and managed Kafka platforms like Confluent Cloud, including Cluster and Schema Linking.
- Observability Implementation: Proven ability to instrument applications and infrastructure with OpenTelemetry and configure collectors for multi-cloud visibility.
- OpenShift Proficiency: Experience with Red Hat OpenShift (OCP/OSD), particularly in managing OVN-Kubernetes and EgressIP for hybrid connectivity.
Nice to Haves:
- Azure Integration Knowledge: Familiarity with Azure core services (ExpressRoute, Entra ID, Azure SQL) to better facilitate "Primary-Peer" management models and cross-cloud troubleshooting.
- High-Cardinality Analysis Tools: Familiarity with Honeycomb or Observe for advanced ad-hoc exploratory querying and root cause analysis.