Drive impactful DevOps initiatives at a leading investment firm. Build scalable infrastructure supporting cutting-edge research and trading systems. Collaborate with experts to integrate AI tooling and modern practices.
Devops Engineer
in Information Technology PermanentJob Detail
Job Description
Overview
- The Platform Engineer will focus on building and maintaining infrastructure supporting research, development, and trading systems.
- This role involves deploying and managing Kubernetes clusters and Helm-based orchestration for GPU workloads.
- Contribute to modern DevOps practices such as CI/CD, observability, and infrastructure as code.
- Collaborate with cross-functional teams to extend platform capabilities and integrate AI tooling.
- Maintain and improve data workflow platforms like Airflow and troubleshoot DAGs.
- Support deployment processes and container infrastructure using tools like Docker and Podman.
- Participate in incident response and monitoring to ensure system reliability and efficiency.
- Document processes and share knowledge effectively across teams.
Key Responsibilities & Duties
- Deploy and manage Kubernetes clusters, including GPU workloads and Helm-based orchestration.
- Implement CI/CD pipelines and container infrastructure to streamline deployment processes.
- Maintain automation practices using Ansible and Terraform for infrastructure management.
- Monitor systems using Grafana and Prometheus, ensuring reliable alerting and observability.
- Collaborate with development, trading, and security teams to enhance platform capabilities.
- Integrate AI tooling into workflows to improve efficiency and productivity.
- Support Airflow infrastructure, including deployment, maintenance, and DAG troubleshooting.
- Participate in after-hours on-call rotations for incident management and response.
Job Requirements
- Bachelor's degree in Computer Science, Engineering, or related technical field.
- Minimum 5 years of experience in SRE, Platform Engineering, or Systems Administration roles.
- Proficiency in Kubernetes deployment, management, and troubleshooting.
- Experience with CI/CD workflows and tools like ArgoCD or Flux for GitOps.
- Strong Python scripting skills and familiarity with Docker/Podman.
- Hands-on experience with Ansible and Terraform for infrastructure automation.
- Knowledge of observability tools such as Grafana and Prometheus.
- Security-first mindset with exposure to DevSecOps practices and tools.
- ShareAustin: