Join a dynamic team driving innovation in DevOps engineering. Enhance workflows with cutting-edge AI tooling and modern practices. Collaborate cross-functionally to support research and trading systems.

devops engineer
in Information Technology PermanentJob Detail
Job Description
Overview
- Join a dynamic team as a DevOps Engineer, focusing on infrastructure and tooling for research, development, and trading systems.
- Collaborate with cross-functional teams to enhance platform capabilities and integrate emerging AI tooling into workflows.
- Contribute to the deployment and management of Kubernetes clusters, supporting GPU workloads and Helm-based orchestration.
- Drive adoption of modern DevOps practices, including CI/CD, observability, and infrastructure as code.
- Maintain and improve data workflow platforms like Airflow, ensuring reliability and efficiency.
- Participate in incident response and on-call rotations to ensure system reliability and performance.
- Support both on-premise and cloud infrastructure environments, leveraging tools like Terraform for management.
- Utilize a security-first mindset to implement DevSecOps practices and ensure system integrity.
Key Responsibilities & Duties
- Deploy and manage Kubernetes clusters, including Helm-based orchestration and GPU workloads.
- Maintain automation and infrastructure as code practices using Ansible and Terraform.
- Implement and support CI/CD pipelines, container infrastructure, and deployment processes.
- Participate in observability and incident response, ensuring reliable monitoring and alerting systems.
- Collaborate with development, trading, and security teams to extend platform capabilities.
- Integrate AI tooling into workflows to improve efficiency and platform capabilities.
- Create and maintain documentation for IT processes and end-user support.
- Support the deployment and maintenance of Airflow infrastructure and assist with DAG troubleshooting.
Job Requirements
- Bachelor's degree in Computer Science, Engineering, or related technical field.
- Minimum of 5 years of experience in SRE, Platform Engineering, or Systems Administration roles.
- Proficiency in Python scripting and hands-on experience with Kubernetes deployment and management.
- Experience with Docker/Podman, Ansible, Terraform, and CI/CD workflows.
- Familiarity with Grafana, Prometheus, and tools like Opsgenie or PagerDuty for incident management.
- Exposure to both on-premise and cloud infrastructure environments managed via Terraform.
- Security-first mindset with experience in DevSecOps practices and tools like SonarQube.
- Excellent communication skills for cross-functional collaboration and documentation.
- ShareAustin: