Lead impactful projects at a dynamic investment firm with cutting-edge technology. Enhance your skills in platform engineering and DevOps practices. Collaborate with talented professionals in a fast-paced environment.
Systems Engineer
in Information Technology PermanentJob Detail
Job Description
Overview
- Contribute to the deployment and management of Kubernetes clusters supporting GPU workloads and Helm-based orchestration.
- Maintain and improve data workflow platforms like Airflow, ensuring efficient DAG troubleshooting and best practices.
- Implement modern DevOps practices, including CI/CD pipelines, observability, and infrastructure as code.
- Collaborate with cross-functional teams to extend platform capabilities and integrate emerging AI tooling.
- Participate in incident response and on-call rotations to ensure reliable system operations.
- Support the deployment and maintenance of container infrastructure and deployment processes.
- Create and maintain documentation for IT and end-users, ensuring clear communication and knowledge sharing.
- Drive the adoption of security-first DevSecOps practices, including SAST and CVE scanning.
- Contribute to the management of both on-premise and cloud infrastructure environments.
Key Responsibilities & Duties
- Deploy and manage Kubernetes clusters, supporting GPU workloads and Helm-based orchestration.
- Maintain automation and infrastructure as code practices using Ansible and Terraform.
- Implement CI/CD pipelines and container infrastructure for streamlined deployment processes.
- Participate in observability and incident response, ensuring reliable monitoring and alerting systems.
- Collaborate with development, trading, and security teams to enhance platform capabilities.
- Integrate AI tooling into workflows to improve efficiency and platform functionality.
- Support the deployment and maintenance of Airflow infrastructure, including DAG troubleshooting.
- Document processes and share knowledge effectively across teams and stakeholders.
- Contribute to the management of hybrid infrastructure environments, including on-premise and cloud setups.
Job Requirements
- Bachelor's degree in Computer Science, Engineering, or related technical field.
- 5–7 years of experience in SRE, Platform Engineering, or Systems Administration roles.
- Proficiency in Python scripting and hands-on experience with Kubernetes deployment and management.
- Experience with Docker/Podman, Ansible, and Terraform for infrastructure automation.
- Familiarity with CI/CD workflows and tools like ArgoCD or Flux for GitOps.
- Proficiency with Grafana and Prometheus for monitoring and observability.
- Exposure to DevSecOps practices, including SonarQube and CVE scanning.
- Experience with hybrid infrastructure environments, including AWS and GCP managed via Terraform.
- Excellent communication skills for cross-functional collaboration and documentation.
- ShareAustin: