Senior Devops Engineer

in Information Technology
  • New York, NY View on Map
  • Salary: $200,000.00 - $250,000.00
Permanent

Job Detail

  • Experience Level Sr Level
  • Degree Type Bachelor of Science (BS)
  • Employment Full Time
  • Working Type Hybrid
  • Job Reference 0000016711
  • Salary Type Annually
  • Industry Financial Services
  • Selling Points

    Lead impactful DevOps projects for AI/ML systems at a top-tier organization. Collaborate on cutting-edge technologies to drive innovation and scalability. Enhance your expertise in advanced AI infrastructure and operations.

Job Description

Overview

  • Lead the development and operation of advanced DevOps pipelines supporting AI/ML lifecycle processes.
  • Collaborate with multidisciplinary teams to deliver scalable, secure, and efficient AI platforms.
  • Develop and maintain CI/CD pipelines for AI/ML services and cloud-based infrastructure.
  • Automate infrastructure provisioning using Infrastructure as Code tools like Terraform and Kubernetes.
  • Ensure reliability, scalability, and observability of AI platforms and workloads.
  • Implement security, compliance, and governance requirements for AI systems and processes.
  • Support production workloads for Generative AI systems and LLM-based services.
  • Document standards, best practices, and processes for DevOps and AI infrastructure.

Key Responsibilities & Duties

  • Design and operate scalable DevOps pipelines for model lifecycle automation and deployment.
  • Develop cloud-native infrastructure on AWS using Kubernetes and containerized workloads.
  • Implement model versioning, artifact management, and experiment tracking systems.
  • Ensure system health monitoring, model performance tracking, and drift detection.
  • Collaborate with AI/ML engineers to standardize deployment patterns and practices.
  • Participate in incident response, troubleshooting, and continuous improvement initiatives.
  • Optimize costs for compute-intensive workloads while ensuring scalability and efficiency.
  • Document reference architectures and best practices for AI infrastructure and operations.

Job Requirements

  • Bachelor of Science degree in a relevant field is required.
  • 10+ years of experience in DevOps, SRE, or Platform Engineering roles.
  • Proficiency in AWS cloud services, Kubernetes, and CI/CD pipeline development.
  • Hands-on experience with Terraform and scripting/programming languages like Python.
  • Experience with MLOps platforms, model registries, and experiment tracking systems.
  • Exposure to Generative AI workloads and LLM-based services in production environments.
  • Strong communication skills and ability to work effectively across teams.
  • AWS certifications are preferred but not mandatory.
  • ShareAustin:

Related Jobs