The largest job portal in the Middle East
Apply now

Job Description

Full job description

We have an open role for Position Title-ML Ops Engineer– with a leading Group in Bahrain.

Job Title: ML Ops Engineer

Location: Bahrain

Experience: 5-7 Years

*** Kindly share CVs to

  • Design and implement data pipelines and engineering infrastructure to support enterprise machine learning systems at scale.
  • Work closely with data scientists and engineering teams to deploy, monitor, and optimize machine learning models in production.
  • Identify, evaluate, and integrate new technologies to enhance performance, maintainability, and reliability of machine learning solutions.
  • Apply software engineering best practices to machine learning pipelines, including CI/CD, automation, monitoring, and version control.
  • Manage cloud infrastructure (AWS, Azure, GCP) and containerization (Docker, Kubernetes) to ensure scalable and efficient ML workloads.
  • Implement and maintain highly available and scalable machine learning environments.
  • Ensure the security and compliance of machine learning systems, adhering to governance and industry regulations.
  • Troubleshoot and optimize machine learning models and infrastructure for performance improvements.
  • Collaborate with IT and OT teams to ensure seamless integration of machine learning systems.
  • Use Infrastructure as Code (Terraform, CloudFormation) to automate the management and provisioning of infrastructure.
  • Implement automated processes for deployment, monitoring, logging, and performance tracking.
  • Required Skillsets:

    • ML Model Deployment & Containerization: Strong experience with Docker and Kubernetes.
    • Cloud Platforms: Expertise in AWS, Azure, or Google Cloud Platform (GCP).
    • DevOps Practices: In-depth knowledge of DevOps, CI/CD pipelines, and automation techniques.
    • Monitoring & Logging: Proficiency in setting up monitoring and logging for ML models and infrastructure.
    • Version Control: Expertise in Git or other version control systems.
    • IT-OT Integration: Experience integrating IT and OT systems.
    • Scalability & High Availability: Proven track record of designing scalable, highly available machine learning infrastructure.
    • Security & Compliance: Understanding of security protocols, compliance frameworks, and governance.
    • Infrastructure as Code (IaC): Proficiency with Terraform or CloudFormation for automating infrastructure management.
    • Scripting: Strong skills in Python or Bash scripting for automation.
    • Data Engineering: Familiarity with data engineering workflows and handling large datasets.
    • Troubleshooting: Excellent problem-solving and troubleshooting abilities in distributed systems.

    Job Types: Full-time, Permanent

    Posted By Career Maker