Job Description
At Kalibri Labs, we are helping to redefine and rebuild the way performance metrics are viewed in the hotel industry. We are looking for passionate, energetic, and hardworking people with an entrepreneurial spirit, who dream big and challenge the status quo. We are working on cutting-edge solutions for the industry as they navigate the recovery process. We are using our big data coupled with machine learning and AI to help highlight the path forward. Kalibri Labs is growing, so if you’re ready to make a difference and utilize your talents across a groundbreaking organization, please keep reading!
We are looking for a MLOps Engineer to work with a product team to build a high quality, secure continuous deployment pipeline on our cloud-based infrastructure.
Responsibilities
- Work within cross-functional teams to ensure quality, availability and cost efficiency of Data Science solutions
- Integrate all aspects of a secure SDLC through highly available processes that support a painless continuous deployment pipeline
- Build observability throughout the product and pipeline. Detect and alert on issues before they impact customers
- Build CI/CD processes that supports the machine learning lifecycle to include self-serve experimentation, performance analysis, feature store integration, model development, hyperparameter tuning, evaluation, and deployment following DevOps best practices.
- Collaborate with project stakeholders to identify product and technical requirements. Conduct analysis to determine integration needs
- Maintain and expand existing AWS/Snowflake infrastructure with industry best practices, considering scalability, reliability, and cost
- Automate everything. Identify activities that can be automated, define and document the process, and then automate it.
Requirements
- Bachelor’s Degree in Computer Science, Information Systems, or a related technical field, or equivalent work experience.
- 3+ years experience creating and managing deployment pipelines using critical Data Science packages such as TensorFlow, scikit based packages, PyTorch etc, in a cloud based environment.
- 2+ years of experience designing and implementing scalable systems and applications on Amazon Web Services (AWS), with a focus on cloud architecture, cost optimization, and security.
- Experience automating infrastructure deployment and operations for scalable systems using tools like Terraform and Ansible.
- Confident working in container-based environments and docker.
- Linux administration and automation with shell, bash, and/or python.
- Experience instrumenting and operating observability systems such as new relic, grafana and prometheus, or datadog.
- Continuous integration and deployment with orchestration tools such, bitbucket pipelines, github actions, airflow, prefect, etc…