At Sumo Logic, we empower software-driven businesses by providing unpreceded real-time visibility into their applications and infrastructure...
At Sumo Logic, we empower software-driven businesses by providing unpreceded real-time visibility into their applications and infrastructure via the analysis of vast quantities of machine data such as logs, metrics, and traces. However, the scale and complexity of this data continues to grow, outpacing the attention and capabilities of even the most skilled engineers and security professionals who rely upon our service. The Analytics team is therefore charged with exploring and delivering innovative product capabilities that augment human expertise with machine intelligence and global context.
As Machine Learning Platform Engineer , you will be a critical member of this team, with responsibility for the infrastructure that powers these product features: data pipelines, training jobs, automated model testing, deployment to real-time prediction endpoints, and operational monitoring. This internal platform leverages the scalability and flexibility of the underlying Sumo Logic system and provides a horizontal intelligence layer that drives user-facing features across the product such as Global Intelligence Services. Besides demanding scale and reliability requirements, our internal platform and processes must also adhere to exceptionally rigorous standards around data privacy and security.
We are looking for a versatile technologist whose responsibilities will include:
- assessing needs and solutions for large-scale data and model infrastructure
- owning the uptime and reliability of ML-related services and capabilities
- collaborating with the team to develop solutions using the ML platform
- developing supporting tooling, automation, and microservices to extend the platform
- continuous measurement and improvement of quality and performance/efficiency
Nice-to-have Technology Experiences
- B.S. / M.S. / Ph.D. in Computer Science or related disciplines
- ≥ 4 years professional experience in software engineering (or ≥ 2 years if ≥ M.S. degree)
- excellent collaboration and communication skills
- strong technical background in: software engineering of production-grade services in cloud environments, theoretical thinking, problem formulation and solving
- operations orientation: SLIs/SLOs, monitoring and troubleshooting, on-call rotations
- Scala, Python
- Cloud provider ML frameworks (eg, AWS Sagemaker)
- Application and infrastructure deployment and management with Kubernetes and Terraform
- Workflow management (eg, Airflow)
- Common ML libraries and concepts (eg, scikit-learn, tensorflow)
What we do:
We are a cloud-native SaaS machine data analytics platform, solving complex monitoring problems for DevOps, SecOps and ITOps teams. Customers love our product because it allows them to easily monitor and optimize their mission critical, large scale applications.
Our microservices architecture in AWS ingests hundreds of terabytes daily across many geographic regions. Millions of queries a day analyze hundreds of petabytes of data.
Democratize machine data analytics through the Sumo Logic platform, bringing real-time data insights securely through the cloud.