Senior Site Reliability Engineer

EPAM SystemsOzorków, PL
Published on

About the Role

We are currently seeking an experienced Senior Site Reliability Engineer (SRE) to join our team at EPAM Systems. In this role, you will work closely with software developers and operations teams to ensure the high reliability, scalability, and efficiency of our systems. Your proactive involvement will be key to enhancing system reliability, optimizing resource utilization, and ensuring continuous improvement in our operational practices.

Responsibilities

  • Collaborate with development, security, quality, and operation teams to implement SRE practices and ensure system reliability.
  • Define and support the required level of reliability, availability, and performance for services and applications.
  • Troubleshoot, mitigate, and support fixing of the infrastructure and application issues promptly.
  • Implement monitoring systems to ensure infrastructure and application reliability.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • Proven experience in any cloud (AWS/GCP/Azure).
  • Experience with implementing SRE practices such as SLO/SLI, Error budgets, and Incident Management.
  • Proficiency in Python or other scripting/programming languages.
  • A strong background in monitoring tools and CI/CD tools.
  • Solid knowledge of container orchestration technologies like Kubernetes or Docker.

Nice to Have

  • Expertise in deployment and management of LLMs, including technologies like RAG.
  • Certification in Kubernetes, AWS/GCP/Azure, or similar technologies.
  • Proven experience in DevOps practices.
  • Knowledge of managing and optimizing AI/ML models in production environments.

Company Culture and Benefits

At EPAM, we gather like-minded people to form a collaborative engineering community of industry professionals. We offer a friendly team and an enjoyable working environment, a flexible schedule, and opportunities to work remotely within Poland, with a chance to work abroad for up to 60 days annually. We provide significant growth opportunities through outstanding career roadmaps and programs for leadership development, soft skills, and well-being. Plus, we host various corporate, social, and well-being events to enrich your work experience.

Please note that the set of benefits may vary based on the role applied for and specifics will be discussed with our recruiter during the interview process. We will reach out to selected candidates exclusively.

Skills

Other Benefits

Flexible schedule and opportunity to work remotely within PolandChance to work abroad for up to 60 days annuallyBusiness-driven relocation opportunitiesOutstanding career roadmapLeadership development and career advisingSoft skills and well-being programsCertification opportunities (GCP, Azure, AWS)Unlimited access to LinkedIn Learning and other platformsHealth insurance and multisport packageEmployee Stock Purchase PlanReferral bonuses and corporate events