Red Hat Service Reliability Engineer - OpenShift in Brno, Czech Republic
At Red Hat, we connect an innovative community of customers, partners, and contributors to deliver an open source stack of trusted, high-performing solutions. We offer cloud, Linux, middleware, storage, and virtualization technologies, together with award-winning global customer support, consulting, and implementation services. Red Hat is a rapidly growing company supporting more than 90% of Fortune 500 companies.
The Red Hat OpenShift Service Reliability Engineering (SRE) team is looking for a Service Reliability Engineer to join our team in Brno, Czech republic, or one of our teams in Poland, Spain, or Italy. Red Hat OpenShift is an enterprise Kubernetes and SRE is the first team to host and manage the code in the public cloud. In this role, you'll be responsible for keeping the Red Hat OpenShift Container Platform environment available and secure. Along with the rest of your team, you will cooperate with other SREs and product engineering associates around the world to deliver large, containerized cluster environments. You'll be responsible for provisioning, upgrades, problem detection and automated recovery scenarios, and incident management. You’ll also need to understand complicated, interconnected data points to be able to resolve faults when issues arise. As a Service Reliability Engineer, you'll need the ability to work in a complicated and fast-paced environment while quickly learning new skills and creating ways to consistently meet service-level agreements (SLAs) and keep enterprise Kubernetes, a globally-distributed, cloud-based, containerized service, running for our customers.
Primary job responsibilities
Interact with automated monitoring and healing infrastructure to ensure healthy environments
Develop automation to autocorrect or completely prevent issues in our online solution
Participate in release cycles of our offerings, deploying code to integration, staging and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management
Perform software updates, peer code reviews, testing, and common vulnerabilities and exposures (CVE) analyses; respond to security threats
Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
Resolve customer issues in cooperation with Red Hat's global customer support team
Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment
Participate in regular shifts and in on-call rotations; a weekend working schedule will be provided
Commercial experience managing Linux servers like RHEL, CentOS, or Fedora, hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure
Experience with enterprise systems monitoring; knowledge of Zabbix or Nagios is a plus
Experience with enterprise configuration management software like Red Hat Ansible Automation, Puppet, or Chef
Experience delivering a hosted service
Basic scripting knowledge in Python and Bash
Demonstrated ability to quickly and accurately troubleshoot system issues
Solid understanding of standard TCP and IP networking and common protocols like DNS and HTTP
Experience with Kubernetes and docker-based containers is a plus
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.
Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.
Job ID 67267
Category Software Engineering