Red Hat Associate Service Reliability Engineer - OpenShift in Brno, Czech Republic
At Red Hat, we connect an innovative community of customers, partners, and contributors to deliver an open source stack of trusted, high-performing solutions. We offer cloud, Linux, middleware, storage, and virtualization technologies, together with award-winning global customer support, consulting, and implementation services. Red Hat is a rapidly growing company supporting more than 90% of Fortune 500 companies.
The Red Hat OpenShift Service Reliability Engineering (SRE) team is looking for an Associate Service Reliability Engineer to join our team in Brno, Czech Republic. Red Hat OpenShift is a leading enterprise Kubernetes container platform and SRE is the first team to host and manage the code in the public cloud. In this role, you will play a key role within the team, as you’ll be responsible for keeping the Red Hat OpenShift Container Platform environment available and secure. Along with the rest of your team, you will interact with other SRE teams and product engineering associates around the world to deliver large, containerized cluster environments. You'll be responsible for provisioning, upgrades, problem detection and automated recovery scenarios, incident management, and understanding complicated, interconnected data points to resolve faults when issues arise. As an Associate Service Reliability Engineer, you’ll need to be able to work in a complicated and fast-paced environment while quickly learning new skills. In addition, you’ll create ways to consistently meet service-level agreements (SLAs) and keep the globally-distributed, cloud-based, and containerized enterprise Kubernetes smoothly running for our customers.
Primary job responsibilities
Work to automatically detect potential issues in a large virtualized environment
Write automation scripts to auto-correct or completely prevent issues in our online offerings
Track and review changes in a highly dynamic environment
Identify single points of failure and other high-risk architecture issues and propose more resilient solutions
Perform and oversee releases to ensure that proper life cycle and policies are followed
Perform software updates, testing, and CVE analyses
Respond to security threats
Take part in both regular shifts and on-call rotations with weekend work required
Experience running Linux servers of any distribution, but preferably Red Hat Enterprise Linux (RHEL), CentOS, or Fedora
Basic knowledge of monitoring systems, preferably Zabbix or Nagios
Basic knowledge of configuration management systems like Puppet or Chef; experience with Red Hat Ansible Automation is preferred
Some experience with cloud technologies like Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), OpenStack, Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure
Demonstrated ability to quickly and accurately troubleshoot issues
Solid Bash scripting skills with an ability to read code and write simple scripts in other languages, preferably in Python
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.
Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.
Job ID 68827
Category Software Engineering