Site Reliability Engineer (shiftwork)
Site Reliability Engineer (shiftwork)
Ho Chi Minh City, VN, 700000 Hanoi, VN, 10000
About GFT
GFT Technologies is driving the digital transformation of the world’s leading financial institutions. Other sectors, such as industry and insurance, also leverage GFT’s strong consulting and implementation skills across all aspects of pioneering technologies, such as cloud engineering, artificial intelligence, the Internet of Things for Industry 4.0, and blockchain.
With its in-depth technological expertise, strong partnerships and scalable IT solutions, GFT increases productivity in software development. This provides clients with faster access to new IT applications and innovative business models, while also reducing risk.
We’ve been a pioneer of near-shore delivery since 2001 and now offer an international team spanning 16 countries with a global workforce of over 9,000 people worldwide. GFT is recognised by industry analysts such as Everest Group as a leader amongst global mid-sized Service Integrators and ranked in the Top 20 leading global Service Integrators in exponential technologies such as Open Banking, Blockchain, Digital Banking, and App Services.
Role Summary
SRE ensures smooth day-to-day operations of the Bank. Understanding of production system access and control, production deployment, Amazon Web Services, Kubernetes, continuous deployment and systems observability is essential for this role.
Key Responsibilities
- Participate in on-call rotations(*) to provide support for critical systems. Engineers are required to work on a rotating 2-2-2 schedule: 2 morning shifts followed by 2 days off, 2 afternoon shifts followed by 2 days off, and 2 night shifts followed by 2 days off.
- Morning: 09:00 AM - 06:00 PM
- Afternoon: 05:00 PM - 02:00 AM
- Night: 01:00 AM - 10:00 AM
- Resolve system incident when occurs
- Deployment of changes into staging and production environments
- Work with Platform Engineers to understand the changes
- Develop deployment pipeline for changes
- Understand the changes and develop observability (monitoring and alert) according to the changes
- Develop and conduct resiliency testing solution
- Continuous enhancement of monitoring solution
- Create and update operation runbooks
- Automate operation runbooks
Required Skills and Qualifications
Technical Skill
- Strong experience with Amazon Web Services
- Strong experience and understanding of Kubernetes system
- Scripting skills with Python or Bash
- Experience in continuous deployment tools
- Harness (good to have)
- Experience in infrastructure as code (IaC) tools
- Terraform
- Experience with observability solutions
- Prometheus & Grafana
- SumoLogic (good to have)
Soft Skills
- Good in communication and able to communicate fluently in English
- Good problem solving skill
- Self-motivated and able to learn fast
What can we offer you?
- Competitive salary
- 13th-month salary guarantee
- Performance bonus
- Professional English course for employees
- Premium health insurance
Due to the high volume of applications we receive, we are unable to respond to every candidate individually. If you have not received a response from GFT regarding your application within 10 workdays, please consider that we have decided to proceed with other candidates. We truly appreciate your interest in GFT and thank you for your understanding.