Lead DevOps Engineer

Brick

Brick

Software Engineering
Posted on Friday, May 3, 2024
💻

Lead DevOps Engineer

Tags
Product & Engineering
About Brick
Brick (https://www.onebrick.io/) is the leading financial API provider in Southeast Asia supported by leading VCs (Better Tomorrow Ventures, Antler, 1982 Ventures, Trihill Capital, Flourish Ventures); and founders and executives from leading fintechs (Modalku, BukuWarung, Aspire, Nium, GoCardless, Cred, Pine Labs, Plaid, TrueLayer). See TechCrunch, e27, Tech in Asia ID.
The Brick team has been rethinking financial services at top fintechs. Now, we are on a mission to power the next generation of fintech companies in Southeast Asia - with modern fintech infrastructure allowing tech companies to offer personalised and inclusive financial services with a single line of code. Brick makes open finance easy for partners including the largest Indonesian conglomerates and fintechs.
What you will do
Provide 2nd-level On-call incident and change management support for customer incidents
Debug production reliability issues across services, perform root cause analysis, and ensure solutions are well-engineered, maintainable, and delivered on schedule
A bridge between development and operations by applying a software engineering mindset to the system.
Build self-service tools for user groups that rely on their services (e.g. Integrate alert mechanism to the system, create internal tools to automate or simplify workloads, build automatic provisioning of test environments, logs, and statistics visualization).
Closely collaborate with developers to ensure the designed solution responds to non-functional requirements such as availability, performance, security, and maintainability (e.g. Log management for searching logs across multiple applications, defining error standardization, etc).
Prepare routine operation documentation.
Closely collaborate with the QA Engineer to perform site reliability testing for infrastructure and applications.
Continuous improvement of software delivery pipeline for process efficiency. Monitor the availability, latency, and overall system health, including adjustment on infrastructure server autoscaling for flexible scaling according to traffic growth and seasonality.
Closely collaborate with the Engineers, QA team, VP of Engineering, and Product team to provide technical assistance to improve system performance, capacity, reliability, and scalability.
Willingness to enhance the team learning environment by providing and encouraging mentorship and technical leadership
What you need
3+ years deploying distributed apps with containers (Docker) & orchestration (Kubernetes, EKS, GKE)
Demonstrated ability working within and building on AWS (IAM, Orgs, API Gateway, Lambda, KMS)
4+ years of Linux system engineering experience (Red Hat and Debian Family, etc.).
3+ years of software version control experience (Gitlab, Github, BitBucket).
3+ years experience deploying with a CI orchestration service (Jenkins, Spinnaker)
2+ years of working experience in scripting/programming languages (Shell, Python, Golang, Ruby)
Experience working with the life-cycle of a help desk incident as 2nd-level support.
Experience with Linux OS (Red Hat and Debian Family, etc.) and software version control (Git) (Gitlab, Github, BitBucket).
Have knowledge of distributed service architecture, such as load balancing, service discovery, distributed caching, and distributed tracing.
Experience analyzing, monitoring, and troubleshooting large-scale, high-traffic distributed systems.
Have strong programming skills (Python, Java, Go) and experience with scripting languages (Shell script).
Have knowledge of databases (PostgreSQL, Redis, NoSQL, etc).
Have knowledge about messaging systems (RabbitMQ, Kafka, SQS).
Experience with monitoring tools like Grafana, Data Dog, and ELK Stacks.
Experience creating automation tools in Ansible or Jenkins.
Experience with container technology and orchestration (Docker, Kubernetes).
A team player with great communication skills both verbal and written.
Detailed-oriented, cautious, and prudent.
Passion and a high sense of responsibility for work
Preferred skills
Experience with automation tools like Terraform and Packer.
Experience with database optimization (PostgreSQL, Redis, NoSQL, etc).
Experience with monitoring tools like Datadogs, Zabbix, Prometheus, etc
Experience with load balancing tools like LVS, Nginx, Openresty, or HAProxy
Experience with container technology such as Docker, Kubernetes, and Apache Mesos.
Experience with Log Management and Analytics tools such as Splunk / ELK
Experience with High Availability system design.
What’s on offer
Competitive Salary package
Monthly incentive based on performance
Private Healthcare
Employee Stock Options
Yearly Bonus
Additional COVID-19 Benefits
Company Laptop
Remote friendly environment with flexible working hours