Is the Site Reliability Engineer role at Marscapital remote?

The Site Reliability Engineer role at Marscapital is an on-site position located in Dublin.

How do I apply for the Site Reliability Engineer position at Marscapital?

You can apply for the Site Reliability Engineer position at Marscapital directly through HireHere. Click the "Apply" button on the job listing to be taken to the application page.

Site Reliability Engineer

MarscapitalDublin12h ago

Job Specification: Site Reliability Engineer (Mid-Level)

Role Overview

We are seeking a Site Reliability Engineer (Mid-level) with strong expertise in AWS cloud infrastructure, containerized platforms, and Azure DevOps CI/CD pipelines. The successful candidate will focus on improving system reliability, availability, performance, and scalability while enabling engineering teams to deliver high-quality services efficiently.

This role blends software engineering with operational excellence, emphasizing automation, observability, incident response, and continuous improvement across cloud-native environments.

Note: This is a reliability-focused engineering role with on-call responsibilities and involvement in platform modernization initiatives.

Qualifications

Key Responsibilities

Design, build, and operate highly available AWS infrastructure using Infrastructure as Code (Terraform / CloudFormation).
Develop and maintain CI/CD pipelines to support automated deployments and testing.
Implement and manage EC2 / containerised workloads using Docker and Kubernetes (EKS/ECS).
Improve system reliability through automation, monitoring, alerting, and self-healing mechanisms.
Define and track SLIs/SLOs and error budgets for critical services.
Participate in incident response, lead root cause analysis, and drive post-incident improvements.
Build observability platforms using CloudWatch, Prometheus, Grafana, ELK, or similar tooling.
Automate operational tasks to reduce toil and improve deployment consistency.
Optimise AWS environments for performance, scalability, and cost efficiency.
Implement security best practices, including IAM, secrets management, and network segmentation.
Collaborate with development teams to improve application reliability and deployment strategies.
Maintain runbooks, architectural documentation, and operational playbooks.

Key Characteristics

Reliability-driven: Focused on uptime, performance, and resilience.
Automation-first mindset: Actively reduces manual effort and operational toil.
Ownership mentality: Takes responsibility for services from design through production.
Strong communicator: Clearly articulates incidents, improvements, and technical concepts.
Collaborative: Works closely with platform, security, and application teams.
Continuous learner: Keeps pace with SRE practices and cloud-native technologies.

Core Experience & Technical Skills

5–7 years of IT experience with at least 3+ years in SRE, DevOps, or Cloud Engineering roles.
Strong hands-on experience with AWS services including EC2, VPC, IAM, S3, RDS, CloudWatch, ALB/ELB, and Route53.
Proven experience creating, managing, and optimising CI/CD pipelines using Azure DevOps.
Solid Linux/Windows system administration and troubleshooting skills across production environments.
Hands-on experience with Docker for containerization and working knowledge of Kubernetes ECS/EKS, including container networking, scaling, rolling deployments, and service mesh concepts.
Strong experience implementing Infrastructure as Code using Terraform and/or CloudFormation.
Scripting proficiency in Bash and Python for automation and operational tooling.
Experience automating infrastructure provisioning, deployments, and operational workflows.
Practical experience implementing observability platforms, including monitoring, logging, and alerting solutions.
Strong understanding of SRE principles, including SLIs, SLOs, error budgets, incident management, postmortems, and capacity planning.
Familiarity with performance tuning, load testing, and reliability optimisation techniques.

Additional Information

D&I statement

Site Reliability Engineer

Qualifications

Additional Information

Explore Engineering

Skills in this job

People also search for