Perl Jobs Senior Site Reliability Engineer

Senior Site Reliability Engineer

About the Company

Chainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry standard for real-world data access, off-chain computation, and secure cross-chain interoperability across any blockchain. The company helps enable verifiable applications in industries such as banking, DeFi, global trade, and gaming, working with global leaders like Swift, DTCC, and ANZ, as well as top Web3 teams including Aave, Compound, GMX, Maker, and Synthetix. Recognized as one of the Global Top 100 Most Loved Workplaces by Newsweek 2025, Chainlink Labs is committed to innovation, reliability, and building a more connected and secure decentralized future.

About the Role

The Observability Team ensures reliability across Chainlink products and services, empowering engineers to deliver crucial blockchain infrastructure. As a Senior Site Reliability Engineer (SRE) specializing in observability, you will design, build, and scale a modern OTEL-based observability platform that supports metrics, logs, and traces. This role is ideal for an engineer with a DevOps mindset, GitOps experience, and a passion for reliability and automation in distributed systems.

Responsibilities

  • Design, build, and orchestrate a modern observability platform leveraging OpenTelemetry.
  • Support multiple telemetry types (metrics, logs, traces) at scale.
  • Define and enforce observability governance standards.
  • Ensure system reliability, security, and performance exceed SLAs.
  • Collaborate with engineers to troubleshoot, deploy, and optimize services.
  • Lead monitoring and alerting solutions to detect issues proactively.
  • Ingest, transform, and analyze data from multiple real-time sources.
  • Manage the availability and scalability of observability infrastructure.
  • Establish and refine alert response processes and operational playbooks.
  • Advocate for reliability and security best practices across engineering teams.

Required Skills

  • 7+ years of experience in DevOps, infrastructure, platform, or SRE roles.
  • Strong programming skills in C, C++, Java, Python, Go, Perl, or Ruby.
  • Proven expertise in designing and managing large-scale real-time systems.
  • Experience with observability tools: Prometheus, Grafana, ELK/Splunk/Grafana Stack.
  • Strong background in Kubernetes and container orchestration.
  • Familiarity with distributed systems and high-availability infrastructure.
  • Strong communication skills, with experience in planning, code reviews, and collaborative troubleshooting.

Preferred Qualifications

  • Enthusiasm for blockchain and Web3 technologies.
  • Prior experience running infrastructure in blockchain/web3 environments.
  • Hands-on experience with AWS, Terraform/Terragrunt, ArgoCD, Calico, GitHub Actions, and Packer.
  • Experience working remotely in distributed engineering teams.
  • Proactive mindset focused on automation, scalability, and continuous improvement.

Head to the official website below for the full vacancy description and requirements: