Staff Software Engineer - Site Reliability and Observability

Teraswitch
Pittsburgh, PA

Who We Are

Teraswitch is on a mission to provide the highest performance, lowest latency bare metal servers in the world. With 20 datacenter locations around the world, Teraswitch has served thousands of customers across 185 countries with our solutions. Founded by Brendan Mannella, Teraswitch is one of the largest privately-held infrastructure companies in the world.

The Job

The Software Engineering Site Reliability Engineer (SRE) is a Software Engineer responsible for ensuring the reliability, scalability, and performance of software systems. Their job profile includes:

  • System Monitoring and Troubleshooting: Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.

  • Automation and Infrastructure: Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.

  • Performance Optimization: Analyzing system performance, identifying bottlenecks, and implementing optimizations to improve the efficiency and scalability of software systems.

  • Incident Response and Root Cause Analysis: Responding to incidents, conducting root cause analysis, and implementing corrective actions to prevent similar incidents in the future.

  • Collaboration with Development Teams: Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.

  • Continuous Improvement: Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.

  • Develop Systems for Internal Developers: Identify areas that can be improved in the Software Development Lifecycle to remove cognitive overhead on developers and help them on the happy path towards developers sustainable, reliable, and resilient software utilizing industry standard practices

Additional Job Description

What You'll Do

  • Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and provide a holistic view of the environment.

  • Deliver tools/software to improve the reliability, scalability and operability of services.

  • Collaborate with engineering teams to analyze and provide inputs in architecture, infrastructure resources, observability to achieve reliability and scalability goals.

  • Serve as a technical leader for key initiatives across the organization, identify potential issues and opportunities, and lead teams to architect the next generation reliability software.

  • Deliver impact by building software that helps maintain reliability on our backend and frontend systems.

  • Improve best practices through developing technical implementations that solve multiple developer and business needs.

  • Participate in 24/7 On-call Rotation of critical systems.

Your Skills & Abilities (Required Qualifications)

  • 7+ years of hands-on SRE experience (software development, systems monitoring) with Software Development experience (Java, golang, python)

  • Experience building and operating high-availability, fault-tolerant, scalable, distributed software in production: Building monitoring, defining alerts, writing run books, establishing dashboards etc.

  • Experience with monitoring and logging tools, such as Grafana, Loki, Logstash, Clickhouse, etc

  • Experience with owning and maintaining software including the SDLC and deployment.

  • Strong working knowledge of Docker, Kubernetes, Terraform, Chef or Ansible .

  • Experience troubleshooting production applications driving mitigation and remediation.

  • BS/MS in Computer Science/Engineering preferred

Compensation and Benefits

Along with competitive pay, as a full-time Teraswitch employee, you are eligible for the following benefits at day 1 of hire:

Health, Dental and Vision Insurance

401k with company profit sharing

Flex PTO and 11 Company Paid Holidays

Posted 2026-02-10

Recommended Jobs

Manager, Commerical Excellence

WebMD
Yardley, PA

Description Position at WebMD WebMD is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status,…

View Details
Posted 2026-01-28

Landscape Purchasing Coordinator

Hively Landscapes
Dover, PA

Hively Landscapes, a leader in outdoor design since 1968, is seeking a dedicated Landscape Purchasing Coordinator to join our team in Dover, PA. This role is essential in ensuring the smooth procurem…

View Details
Posted 2026-01-25

General Labor

FullSteam
Pittston, PA

Now hiring General Labor Workers in Pittston, PA - Apply Now! Job Title: General Labor Pay Rate: $15/hr. Shifts: 1st (7:30 AM- 3 PM) and 2nd (3 PM- 10:30 PM) Benefits : Medical, Dental,…

View Details
Posted 2026-01-08

Field Service Representative I - MultiChem - 205093

Halliburton
Mansfield, PA

We are looking for the right people - people who want to innovate, achieve, grow and lead. We attract and retain the best talent by investing in our employees and empowering them to develop themselves…

View Details
Posted 2025-12-31

Community Recovery Specialist

Resources for Human Development
Slatington, PA

Community Recovery Specialist Swing shift P/T-8hrs-Could be day-middle shift. Job Posting Title: COMMUNITY RECOVERY SPEC Job Description: Community Recovery Specialist Quakertown, PA …

View Details
Posted 2025-10-16

Senior Data Engineer

Highmark Health
Harrisburg, PA

Company : enGen Job Description : JOB SUMMARY We are seeking a highly skilled and adaptable Senior FHIR Interoperability Engineer to drive the reliable movement, transformation, …

View Details
Posted 2026-02-09

Household Science Instructor (Part Time, After School, In-Person)

Concorde Education
Reading, PA

Schedule: Typically 1 hour per week for 10 weeks (after school); exact days/times vary by assignment Location: On-site at a partner school; varies by assignment Start Date: Rolling openings …

View Details
Posted 2026-02-09

Data Warehousing Professionals - Future Opportunities (Pipeline Posting)

LingaTech
Lancaster, PA

Location: Greater Harrisburg, PA (Hybrid/Onsite Options May Vary) Job Type: Future Opportunity – Talent Pipeline Status: Not an Active Opening Job Summary LingaTech is building a pipeli…

View Details
Posted 2026-01-23

Project Cost Accountant

Fluitron LLC
Philadelphia, PA

: Description: I. General Summary of Position The Project Cost Accountant is responsible for general and cost and accounting functions focusing on customer projects. This includes calculating rev…

View Details
Posted 2026-02-08

Member Experience Specialist (Hybrid)

Paoli, PA

A leading, mission-driven organization is seeking a motivated and people-focused Member Experience Specialist to join our team in a hybrid capacity. This is a fantastic opportunity for a professional …

View Details
Posted 2026-01-29