Software Engineer - Site Reliability Engineer (SRE)

Lovelace AI
Pittsburgh, PA

About Us:

Lovelace AI was born from the desire to apply state of the art AI and systems engineering to the question of human safety, especially in dangerous conditions such as conflict, disaster response, anti-terrorism and deterrence against AIs designed by adversaries to harm civilians.

How many lives can be saved by taming the information overload, confusion and conflicting priorities experienced by the people responsible for dealing with dangerous situations around the world? We believe the answer is potentially dramatic, and we are determined to create a team with the wisdom, skills, brainpower, thoughtfulness and experience to make this vision real.

Job Summary:

Lovelace AI is seeking a highly skilled and motivated Software Engineer - Site Reliability Engineer (SRE) to join our growing team. As an SRE at Lovelace AI, you will play a critical role in ensuring the availability, scalability, and performance of our cutting-edge AI-powered applications and infrastructure. You will bridge the gap between software development and operations, applying sound engineering principles and automation to maintain and improve our systems. ​​​​ Key Responsibilities:
  • Design, implement, and maintain robust monitoring, alerting, and observability solutions to proactively detect and resolve issues before they impact end-users.
  • Lead troubleshooting efforts for complex production issues, providing detailed root cause analysis (RCA) and implementing preventative measures.
  • Develop and maintain automation scripts, build systems (Bazel) and infrastructure as code (IaC) using tools like Terraform, Ansible, or CloudFormation to eliminate manual tasks and improve system reliability and efficiency.
  • Collaborate closely with software engineering teams to influence the design of new services and applications, ensuring they are scalable, reliable, and resilient from the outset.
  • Participate in on-call rotations to respond to platform emergencies, alerts, and escalations, ensuring high service uptime.
  • Analyze system performance and recommend optimizations for scalability, reliability, and efficiency.
  • Implement and enforce best practices in deployment, monitoring, and incident management to continuously improve overall system reliability and reduce downtime.
  • Develop and maintain internal tools that streamline complex operations, track bugs, manage CI/CD pipelines, and facilitate cross-team communication.
  • Conduct post-incident reviews, documenting software problems and solutions in a shared knowledge base to prevent similar issues in the future.
  • Assist with vulnerability management, system patching, and implementing security measures to protect the integrity and availability of services.

Qualifications:

  • 5+ years of experience in site reliability engineering, DevOps, systems administration, or related roles.
  • Proven track record of managing complex infrastructure, troubleshooting production issues, and optimizing system performance in high-scale environments.
  • Strong experience with Linux/Unix administration and proficiency in scripting languages (e.g., Python, Bash, Go).
  • Deep understanding of cloud platforms (AWS, GCP, Azure) and related services (e.g., EC2, S3, Lambda, Kubernetes).
  • Experience with containerization and orchestration technologies like Docker and Kubernetes.
  • Proficiency with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Dynatrace, ELK Stack).
  • Strong understanding of networking fundamentals (DNS, TCP/IP), load balancing, and CDNs.
  • Experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI) and infrastructure automation.
  • Familiarity with distributed systems and microservices architecture.
  • Excellent problem-solving and troubleshooting skills.
  • Strong analytical skills with the ability to identify Service Level Indicators (SLIs) and align efforts to meet availability and latency objectives.
  • Ability to balance both development and support roles effectively.
  • Strong interpersonal skills and excellent communication skills, with the ability to collaborate effectively across various teams.
  • Experience in working on projects that involve business segments.

Benefits:

LovelaceAI offers competitive compensation packages, comprehensive benefits. We provide a supportive and inclusive work environment where your skills and expertise can make a significant impact on the safety and security of our communities.

Lovelace’s founding team includes:

Andrew Moore, who has a track record of building impactful AI systems, designing them with human rights impact assessments as a top priority, leading the AI division of one of the world’s foremost cloud companies, and actively participating in machine learning and AI research over the past two decades.

Brendan Dunne, a career Special Operations veteran and retired Army officer who has led high performing cross functional teams the past 20 years in the country’s premier national mission force. He was most recently in charge of US Special Operations Command’s Global Analytics Platform (aka the GAP), one of DoD’s leading technology platforms.

Toby Smith, well known in the Pittsburgh Tech community for his engineering leadership and design skills, and who has led many of the most ambitious and complex system infrastructure projects in Google Pittsburgh and NetApp.

Here is a note from Andrew Moore to people who are reading these Job Postings:

“Hi folks, I’m so glad you are potentially interested in Lovelace AI. This area means a lot to me because while I am an AI optimist, I also think that we technologists owe it to a rightly skeptical world to show that modern intelligent systems can actually be useful. Usefulness comes in many guises: from life sciences to education and from transportation to entertainment and many others. For many of us, security and public safety is also very high on that list. That reasoning leads to this conclusion: I’m determined to make sure that the people building Lovelace AI gain a lot from the experience, including the chance to solve fascinating problems in computer science, AI, business development, customer success and product management. I also hope that we all learn from each other in a highly enriching work environment. But my main hope is that we have a shared sense of accomplishment as we see an increasing number of national security and public safety domains made safer through sensible and robust use of advanced computer science."

Posted 2025-09-10

Recommended Jobs

Nurse Practitioner

She Recruits, LLC
Reading, PA

Nurse Practitioner – Geriatric Primary Care Full-Time | Monday–Friday | Day Shift | No Weekends or Holidays Step into a role where your expertise and compassion truly make an impact. We are see…

View Details
Posted 2025-09-07

Discover Historic Philadelphia: Care for Newborns in Style!

NurseRecruiter
Philadelphia, PA

Registered Nurse - Neonatal Intensive Care - Travel - (NICU RN) Embark on a rewarding journey as a NICU Registered Nurse in historic Philadelphia! Join a dedicated team committed to providing excepti…

View Details
Posted 2025-07-31

Shop Foreman

Transteck, Inc
Jonestown, PA

Job Description Job Description Transteck, Inc. has an immediate opening for a 2nd shift Shop Foreman at our Freightliner of Lebanon location. If you possess the skills and experience we are loo…

View Details
Posted 2025-08-18

Locum Tenens Gastroenterology Job York, PA

CompHealth CompHealth
York, PA

Some locum assignments can be as short as a day, others, years. Some are far from home, others are local. Whatever it is you're looking for, we offer true opportunities, not just postings. CompHealth …

View Details
Posted 2025-09-05

In Home Sales Consultant

Revelare Kitchens
Lancaster, PA

Job Description Job Description Realistically make $1,500 to $2,500 per week (or more) as a Kitchen Sales Designer. There is NO COLD CALLING, no canvassing, no prospecting of any kind. You jus…

View Details
Posted 2025-07-28

Sales Representative Pittsburgh, PA

Maharam Fabric Corporation
Pittsburgh, PA

Maharam is seeking a Sales Representative for our Pittsburgh territory. The territory would consist of Pittsburgh and upstate New York. The Sales Representative will be responsible for selling Edelman…

View Details
Posted 2025-08-06

DevOps Engineer

Pittsburgh, PA

DevOps Engineer Position Description CGI is looking for a skilled DevOps Engineer to design, implement, and maintain automated CI/CD pipelines that enable rapid and reliable software delivery. T…

View Details
Posted 2025-09-16

Sr Supervisor Food Safety and Quality

McCain Foods Limited
Easton, PA

​ Position Title: Sr Supervisor Food Safety and Quality Position Type: Regular - Full-Time ​ Position Location: Easton  Grade: Grade 05 Requisition ID: 36780  ​ ​ In every ro…

View Details
Posted 2025-09-05

AWS Alliance Driver, Director Save for Later Remove job

PwC
Philadelphia, PA

At PwC, our people in brand management, marketing and sales focus on collaboration to develop and execute strategic sales and marketing initiatives. These individuals focus on driving revenue growt…

View Details
Posted 2025-09-04

Chief Accountant

Portage Area Regional Transportation Authority
Washington, PA

Description JOB RESPONSIBILITIES (Performs other related duties as assigned.) Under the general direction of the Finance Director, performs accounting and administrative work in the daily manageme…

View Details
Posted 2025-09-14