Site Reliability Engineer Job at Cognizant, Remote

QmhPK0JWYTBEYzVGUmJZVnZQYjFBbWRUQ1E9PQ==
  • Cognizant
  • Remote

Job Description

REMOTE-Site Reliability (AWS and Kubernetes focused) Engineer

Job Summary

As a Site Reliability Engineer, you are primarily responsible for the reliability, availability, and scalability of our production and non-production environments. You work alongside development and infrastructure teams to make sure new and existing code releases are stable and available to our internal customers. You develop ongoing observability through tools such as Datadog. You provide front-line support to all environments and respond to incidents as needed.

Responsibilities

· Work with engineering teams to gain deep knowledge of how our applications work so you can provide support and observability.

· Develop and maintain monitoring, alerting, and logging through observability tools such as Datadog.

· Automate tasks to ensure systems can self-heal where possible.

· Evaluate current running systems and provide improvements for performance and reliability.

· Take part in capacity planning and scaling initiatives for new environments.

· Develop runbooks for the SRE team and other incident response teams.

· Participate in on-call rotation and respond to incidents for production environments.

· Long term goal of developing SLIs and SLOs with engineering teams.

Qualifications

·8 – 10 years of experience in software and infrastructure operations and support.

· Expertise in cloud platforms such as AWS, GCP, or Azure.

· Expertise in troubleshooting complex application and infrastructure issues with a focus on networking and messaging between services.

· Strong experience in modern software application and infrastructure performance monitoring and tuning.

· Strong experience with monitoring solutions such as Datadog or New Relic.

· Experience with application performance monitoring tools such as Datadog APM or similar.

· Experience working with containerized applications using Docker and running on Kubernetes or similar.

· Experience in automation scripting using Bash and/or Python.

· Experience in source control systems such as Git and hosted solutions like Bitbucket, GitHub, and/or Gitlab for CI/CD.

· Experience in release management and making sure applications are stable post release.

· Experience in Linux system administration.

· Experience with messaging systems such as AWS SQS, RabbitMQ, Pulsar, or Kafka.

· Familiarity with relational databases such as PostgreSQL and Microsoft SQL Server.

· Familiarity with transaction testing tools such as Datadog Synthetic Tests and RUM.

· Familiarity with SRE concepts such as SLIs, SLOs, SLAs, and error budgets.

· Excellent communication and collaboration skills in working with cross-functional teams.

· Ability to take ownership of a project or system and complete it or make improvements while providing extensive communication and documentation.

· Ability to work independently and handle multiple priorities.

· Have a proactive mindset and be comfortable with being the point person on critical tasks.

  • The annual salary for this position is between $130,000 – $160,000+ Bonus depending on experience and other qualifications of the successful candidate. 
    This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and subject to the terms of Cognizant’s applicable plans. 
    Benefits : Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:
  • Medical/Dental/Vision/Life Insurance 
  • Paid holidays plus Paid Time Off 
  • 401(k) plan and contributions 
  • Long-term/Short-term Disability 
  • Paid Parental Leave 
  • Employee Stock Purchase Plan 
    Disclaimer: The salary, other compensation, and benefits information is accurate as of the date of this posting. Cognizant reserves the right to modify this information at any time, subject to applicable law.
      Applications will be accepted until October 2nd, 2025
  • #LI-SC6

Job Tags

Remote job, Full time, Temporary work,

Similar Jobs

Foundation 2 Crisis Services

Community Outreach Worker Job at Foundation 2 Crisis Services

 ...crisis center, including phone, text, and chat services. We also operate an emergency youth shelter, mental health support groups, community-based crisis programs, violence prevention, training, and more. We have been saving lives for over 50 years. Join our team today!... 

Grobrix

Urban Farmer Educator Job at Grobrix

 ...Urban Farmer Educator COMPANY OVERVIEW Green City Growers is an employee owned (ESOP) Certified B Corp and is the premier urban...  ...be required. Benefits: Medical, dental, vision, life insurance for employees and children after 90 days 401(k) retirement plan... 

JSAV

Audio Visual Technician Job at JSAV

 ...innovation, and teamwork.Specializing in live events, we provide cutting-edge audio-visual solutions that elevate conferences, meetings, and events within the hospitality space. Our skilled technicians and creative professionals work closely with clients to ensure seamless... 

Children’s Hospital Association

Front-End Web Developer Job at Children’s Hospital Association

 ...Lenexa, KS office. We leverage a hybrid working model working three days in the office, two days optional remote work. What Youll Do The Front-End Web Developer will play a key role in implementing new site designs and features, while also owning the development of... 

AMTeck International

Cloud Engineer (m/w/d) Job at AMTeck International

 ...Firewalls Erfahrungen mit greren Migrationsprojekten von On-Premise in die Cloud Erfahrungen in der berwachung von Systemen mit Elastic Stack sowie sehr gute Kenntnisse in der Bedienung von Windows-Rechnern und passender Skriptsprache wie z.b. Powershell Unsere...