NVIDIA
Senior Site Reliability Engineer - GeForce NOW
This job is now closed
Job Description
- Req#: JR1981750
Working on building tools to improve the SRE Observability.
Be part of Kubernetes migration journey with VMI setup and problem solving.
Rapidly debug and triage incidents and user-reported issues
Taking ownership of automating, scripting, and tooling of new/existing scripts to help the team achieve 100% automation of daily tasks
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity management and launch reviews.
Be part of an on call rotation to support production systems
MS or BS in Computer Science/Engineering or a related field or equivalent experience.
8+ year’s Site reliability engineering experience working on large scale distributed micro services in a production environment with a real passion for automation and tooling.
Very strong Kubernetes background and ability to understand Kubernetes with complex and highly available VMI setup on K8's.
Lead significant production improvements including change management, post-mortem reviews, workflow processes, design and deliver software automation in various languages.
Confirmed strengths in problem-solving and root causing issues, while continuously seeking ways to drive optimization, efficiency and the bottom line.
Previous experience with Datadog, Prometheus, alert manager or similar monitoring systems.
Jenkins (or similar CI/CD) setup, configuration, deployment is a requirement
Excellent communication, presentation, social, and analytical skills; the ability to communicate complex interaction concepts clearly and persuasively across different audiences and varying levels of the organization.
Experience with Stack Storm, Prometheus, and Kubernetes and similar are bonuses.
Prior experience as an SRE or Service Engineering is a huge plus.
We are now looking for a Sr. Site Reliability Engineer (SRE). NVIDIA is looking for a Senior Site Reliability Engineer (SRE) to join its GeForce Now (GFN) team. SRE at NVIDIA ensures that our internal and external facing GPU cloud gaming services have reliability and uptime as promised to the users and at the same time enabling developers to make changes to the existing system through careful preparation and planning while keeping an eye on capacity, latency and performance. As SREs are responsible for the big picture of how our systems relate to each other, we use a breadth of tools and approaches to tackle complex problems.
The person in this position will be responsible for Service Response and Workflows and will drive tools/service development to maintain and improve service SLOs. We partner with Service Owners to drive reliability of the service. The GFN Service is an exciting service in the newly growing game streaming industry.
What you will be doing:
What we need to see:
Ways to stand out from the crowd:
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you
The base salary range is 164,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
About the company
9637389 Nvidia Corporation is an American multinational technology company incorporated in Delaware and based in Santa Clara, California.
Notice
Talentify is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status.
Talentify provides reasonable accommodations to qualified applicants with disabilities, including disabled veterans. Request assistance at accessibility@talentify.io or 407-000-0000.
Federal law requires every new hire to complete Form I-9 and present proof of identity and U.S. work eligibility.
An Automated Employment Decision Tool (AEDT) will score your job-related skills and responses. Bias-audit & data-use details: www.talentify.io/bias-audit-report. NYC applicants may request an alternative process or accommodation at aedt@talentify.io or 407-000-0000.