This job is now closed
Job Description
- Req#: JR1978593
Triage customer issues involving DGX clusters (GPU-based supercomputers interconnected with network fabric), network adapters and DPUs, including both InfiniBand and Ethernet technologies
Take ownership and drive critical customer issues to resolution
Collaborate with engineering to document, recreate and solve issues
Develop features and tools as part of solution engineering efforts to support all Enterprise Service offerings including InfiniBand, Ethernet, and GPU server technologies
Occasional work on weekends and holidays to support customers
BS in Computer Science, Electrical Engineering, Computer Engineering, related field, or equivalent experience
10+ years of proven experience developing with C for network and/or server equipment or infrastructure
10+ years of proven experience within the customer support escalation path, or providing direct customer-facing support
Linux OS including System and Network Administration at a RHCE level
Expertise in Linux and Unix-type OSes at the kernel, driver, system, and user-space layers
Proven ability to deeply analyze/develop networking protocols (ARP, STP, LACP, MLAG, IGMP, PIM, BGP, OSPF) within drivers/network operating systems
Experience analyzing vendor interoperability problems and RFCs
Containerized solutions experience on a level of DCA and/or CKA
Excellent communication, verbal and written English skills
InfiniBand, OFED, MOFED, RDMA, ROCE and GPU Technology
Clustering or HPC Data-Center technologies including Upper Layer Protocols (i.e., NCCL, MPI)
Additional Operating Systems such as Microsoft Windows, VMware
Expertise with shell scripting (Python/Bash)
Networking and Linux Certifications such as CCIE, JNCIE, RHCE, VCDX
We are looking for an experienced software engineer who excels solving customer problems with DGX clusters (GPU-based supercomputers interconnected with InfiniBand network fabric). The ideal candidate also has experience developing drivers for network adapters. We need someone who will bridge the gap between the NVIDIA Enterprise Experience (NVEX) support team and the product development groups. An exciting candidate is highly technical who can both triage sophisticated customer hardware and network issues as well as develop key software enhancements and tools for NVIDIA InfiniBand and GPU products.
Along with the NVEX support responsibilities, this job enables you to develop software for the same products you support. You can make a difference in both the current customer experience as well as the product's future! The individual will get to work with many NVIDIA teams and customers, so superb interpersonal and communication skills are critical. You must be able to understand, root cause, and resolve complex issues, and provide detailed explanations of what you find.
What you will be doing:
What we need to see:
Ways to stand out from the crowd:
Knowledge and working experience with the following:
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
About the company
9637389 Nvidia Corporation is an American multinational technology company incorporated in Delaware and based in Santa Clara, California.
Notice
Talentify is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status.
Talentify provides reasonable accommodations to qualified applicants with disabilities, including disabled veterans. Request assistance at accessibility@talentify.io or 407-000-0000.
Federal law requires every new hire to complete Form I-9 and present proof of identity and U.S. work eligibility.
An Automated Employment Decision Tool (AEDT) will score your job-related skills and responses. Bias-audit & data-use details: www.talentify.io/bias-audit-report. NYC applicants may request an alternative process or accommodation at aedt@talentify.io or 407-000-0000.