What's your preference?
Job Description
- Req#: JR2001681
Assist various network and AI cluster support teams in reproducing, resolving, and root causing sophisticated customer issues
Work with R&D teams to develop bug fixes, workarounds, and solutions for critical customers using NVIDIA’s network technologies
Become an authority in NVIDIA network technologies used in AI clusters such as Infiniband, NVLink, and Spectrum-X
Analyze network performance metrics and make tuning recommendations for high-performance, lossless networks
Develop support and analysis tools to help analyze and root cause field issues
Daily use of ground breaking AI tools for software development, log and trace analysis, and source code debugging
Occasional work on weekends or holidays to support customers
Minimum of a BS in Computer, Electrical, or Software Engineering (or equivalent experience)
5-10 years of experience in C programming in Linux and embedded systems
Proficiency in Python
At least 5 years of experience developing software for one or more of the following:
Linux NIC drivers, switch ASICs and SDKs, embedded network device firmware, Linux based network equipment (routers, switches, gateways, etc), network operating systems, virtual routers, SDN stacks, virtual switching, DPDK, SRIOV stacksAt least 5 years of experience directly supporting end-customers, partners, or integrators for network equipment and infrastructures
Strong system software (firmware, BIOS, kernel, driver, operating system) expertise
Experience with container environments (K8s and Docker)
Professional-level communication skills, including adjusting communication to the technical level of the audience, and staying calm and focused in negative situations.
Passion for learning innovative tech and motivation to work hard on ground-breaking products
Background with AI infrastructure and HPC networking
Experience programming switch and NIC ASICs and SDKs
Experience with Infiniband or other non-Ethernet network technologies
Experience developing or supporting DPUs or SmartNICs
Knowledge of HPC performance test tools and NVIDIA AI stacks (NCCL, MPI, DOCA, CUDA)
The NVIDIA Enterprise Experience (NVEX) Solutions Engineering team is looking for a senior Computer or Software Engineer who is ready to become an authority in ground-breaking network technology used in AI clusters. Our team of software engineers bridge the gap between the customer support teams and R&D, focusing on resolution of tough problems from the front lines and providing the highest level of support for InfiniBand, NVLink, and Spectrum-X network systems that interconnect GPUs and AI compute infrastructure.
Candidates must have a software development background in the networking industry either for a network hardware manufacturer or software integrator. It is essential to have a proven grasp of in-field, production network operations and have experience in root-causing customer-found issues down to the source code level, primarily C and Python. Breadth of experience is key. We want to see experience in multiple areas such as network operating systems (NOS), Linux network drivers and internals, network hardware, NIC software, Smart NICs, DPUs, embedded firmware, Software Defined Networking, and infrastructure management technologies. IPC, race conditions, finite state machines, event processing loops, queue management, network traffic and flow analysis, and software design gaps will be common areas of focus. The individual will get to work across many NVIDIA teams and often interact with both internal and external customers, so superb interpersonal and communication skills are essential. Candidates will need to understand, root cause, and resolve complex issues, and provide detailed explanations of what you find.
What you will be doing:
What we need to see:Ways to stand out from the crowd:
You will also be eligible for equity and benefits.
About the company
9637389 Nvidia Corporation is an American multinational technology company incorporated in Delaware and based in Santa Clara, California.
Notice
Talentify is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status.
Talentify provides reasonable accommodations to qualified applicants with disabilities, including disabled veterans. Request assistance at accessibility@talentify.io or 407-000-0000.
Federal law requires every new hire to complete Form I-9 and present proof of identity and U.S. work eligibility.
An Automated Employment Decision Tool (AEDT) will score your job-related skills and responses. Bias-audit & data-use details: www.talentify.io/bias-audit-report. NYC applicants may request an alternative process or accommodation at aedt@talentify.io or 407-000-0000.