AI Insights
NVIDIA

Distinguished Engineer - Rack Scale Architecture

NVIDIA · Santa Clara, California, US
full-timemid (15-25 yrs)Posted 80d ago
Software EngineeringIC6IC + ManagementOn-site
StackSystem ArchitectureFirmwareKernel DriversLinuxNetworking ProtocolsEthernetInfiniBandRedfishIPMIGPU ArchitectureDPUFPGACXLUCIeNVLinkOut-of-Band ManagementIn-Band ManagementPlatform SecurityHPC Software StackData Center ArchitectureCluster ManagementStorage TechnologiesScale-Up Fabric Design

Summary

A Distinguished Engineer role at NVIDIA focused on driving software end-to-end architecture for rack-scale products, spanning firmware, kernel drivers, OS, networking, and manageability software. The role requires deep SW/HW interface expertise and 15+ years of system architecture experience, with direct engagement with hyperscalar/cloud customers.

About the role

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.

NVIDIA has a rapidly expanding ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These designs have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. Each bringing together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're searching for a highly motivated, technical leader to drive the engineering roadmap and innovation for our rack system software architecture. From firmware, kernel drivers, operating systems, networking, fabrics and associated user mode drivers + manageability software. You will work with component leads internally and engage with industry leading hyperscalar / cloud service providers on taking these products to market.

What you’ll be doing:

  • Drive the software end-to-end architecture for NVIDIA's rack-scale products

  • Maintain deep understanding of the product portfolio and roadmap; translate forward-looking plans into clear, formal software requirements that anchor execution across the organization.

  • Ensure high quality & reliable software; serving as a trusted architectural partner to teams requiring guidance or oversight.

  • Work directly with major customers to understand their requirements and work to align their roadmap with NVIDIA’s roadmap.

  • Work with business partners and vendors to shape their products to meet NVIDIA’s needs.

  • Develop a roadmap of new technologies and protocols; drive their design and adoption.

  • Mentor architects and engineering teams to grow them into future leaders.

  • Make key technical decisions even when faced with ambiguity

What we need to see:

  • BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.

  • 15+ years in the area of System architecture and design

  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.

  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband)

  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs

  • Expertise in out-of-band and in-band management architectures.

  • Knowledge of system management protocols such as Redfish and IPMI.

  • Experience working with platform security experts to define tradeoffs between security and ease of use.

  • Demonstrable experience in implementing left shift strategy to de-risk program execution. Excellent written and verbal communication skills.

Ways to stand out from the crowd:

  • Knowledge of large-scale cloud and cluster level deployment and management systems. Experience with designing robust, resilient and performant scale-up fabrics

  • Demonstrated track record of leading data center products across the entire lifecycle, spanning inception, pre-silicon development, post-silicon bring-up, manufacturing, and deployment.

  • Familiarity with CXL, UCIE and other C2C technology architectures. Knowledge in storage and networking technologies.

We are widely considered to be one of the technology world’s most desirable employers, and as a result have some of the most forward-thinking and hardworking people in the world working for us. So if you're clever, creative, and driven, we'd love to have you join the team.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 320,000 USD - 488,750 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until April 6, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

What you'll do

1Drive end-to-end software architecture for NVIDIA's rack-scale products including firmware, kernel drivers, OS, networking, and manageability software
2Maintain deep understanding of the product portfolio and roadmap; translate forward-looking plans into formal software requirements
3Ensure high quality and reliable software; serve as trusted architectural partner across engineering teams
4Work directly with major hyperscaler/cloud customers to understand requirements and align roadmaps
5Collaborate with business partners and vendors to shape products to meet NVIDIA's needs
6Develop roadmap for new technologies and protocols; drive their design and adoption
7Mentor architects and engineering teams to grow them into future leaders
8Make key technical decisions under ambiguity

Requirements

15+ years of system architecture and design experience with deep expertise at the SW/HW interface for scalable, performant server systems
Deep knowledge of networking technology and protocols including Ethernet and InfiniBand, plus out-of-band/in-band management architectures
Hands-on expertise with complex system software for accelerators such as GPUs, DPUs, or FPGAs including system management protocols (Redfish, IPMI)
Proven ability to drive end-to-end software architecture for rack-scale or multi-node systems from inception through deployment
Strong communication skills with demonstrated ability to define formal software requirements, mentor engineering teams, and engage directly with hyperscaler/cloud customers

Nice to have

CXL
UCIe
Large-scale cloud deployment systems
Cluster management
Scale-up fabric design
Storage technologies
Data center product lifecycle leadership
Pre-silicon and post-silicon bring-up experience

Role overview

Role family
Software Engineering
Level
IC6 — platform
Experience
15–25 years
Type
Hybrid (IC + Management)
Remote policy
On-site
Visa sponsorship
Not offered

Tech stack analysis

INFRASTRUCTURE
NVLinkInfiniBandCXLUCIeHGXDGXNVIDIA Grace CPUGPUDPUFPGA
TOOLS
RedfishIPMI

Green flags

5 items
Salary range explicitly disclosed ($320K–$488,750), well above market for senior engineering roles, plus equitycompensation

Discover all 5 green flags for this role

Sign up free →

Benefits breakdown

See all benefits organized by category — health, financial, time off & more

Sign up free →

Hiring insights

JD quality
8/10
Urgency
medium
Autonomy
high
Team size
large (15+)

See JD quality score, hiring urgency & team details

Sign up free →

Red flags

PRO3 items
15+ years experience plus expertise across firmware, kernel drivers, networking, security, and management protocols is an extremely broad and demanding requirements barrequirements

See all 3 red flags — what the JD isn't telling you

Sign up free →

Interview insights

PRO

Get full interview breakdown — rounds, likely topics & prep tips

Sign up free →

Career path

PRO
Next roles
NVIDIA FellowVP of EngineeringChief Architect

See where this role leads — full career progression

Sign up free →
About the company

NVIDIA is the world's leading designer of GPUs and AI computing platforms. Its chips power everything from gaming and data centers to autonomous vehicles and scientific research. With a market cap exceeding $2 trillion, NVIDIA's CUDA platform and AI accelerators have become the backbone of the global AI revolution.

HQSanta Clara, CA, USA
Build vs Maintainboth
Cross-functionalYes