Furiosa Ai

NPU MI (Management Interface) Software Engineer

Furiosa Ai Remote 2 days ago
engineering

Responsibilities

  • Design and develop the NPU Management Interface (MI) firmware/software enabling communication between Host/BMC and NPU devices

  • Implement and maintain MCTP, PLDM, and custom MI command handling for out-of-band NPU management, monitoring, and control

  • Develop device-management features over SMBus/I²C, I3C, PCIe VDM, or custom sideband channels

  • Integrate MI functionality into the NPU firmware, including:

    • Health and error reporting

    • Thermal and power telemetry

    • Runtime status, utilization metrics, and debug information

  • Ensure compliance with industry specifications by performing spec-driven design, implementation, and validation

  • Support bring-up, interoperability testing, rack-scale platform integration, and system-level validation

  • Develop test strategies and validation tools based on MCTP and PLDM specifications

  • Perform protocol compliance testing, regression testing, and interoperability verification

Requirements

  • Strong proficiency in embedded C or C++

  • Experience with firmware development for NPU/accelerator, GPU, or SoC

  • Understanding of management protocols including MCTP (over I²C/SMBus, I3C or PCIe VDM) and PLDM

  • Experience with low-level interfaces: SMBus/I²C, I3C, SPI, PCIe

  • Ability to interpret complex protocol specifications and convert them into robust implementations

  • Familiarity with device telemetry, sensor frameworks, watchdog/reset flows, and health monitoring

  • Experience with system-level debugging using logic/protocol analyzers and low-level debug tools

  • Knowledge of embedded systems, bare-metal or RTOS environments, and firmware lifecycle flows

Preferred Qualifications

  • Experience of BMC firmware stacks such as OpenBMC, Redfish, IPMI, and PLDM device-model implementations

  • Background in spec creation, requirement definition, or standards compliance validation

  • Experience defining FRU data, power/thermal management policies, and diagnostics frameworks

  • Familiarity with secure provisioning, firmware update mechanisms, and lifecycle state management

  • Experience with large-scale datacenter or HPC system integration (rack-level management, telemetry aggregation)

  • Contributions to firmware for accelerator, MCTP/PLDM implementations, or open-source system firmware projects

Contact

Sponsored

Explore Engineering

People also search for