The Server R&D Team focuses on the research and development of server hardware in the fields of computing, storage, and heterogeneous computing, as well as software development and the design and delivery of integrated business solutions. The Heterogeneous Hardware and Systems group, part of the Heterogeneous Computing Software-Hardware Integration team within the Server R&D organization, is responsible for the evaluation and integration of heterogeneous computing components (e.g., accelerators, AI chips), and for planning scalable heterogeneous computing infrastructure architectures. The team delivers co-optimized software-hardware solutions for diverse workloads, and provides technical support and solutions to key internal business units, cloud platform teams, and enterprise computing services.
Your responsibilities include, but are not limited to:
1.Leading the planning and integration of vendor-provided heterogeneous computing hardware (e.g., GPUs, AI accelerators), as well as the architecture and system design of heterogeneous AI servers;
2.Providing technical support for quality operations of prototype hardware and software systems, including functional validation, stress testing, and troubleshooting;
3.Diagnosing and resolving system-level and performance-related issues encountered during heterogeneous computing workflows and deployment;
4.Conducting performance analysis, profiling, and benchmarking of heterogeneous workloads running on integrated systems;
5.Collaborating with production and engineering teams to gather, clarify, and translate business and technical requirements;
6.Independently addressing complex technical and operational challenges in a fast-paced, high-pressure environment.
Position Requirement
Minimum qualification:
1. BS, MS or Ph.D. in Computer Science, Computer Engineering, or related field;
2. At least 5 years of experience in software development for heterogeneous computing with hardware-software integration, or in related fields.
3. Experience with heterogeneous hardware architectures and hardware accelerators;
4. Experience in heterogeneous computing frameworks and toolsets;
5. Ability to work independently, good communication and strong interpersonal skills;
6. Reliability and self-motivation in a dynamic product-oriented team;
Preferred qualification:
1. Strong familiarity with AI chip architectures, and hands-on experience understanding the on-chip components and performance metrics that impact model efficiency.
2. Experience in large scale computing distributed training process;
3. Knowledge of heterogeneous computing algorithm and architecture;
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.