GPU Platform Research Engineer - Cloud Hardware Reliability - Permanent
We usually respond within three days
About Huawei
Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – we are committed to bringing digital to every person, home and organization for a fully connected, intelligent world.
At Huawei, innovation focuses on customer needs. We invest heavily in basic research, concentrating on technological breakthroughs that drive the world forward. We have more than 180,000 employees, and we operate in more than 170 countries and regions.
About the IRC
Huawei Ireland Research Centre (IRC) mission is to position Huawei as a recognized technology leader and a global provider of information and communications technology (ICT) solutions. To achieve this we are building an industry-recognized multi-discipline Research Centre of experts with focus on medium-term to long-term issues. The IRC will work closely with an open innovative ecosystem with Huawei customers to address real-world issues. The IRC will also engage with key European universities to build a basic research capability to support Huawei technical projects.
About the job
Are you a researcher or engineer interested the challenges of planet-scale cloud infrastructure? We are looking for people who are passionate about working on problems that lie at the intersection of academic research and practical industry implementations. This is a chance to be part of the strategic team that will tackle the Cloud Hardware Reliability challenges. This team works on improving the reliability of cloud infrastructure architecting and designing new features for the future servers.
The Cloud Reliability Lab at the Huawei Ireland Research Centre has a mission to bring world-class reliability to Huawei Cloud by solving cross-functional problems that span Hardware, Software, Networking and Operations. We have teams working in all these areas with a diverse mix of talented, people including industry experts, academic researchers, and Ph.D. interns. In your role, you will collaborate with the local research teams, other European research centers, and other engineering teams spread across the globe.
Responsibilities
- Understanding and investigating planet-scale technical problems. For example, defining new hardware reliability functionalities for globally distributed data centers.
- Architect and help design RAS features for GPU platforms and develop manageability solutions to monitor and maintain system health.
- Present findings and solutions tailored to the needs of key stakeholders, including engineering teams, senior management, customers, and external partners.
- Gather insights from the cutting edge of industry and academia regarding GPU hardware development. Help translate customer requirements, feedback, and market dynamics into potential feature requests to ensure a high reliability fleet.
Requirements
- Ph.D. or Master’s degree in Electrical Engineering or Computer Engineering or a related field
- Expert on GPU platform manageability including BMC firmware strategy for GPU systems
- Expert on GPU infrastructure technology (cloud/server platforms, data center specialized infrastructure)
- Expert on HBM memory reliability or similar memory technologies for GPU systems
Privacy Statement
Please read and understand our West European Recruitment Privacy Notice before submitting your personal data to Huawei so that you fully understand how we process and manage your personal data received.
http://career.huawei.com/reccampportal/portal/hrd/weu_rec_all.html
- Department
- Cloud Reliability Lab: (Huawei Cloud)
- Locations
- Dublin
About Huawei Ireland Research Centre
Huawei Ireland Research Centre (IRC) mission is to position Huawei as a recognized technology leader and a global provider of information and communications technology (ICT) solutions. To achieve this we are building an industry-recognized multi-discipline Research Centre of experts with focus on medium-term to long-term issues. The IRC will work closely with an open innovative ecosystem with Huawei customers to address real-world issues. The IRC will also engage with key European universities to build a basic research capability to support Huawei technical projects.
GPU Platform Research Engineer - Cloud Hardware Reliability - Permanent
Loading application form
Already working at Huawei Ireland Research Centre?
Let’s recruit together and find your next colleague.