Infra Engr, Infra Hybrid It

Singapore, Singapore

Job Description


Singapore, SingaporeCompany: Singtel GroupRESPONSIBILITIES:

  • Support High Power Computing and ITSM
  • The System engineer is responsible in specializing in High-Performance Computing (HPC), you will be a key contributor to the design, implementation, and optimization of complex computational systems. Leveraging your expertise in HPC technologies, you will collaborate with cross-functional teams to ensure the seamless integration and performance of high-performance computing environments.
System Design and Implementation:
  • Design, implement, and maintain high-performance computing systems to meet the organization\'s computational needs.
  • Collaborate with stakeholders to understand performance requirements and hardware specifications.
Parallel Computing:
  • Implement and optimize parallel computing techniques to enhance system performance.
  • Leverage parallel programming languages and frameworks for efficient task execution.
Cluster Management:
  • Manage and optimize HPC clusters, ensuring scalability and reliability.
  • Implement and maintain cluster management tools for efficient resource utilization.
Performance Tuning:
  • Analyze and fine-tune system configurations, hardware, and software for optimal performance.
  • Identify and resolve performance bottlenecks in HPC applications.
Job Scheduling:
  • Utilize job scheduling systems to allocate computational resources and manage workloads efficiently.
  • Collaborate with users to understand job requirements and prioritize computing tasks.
Networking and Interconnects:
  • Configure and optimize high-speed interconnects, such as InfiniBand, for fast data transfer between nodes.
  • Collaborate with network administrators to ensure seamless communication within HPC environments.
Distributed File Systems:
  • Implement and manage distributed file systems for efficient data storage and retrieval.
  • Optimize data access and transfer mechanisms to support large-scale computations.
Fault Tolerance and Reliability:
  • Implement strategies for fault tolerance to ensure system reliability during long-running computations.
  • Troubleshoot and resolve system issues to minimize downtime.
Documentation:
  • Create and maintain detailed documentation of HPC system configurations, processes, and best practices.
  • Develop user guides and training materials for HPC users.
Stay Updated:
  • Keep abreast of emerging trends and advancements in HPC technologies.
  • Evaluate and recommend new hardware and software solutions to enhance system capabilities.
REQUIREMENTS
  • Bachelor\'s or master\'s degree in computer science, Information Technology, or a related field.
  • Proven experience as a Systems Engineer with a focus on High-Performance Computing.
  • Knowledge of HPC architectures, technologies, and parallel programming languages.
Technical Proficiency:
  • Familiarity with cluster management tools, job scheduling systems, and distributed file systems.
  • Experience with high-speed interconnects (e.g., InfiniBand) and networking in HPC environments.
Problem-Solving Skills:
  • Strong analytical and problem-solving skills to address complex HPC challenges.
Communication:
  • Excellent communication and collaboration skills to work effectively in interdisciplinary teams.

Singtel

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD1414433
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Singapore, Singapore
  • Education
    Not mentioned