Hybrid Hardware & Software Support Engineer - HPC
Role details
Job location
Tech stack
Job description
Primarily on-site at a customer facility near Reading, Berkshire, with occasional support for additional HPC installations across Europe., Bull's High-Performance Computing (HPC), Artificial Intelligence & Quantum Business Unit is seeking a Hybrid Hardware & Software Support Engineer to join our HPC Services team. This is a highly visible, customer-facing operational role supporting advanced HPC infrastructures in the UK. You will work across computing, storage, and networking layers, ensuring the deployment, stability, and performance of large-scale Linux-based systems. While prior HPC experience is an advantage, it is not mandatory - strong Linux and infrastructure engineers eager to grow into HPC & AI are encouraged to apply., Deployment & System Bring-Up
- Install, configure, and integrate HPC cluster components (compute, storage, networking).
- Perform system installation, initial configuration, and operational readiness checks.
- Apply patches, updates, and conduct routine maintenance activities.
Hybrid Hardware & Software Support
-
Provide Level 1 and Level 2 operational support for HPC environments.
-
Diagnose and resolve issues involving:
-
Linux operating systems
-
Enterprise server hardware
-
High-speed interconnects
-
Storage subsystems
Conduct root cause analysis and implement corrective actions.
Escalate appropriately within the global support organisation when needed. Operations & Incident Handling
- Monitor system health and respond to incidents proactively.
- Perform troubleshooting in secure, mission-critical environments.
- Maintain detailed and accurate documentation of incidents and resolutions.
Customer Interface
- Act as the primary technical contact on-site.
- Communicate effectively regarding incidents, planned maintenance, and system status.
- Build trusted relationships with customer technical stakeholders.
- Represent Bull professionally in sensitive and high-profile environments.
Requirements
Do you have experience in Virtualization?, * Strong Linux expertise (RedHat and/or Debian-based environments)
- Solid understanding of enterprise server hardware (CPU, memory, storage, diagnostics)
- Scripting skills in Bash and/or Python
- Strong networking fundamentals (TCP/IP, routing, switching, security basics)
- Hands-on experience with infrastructure deployment, configuration, and maintenance
- Excellent troubleshooting and analytical abilities
- Proactive mindset and ability to work independently
Desirable Skills & Experience Valuable, but not mandatory:
- Experience with HPC clusters
- High-speed networking (40/100GbE, InfiniBand)
- Virtualisation technologies (KVM, OpenStack)
- Storage systems (Ceph, SAN/NAS)
- Parallel filesystems (Lustre, GPFS, BeeGFS)
- Containers (Docker, Podman, Kubernetes)
- Configuration management (Ansible, Puppet)
- Monitoring and observability tools (Prometheus, Grafana, Icinga)
- Workload managers (Slurm, PBS Pro)
- Git version control, * Is hands-on, operationally focused, and detail oriented
- Thrives in secure, mission-critical environments
- Approaches troubleshooting methodically, even under pressure
- Communicates clearly with both technical and non-technical stakeholders
- Takes full ownership of incidents through to resolution
- Is motivated to learn continuously and expand their technical expertise
Education & Experience Option 1:
- Degree in Computer Science, Engineering, or related field + at least 2 years of relevant experience
Option 2:
- 5+ years of relevant industry experience
Strong early-career candidates with solid technical foundations will also be considered.
Benefits & conditions
- Working on advanced HPC and digital infrastructure projects
- Continuous learning and technical skill development
- Career growth within a global technology organisation
- Participation in internal initiatives and community-focused activities.