Linux HPC Systems Administrator
SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.
Linux HPC Systems Administrator
We are looking for a motivated individual that is well-versed in High Performance Compute (HPC) clusters. The ideal candidate will be flexible and will flourish in a fast-paced and dynamic environment. He or she should be a self-starter and have excellent motivation and customer service skills to excel at this position.
- Provide leadership during the development and enhancement of Linux parallel computing monitoring and job processing, including working with applications, technical support and operations during the design, development and implementation of applications
- Provide expert level Linux support to end user desktop and server environments. Provide help and guidance to customers identifying ways to improve workflow and efficiency of computational resources
- Install and configure servers, storage, and other infrastructure resources
- Provide regular maintenance, monitoring, and backup of infrastructure systems
- Provide after-hours or weekend support when necessary to perform high-risk or planned downtime of SpaceX IT systems for upgrades and maintenance
- Write instructional documentation and convey highly technical ideas in terms stakeholders and customers can understand
- Minimum 5 years of experience in Linux administration, including recent versions of distributions such as Red Hat, CentOS, and Ubuntu
- Minimum 3 years of experience with High Performance Compute (HPC) clusters
Preferred Skills and Experience:
- Bachelor’s degree in computer science, engineering, information systems/IT, physics, math or similar technical discipline
- Experience configuring, installing and troubleshooting Open MPI, Platform MPI, and Intel MPI
- Experience configuring, installing and troubleshooting Slurm Workload Manager (http://slurm.schedmd.com)
- Experience with configuring, installing and troubleshooting primarily CFD and FEA applications and with lesser focus on Electro-magnetic and Electro-mechanical applications. Experience with application tuning and profiling
- Experience with Mellanox IB (Infiniband) switches (Director Class, TORs, Gateways) and HCAs
- Experience with Nvidia based GPU computing
- Experience with troubleshooting and triaging analysis job results
- Experience with helping customer debug custom-written codes
- Demonstrate mastery of the concepts in server hardening and performance tuning of LINUX systems in an HPC environment
- Expert level familiarity with the Linux operating system, its standard tools, scripting languages, and shell commands
- Expert level experience with IP networking and related network services
- Experience integrating Microsoft technologies with Linux open-source and commercial products.
- Familiarity with Docker
- Familiarity with Linux networking and authentication infrastructure
- Have a working understanding of Kerberos authentication
- Strong verbal and written communication skills
- Motivated self-starter personality, able to work independently
- Familiarity with network licensing technologies such as FlexLM, RLM and HASP dongles
- To conform to U.S. Government space technology export regulations, including the International Traffic in Arms Regulations (ITAR) you must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State. Learn more about the ITAR here.
SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.
Applicants wishing to view a copy of SpaceX’s Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application/interview process should notify the Human Resources Department at (310) 363-6000.