System Administrator L2
We are seeking System Administrators at all experience levels to support mission operations sustainment activities, including system monitoring, troubleshooting, maintenance, and operational support within large-scale Linux environments.
This position is heavily focused on Linux system administration in mission-critical environments supporting high-performance computing (HPC) infrastructure and operational platforms. Candidates should possess strong troubleshooting abilities and be comfortable supporting systems where each environment may require unique solutions and approaches.
Minimum Qualifications
- Active TS/SCI with Polygraph
- Bachelor’s degree in a technical field and 5+ years of relevant experience
OR
- Additional 4 years of relevant experience in lieu of degree
Responsibilities
- Support mission operations sustainment activities including monitoring, troubleshooting, and system maintenance
- Install, configure, and maintain Linux operating systems and applications
- Perform file system configuration and system optimization
- Troubleshoot operating system, application, and network-related issues
- Support TCP/IP networking configuration and troubleshooting
- Compile, install, and maintain software packages and applications
- Develop and maintain Bash scripts to support operational efficiency and automation
- Support enterprise monitoring, alerting, and metrics platforms
- Collaborate with engineering and operations teams to maintain system stability and performance
- Participate in root cause analysis and issue resolution efforts
Required Qualifications
- Strong Linux system administration experience with:
- RHEL
- CentOS
- Rocky Linux
- SLES
- Ubuntu
- Experience with:
- OS installation and configuration
- File system management
- TCP/IP networking
- Operating system and application troubleshooting
- Bash scripting
- Software compilation and installation
- Understanding of High-Performance Computing (HPC) architecture
- Knowledge of high-speed networking technologies such as:
- Familiarity with the following tools and technologies:
- Jira
- Confluence
- Grafana
- Prometheus
- Nagios
- Slurm
- Git
- Salt
- Ansible
- Strong troubleshooting and analytical skills
- Ability to work independently in complex operational environments where systems and solutions may vary significantly