Responsibilities Navigate, troubleshoot, and recover dynamic infrastructure and long-running processes in real-time using command-line tools. Master and manage highly containerized environments, including orchestrating Dockerized sandboxes and CI/CD workflows. Build, maintain, and optimize systems for AI model training and high-throughput compute environments. Respond swiftly to system errors, executing dynamic mid-operation replanning and recovery. Collaborate with engineering and AI teams to ensure seamless integration, reliability, and performance. Document system architectures, incident responses, and recovery protocols with meticulous clarity. Requirements Have demonstrated expert proficiency working in terminal environments for system builds, server administration, and infrastructure management. Possess advanced problem-solving skills for multi-step troubleshooting, filesystem navigation, and process management within containerized settings. Have deep familiarity with build systems, package managers, databases, web servers, ML frameworks, version control, and cryptography tools. Have a proven ability to execute dynamic infrastructure recovery and optimize long-running processes under pressure. #J-18808-Ljbffr
Infrastructure Engineer | $70/Hr Remote
CROSSING HURDLES
Remote, Remote
Published 10 days ago
Report job