The Role:As a Senior Datacenter Engineer you will be responsible for leading the day-to-day operations of the Tesla Datacenter engineering tasks for the new Gigafactory Berlin build in Germany, but also other off-site locations Tesla has in Europe. Our global team performs all the on and off premise datacenter work that supports all production and engineering work that makes Tesla a world leader in self-driving EV, energy storage, and solar power technology. Continuous deployment, monitoring, maintenance, improvement, and rapid turn-around on service requests from all over the organization is imperative to drive a successful production environment in the datacenter.You’ll be the highly engaged and hands-on regional representative for a closely integrated, cross-functional, and versatile team that performs most racking, stacking, wiring, and implementation designs, implements, and maintains all Tesla datacenter resources globally. With the ever-growing need for more and more data and compute, locally, and in remote locations – datacenter operations need to follow suit, be scalable through more automated processes for deployment, monitoring, and alerting. You will be responsible for ensuring greatly improved processes in precision deployments of production systems by leveraging the combined resources your team provides.Responsibilities:
- Leverage and improve upon existing data center deployments to ensure continuous operation
- Work with engineering teams to understand useful metrics to collect and implement such monitoring and alerting with existing monitoring solutions at the datacenter level.
- Organize and document implemented solutions for long term information retention with our internal ticketing and documentation system.
- Work closely with involved parties automated workflows that can be easily implemented by remote hands with little or no understanding of internal systems.
- As part of the team, respond to, and document submitted support tickets relating to the functionality of various systems present in the datacenter.
- Help develop automated tools to collect information that can be directly used to assist users creating root cause analysis for issues reported.
- BS in Computer Science, Electrical Engineering or related field or a Bachelor’s degree with 3 years of additional equivalent experience
- 5+ years experience with:
- Computer deployment and operations (CPU / GPU)
- Networking infrastructure deployment and operation
- Wiring and cable management
- Linux operating system flavors
- Systems monitoring and alerting
- 3+ years experience with:
- Storage systems (On-prem and/or in-cloud)
- DCIM type software for monitoring, alerting, automation
- Working knowledge of power and cooling infrastructure at the datacenter scale and planning for such
- Working knowledge of datacenter, network, and compute deployments at scale
- Working knowledge of programming and/or scripting with python, bash, or similar
- Excellent time management and communication skills are absolute musts
- Being highly organized and able to work on multiple projects simultaneously
- Working autonomously while yet being part of a global team
- Ability to step up and take ownership to bring complex tasks to completion
Nice to have:
- Experience with multi-site on-prem and in cloud hybrid software and hardware deployments
- Familiarity with public cloud compute and storage resource orchestration
- Interest and knowledge in energy efficient high performance computing, planning, and understanding of liquid cooling
Other position requirements:
- Travel as necessary to remote sites within region
- Ability to lift 20 kg boxes and/or equipment on regular basis