Job Description

Your tasks

Your mission:
As a DevOps/Platform Engineer (m/f/d), you will provide a secure, scalable, and observable platform for our AI platform Alan and establish the "You build it, you run it" principle within the team. You will support the production teams on paved paths (self-service, guardrails) and ensure predictable performance and costs.
Your tasks

  • You will assume ownership of key platform/serving components.
  • You operate K8s clusters, networking (Ingress), storage (databases, snapshots) and OS/kernel patching, ensuring their secure and stable operation.
  • You model multi-cloud resources (especially Open Telekom Cloud) via console and IaC (Terraform).
  • You build CI/CD pipelines and release/versioning/rollback strategies.
  • In the area of ​​Observability & Site Reliability Engineering, you will implement OpenTelemetry-based tracing, metrics and logs, and define SLIs/SLOs, alerting and error budgets.
  • Together with our AI Engineers, you will provide the platform for model serving: GPU scheduling, autoscaling, inference gateways, observability (latency/QPS/token costs)

Your profile

  • You have successfully completed your Master's degree or doctorate in one of the STEM subjects or a humanities subject with a STEM specialization.
  • You have at least 2 years of relevant professional experience in DevOps, Site Reliability Engineering or Platform Engineering and have demonstrably taken on responsibility for Kubernetes, IaC, CI/CD, Observability and production operations – ideally in a SaaS environment.
  • You possess practical know-how in Git-based deployments, modular IaC, secret/config management, and incident experience.
  • You have security expertise in network security, secrets, hardening (CIS), software supply chain and access principles (least privilege).
  • Ideally, you have initial practical experience in operating inference workloads (vLLM or similar), GPU capacity management, autoscaling, and observability.
  • You are characterized by curiosity and a thirst for knowledge, as well as strong problem-solving and communication skills.
  • You communicate convincingly and efficiently in German and English.

Why us?

  • You will work on a state-of-the-art, scalable AI platform with plenty of creative freedom and take on early responsibility for key infrastructure and architecture decisions.
  • You will exchange professional ideas with your future colleagues on an equal footing and receive budget and time for your own innovation projects.
  • With us, you will grow professionally and personally through training courses, certifications and career development programs specifically tailored to you.
  • You can focus on and expand your areas of expertise.
  • In addition to an attractive fixed salary plus revenue and profit sharing, you can compensate for overtime and book travel time as working time.
  • By freely choosing your workplace and having flexible working hours, you can design your workday to fit your lifestyle.
  • You can also expect a top-equipped workplace, JobRad (company bike scheme), Body & Mind Workout, Games Nights, barbecues on our roof terrace, team activities with adventurous colleagues, summer parties with your family members and many other benefits.

About Us

Our family business