Job Description

Become our new Cloud AI Architect (m/f/d) – and shape the future of data-driven platforms!

As a Cloud AI Architect , you are responsible for the hands-on design, implementation and stable operation of AI and data-driven systems in complex enterprise environments.

You work AI-first, close to production and with high ownership : from the initial architecture decision to deployment strategies, stability, security and cost control in live operation.

This role is not purely conceptual or advisory. You make concrete technical decisions , implement them yourself, and take responsibility for ensuring that agentic systems, data platforms, and AI workloads run reliably, scalably, and economically .

Your tasks & responsibilities

AI-First Platform & Operational Architecture

You follow a consistent AI-first engineering approach .
Platforms and operating models are designed from the outset to meet the specific requirements of LLMs, agentic systems, and AI workloads .

This includes:

  • Development and enhancement of cloud platforms on Microsoft Azure for AI and data systems
    (compute, storage, network, identity, tooling)
  • Deployment and operation of Databricks as a central platform for data engineering, machine learning and AI workloads
  • Architecture of agent orchestration, runtime environments, control planes and tool integrations
  • Ensuring that platforms are production-ready, scalable, observable, and operationally manageable.

DevOps, deployment & release management

You are responsible for the technical implementation and stable operation of AI, agent and data systems throughout their entire lifecycle.

  • Building and operating CI/CD pipelines for AI, agent, and data components in Azure
  • Definition and implementation of release and update strategies (e.g., canary releases, versioning, controlled agent updates)
  • Reproducible deployments and clean rollbacks , especially for Databricks and AI workloads
  • Close collaboration with engineering managers and product managers on architectural and operational decisions.

Stability, Security & Governance

You will be responsible for the safe and stable operation of business-critical systems.

  • Responsibility for availability, performance, fault tolerance and incident handling
  • Implementation of Security by Design in Azure and Databricks environments, including:
    • Role and authorization concepts (Azure IAM)
    • Secrets Management
    • Network and system isolation
    • Audit and compliance requirements
  • Development and enforcement of governance rules for agentic systems (access, guardrails, policies, control mechanisms)

Cost, Performance & Scalability

You ensure that technical excellence and economic efficiency go hand in hand.

  • Transparency and active control of operating costs in Azure and Databricks (compute, storage, token costs, latency)
  • Design of architectures that scale with growing data volumes and agent networks.
  • Evaluation of technical and economic trade-offs together with the Engineering Manager and Product Strategist

Production launch & operation

You accompany systems all the way to productive enterprise operation – and beyond.

  • Responsibility for go-live preparation, stabilization and transition to regular operation
  • Creation of runbooks, operational documentation, and architecture decision records
  • Preparing for handover to customer IT or internal operations teams
  • Technical contact person in critical project and operational situations

What we are looking for

Core Profile

  • Highly experienced, hands-on engineer specializing in cloud platforms (Microsoft Azure), DevOps, and enterprise operations.
  • Proven productive experience with Azure and Databricks in enterprise environments
  • Experience with AI and data workloads in production environments (not a purely infrastructure or conceptual role)
  • Strong AI-first mindset with a clear understanding of stability, security, and operations.

AI & Data Engineering

  • Experience with LLM-based systems and agentic architectures
  • Understanding of ML lifecycle concepts (training, inference, monitoring)
  • Architecture of agentic systems including guardrails, policies and control mechanisms

Cloud, DevOps & Platform

  • Very good practical experience with:
    • Microsoft Azure (compute, networking, storage, IAM, security)
    • Databricks (data engineering, ML & AI workloads)
  • CI/CD pipelines and infrastructure as code (e.g. Terraform, Bicep)
  • Monitoring, logging and observability in the enterprise environment

Security & Governance

  • Cloud Security Patterns
  • Identity & Access Management
  • Compliance and audit requirements in enterprise environments

Soft Skills & Working Method

  • Excellent communication skills:
    You explain technical decisions, risks and costs clearly – both internally and to customers.
  • Strong business and product understanding
  • Forward deployed mindset :
  • You enjoy working closely with customers and taking responsibility in real project situations.
  • Strong ownership mindset, pragmatism, and implementation skills

Language skills

  • Fluent and negotiation-level German and English skills

Why you're exactly right for us

  • Demanding data and AI projects using state-of-the-art technologies (Microsoft Azure, Databricks, modern AI platforms)
  • High-performance culture with a lot of responsibility and creative freedom
  • Steep learning curve & development opportunities , including Azure & Databricks certifications
  • Hands-on engineering culture with a strong team spirit
  • Close collaboration with customers, partners and the broad AI ecosystem