Dr. Michael Koehn

Dr. Michael Koehn

Scientist, AI Consultant & Engineer

Specialist for data analysis and machine learning, transformer, convolutional and graph neural networks. Very experienced in time-series forecasting, anomaly detection and explainable AI. Python, R, C++, Matlab and database developer, numerical analysis and parallel computing, processing of large datasets with modern tools, dashboard visualization using Plotly Dash, R Shiny, JavaScript frameworks, Power BI or Tableau, product deployment in the major cloud environments via DevOps/MLOps pipelines and Kubernetes clusters. Extensive agile project management experience. Post-graduate business education. Deep yet pragmatic thinker, very reliable and sincere partner. Trained to document and present findings to non-technical stakeholders in a palatable and actionable manner. Project work as senior expert in sectors automotive, energy, insurance, banking, telecommunications, engineering, retail, logistics, pharmaceuticals, research.

Projects

Scientifical Guidance in Product Development
Scientifical Guidance in Product Development
Developer October 2024 - September 2025

Sector: Home and kitchen appliances

  • Design and implementation of algorithms and pipelines
    • to simulate the dynamical, physical system (magnets and magnetic field sensors)
    • to interpret measurements
    • to process and analyse / validate the simulation in comparison with experimental tests of the actual product
  • Simulation and detection algorithm of the physical system behind the product
    • Set up the simulation of a new accessory for the product.
    • Set up the detection algorithm, including i.a. defining thresholds.
  • Infrastructure
    • Use Databricks and MLFlow experiments for a complete documentation of the simulation runs to guarantee 100% reproducability.
    • Creation of dashboards for end-of-line tests to allow quick analyses for the production stakeholders.
  • Support during development of a new product
    • Optimise the sensor positions for a new product.
    • Guide implementation of an unforeseen component change in close contact with the involved stakeholders.
    • Define end-of-line test parameters for the production of the new product.
  • Simulation and detection algorithm for the new product
    • Set up pipelines for the simulation of the new product’s dynamics.
    • Set up pipelines for the detection algorithm of the new product.
  • Experimental confirmation
    • Guide the experimental confirmation of the set up simulation in close contact with the responsible experimental team of the client for all the above mentioned products.
    • Analyse, visualise and document the findings.
  • Ad-hoc analyses
    • Support during unforeseen issues in production or development of the above mentioned products, always in close contact with all the stakeholders involved.

Short-Term Trading Optimization for Energy Exchange Markets in Microsoft Azure
Short-Term Trading Optimization for Energy Exchange Markets in Microsoft Azure
Developer February 2021 - March 2025

Sector: Engineering / Energy

  • Data analysis, profiling, cleansing
  • Algorithm engineering for quarter-hourly energy trading (EPEX) using mixed-integer linear programming optimization against different markets
  • Extensive work on trading strategy measuring market volatility using average true range (ATR) on open-high-low-close chart (OHLC), Renko, moving averages (MACD), OBV indicators, difference between buy and sell orders at different price levels (orderbook imbalance)
  • Integration of renewable plants into a portfolio, forecast and market
  • Taking into account SPOT market and operating reserve (FCR, aFRR, mFRR)
  • Set-up as service using FastAPI
  • Energy-price time-series forecasting using neural networks and benchmarking against the forecasting products of external vendors (ICIS, Procom, Price-IT)
  • Implementation of ENTSO-E data as forecasting features (Lastflüsse, Lastkurven)
  • Implementation of data from regelleistung.net
  • Implementation of EPEX data
  • Documentation in client’s wiki system
  • Dashboard development using Plotly and PowerBI for monthly evaluation of profit from algorithmic trading w.r.t. various defined business strategies
  • Designing and implementing a thermodynamic model for temperature prediction during high-current charging and discharging cycles
  • Azure pipeline development for scalable product development, CI/CD package registry, infrastructure as code, unit and integration testing
  • Dashboard access management via Oauth 2.0
  • Assisting in frontend development using C# Razor / Blazor
  • Agile project organization with biweekly sprints and code reviews
  • Documentation of results in client wiki system

Cloud-based Data Pipelines and Predictive Maintenance for Photovoltaic Power Plants
Cloud-based Data Pipelines and Predictive Maintenance for Photovoltaic Power Plants
Developer May 2024 - Present

Sector: Engineering / Energy

  • Data analysis, profiling, cleansing
  • Edge-device config for data export to Azure Blob
  • Configuration of Modbus protocol for exporting on-site recorded data to Azure Blob and Cosmos storage solutions
  • Implementation of scalable Databricks pipelines for ingesting data from Azure Cosmos DB and Azure Blob Storage into Delta Lake:
    • Development of dedicated Python package hosted on Azure DevOps repo
    • Configuration of CI/CD for robust unit testing and deployment
    • Implementation via Apache Spark for distributed parallel data processing and structured streaming
    • Deployment with Databricks for processing and metrics
    • Configuation of Microsoft Fabric for hosting bronze, silver and gold workspaces
    • Integration of Cosmos DB Change Feed and Databricks Autoloader for real-time and batch ingestion
    • Implementation of medallion architecture (bronze, silver, gold) for optimal data organization and analytics
    • Symbiotic data architecture: data is ingested via Databricks from Azure storages to Microsoft Fabric’s bronze workspace, processed into silver, and prepared for analytics in gold
    • Base-code restructuring and revamping of legacy classic anomaly detection algorithms
    • CI/CD workflows, infrastructure as code, unit and integration testing
  • Cutting-Edge Anomaly Detection Models: Implementation of advanced machine learning algorithms, including variational autoencoders (VAE) and isolation forests tailored for accurate anomaly detection and predictive maintenance.
  • Development, analysis and fine-tuning of machine learning algorithms for anomaly detection, including Isolation Forests, Variational Auto-Encoders (Conditional and Vector-Quantized), Self Organizing Maps, Robust Covariance Estimation
  • Application of SHAP for Anomaly Explanations: Utilized SHapley Additive exPlanations (SHAP) to interpret model outputs, providing clear explanations for detected anomalies
  • Design and development of front-facing customer dashboards, using PowerBI and integrated into the Fabric ecosystem
  • Documentation in client’s wiki system

Retrieval-Augmented Mental Health Chatbot
Retrieval-Augmented Mental Health Chatbot
Developer August 2024 - December 2024

Sector: Health

  • Built a mental health chatbot using RAG to give helpful, grounded responses.
  • Used OpenAI’s GPT-4 for generation, with retrieval from a curated set of mental health docs (CBT, mindfulness, WHO guides).
  • Set up FAISS for vector search, added metadata filters (topic, tone) to control what gets retrieved.
  • Logged daily moods from users, stored with timestamps for trend tracking.
  • Pulled local weather data via OpenWeatherMap API to add context to mood and responses.
  • Flagged critical messages (e.g. “I want to give up”) with basic keyword checks and sentiment thresholds.
  • Added daily check-ins and journaling prompts to keep users engaged.
  • Stored everything in Azure (OpenAI, Cosmos DB).
  • Responses grounded in retrieved docs to avoid hallucinations.
  • Used LangChain to wire up retriever + generator pipeline.
  • Simple admin panel built with Streamlit to monitor and update knowledge base.

Acoustic Monitoring Predictive Maintenance for Gas Turbines and Compressors
Acoustic Monitoring Predictive Maintenance for Gas Turbines and Compressors
Developer April 2024 - December 2024

Sector: Energy

  • Data analysis, profiling, cleansing: Extensive exploratory data analysis, profiling acoustic emission data represented as spectrograms, and cleansing for modeling readiness. Application of dimensionality reduction techniques and exploration of data separability to identify optimal feature spaces using Convolutional Neural Network (CNN) based Variational Autoencoders (VAE) implemented with Keras/Tensorflow and scikit-learn Principal Component Analysis (PCA).
  • Selection of suitable conditioning parameters for training a generative acoustic emission model in an unsupervised learning setting. Integration of internal machine parameters and external environmental parameters with variations on multiple timescales. Visualization of spectrograms in Python with Plotly.
  • Training dataset preparation: Selection of the most informative machine parameters measured by onboard sensors, excluding retrofitted microphones or vibration sensors. Limitation to frequency bands with an optimal signal-to-noise ratio. Restriction of data to normal operating conditions and appropriate subsets for in-domain training. Application of exponential mapping (and corresponding logarithmic inversion) to prevent non-physical predictions in generated spectrograms (Pandas and NumPy methods).
  • Training and evaluation of generative models: Development and evaluation of acoustic emission generative models using Conditional Variational Autoencoders (CVAE); implemented and trained in PyTorch. Model architecture optimization and hyperparameter tuning through comprehensive grid searches, including variations in encoder and decoder layer depth, latent space dimensions, and regularization of latent space (via Kullback-Leibler divergence/relative entropy) to balance spectrogram reconstruction and conditional generation performance.
  • Scientific reporting: Preparation of a comprehensive report detailing methodology, findings, and implications. Recommendations for subsequent steps toward anomaly detection based on generated spectrograms.
  • Roadmap for acoustic monitoring with real audio instead of spectrograms. Analysis and evaluation of state-of-the-art anomalous sound detection methods as seen in the DCASE Challenge. Feasibility analysis and planning for anomaly detection leveraging pre-trained audio transformers.

Reinforcement Learning Developer for AI-based Algorithmic Trading
Reinforcement Learning Developer for AI-based Algorithmic Trading
Developer February 2024 - today

Sector: Finance

  • Improved data pipelines
  • Further development and optimisation of an existing reinforcement learning agent for use in a fully automated trading system in the Forex market
  • Spreads, slippage, latency, per-trade/day limits, exposure caps
  • Sharpe/Sortino, drawdown/CVaR penalties, transaction-cost awareness, action-change penalties
  • Transformer-based policies, distributional/value-based hybrids, and ensemble critics
  • Walk-forward splits, regime detection, stress tests around macro events, curriculum learning from calm to high-volatility periods.
  • Out-of-sample backtests, bootstrap confidence intervals, and live paper trading with drift/overfit diagnostics.
  • Automated hyperparameter, experiment tracking (W&B), and reproducible seeds/checkpoints.
  • Real-time inference service, monitoring, alerting
  • Model versioning, and compliance-ready logs for orders, fills, and parameter changes.

Reinforcement Learning in Intelligent Control for Energy Systems
Reinforcement Learning in Intelligent Control for Energy Systems
Developer February 2023 - March 2024

Sector: Engineering / Energy

  • Developed a reinforcement learning agent to manage and optimize electrical grid operations under dynamic conditions.
  • Used Grid2Op to simulate realistic power grid environments with line limits, contingencies, and redispatch actions.
  • Leveraged L2RPN datasets and competition frameworks to train and benchmark the agent against established performance metrics.
  • Combined RL (PPO, DQN) with rule-based safety constraints
  • Stress-test agents on rare-event and high-demand scenarios
  • Optimized hyperparameters for robustness and efficiency
  • Improved significantly on various benchmarks.

Explainable AI for Neural Networks in Drug Development
Explainable AI for Neural Networks in Drug Development
Developer January 2024 - March 2025

Sector: Health / Pharmaceutical

  • Development of module for Explainable AI feature integrated into client Quantitative Structure-Activity Relationships (QSAR) neural network Python package for training and evaluating models
  • Evaluation of integrated gradients with Captum, counterfactuals generation and inference with Self-Referencing Embedded Strings (SELFIES) and Local Interpretable Model-Agnostic Explanations (LIME) with Activity Cliff benchmark datasets
  • Mapping of determined molecular activity contributions to atoms and molecular fingerprint bits as predicted by tabular, graph-based and chemical-language neural network architectures using RDKit Similarity Maps
  • Benchmarking and optimization of Explainable AI module processing for efficient end user workflows aiding rational drug design
  • Integration of optimized Explainable AI feature into deployed client QSAR framework

Bayesian Neural Networks for Automated Scientific Discovery
Bayesian Neural Networks for Automated Scientific Discovery
Developer July 2022 - December 2022

Sector: Nuclear Fusion Research Design and implementation of a fully containerized, automated scientific discovery (ASD) platform for exploratory data analysis in high-dimensional scientific datasets from nuclear fusion and climate science research. Core contributions:

  • Scientific Discovery Algorithm Engineering
    • Developed and modularized a pipeline for detecting statistical predictability and correlations of arbitrary dimensions using Bayesian inference, neural networks, kernel density estimation, and Gaussian processes.
    • Incorporated novelty detection, sensitivity analysis, and automated feature relevance ranking using statistical and machine learning approaches.
    • Employed streamlit-based visual analytics for scientific users without AI background, including interactive dashboards for data inspection and correlation exploration.
  • Infrastructure & Containerization
    • Architected a reproducible, GPU-capable containerized application stack using Docker and TensorFlow that can be hosted locally or on AWS.
    • Built a Ray.io Cluster that scales dynamically based on the workloads’ needs.
    • Fully network integration of all components over a secure Tailnet using Headscale and Tailscale clients.
    • Automated container setup with full dependency management (Python, R, OS, and Node.js tools), creating isolated, reproducible environments.
    • Built runtime tooling for launching Streamlit UIs, Ray.io parallel tasks, and diagnostics dashboards inside containers.
  • DevOps & Deployment
    • Created an all-in-one installer script to build, configure, and run the full application locally or in cloud environments (tested on AWS).
    • Integrated system diagnostics and debugging tools to collect runtime info (CPU, memory, compiler versions) and logs.
    • Installed and configured services such as Tailscale VPN, AWS CLI, and git tooling for secure data access and collaboration.
    • Prepared for cloud deployment using YAML CloudFormation templates for EC2/Auto Scaling, Headscale service, and Step Functions orchestration.
  • Machine Learning and Forecasting
    • Used Bayesian hyperparameter optimization (Optuna) for model tuning.
    • Integrated distributed task execution using Ray.io and Prefect, allowing scalable processing of large datasets.
    • Leveraged tools such as PyTorch, XGBoost, LightGBM, and TPOT for benchmark comparisons in model discovery.
  • Scientific Toolkit Expansion
    • Implemented dependency support for both Python and R data science ecosystems (e.g., CVXR, Rcpp, mclust).
    • Enabled multi-language data pipelines for exploratory and statistical modeling workflows.

Supply Chain Optimization: statistical forecasting SAP IBP
Supply Chain Optimization: statistical forecasting SAP IBP
Developer June 2023 - December 2024

Sector: HVAC Solutions / Renewable Energies

  • Data analysis, profiling, cleansing
  • Comprehensive assessment of the SAP IBP forecasting configuration, identifying limitations in handling volatile market phases
  • Implementation of a new ensemble forecast model in SAP IBP, significantly enhancing forecast accuracy and increasing model resilience to effectively manage volatile demand
  • Development of a data-driven forecast segmentation to improve forecast accuracy and responsiveness
  • Leveraged forward-looking order data to further enhance short-term forecast accuracy
  • Development and implementation of a fully-integrated accessory planning system, forecasting accessories in relation to core products to optimize inventory
  • Deployment of the tailor-made accessory planning system
  • Assessment of the currently used solutions for product lifecycle management and delivering actionable recommendations for optimization
  • Agile project organization

High-Performance Scientific NLP Computing Cluster
High-Performance Scientific NLP Computing Cluster
Developer August 2024 - April 2025

Sector: Public Design, implementation, and automation of a fully self-service, multi-GPU-accelerated computing environment for researchers and students at the University of Mannheim, deployed as a Kubernetes-based platform. Core contributions:

  • System Architecture and Infrastructure Deployment
    • Architected a robust cloud-native environment using K3s Kubernetes, 1 TB RAM, 4× NVIDIA H100 GPUs, and 40 TB NVMe storage.
    • Integrated JupyterHub provisioning, Dask orchestration, Conda‑Store, Argo workflows, and unified identity management via Keycloak.
    • Developed a reproducible, version-controlled setup with infrastructure-as-code and modular bash automation scripts for deployment, maintenance, and disaster recovery.
  • GPU Workload Scheduling and Fractional Allocation
    • Configured NVIDIA GPU Operator for K8s to support MIG partitioning, exposing fractional GPU slices to Kubernetes.
    • Developed dynamic notebook profiles that adapt to real-time MIG availability.
    • Implemented an idle culler to release unused MIG slices after 4h of inactivity, improving GPU utilization.
  • User Management and Secure Networking
    • Deployed Headscale as a self-hosted alternative to Tailscale control plane, with dynamic ACL generation for per-user network segmentation and secure user-to-container communication.
  • Deployment Automation & Configuration Tooling
    • Developed a CLI automation tool that orchestrates deployment actions: create, update, destroy, and maintenance, complete with interactive prompts, config validation, and dry-runs.
    • Enabled full-stack updates through idempotent scripts (e.g. SSL certificate renewal, GPU Operator upgrades, TLS secret rotation).
    • Implemented backup and restore flows using Proxmox ZFS snapshots and zfs.
  • Developer Enablement
    • Provided users with a preconfigured .bashrc template with Conda, Dask, GPU utilities, and CLI aliases.
    • Delivered a flexible local Conda/Nix package management environment, allowing persistent environments via Nix stores or shared Conda via NFS.
    • Custom JupyterLab Docker image built and maintained to support tools such RStudio, Julia, Ollama, Langflow, H2O LLM Studio, etc.
  • Documentation and Operational Excellence
    • Authored a comprehensive Software Requirements Specification and Operations Manual outlining installation, architecture, usage, and disaster recovery.
    • Maintained logs, telemetry, and dashboards for cluster monitoring (via Portainer and Grafana dashboards).

Time series forecasting for wind turbine sensors
Time series forecasting for wind turbine sensors
Developer February 2024 - June 2024

Sector: Engineering / Energy

  • Time series specific data analysis, profiling, cleaning for training of neural networks
  • Classical statistical analysis of stationary and non-stationary time series data
  • Designed, developed and fine tuned neural network and tree-based ensemble models for gap filling of environmental sensors for evaluation of wind turbine sites
  • Time series gap-filling with auto-regressive Long Short-Term Memory recurrent neural networks using limited data inputs
  • Integration of developed and tested neural network models into Azure data lake and processing pipelines
  • Application of trained neural network models for anomaly detection related to data quality in client time series databases
  • Integration of neural network model results into client dashboard system for exploratory data analysis
  • Prepared and lead workshop for machine learning and neural network approaches to time series prediction and gap filling
  • Documentation of developed code and model evaluation results in client’s Azure Boards user story system
  • Agile project organization with biweekly sprints

NLP for various use cases for Legal AI in eLearning
NLP for various use cases for Legal AI in eLearning
Developer February 2023 - June 2023

Sector: Legal

  • Data cleaning and preparation
  • Generation of new exemplary cases that are similar to existing ones concerning the juristic act but differ in the action itself. Test quality of outcome for different GPT versions.
  • Construction of tailored language models from scratch
  • Fine-tune GPT models with existing exemplary cases and let it evaluate True/False-statements.
  • Evaluation of the performance via different metrics to cover the whole spectrum. Comparison with suitable open-source models from HuggingFace.
  • Comparison with baseline / non-fine-tuned davinci model.
  • Implementation of front-end for legal chatbot via Streamlit
  • Implement Semantic Search algorithm to provide case-specific information for the model as context input.
  • Regular meetings to present and discuss the results as well as the setup.
  • Present new ideas for use cases.

Churn forecasting based on competitor pricing data
Churn forecasting based on competitor pricing data
Developer April 2023 - December 2023

Sector: Energy

  • Localization/obtaining and modelling of pricing data
  • Coupling to competitor data via Verivox and Check24
  • Construction of customer-level machine-learning churn models based on the prepared master dataset
  • Operationalization, CI pipeline setup in Azure cloud
  • Adaptation to customer cohorts across a few different dimensions (division (Strom, Heizstrom, Ergas), the supply status (Grundversorgung, Sonderprodukt), the acquisition sales channel, etc.)
  • Dashboard development couping to model results

Computer Vision for Article Image Analysis, Churn and Data Pipeline Engineering in Microsoft Azure
Computer Vision for Article Image Analysis, Churn and Data Pipeline Engineering in Microsoft Azure
Developer March 2022 - April 2023

Sector: Retail / Fashion

  • Implementation of a deep-learning model for automatic tagging of product pictures with rich fashion attributes
    • U-net CNN for background separation
    • Clustering Algorithms for determining the color components/splits of multicolored fashion articles
    • CNN model with classification head for product tagging based on EfficientNet backbones (pattern, closing, logo, etc.)
    • Automated generation of article descriptions from images using NLP techniques
  • Implementation of state-of-the-art models for Churn prediction and Customer Lifetime value prediction
  • Dynamic pricing and elasticity via implementation of suitable algorithms using external factors such as competitor pricing
  • Engineering of complex data pipelines in the Azure Databricks environment
  • Visualization of results

Churn forecasting based on competitor pricing data
Churn forecasting based on competitor pricing data
Developer April 2023 - December 2023

Sector: Energy

  • Localization/obtaining and modelling of pricing data
  • Coupling to competitor data via Verivox and Check24
  • Construction of customer-level machine-learning churn models based on the prepared master dataset
  • Operationalization, CI pipeline setup in Azure cloud
  • Adaptation to customer cohorts across a few different dimensions (division (Strom, Heizstrom, Ergas), the supply status (Grundversorgung, Sonderprodukt), the acquisition sales channel, etc.)
  • Dashboard development couping to model results

Circadian clock prediction
Circadian clock prediction
Developer June 2020 - February 2022

Sector: Health / Pharmaceutical

  • Development of a predictive model for the circadian clock based on genetic samples
  • Integration into existing IT structure
  • Prediction of circadian clock using genetic data
  • Packaging as Windows tool with graphical frontend

Computer Vision on tracking data in professional sports
Computer Vision on tracking data in professional sports
Developer March 2020 - August 2020

Sector: Sports

  • Analysis of real-time tracking data for a whole season of football matches
  • Development of high-level opponent report for set-pieces via automated statistical analyses and via machine learning
  • Autoencoder (convolution/deconvolution,pooling/upsampling, and fully-connected layers) using players’ trajectories densely embedded into a 128-dimensional vector
  • Clustering for similar tactical approaches following set pieces
  • Authorship of research article and presentation of results

Predictive Parking
Predictive Parking
Developer November 2018 - July 2019

Sector: Financial Services / Insurance

  • Cleansing and preparation of large dataset
  • Development of a predictive model for parking availabilities in major cities
  • Collaboration with the ISMLL at the University of Hildesheim
  • Docker container orchestration

Convolutional Recurrent Deep Learning for Computer Vision
Convolutional Recurrent Deep Learning for Computer Vision
Developer March 2018 - July 2019

Sector: Research/Logistics/Care

  • Assessment of feasibility and authorship of research plan including project planning
  • Preparation of data pipeline using AWS services
  • Prototyping for automatic recognition of irregularities in video data from surveillance cameras using spatio-temporally sensitive neural networks
  • Communication with stakeholders at participating institutions

Engine Test Bench Analysis
Engine Test Bench Analysis
Developer July 2018 - December 2018

Sector: Automotive

  • Conception of data model
  • Conception and implementation of data pipeline
  • Analysis of engine test data for irregularities using machine-learning tools
  • Development of reporting dashboard for subject-matter experts
  • Stakeholder management and project planning

Telemetry in free-floating car sharing
Telemetry in free-floating car sharing
Developer November 2017 - April 2018

Sector: Automotive

  • Set up of a Spark cluster on a suitable EC2 instance in AWS
  • Deriving business insights from massive data set via statistical analysis using PySpark
  • Constructing models for predictive maintenance
  • Communication of results to various business stakeholders across the Group
  • Knowledge transfer to data scientists at subsidiaries in other countries
  • Visualization using Tableau

Creation of internal knowledge database using NLP techniques in AWS cloud / on premise
Creation of internal knowledge database using NLP techniques in AWS cloud / on premise
Developer January 2024 - June 2024

Sector: Health / Pharmaceutical

  • Developed fast and host-independent containerized solution of a scoring system reproducing results of competitor’s published research results with own data based on PyTorch
  • Built a Streamlit-based dashboard to monitor tasks and compare performance of various endpoints
  • Built package to deploy inference API, cloud-native containerization
  • Improved existing codebase
  • Combination of multitask into “multi-class” predictions

Advanced Question-Answering Tool
Advanced Question-Answering Tool
Developer June 2023 - October 2023

Sector: Chemical

  • Utilized cutting-edge technologies such as Langchain and GPT-4 to develop a powerful tool capable of answering questions over PDF files.
  • Implemented a robust PDF parsing algorithm to extract and divide the content into smaller text pieces, optimizing the search process for precise information retrieval.
  • Leveraged the power of embeddings to generate vector representations for text fragments, enabling accurate and context-aware similarity search for relevant information.
  • Utilized FAISS, a high-performance similarity search library, to efficiently index and store the generated embedding vectors, facilitating fast and scalable retrieval of relevant text pieces.
  • Developed an advanced chatbot by the integrating semantic search with GPT-4, enabling GPT-4 to provide accurate and factual answers.
  • Created a robust web application utilizing FastAPI and HTML/CSS/JavaScript, offering a user-friendly interface to the chatbot.

Recommender Systems
Recommender Systems
Developer January 2023 - June 2023

Sector: Publishing

  • In-depth analysis of existing user data and interactions
  • Comprehensive analysis of the existing recommender system based on collaborative filtering
  • Leveraged Deep Reinforcement Learning to refine the existing recommender system, enabling more accurate and personalized learning material recommendations for individual users
  • Utilized the state-of-the-art Framework RecSim to simulate user interactions
  • Presented findings and results to stakeholders, providing actionable insights

NLP app for searching scientific publications
NLP app for searching scientific publications
Developer December 2022 - January 2023

Sector: Publishing

  • Data cleaning and preparation, construction of database from arXiv publications
  • Create encodings for content of scientific papers
  • Build app that
    • allows you to enter a search prompt (front-end)
    • performs asymmetric semantic search on search prompt and deposited paper content encodings (back-end)
    • returns papers that answer the provided question best
  • Test different LLMs to optimise results
  • Implement and test different chunking methods to optimise the encoding of the content of the scientific papers

NLP tool for an optimized mail search using open-source models for data protection
NLP tool for an optimized mail search using open-source models for data protection
Developer July 2022 - August 2022

Sector: Showcase

  • ETL: Automated construction of dataset using SAP Analysis for Microsoft Office and maildir exports
  • Create encodings for the collection of mails
  • Perform asymmetric semantic search on a search prompt and the mail encodings, returning best candidates that answer the prompt
  • Test different LLMs to optimise results
  • Implement and test different chunking methods to optimise the encoding of the mails’ contents

Predictive maintenance for CNC industry drilling (AI-based chatter detection)
Predictive maintenance for CNC industry drilling (AI-based chatter detection)
Developer May 2023 - October 2023

Sector: Engineering

  • Data cleaning and preparation
  • Range of existing video, audio and vibrational recordings from machine operations to train model
  • Siamese network where two clones of a network with shared weights work in a tandem
  • Manual annotations
  • Convert the solution into a self-optimizing machining system (SOMS) by utilizing reinforcement learning which can vary the cut parameters and therefore not only detect but automatically avoid chatter altogether
  • Regular meetings to present and discuss the results as well as the setup.
  • Present new ideas for use cases.

Neural Networks for Road Dynamics Measurement
Neural Networks for Road Dynamics Measurement
Developer August 2021 - April 2022

Sector: Engineering

  • Data analysis, profiling, cleansing
  • Prototyping a deep-learning model for application in a wheel force transducer (explored architectures include Random Forests, Neural Networks, Convolutions Neural Networks, Long Short-Term Memory Networks, Transformer Networks)
  • Feasibility analysis for real-time application

Data Science for Marketing Analytics and Customer Relationship Management
Data Science for Marketing Analytics and Customer Relationship Management
Developer March 2021 - March 2023

Sector: Retail / Fashion

  • Data profiling and cleansing
  • Transforming data into knowledge by in-depth data science analyses, conception and evaluation of campaigns
  • Engineering of trigger-based SQL data pipelines in the Google Cloud Environment using Dataform
  • Implementation of state-of-the-art ML models for Churn prediction, Customer Lifetime Value prediction, Conversion Delay estimation and Return forecasting
  • Automated generation of article descriptions from images using NLP and computer vision techniques
  • Implementation of a data-driven dynamic Attribution Model based on Markov chains and Shapley values
  • Deployment of all models using Github Actions, Docker, Kubernetes, Terraform and Airflow
  • Setup and maintenance of data contracts
  • Communication and collaboration with stakeholders across the organization
  • Dashboard development for visualization of results tags: [Python, PanDas, HuggingFace, FastAPI, NLP, Javascript, CSS, HTML, websockets, Uvicorn, Nginx, Gitlab CI]

Explainable AI for Drug Development
Explainable AI for Drug Development
Developer October 2021 - September 2022

Sector: Health / Pharmaceutical

  • Development of a full-scale explainable AI python package for performing feature importance estimation (by leveraging Shapley values) on models for ADMET (absorption, distribution, metabolism, excretion, toxicity) predictions
  • Leveraged Similarity Maps to analyze and visualize atomic contributions to the predicted ADMET probabilities
  • Dashboard development
  • API development for interaction with JS frontend
  • Implementation of an interface and API to leverage Message Passing Neural Networks (Chemprop) for molecular property prediction
  • Deployment of the Chemprop interface and API using Docker

Scoring, Prediction API and Dashboard for Drug Development
Scoring, Prediction API and Dashboard for Drug Development
Developer November 2023 - April 2024

Sector: Health / Pharmaceutical

  • Developed fast and host-independent containerized solution of a scoring system reproducing results of competitor’s published research results with own data based on PyTorch
  • Built a Streamlit-based dashboard to monitor tasks and compare performance of various endpoints
  • Built package to deploy inference API, cloud-native containerization
  • Improved existing codebase
  • Combination of multitask into “multi-class” predictions

Spare-parts pricing and time-series
Spare-parts pricing and time-series
Developer February 2021 - August 2021

Sector: Automotive

  • Data analysis, profiling, cleansing
  • Translation of business logic into a technical use case
  • Algorithmic dynamic pricing for spare parts for a OEM
  • Time-series analysis of composite index graph
  • Automation of evaluation using cloud services on AWS platform

Skills

Experience

1

Berlin, Germany

Full-scale project work, from data preparation to prototyping to Kubernetes deployment to API construction to visualization.

AI Consultant

2019 - Present

Responsibilities:
  • Project work as senior expert in sectors automotive, energy, insurance, banking, telecommunications, engineering, retail, logistics, pharmaceuticals, research.

Undisclosed AI Consulting Company

2017 - 2019

Munich/Berlin

Major AI consulting company.

Senior Consultant Data Strategy

2017 - 2019

Responsibilities:
  • Consulting services for corporate clients and medium-sized enterprises
  • Implementation of machine-learning projects
  • Composition and presentation of concept papers & corporate trainings
2

3

Philadelphia, USA

Ivy-League major US university.

Faculty Member

-

Responsibilities:
  • Independent research

Potsdam, Germany

Albert Einstein Institute

Postdoctoral Researcher

-

Responsibilities:
  • Independent research
4

Education

Diploma in Physics
Taken Courses:
  • Quantum Field Theory
  • Topology & Functional Analysis
  • Theoretical Physics
  • Experimental Physics
Thesis:
Dilogarithm Identities and Characters of Exceptional Rational Conformal Field Theories
Supervisor:
PD Dr. Michael Flohr
Entrepreneurial Post-Graduate Education

Publications

Quantum Aspects and Arithmetic Structures of Cosmological Singularities in Gravitational Theories

PhD thesis

Dilogarithm Identities and Characters of Exceptional Rational Conformal Field Theories

Diploma thesis