Alexander Brauckmann

E-mail

Phone

Fax

Visitor's Address

alexander.brauckmann@tu-dresden.de

+49 (0)351 463 42336

+49 (0)351 463 39995

Helmholtzstrasse 18,3rd floor, BAR III59

01069 Dresden
Germany

Curriculum Vitae

Alexander Brauckmann received his Diploma degree in Computer Science from TU Dresden in March 2020, in which he has focused on Software Engineering, Systems Engineering, and Machine Learning. During his studies, he gained experience in various industry internships, as well as a student assistant at the Chair for Compiler Construction, where he wrote his final thesis on Machine Learning models for predictive and generative tasks on source code.

His research interests include

Machine Learning models of source code
Machine Learning-enabled compiler optimizations

Student Topics

I have several student topics in the intersection of compilers and machine learning available. For concrete descriptions, please email me. Generally, the topics can be adapted to several extends, fitting the scope of research projects or final theses:

Modeling Source Code for Machine Learning

Central to ML-Methods is the way how to represent and model data, with the goal to learn meaningful features for the given task. In compilers, knowledge from analysis can be exploited to construct better models. So far, we have explored representations at different levels and modeled them using Graph Neural Network (GNN) models. Your task will be to work on new models with additional compiler-internal semantics.

Requirements: C/C++, Python
Beneficial: Machine Learning, Graph Algorithms
Related Work: [1], [2], [3]

Publications

2025
Alexander Brauckmann, Anderson Faustino da Silva, Gabriel Synnaeve, Michael F. P. O'Boyle, Jeronimo Castrillon, Hugh Leather, "DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 92–103, New York, NY, USA, Mar 2025. [doi] [Bibtex & Downloads]

DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses

Reference

Alexander Brauckmann, Anderson Faustino da Silva, Gabriel Synnaeve, Michael F. P. O'Boyle, Jeronimo Castrillon, Hugh Leather, "DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 92–103, New York, NY, USA, Mar 2025. [doi]

Abstract
Data flow analysis is fundamental to modern program optimization and verification, serving as a critical foundation for compiler transformations. As machine learning increasingly drives compiler tasks, the need for models that can implicitly understand and correctly reason about data flow properties becomes crucial for maintaining soundness. State-of-the-art machine learning methods, especially graph neural networks (GNNs), face challenges in generalizing beyond training scenarios due to their limited ability to perform large propagations. We present DFA-Net, a neural network architecture tailored for compilers that systematically generalizes. It emulates the reasoning process of compilers, facilitating the generalization of data flow analyses from simple to complex programs. The architecture decomposes data flow analyses into specialized neural networks for initialization, transfer, and meet operations, explicitly incorporating compiler-specific knowledge into the model design. We evaluate DFA-Net on a data flow analysis benchmark from related work and demonstrate that our compiler-specific neural architecture can learn and systematically generalize on this task. DFA-Net demonstrates superior performance over traditional GNNs in data flow analysis, achieving F1 scores of 0.761 versus 0.009 for data dependencies and 0.989 versus 0.196 for dominators at high complexity levels, while maintaining perfect scores for liveness and reachability analyses where GNNs struggle significantly.

Bibtex

@InProceedings{brauckmann_cc25,
author = {Alexander Brauckmann and Anderson Faustino da Silva and Gabriel Synnaeve and Michael F. P. O'Boyle and Jeronimo Castrillon and Hugh Leather},
booktitle = {Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025)},
title = {DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses},
doi = {10.1145/3708493.3712687},
isbn = {9798400714078},
location = {Las Vegas, NV, USA},
pages = {92–103},
publisher = {Association for Computing Machinery},
series = {CC 2025},
url = {https://doi.org/10.1145/3708493.3712687},
abstract = {Data flow analysis is fundamental to modern program optimization and verification, serving as a critical foundation for compiler transformations. As machine learning increasingly drives compiler tasks, the need for models that can implicitly understand and correctly reason about data flow properties becomes crucial for maintaining soundness. State-of-the-art machine learning methods, especially graph neural networks (GNNs), face challenges in generalizing beyond training scenarios due to their limited ability to perform large propagations. We present DFA-Net, a neural network architecture tailored for compilers that systematically generalizes. It emulates the reasoning process of compilers, facilitating the generalization of data flow analyses from simple to complex programs. The architecture decomposes data flow analyses into specialized neural networks for initialization, transfer, and meet operations, explicitly incorporating compiler-specific knowledge into the model design. We evaluate DFA-Net on a data flow analysis benchmark from related work and demonstrate that our compiler-specific neural architecture can learn and systematically generalize on this task. DFA-Net demonstrates superior performance over traditional GNNs in data flow analysis, achieving F1 scores of 0.761 versus 0.009 for data dependencies and 0.989 versus 0.196 for dominators at high complexity levels, while maintaining perfect scores for liveness and reachability analyses where GNNs struggle significantly.},
address = {New York, NY, USA},
month = mar,
numpages = {11},
year = {2025},
}

Downloads

2503_Brauckmann_CC [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3804

×

2021
Alexander Brauckmann, Andr'es Goens, Jeronimo Castrillon, "PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning", Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 17-29, Sep 2021. [doi] [Bibtex & Downloads]

PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning

Reference

Alexander Brauckmann, Andr'es Goens, Jeronimo Castrillon, "PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning", Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 17-29, Sep 2021. [doi]

Abstract
The polyhedral model allows a structured way of defining semantics-preserving transformations to improve the performance of a large class of loops. Finding profitable points in this space is a hard problem which is usually approached by heuristics that generalize from domain-expert knowledge. Existing search space formulations in state-of-the-art heuristics depend on the shape of particular loops, making it hard to leverage generic and more powerful optimization techniques from the machine learning domain. In this paper, we propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP). Instead of using transformations, the formulation is based on an abstract space of possible schedules. In this formulation, states model partial schedules, which are constructed by actions that are reusable across different loops. With a simple heuristic to traverse the space, we demonstrate that our formulation is powerful enough to match and outperform state-of-the-art heuristics. On the Polybench benchmark suite, we found the search space to contain transformations that lead to a speedup of 3.39x over LLVM O3, which is 1.34x better than the best transformations found in the search space of isl, and 1.83x better than the speedup achieved by the default heuristics of isl. Our generic MDP formulation enables future work to use reinforcement learning to learn optimization heuristics over a wide range of loops. This also contributes to the emerging field of machine learning in compilers, as it exposes a novel problem formulation that can push the limits of existing methods.

Bibtex

@InProceedings{brauckmann_pact21,
author = {Brauckmann, Alexander and Goens, Andrés and Castrillon, Jeronimo},
booktitle = {Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)},
title = {PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning},
month = sep,
doi = {10.1109/PACT52795.2021.00009},
pages = {17-29},
url = {https://ieeexplore.ieee.org/document/9563041},
year = {2021},
abstract = {The polyhedral model allows a structured way of defining semantics-preserving transformations to improve the performance of a large class of loops. Finding profitable points in this space is a hard problem which is usually approached by heuristics that generalize from domain-expert knowledge. Existing search space formulations in state-of-the-art heuristics depend on the shape of particular loops, making it hard to leverage generic and more powerful optimization techniques from the machine learning domain. In this paper, we propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP). Instead of using transformations, the formulation is based on an abstract space of possible schedules. In this formulation, states model partial schedules, which are constructed by actions that are reusable across different loops. With a simple heuristic to traverse the space, we demonstrate that our formulation is powerful enough to match and outperform state-of-the-art heuristics. On the Polybench benchmark suite, we found the search space to contain transformations that lead to a speedup of 3.39x over LLVM O3, which is 1.34x better than the best transformations found in the search space of isl, and 1.83x better than the speedup achieved by the default heuristics of isl. Our generic MDP formulation enables future work to use reinforcement learning to learn optimization heuristics over a wide range of loops. This also contributes to the emerging field of machine learning in compilers, as it exposes a novel problem formulation that can push the limits of existing methods.},
}

Downloads

2109_Brauckmann_PACT [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3182

×

2020
Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon, "ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers", In Proceeding: 2020 Forum for Specification and Design Languages (FDL), pp. 1-4, Sep 2020. [doi] [Bibtex & Downloads]

ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers

Reference

Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon, "ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers", In Proceeding: 2020 Forum for Specification and Design Languages (FDL), pp. 1-4, Sep 2020. [doi]

Abstract
Deep Learning methods have not only shown to improve software performance in compiler heuristics, but also e.g. to improve security in vulnerability prediction or to boost developer productivity in software engineering tools. A key to the success of such methods across these use cases is the expressiveness of the representation used to abstract from the program code. Recent work has shown that different such representations have unique advantages in terms of performance. However, determining the best-performing one for a given task is often not obvious and requires empirical evaluation. Therefore, we present ComPy-Learn, a toolbox for conveniently defining, extracting, and exploring representations of program code. With syntax-level language information from the Clang compiler frontend and low-level information from the LLVM compiler backend, the tool supports the construction of linear and graph representations and enables an efficient search for the best-performing representation and model for tasks on program code.

Bibtex

@InProceedings{brauckmann_fdl20,
author = {Alexander Brauckmann and Andr\'{e}s Goens and Jeronimo Castrillon},
title = {ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers},
booktitle = {2020 Forum for Specification and Design Languages (FDL)},
year = {2020},
location = {Kiel, Germany},
month = sep,
pages={1-4},
doi={10.1109/FDL50818.2020.9232946},
url = {https://ieeexplore.ieee.org/document/9232946},
abstract = {Deep Learning methods have not only shown to improve software performance in compiler heuristics, but also e.g. to improve security in vulnerability prediction or to boost developer productivity in software engineering tools. A key to the success of such methods across these use cases is the expressiveness of the representation used to abstract from the program code. Recent work has shown that different such representations have unique advantages in terms of performance. However, determining the best-performing one for a given task is often not obvious and requires empirical evaluation. Therefore, we present ComPy-Learn, a toolbox for conveniently defining, extracting, and exploring representations of program code. With syntax-level language information from the Clang compiler frontend and low-level information from the LLVM compiler backend, the tool supports the construction of linear and graph representations and enables an efficient search for the best-performing representation and model for tasks on program code.},
}

Downloads

2009_Brauckmann_FDL [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2842

×
Alexander Brauckmann, Andrés Goens, Sebastian Ertel, Jeronimo Castrillon, "Compiler-Based Graph Representations for Deep Learning Models of Code", Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020), Association for Computing Machinery, pp. 201–211, New York, NY, USA, Feb 2020. [doi] [Bibtex & Downloads]

Compiler-Based Graph Representations for Deep Learning Models of Code

Reference

Alexander Brauckmann, Andrés Goens, Sebastian Ertel, Jeronimo Castrillon, "Compiler-Based Graph Representations for Deep Learning Models of Code", Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020), Association for Computing Machinery, pp. 201–211, New York, NY, USA, Feb 2020. [doi]

Bibtex

@InProceedings{brauckmann_cc20,
author = {Alexander Brauckmann and Andr\'{e}s Goens and Sebastian Ertel and Jeronimo Castrillon},
title = {Compiler-Based Graph Representations for Deep Learning Models of Code},
booktitle = {Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020)},
year = {2020},
isbn = {9781450371209},
url = {https://doi.org/10.1145/3377555.3377894},
doi = {10.1145/3377555.3377894},
series = {CC 2020},
pages = {201–211},
numpages = {11},
publisher = {Association for Computing Machinery},
location = {San Diego, CA, USA},
month = feb,
address = {New York, NY, USA},
keywords = {conf},
}

Downloads

2002_Brauckmann_CC [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2561

×
Alexander Brauckmann, "Investigating Input Representations and Representation Models of Source Code for Machine Learning", Master's thesis, TU Dresden, Feb 2020. [Bibtex & Downloads]

Investigating Input Representations and Representation Models of Source Code for Machine Learning

Reference

Alexander Brauckmann, "Investigating Input Representations and Representation Models of Source Code for Machine Learning", Master's thesis, TU Dresden, Feb 2020.

Bibtex

@mastersthesis{Brauckmann-diplom20,
title={Investigating Input Representations and Representation Models of Source Code for Machine Learning},
author={Alexander Brauckmann},
year={2020},
month=feb,
school={TU Dresden},
}

Downloads

2002_Brauckmann_DA [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2839

×

2019
Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, Jeronimo Castrillon, "A Case Study on Machine Learning for Synthesizing Benchmarks", Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL), ACM, pp. 38–46, New York, NY, USA, Jun 2019. [doi] [Bibtex & Downloads]

A Case Study on Machine Learning for Synthesizing Benchmarks

Reference

Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, Jeronimo Castrillon, "A Case Study on Machine Learning for Synthesizing Benchmarks", Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL), ACM, pp. 38–46, New York, NY, USA, Jun 2019. [doi]

Abstract
Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.

Bibtex

@InProceedings{goens_mapl19,
author = {Andr\'{e}s Goens and Alexander Brauckmann and Sebastian Ertel and Chris Cummins and Hugh Leather and Jeronimo Castrillon},
title = {A Case Study on Machine Learning for Synthesizing Benchmarks},
booktitle = {Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL)},
year = {2019},
series = {MAPL 2019},
doi = {10.1145/3315508.3329976},
url = {http://doi.acm.org/10.1145/3315508.3329976},
acmid = {3329976},
isbn = {978-1-4503-6719-6/19/06},
pages = {38--46},
address = {New York, NY, USA},
month = jun,
publisher = {ACM},
keywords = {conf},
location = {Phoenix, AZ, USA},
numpages = {9},
abstract = {Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.},
}

Downloads

1906_Goens_MAPL [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2450

×

Alexander Brauckmann

2025

2021

2020

2019