- Chair of Compiler Construction
- Chair of Emerging Electronic Technologies
- Chair of Knowledge-Based Systems
- Chair of Molecular Functional Materials
- Chair of Network Dynamics
- Chair of Organic Devices
- Chair of Processor Design
Alexander Brauckmann |
||
Phone Fax Visitor's Address |
alexander.brauckmann@tu-dresden.de +49 (0)351 463 42336 +49 (0)351 463 39995 Helmholtzstrasse 18,3rd floor, BAR III59 01069 Dresden |
Alexander Brauckmann received his Diploma degree in Computer Science from TU Dresden in March 2020, in which he has focused on Software Engineering, Systems Engineering, and Machine Learning. During his studies, he gained experience in various industry internships, as well as a student assistant at the Chair for Compiler Construction, where he wrote his final thesis on Machine Learning models for predictive and generative tasks on source code.
His research interests include
- Machine Learning models of source code
- Machine Learning-enabled compiler optimizations
I have several student topics in the intersection of compilers and machine learning available. For concrete descriptions, please email me. Generally, the topics can be adapted to several extends, fitting the scope of research projects or final theses:
Modeling Source Code for Machine Learning
Central to ML-Methods is the way how to represent and model data, with the goal to learn meaningful features for the given task. In compilers, knowledge from analysis can be exploited to construct better models. So far, we have explored representations at different levels and modeled them using Graph Neural Network (GNN) models. Your task will be to work on new models with additional compiler-internal semantics.
Requirements: C/C++, Python
Beneficial: Machine Learning, Graph Algorithms
Related Work: [1], [2], [3]
2021
- Alexander Brauckmann, Andr'es Goens, Jeronimo Castrillon, "PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning", Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 17-29, Sep 2021. [doi] [Bibtex & Downloads]
PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning
Reference
Alexander Brauckmann, Andr'es Goens, Jeronimo Castrillon, "PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning", Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 17-29, Sep 2021. [doi]
Abstract
The polyhedral model allows a structured way of defining semantics-preserving transformations to improve the performance of a large class of loops. Finding profitable points in this space is a hard problem which is usually approached by heuristics that generalize from domain-expert knowledge. Existing search space formulations in state-of-the-art heuristics depend on the shape of particular loops, making it hard to leverage generic and more powerful optimization techniques from the machine learning domain. In this paper, we propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP). Instead of using transformations, the formulation is based on an abstract space of possible schedules. In this formulation, states model partial schedules, which are constructed by actions that are reusable across different loops. With a simple heuristic to traverse the space, we demonstrate that our formulation is powerful enough to match and outperform state-of-the-art heuristics. On the Polybench benchmark suite, we found the search space to contain transformations that lead to a speedup of 3.39x over LLVM O3, which is 1.34x better than the best transformations found in the search space of isl, and 1.83x better than the speedup achieved by the default heuristics of isl. Our generic MDP formulation enables future work to use reinforcement learning to learn optimization heuristics over a wide range of loops. This also contributes to the emerging field of machine learning in compilers, as it exposes a novel problem formulation that can push the limits of existing methods.
Bibtex
@InProceedings{brauckmann_pact21,
author = {Brauckmann, Alexander and Goens, Andrés and Castrillon, Jeronimo},
booktitle = {Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)},
title = {PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning},
month = sep,
doi = {10.1109/PACT52795.2021.00009},
pages = {17-29},
url = {https://ieeexplore.ieee.org/document/9563041},
year = {2021},
abstract = {The polyhedral model allows a structured way of defining semantics-preserving transformations to improve the performance of a large class of loops. Finding profitable points in this space is a hard problem which is usually approached by heuristics that generalize from domain-expert knowledge. Existing search space formulations in state-of-the-art heuristics depend on the shape of particular loops, making it hard to leverage generic and more powerful optimization techniques from the machine learning domain. In this paper, we propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP). Instead of using transformations, the formulation is based on an abstract space of possible schedules. In this formulation, states model partial schedules, which are constructed by actions that are reusable across different loops. With a simple heuristic to traverse the space, we demonstrate that our formulation is powerful enough to match and outperform state-of-the-art heuristics. On the Polybench benchmark suite, we found the search space to contain transformations that lead to a speedup of 3.39x over LLVM O3, which is 1.34x better than the best transformations found in the search space of isl, and 1.83x better than the speedup achieved by the default heuristics of isl. Our generic MDP formulation enables future work to use reinforcement learning to learn optimization heuristics over a wide range of loops. This also contributes to the emerging field of machine learning in compilers, as it exposes a novel problem formulation that can push the limits of existing methods.},
}Downloads
2109_Brauckmann_PACT [PDF]
Permalink
2020
- Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon, "ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers", In Proceeding: 2020 Forum for Specification and Design Languages (FDL), pp. 1-4, Sep 2020. [doi] [Bibtex & Downloads]
ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers
Reference
Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon, "ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers", In Proceeding: 2020 Forum for Specification and Design Languages (FDL), pp. 1-4, Sep 2020. [doi]
Abstract
Deep Learning methods have not only shown to improve software performance in compiler heuristics, but also e.g. to improve security in vulnerability prediction or to boost developer productivity in software engineering tools. A key to the success of such methods across these use cases is the expressiveness of the representation used to abstract from the program code. Recent work has shown that different such representations have unique advantages in terms of performance. However, determining the best-performing one for a given task is often not obvious and requires empirical evaluation. Therefore, we present ComPy-Learn, a toolbox for conveniently defining, extracting, and exploring representations of program code. With syntax-level language information from the Clang compiler frontend and low-level information from the LLVM compiler backend, the tool supports the construction of linear and graph representations and enables an efficient search for the best-performing representation and model for tasks on program code.
Bibtex
@InProceedings{brauckmann_fdl20,
author = {Alexander Brauckmann and Andr\'{e}s Goens and Jeronimo Castrillon},
title = {ComPy-Learn: A Toolbox for Exploring Machine Learning Representations for Compilers},
booktitle = {2020 Forum for Specification and Design Languages (FDL)},
year = {2020},
location = {Kiel, Germany},
month = sep,
pages={1-4},
doi={10.1109/FDL50818.2020.9232946},
url = {https://ieeexplore.ieee.org/document/9232946},
abstract = {Deep Learning methods have not only shown to improve software performance in compiler heuristics, but also e.g. to improve security in vulnerability prediction or to boost developer productivity in software engineering tools. A key to the success of such methods across these use cases is the expressiveness of the representation used to abstract from the program code. Recent work has shown that different such representations have unique advantages in terms of performance. However, determining the best-performing one for a given task is often not obvious and requires empirical evaluation. Therefore, we present ComPy-Learn, a toolbox for conveniently defining, extracting, and exploring representations of program code. With syntax-level language information from the Clang compiler frontend and low-level information from the LLVM compiler backend, the tool supports the construction of linear and graph representations and enables an efficient search for the best-performing representation and model for tasks on program code.},
}Downloads
2009_Brauckmann_FDL [PDF]
Permalink
- Alexander Brauckmann, Andrés Goens, Sebastian Ertel, Jeronimo Castrillon, "Compiler-Based Graph Representations for Deep Learning Models of Code", Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020), Association for Computing Machinery, pp. 201–211, New York, NY, USA, Feb 2020. [doi] [Bibtex & Downloads]
Compiler-Based Graph Representations for Deep Learning Models of Code
Reference
Alexander Brauckmann, Andrés Goens, Sebastian Ertel, Jeronimo Castrillon, "Compiler-Based Graph Representations for Deep Learning Models of Code", Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020), Association for Computing Machinery, pp. 201–211, New York, NY, USA, Feb 2020. [doi]
Bibtex
@InProceedings{brauckmann_cc20,
author = {Alexander Brauckmann and Andr\'{e}s Goens and Sebastian Ertel and Jeronimo Castrillon},
title = {Compiler-Based Graph Representations for Deep Learning Models of Code},
booktitle = {Proceedings of the 29th ACM SIGPLAN International Conference on Compiler Construction (CC 2020)},
year = {2020},
isbn = {9781450371209},
url = {https://doi.org/10.1145/3377555.3377894},
doi = {10.1145/3377555.3377894},
series = {CC 2020},
pages = {201–211},
numpages = {11},
publisher = {Association for Computing Machinery},
location = {San Diego, CA, USA},
month = feb,
address = {New York, NY, USA},
keywords = {conf},
}Downloads
2002_Brauckmann_CC [PDF]
Permalink
- Alexander Brauckmann, "Investigating Input Representations and Representation Models of Source Code for Machine Learning", Master's thesis, TU Dresden, Feb 2020. [Bibtex & Downloads]
Investigating Input Representations and Representation Models of Source Code for Machine Learning
Reference
Alexander Brauckmann, "Investigating Input Representations and Representation Models of Source Code for Machine Learning", Master's thesis, TU Dresden, Feb 2020.
Bibtex
@mastersthesis{Brauckmann-diplom20,
title={Investigating Input Representations and Representation Models of Source Code for Machine Learning},
author={Alexander Brauckmann},
year={2020},
month=feb,
school={TU Dresden},
}Downloads
2002_Brauckmann_DA [PDF]
Permalink
2019
- Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, Jeronimo Castrillon, "A Case Study on Machine Learning for Synthesizing Benchmarks", Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL), ACM, pp. 38–46, New York, NY, USA, Jun 2019. [doi] [Bibtex & Downloads]
A Case Study on Machine Learning for Synthesizing Benchmarks
Reference
Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, Jeronimo Castrillon, "A Case Study on Machine Learning for Synthesizing Benchmarks", Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL), ACM, pp. 38–46, New York, NY, USA, Jun 2019. [doi]
Abstract
Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.
Bibtex
@InProceedings{goens_mapl19,
author = {Andr\'{e}s Goens and Alexander Brauckmann and Sebastian Ertel and Chris Cummins and Hugh Leather and Jeronimo Castrillon},
title = {A Case Study on Machine Learning for Synthesizing Benchmarks},
booktitle = {Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL)},
year = {2019},
series = {MAPL 2019},
doi = {10.1145/3315508.3329976},
url = {http://doi.acm.org/10.1145/3315508.3329976},
acmid = {3329976},
isbn = {978-1-4503-6719-6/19/06},
pages = {38--46},
address = {New York, NY, USA},
month = jun,
publisher = {ACM},
keywords = {conf},
location = {Phoenix, AZ, USA},
numpages = {9},
abstract = {Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.},
}Downloads
1906_Goens_MAPL [PDF]
Permalink