Data-centric Metaprogramming in Object-Oriented Languages
With great power comes great responsibility. As far as program transformations go, metaprogramming is the ultimate tool, allowing developers to transform any aspect of their program. However, wielding metaprogramming is difficult, especially in object-oriented languages, where even a slight signature change can break overriding relations, altering the program behavior.
In this talk I will introduce data-centric metaprogramming, a well-behaved subset of metaprogramming aimed at optimizing the data representation in a program. Data representation is becoming a hot topic again, with the Java community planning to add value class and specialization support at the virtual machine level. In the presentation, you will see how data-centric metaprogramming subsumes specialization and value class inlining while opening the door to more complex transformations.
For example, consider an immutable
Vector collection and an
Employee class corresponding to a database row. Both
Employee are compiled separately, unaware of each other. This makes
Vector[Employee] a awful data container, neither compact nor efficient. Using data-centric metaprogramming, developers can define a better representation, either a compressed byte array or a cache-friendly column-based storage. Then, the compiler support allows this new representation to be used either locally or globally while offering strong correctness guarantees for the transformed code.
Being able to separate the interface used for programming (e.g.
Vector[Employee]) and the actual data representation raises many interesting questions:
- Can it work under an open-world assumption?
- Is there a single “best representation” or are there many “good” representations?
- If data can have different representations, how can it be passed from one piece of code to another?
We’ll work together through the questions and answers and show the research opportunities around this technique.
I am a PhD student at École polytechnique fédérale de Lausanne, in the Programming Methods Laboratory (LAMP). My supervisor is Prof. Martin Odersky, best known for designing the Scala programming language.
I am interested in performance-oriented compilation of high-level language constructs. Convenience and safety make generics great for productivity, but due to the erasure transformation, they perform sub-optimally when used with primitive numeric types. This is where I can help. My main project, dubbed miniboxing, is aimed at compiling generic classes down to very efficient bytecode. It is available at scala-miniboxing.org and can speed up generic code by up to 22x.