Polly Labs applied as a participating organization at Google’s Summer of Code. In previous years we participated indirectly through the LLVM Foundation and Julia.

Applying

If you are interested in participating in Google’s Summer of Code as a student for one of Polly Labs’ software projects, this is the section made for you.

We have no established method for applications. It is advised to contact us before applying for Summer of Code at Google. The preferred channel depends on the project you are interested in:

Send us some general information about yourself, what relevant experience you have and the project you are interested in. Either choose one from the list below or suggest a project on your own.

There are no general requirements for students other than having some experience with the technologies in question. For instance, an understanding of the polyhedral model might be useful, but not all projects require it.

Having said that, there is some competition between students and projects when applying at Google. The Julia project assembled a list of general advice here: Julia application guidelines.

Mentors

These people have agreed to potentially mentor a project.

  • Tobias Grosser
  • Sven Verdoolaege
  • Michael Kruse
  • Chandan Reddy
  • Oleksandr Zinenko
  • Philippe Clauss
  • Manuel Selva
  • Alexandre Isoard

Project Ideas

These are some projects students can apply for. The list is not exhaustive, we are open to more project suggestions.

PRL for CUDA

PRL is a runtime library for PPCG-generated code. Its main advantages over ocl_utilities.c used by default by PPCG are that it reuses the OpenCL objects (context, command queue, etc) and the programs compiled by the OpenCL driver so it does not have to be recompiled every time a PPCG-optimized region starts execution. It also allows reusing memory buffers allocated on the GPU, with fine-grained control over when they have to be transferred between host and accelerator.

This library currently only exists for OpenCL. In CUDA the program compilation and persistent objects are managed by CUDA and its runtime. However, there is currently no control over data transfers when using CUDA. The goal for this GSoC project is to add CUDA-support to PRL.

Things to do in this project:

  • Refactor out the shared parts for the OpenCL and CUDA versions of PRL.
  • Implement the PRL memory tagging API (e.g. prl_mem_manage) for CUDA.
  • Change the PPCG code generator to optionally generate CUDA code for PRL.
  • Add support to pencilcc, e.g. pass the required libraries to the linking pass.
  • Measure the speedup gained through memory tagging.
  • Maybe standardize the memory tagging API in PENCIL (e.g. if there are additional flags only useful for CUDA).

Possible mentor(s): Michael Kruse and/or Chandan Reddy

Check isl Memory Management Annotations

The Integer Set Library (isl) is a library for manipulating sets and relations of integer points bounded by linear constraints. It is widely used in the polyhedral community in projects such as LLVM Polly, GCC Graphite and CLooG.

Its implementation uses memory annotations in function declaration to state what happens with an object’s ownership and hence whose responsibility it is to release it. __isl_keep means the ownership remains at the caller; with __isl_take the called function will consume the object; and __isl_give passes the ownership to the caller, typically as a return value.

The task would be to implement an intra-procedural static analysis checker that checks the memory management annotations inside the isl source. That is, it should interpret the __isl_keep, __isl_take, __isl_give and __isl_null annotations and check that there are no memory leaks or double frees based on those annotations.

Some solutions have already been suggested and partially implemented. One such approach is to modify the Clang Static Analyzer. It already has a Retain Count Checker which was originally meant to check Objective-C Source Annotations. Its main drawbacks are that it was implemented only for that single use case and that it acts inter-procedurally, while the memory annotations should be correct for each function individually.

A second approach could be AST Matchers. These are less powerful than the Clang Static Analyzer, but could be sufficient for the isl coding style.

If there is time left, the student can also improve the execution paths on error conditions. There are less well-tested than the main code paths. Testing methods could include a static analyzer and unit tests.

Possible mentor(s): Tobias Grosser, Michael Kruse and/or Sven Verdoolaege

Merge Polly and APOLLO

Both, Polly and APOLLO are plugins for the LLVM compiler. While Polly is older and an official LLVM project, the authors of APOLLO had different requirements and hence started their own implementation. Most importantly, APOLLO makes extensive use of speculative execution and JIT compilation, which Polly is not designed for.

The goal of this project proposal is to bring Polly and APOLLO together, in the best case combining both communities into a single project.

Notable differences between the projects are:

  • APOLLO uses dynamic compilation, Polly is designed for AOT compilation.
  • APOLLO allows speculative execution and is able to roll-back previously executed code when a misprediction occurs.
  • APOLLO uses bones (basically individual StoreInsts) as statements. Polly uses BasicBlocks.
  • APOLLO handles DCoPS (Dynamic Control PartS) that cannot be handled by Polly’s SCoPs (Static Control Parts). The APOLLO system also allows marking loops that may be good candidates for runtime speculative optimizations.
  • Polly has been tested on more code and therefore should be the more mature implementation.
  • APOLLO uses the external projects PLUTO, Armadillo, OpenScop Library and CLooG. The different licenses (LGPL 2.1, MPL 2.0, BSD 3-clause, LGPL 2.1+) are a problem for Polly which aims to be included into LLVM which is distributed under a single license. Older versions of Polly could optionally use PLUTO and CLooG, but support for them have since been removed.

Possible mentor(s): Philippe Clauss, Manuel Selva, Tobias Grosser, and/or Michael Kruse

Fuzzing Polly and isl

Fuzzy-testing could be a useful tool for finding bugs in the existing implementation. Unfortunately for both projects it is not as easy as launching libFuzzer or American Fuzzy Lop.

In case of Polly, this merely tests the bitcode parser which has not been fortified against fuzzing itself yet. Also, the probability to hit something that just remotely looks like a SCoP is rather low. That probability is not much higher when using llvm-stress which also has the problem that it can only generate certain categories of nested loop structures. A tool that can find whether some condition can occur and produce reduced test cases that are otherwise difficult to find would be very valuable.

For isl simple fuzzing would just test the parser. More interesting would be if it could test the library’s other public functions as well. For instance, some error conditions can get hit only in unusual circumstances.

This project is more about developing tools for fuzzing and finding bugs than to fix them. The community would try to fix the bugs on the majority of cases. Possible strategies are:

  • Try to increase the likelihood of SCoPs being generated by afl or libFuzzer.
  • Modifying llvm-stress to more likely generate SCoP-like IR.
  • Write a tool that combines and mutates tests from Polly’s test suite into new test cases. (This is basically what afl and libFuzzer are doing as well, but on bit-level)
  • Fuzzing minimal programs that read a polyhedron as input, process it using isl functions and check the plausibility of the result’s correctness.

Possible mentor: Michael Kruse

Implement a faster SCoP detection in Polly

Last year, Sebastian Pop and Aditya Kumar published the article "SCoP Detection: A Fast Algorithm for Industrial Compilers". It presents a more compile-time efficient algorithm to find SCoPs on the IR-Level which they implemented in GCC Graphite.

This project’s idea is to implement the same algorithm in Polly. Polly’s ScopDetection pass is currently based on LLVM’s RegionInfo which in turn uses LLVM’s DominanceFrontier analysis. The latter is considered deprecated and has quadratic worst-case behavior. Not least the current ScopDetection does a lot of redundant computation.

Possible mentors: Tobias Grosser, Andreas Simbürger

Correctly preserve LLVM analyses after Polly’s CodeGen

Transformation passes in LLVM can chose to either preserve an analysis’ result by adapting it to the changes it made or let the pass manager throw the analysis away to force it being recomputed when needed again. Due to Polly’s current pass structure, Polly must preserve all analyses it uses or the pass manager will throw away Polly’s SCoP analysis themselves.

Currently Polly does not preserve all analysis as well as it could do. Most of the time it preserves just enough information such that other SCoPs in the same function can still be processed correctly. There is an ugly workaround to ensure that the analyses are recomputed for add other passes.

The following analyses must be preserved.

  • DominatorTree: Fully preserved.
  • ScalarEvolution: At the moment Polly just invalidates everything in the SCoP.
  • LoopInfo: Polly ensures that already found loops stay consistent, but does not add new loops in the generated code.
  • RegionInfo: Like loops, existing regions are preserved, but no new regions in generated code are found.
  • ScopDetection: Code-generated SCoPs should be invalidated.
  • DependenceInfo: Code-generated SCoPs should be invalidated.
  • Alias analyses: No effort done, probably also not necessary.

The main goal of this project is to ensure that all analysis are preserved correctly such that they pass their verifyAnalysis function and no miscompilation occurs with the barrier disabled. Stretch goals are to improve the quality of analysis preservation, for instance that loops in the generated code are added to LoopInfo.

Possible mentors: Michael Kruse