Top Menu


Machine Learning Section


Department of Computer Science (DIKU)

University of Copenhagen


Universitetsparken 1

2100 København N



Room: 1-1-N110


EMail: This email address is being protected from spambots. You need JavaScript enabled to view it.



Astronomy & Data Analysis

Current and upcoming projects in astronomy gather tera- and petabytes of data. Machine learning techniques have been identified as an important building block for the overall analysis. I am involved in the development of redshift estimation models for photometric data (grayscale images).







Exact Unsupervised Regularized Least-Squares Classification

An exact approach to the unsupervised regularized least-squares classification problem, implemented in Python using the Numpy and the Scipy packages. The learning problem considered is a variant of the maximum margin clustering, task which can be regarded as the direct extension of support vector machines to unsupervised learning scenarios. The goal is to partition unlabeled data into two classes such that a subsequent application of a support vector machine would yield the overall best result (with respect to the optimization problem associated with support vector machines).



While being very appealing from a conceptual point of view, the combinatorial nature of the induced optimization problem renders a direct application of this concept difficult. The prototype implementation provided below exhibits a polynomial runtime w.r.t. the patterns and can be applied to arbitrary (linear and non-linear) kernel matrices with fixed rank 2.


Source Code


The code is published under the MIT license. If you use the code for scientific work, please use the reference(s) below to cite us. The source code can be downloaded here.




Version 0.1 (November 2013)




  1. Fabian Gieseke, Tapio Pahikkala, and Christian Igel. Polynomial Runtime Bounds for Fixed-Rank Unsupervised Least-Squares Classification. In Proceedings of the 5th Asian Conference on Machine Learning (ACML). 2013, 62-71.   


Buffer k-d Trees

The bufferkdtree package is now hosted on




[old version]


The buffer k-d tree library is an OpenCL library that aims at accelerating nearest neighbor computations using both k-d trees and graphics processing cards (GPUs). The source code is published under the GNU General Public License (GPLv3).


Buffer k-d trees aim at scenarios, where you are given both a large reference (e.g., 100,000 points) and a huge query set (e.g., 1,000,000 or more points) with an input space of moderate dimensionality (e.g., from 4 to 20 dimensions). A description of the techniques used and an experimental evaluation of the implementation using massive astronomical data sets are provided in the corresponding paper.




  • 21 October 2014: Added a note regarding the performances using different driver/CUDA settings.
  • 12 June 2014: Initial release




The current version of the source code is experimental (use the code at your own risk!). So far, it has only been tested extensively on a system with an Intel(R) Core(TM) i7-3770 CPU at 3.40GHz (4 cores, 8 hardware threads), 16GB RAM, a GeForce GTX 770 GPU (with 4GB RAM) and Ubuntu 12.04 (64 Bit) as operating system.


The implementation is based on the efficient use of implicit hardware caches. Thus, to obtain good speed-ups, the GPU at hand has to support this feature! Current architectures such as Nvidia's Kepler architecture exhibit such caches, see, e.g., the Kepler GK110 Whitepaper.




There following releases of the source code are available:





The code has been tested on Ubuntu 12.04 (64 Bit) only, see above. Given an appropriate Installation of OpenCL, compiling the source code on a Linux-based system should be doable by running the Makefile in the "src" directory. IMPORTANT: You might have to change the GPU_PLATFORM_NUMBER in src/bufferkdtree/Makefile before compiling the sources. Also, other modifications might need to be done (such as specifying another OpenCL path).

user@gpuserver:~/bufferkdtree$ cd src/
user@gpuserver:~/bufferkdtree/src$ make
cd ../build && rm -f bufferkdtree_* rm -f kdtree_cpu
cd bufferkdtree && make


user@gpuserver:~/bufferkdtree/src$ cd ..


This should generate executable files in the "build" directory:


user@gpuserver:~/bufferkdtree$ ls build/
bufferkdtree_cpu  bufferkdtree_gpu  kdtree_cpu 


which correspond to the following implementations:


  • bufferkdtree_gpu: The GPU-based buffer k-d tree implementation (uses both CPU and GPU)
  • bufferkdtree_cpu: A pure CPU-based buffer k-d tree implementation (does not use the GPU)
  • kdtree_cpu: A standard parallel k-d tree implementation (pointer-less, CPU only).


NOTE: The current implementation dynamically loads OpenCL kernels, which have to be compiled when invoked the first time (these kernels have to be recompiled whenever the problem parameters change, such as the number of nearest neighbors or the dimensionality of the input patterns). For this reason, the path to the OpenCL kernels is absolute (thus, the kernels subdirectory must not be moved/deleted).




To run the small toy example, do:


user@gpuserver:~/bufferkdtree$ cd examples/
user@gpuserver:~/bufferkdtree/examples$ ./astronomy_small
The following parameters will be used:
k=12 (number of nearest neighbors)
h=8 (tree height)
Reading training patterns ...


QUERY RUNTIME: 					0.6910200000
Nearest neighbors have been computed. Writing output to file ...


This bash script invokes both the buffer k-d tree implementation as well as a parallel (CPU) k-d tree implementation. Similary, you can run the script astronomy_large, which will automatically download additional data (1GB) prior to running both implementations. An exemplary output obtained on the machine mentioned above can be found here; the data used corresponds to psf_model_mag in the paper mentioned below.


NOTE: The current implementation loads all test queries into the global memory of the GPU. For this reason, executing astronomy_large requires a GPU with a quite large global memory (4GB). In case not enough resources can be allocated, you might get the error "Error with errorcode: -4 in file gpu.c in line 399" One way to reduce this memory overhead is to process the test queries in smaller chunks (I am working on that, the next release will process such chunks automatically).


NOTE: The performance might depend on the particular OpenCL version (and driver). For instance, the results mentioned above were obtained on Ubuntu 12.04 (64 Bit) with kernel 3.8.0-44-generic, CUDA 5.5, and NVIDIA driver 319.23. The performance might be different on the same system using a different setup (e.g., using Ubuntu 14.04 (64 Bit) with kernel 3.13.0-36-generic, CUDA 6.5, and NVIDIA driver 340.29, the performance drops by about 30%).




If you wish to cite a paper that describes the techniques and the implementation for buffer k-d trees, please make use of the following work:


Fabian Gieseke, Justin Heinermann, Cosmin Oancea, and Christian Igel. Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs. In Proceedings of the 31st International Conference on Machine Learning (ICML) 32(1), 2014, 172-180. [pdf]


The corresponding bibtex entry can be found here.




The source code is published under the GNU General Public License (GPLv3). The software is free for non-commercial use only; other licenses can be obtained upon request. The author is not responsible for any implications that stem from the use of this software.




Parts of this work have been generously supported by the German Academic Exchange Service (DAAD).





  • Stefan Oehmcke (Postdoc)
  • Andrew Chepreghy (MSc student, together with Adriano Agnello)
  • Ties Robroek (MSc student)
  • Konrad Gnoinski (MSc student, together with Maersk)


Collaborations and Projects


  • PI for joint project Drone Borne LiDAR and Artificial Intelligence for Assessing Carbon Storage (supported by the Data+), together with Katerina Trepekli and Thomas Friborg (IGN) and Christian Igel (DIKU).
  • PI for joint project Monitoring Changes in Big Satellite Data via Massively-Parallel Artificial Intelligence (supported by the Independent Research Fund Denmark), together with Cosmin Oancea (DIKU), Marcos Vaz Sallies (DIKU), IGN (Copenhagen), DMI (Denmark), and Wageningen University (Netherlands)
  • Co-PI for joint project Drone borne LiDAR and Artificial Intelligence for assessing carbon storage (MapCLand, supported by the Villum Foundation), together with Thomas Friborg (PI, University of Copenhagen) and Katerina Trepekli (University of Copenhagen).
  • Co-PI for joint project Monitoring tropical forest recovery capacity using RADAR Sentinel satellite data, together with Jan Verbesselt (PI, Wageningen University) and Wanda De Keersmaecker (Wageningen University).




  • Sizhuo Li (MSc student, together with Martin Brandt)
  • Frederik Faye (MSc student, together with Troels Petersen, 2019)
  • Vinnie Ko (Visiting PhD student, 2018-2019)
  • Svend Lund Breddam (MSc student, together with Cosmin Oancea, 2019)
  • Michael Haahr (BSc student, together with Stefan Oehmcke, 2019)
  • Andreas Borgstad (MSc student, together with Marcos Vaz Salles and Wouter Kouw, 2019)
  • Christoffer Thrysøe (MSc student, together with Marcos Vaz Salles and Wouter Kouw, 2019)
  • Ankit Kariryaa (Visiting PhD student, 2018)
  • Yimian Dai (Visiting PhD student, 2018)
  • Darrick Jian Wen Joo (MSc student, co-supervisor Kim Steenstrup Pedersen, 2018)
  • Rasmus Klitgaard (MSc student, co-supervisor Christian Igel, 2018)
  • Pantelis Kouris (MSc student, co-supervisor Alexander Prishchepov, 2018)
  • Felix  Schaumann (MSc student, 2018)
  • Martin Mekota (MSc student, 2018)
  • Sorin Davidoi (MSc student, 2018)
  • Malte von Mehren (MSc student, 2017)
  • Nicholas Indrehus (MSc student, 2017)
  • Rasmus Bygvraa Nilsson (BSc student, 2017)
  • Alexander Steen Andersen (BSc student, 2017)
  • Sebastian Ardahl Skoett (BSc student, 2017)
  • Koen Basten (BSc student, 2016)
  • Jeftha Spunda (BSc student, 2016)
  • Joris van Vugt (BSc student, 2016)
  • Ties Robroek (BSc student, 2015)
  • Christopher Schenk (BSc student, 2014)
  • Jendrik Poloczek (MSc student, 2013, together with Oliver Kramer)
  • Helge Arjangui (MSc student, 2013, together with Oliver Kramer)
  • Justin Heinermann (MSc student, 2013, together with Oliver Kramer)
Укладка ламината на пол. Быстрая укладка ламината своими руками видео. Укладка ламината по диагонали. Пробковое покрытие для пола. Качественная укладка пробкового пола. Пробковый пол плюсы и минусы. Смеси для выравнивания пола. Быстрое выравнивание бетонного пола. Выравнивание пола стоимость. Самоделки для дома. Лучшие самоделки для дома своими руками. Полезные самоделки для дома. Как обустроить кухню. Как обустроить маленькую кухню быстро. Как обустроить кухню фото. Как открыть свой магазин. Хочу открыть магазин с большим капиталом. Сколько стоит открыть магазин. Диван своими руками. Как сделать диван своими руками для дома. Диван своими руками чертежи. Дизайн маленькой кухни. Красивый дизайн интерьера кухни фото. Малогабаритные кухни дизайн.