The DeepChip Project - Deep Learning for Resource-Constrained Systems


DeepChip focusses on deep learning techniques for resource-constrained systems. Many processes require evaluation of complex numerical functions close to the machine or structure of interest, to avoid the effort of data transfer or to enable small reaction times. Although computing performance of embedded platforms is increasing, it is often significantly lower than the requirements of state-of-the-art algorithms. With the advent of Deep Neural Networks (DNN), the achievable classification performance has been pushed to new levels. The high cost of execution, however, renders them unusable to many real-world applications. A possible approach is the use of hybrid processors (ARM+FPGA or similar), but this raises the question on how to auto-generate optimized DNN classifier implementations. In the DeepChip project, we tackle this problem by optimizing deep models in terms of sparsity, asynchrony and reduced precision, and by extending machine learning languages with a hybrid back-end that is responsible for code generation, automated partitioning and integration.

The DeepChip Project is a FWF/DFG co-funded D-A-CH project, run by Graz University of Technology and Ruprecht-Karls University of Heidelberg. While the first partner contributes expertise and experience from the machine learning area and applications, the second partner has a strong background on application-specific computing systems of various scale. Within the DeepChip project, the partners jointly pursue the objective of designing a productive and easy-to-use tool chain to design custom hardware for deep learning purposes, thereby contributing to bringing advanced machine learning techniques and principles to tiny embedded devices like mobile chips, Internet of Things and more.

For questions or comments, please contact: Holger Fröning, holger.froening (at)

Workshop mini-series on Embedded Machine Learning (WEML)

We are frequently hosting rather informal workshops that gather experts and interested people in machine learning, particulary deep learning, for embedded or other resource-constrained systems. More informations about recent incarnations can be found here:

Note that the next edition is postponed until the COVID situation allows for a better planning of an in-presence event.

Workshop mini-series on IoT, Edge, and Mobile for Embedded Machine Learning (ITEM)

While WEML is a rather informal gathering with no proceedings or similar, ITEM is its academic counterpart, collocated ususally with ECML-PKDD as a premier European forum on ML. For more information about recent and upcoming editions, please visit The next edition is scheduled for September 2021.


The DeepChip project is co-run by the

Current people:

  • Franz Pernkopf, Co-PI, Graz University of Technology, Austria

  • Holger Fröning, Co-PI, Heidelberg University, Germany

  • Günther Schindler, Heidelberg University, Germany

  • Wolfgang Roth, Graz University of Technology, Austria

Associated partners

  • Manfred Mücke, Materials Center Leoben, Austria

Former people:

  • Matthias Zöhrer, Graz University of Technology, Austria


Günther Schindler, Compressing and Mapping Deep Neural Networks on Edge Computing Systems, Dissertation, Heidelberg University, July 19, 2021. [doi]

[ICPR2021] Wolfgang Roth, Günther Schindler, Holger Fröning, and Franz Pernkopf, On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks, 25th International Conference on Pattern Recognition (ICPR2021), Milan, Italy, Jan. 2021.

[LOD2020] Günther Schindler, Wolfgang Roth, Franz Pernkopf and Holger Fröning, Parameterized Structured Pruning for Deep Neural Networks, 6th International Conference on Machine Learning, Optimization, and Data Science (LOD 2020), July 19-23, 2020, Siena, Italy. Best paper finalist.

[ICASSP2020] Markus Huber, Günther Schindler, Wolfgang Roth, Holger Fröning, Christian Schörkhuber, Franz Pernkopf, Towards Real-Time Single-Channel Single-Voice Separation with Pruned Multi-Scale DenseNets, 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, May 4-8, 2020.

[ARXIV] Wolfgang Roth, Günther Schindler, Matthias Zöhrer, Lukas Pfeifenberger, Robert Peharz, Sebastian Tschiatschek, Holger Fröning, Franz Pernkopf, Zoubin Ghahramani, Resource-Efficient Neural Networks for Embedded Systems. ArXiv:2001.03048 [stat.ML], Jan. 2020.

[ECML2019] Wolfgang Roth, Günther Schindler, Holger Fröning, Franz Pernkopf, Training Discrete-Valued Neural Networks with Sign Activations Using Weight Distributions, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2019), Sept. 16-20, Würzburg, Germany. (acceptance rate: 17.7%, 130/734)

[CoRR.cs] Günther Schindler, Wolfgang Roth, Franz Pernkopf, Holger Fröning, Parameterized Structured Pruning for Deep Neural Networks, arXiv:1906.05180 [CoRR.cs], June 2019.

[HiPEAC2019EDLA] Christoph Gratl, Manfred Mücke, Günther Schindler and Holger Fröning, Towards efficient mapping of BNNs onto embedded targets using Tensorflow/XLA, 1st Workshop on Emerging Deep Learning Accelerators (EDLA), co-located with the HiPEAC 2019 Conference, January 21-23, 2019, Valencia, Spain.

[CoRR.cs] Franz Pernkopf, Wolfgang Roth, Matthias Zoehrer, Lukas Pfeifenberger, Günther Schindler, Holger Fröning, Sebastian Tschiatschek, Robert Peharz, Matthew Mattina, Zoubin Ghahramani, Efficient and Robust Machine Learning for Real-World Systems, arXiv:1812.02240 [CoRR.cs], December 2018.

[ECML2018] Günther Schindler, Matthias Zöhrer, Franz Pernkopf, and Holger Fröning, Towards Efficient Forward Propagation on Resource-Constrained Systems, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2018), Sept 10-14, Dublin, Ireland. (acceptance rate: 26%, 92/354)

[ICASSP2018] Matthias Zöhrer, Lukas Pfeifenberger, Günther Schindler, Holger Fröning, and Franz Pernkopf, Resource Efficient Deep Eigenvector Beamforming, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 15–20 April 2018, Calgary, Alberta, Canada.

[UCHPC2017] Günther Schindler, Manfred Mücke, Holger Fröning, Linking Application Description with Efficient SIMD Code Generation for Low-Precision Signed-Integer GEMM, 10th Workshop on UnConventional High Performance Computing 2017 (UCHPC 2017), in conjunction with EuroPAR 2017, August 28/29, 2017, Santiago de Compostela, Spain.


Quantization for efficient DL inference on general-purpose processors

While most related work on quantization results in a reduced test accuracy, in this work we show that quantization on general-purpose processors without any loss in accuracy is possible. Please refer to the ECML2018 paper for a detailed coverage and discussion of the concept. Reproducibility repo can be found at:

Custom precision operators for ARM processors

In this work we explore the possibilities of ARM embedded processors for linear algebra operations of custom precision. We extend the Eigen BLAS library for different quantizations, ranging from one bit to 32 bit. We demonstrate how performance scales with an increasingly low precision. For more information, see [UCHPC2017].


We gratefully acknowledge the sponsoring we receive from the Austrian FWF and German DFG.