Sorting algorithms should be simple, robust and sustainable.
Simplicity is prerequisite for reliability ― Edsger W. Dijkstra
In the post-Moore era, it will be essential for algorithm designers and hardware architects to work together to find simple abstractions that designers can understand and that architects can implement efficiently ― Leiserson et. al1
Sorting algorithms must be simple for reliable industrial grade use. Complexity hinders widespread use and carries risks. Examples of this are the following:
Years after rolling out Quicksort in the C standard-library it turned out that the implementation was vulnerable for algorithmic complexity attacks. For application users and cloud providers it is no fun, when certain data input can turn the log-linear \(\mathcal{O}(N\log{N})\) runtime into quadratic \(\mathcal{O}(N^2)\) runtime. The reason for the vulnerability was a premature optimization that gained speed by using deterministic median pivots instead of the random pivots originally recommended by Hoare (the inventor of Quicksort). As of 2020 there is now a formally verified implementation of a modern Quicksort2.
Years after rolling out Timsort in Python and Java it turned out that the implementation did sort wrong in certain edge-cases.3 Timsort is an engineering masterpiece but was simply too complicated for complete quality assurance.
Years after publication Funnelsort adoption is negligible because Funnelsort is difficult to understand, difficult to implement, and simplifications (Lazy Funnelsort) sacrifice the cache-friendly memory-layout.
It is time to develop the discipline of resilient algorithms ― Moshe Y. Vardi4
Sorting algorithms must function robustly under divers conditions. Simple algorithms also tend to be more robust. Tuning for specific conditions not only makes algorithms more complex, the success of some tuning techniques also strongly depends on CPU and compiler version, notoriously for tuning against branch-misprediction. greeNsort® algorithms require less tuning than other algorithms. Even without tuning greeNsort® algorithms are very adaptive to many diverse data inputs, and some algorithms even reduce branch-misprediction. Still, greeNsort® algorithms may be tuned for specific preferences. The covid-19 pandemic increased appreciation for short local supply chains: greeNsort® algorithms prefer local memory access over distant memory access, where possible. Finally, greeNsort® algorithms allow the resilient execution under difficult conditions: some are designed for economic execution with negligible buffer memory, others allow to cope with temporary shortage of memory when calling them.
What can software professionals do? Software can be designed and tuned for efficiency and memory size, enabling client devices to remain viable for over eight years. Software upgrades should have as a design goal to avoid driving client hardware obsolescence. ― Andrew A. Chien5
We must redesign cloud software and hardware to flexibly follow renewable energy. For cloud computing, the majority of carbon emissions arise from power consumed during operation (80% for typical four-year use). But embodied carbon for hardware and data center infrastructure (scope 3) cannot be ignored. ― Andrew A. Chien6
The footprint of software is composed of fixed investment cost (hardware production) and variable runtime cost (energy consumption). However, academic research has almost exclusively compared sorting algorithms with regard to variable cost (typically measuring runtime or counts of certain operations such as moves and comparisons). Ignoring the fixed costs has two major drawbacks:
We introduce the sorting footprint7 for measuring and comparing algorithms with different memory needs: greeNsort® algorithms achieve better runtime with less hardware.
The most sustainable way is to not make things. The second most sustainable way is to make something very useful, to solve a problem that hasn’t been solved. ― Thomas Sigsgaard
greeNsort® algorithms are …
There’s a way to do it better. Find it. ― Thomas Edison
There’s plenty of room at the Top. What will drive computer performance after Moore’s law?↩︎
Peter Lammich 2020 Efficient Verified Implementation of Introsort and Pdqsort↩︎
Gow et. al. (2015) “OpenJDK’s Java.utils.Collection.sort() Is Broken: The Good, the Bad and the Worst Case”↩︎
Efficiency vs. Resilience: What COVID-19 Teaches Computing, COMMUNICATIONS OF THE ACM, 05/2020↩︎
What Do DDT and Computing Have in Common?, COMMUNICATIONS OF THE ACM, 06/2020↩︎
Driving the Cloud to True Zero Carbon, COMMUNICATIONS OF THE ACM, 02/2021↩︎
The footprint measure integrates fixed and variable costs by integrating resources such as CPU and RAM over \(runtime\) and the \(footprint\) measure allows to compare algorithms with different \(memory\) requirements. For single-threaded sorting the \(footprint\) is just \(\int_{t=0}^{runtime} memory_t \; dt\), for algorithms using constant \(memory\) this simplifies to \(runtime \cdot memory\), where \(energy\) is measured instead of \(runtime\), the \(efootprint\) is calculated as \(energy \cdot memory\)↩︎
Copyright © 2010 - 2024 Dr. Jens Oehlschlägel - All rights reserved - Terms - Privacy - Impressum