# Chapter 1 Introduction

## 1.1 Document-Structure

- in the
*Introduction*I explain my*Motivation&Scope*, the*greeNsort*^{®}*Perspective on Innovation*and its*Innovation-Taxonomy* - in the section
*Paradigmatic innovations*I describe the fundamental*greeNsort*^{®}innovations before diving into particular algorithms - in the section
*Methods&Measurement*I explain the employed techniques for measuring energy and other KPIs as well as the analytic methods used - in the section
*Quicksort Algorithms*I describe the*The Quicksort-Dilemma*, its*greeNsort*^{®}solution and explain consequences for partial sorting. A more detailed description including code-examples is in the Quicksort draft chapter of the*greeNsort*^{®}book. - the
*Split&Merge Algorithms*section is the main part of*greeNsort*^{®}: prior art reference algorithms and the*The Mergesort-Dilemma*are explained, then more memory parsimonious*greeNsort*^{®}algorithms are developed step-by-step and variations of the algorithms are explained. - the
*Partition&Pool Algorithms*section briefly transfers the learning of the*Quicksort Algorithms*and*Split&Merge Algorithms*sections to stable sorting via partitioning. - the
*Parallel sorting*section explains and analyzes parallel versions of some prior art and*greeNsort*^{®}algorithms - the
*Incremental sorting*section explores implications for sorting a stream of data INSERTs and data structures which allow DELETEs and queries - the
*Impact*section gives an outlook for the impact of the*greeNsort*^{®}innovations in various areas - the
*Conclusion&Outlook*section gives a summary of the report - the
*Author&Project*section explains who am I and how to reach me - the final
*Terms&Tables*sections list important classes of terms (*Definition&Convictions*,*Bold techniques*,*Algorithms*,*Extended&Further*), and then*Abbreviations*,*Glossary*and*Bibliography*.

## 1.2 Motivation&Scope

Sorting is a fundamental operation in IT, in my eyes even more fundamental than hashing, see the empirical tests in Oehlschlägel and Silvestri (2012). Sorting emits CO2 by consuming electricity during operation and by requiring certain hardware that embodies CO2. The idea of *greeNsort*^{®} is to bring CO2 savings to thousands of pieces of software and to billions of devices without a need to re-write software or a need to produce new hardware. The idea of *greeNsort*^{®} is to reduce the CO2 during sorting and to reduce the required hardware, and to develop better sorting-algorithms that are so simple, that they can easily be implemented and replace inefficient algorithms that today are found in many central libraries of popular programming languages.

The core scope of *greeNsort*^{®} is *simple general binary sorting in contiguous space with Divide&Conquer algorithms that can be described by recursion*. Why *general*? Because I don’t want to optimize just a special case, there are already some very efficient sorting-algorithms for special cases, for example radix-sorting algorithms. Why *simple*? Because simplicity enables quick low-cost implementation. Why *binary*? Because binary is simple, teaching sorting starts with binary and ends with k-ary; there are already some very efficient k-ary sorting-algorithms, but those are more complicated and pay-off only on hardware with particular properties. Why *contiguous space*? Because contiguous space is simple and algorithms operating in contiguous space can be designed to have good cache-properties; algorithms that operate in block-managed-memory^{1} can reduce hardware-memory requirements, but they are more complex and introduce random-access on block-level. Why *Divide&Conquer*? Because Divide&Conquer guarantees that the costs of sorting are limited at \(\mathcal{O}(log{N} \cdot f(N))\), for example \(\mathcal{O}(log{N} \cdot N)\) in the random-access-model. Why *by recursion*? Because recursion is an important *mental method* for developing algorithms, teaching algorithms, understanding algorithms and analyzing algorithms^{2}.

Having said this, out-of-scope are for example *in-place-merging* such as *Grailsort* (complicated and slow), algorithms such as *Funnelsort* that leverage the continuous van-Emde-Boas-Layout (k-ary, complicated and not practical for *N* that is not a power of two), *Lazy-Funnelsort* (k-ary, not contiguous) or *(I)n-place (P)arallel (S)uper (S)calar (S)ample(So)rt (IPS4o)* (k-ary, complicated, not contiguous, not completely general, not yet stable). All of these have been academically successful, none of these has found widespread adoption in core-libraries. *greeNsort*^{®} does not claim to beat these algorithms in their domain, although I expect that joining forces could create synergies, for example result in a stable IPS4o with still excellent CO2-properties.

## 1.3 Perspective on Innovation

*greeNsort*^{®} is a *web of interrelated intangible innovations* with the intention to *forever alter the way how sorting is taught and used*. The changes are fundamental and deconstruct many prior art beliefs that stand since 1945 resp. 1961, hence from an academic perspective, it could be appropriate to classify *greeNsort*^{®} as a *Paradigm Change* in the sense of Kuhn (1962). But could it be a *Leapfrog Innovation*? I believe *greeNsort*^{®} is a leapfrog innovation, in spite of three aspects of leapfrog innovations that I believe are myths about leapfrog innovations. Let’s begin with common ground: *Leapfrog innovations are innovations that change the world*.

### 1.3.1 Are leapfrog innovations tangible products?

A common myth about leapfrog innovations is that they are *tangible products*. For example the list of published in the Atlantic *The 50 Greatest Breakthroughs Since the Wheel* mentions almost exclusively tangible innovations (45), with the exception of three borderline cases (*paper money*, *electricity* and *the internet*) and two clearly intangible innovations (*alphabetization*, *the Gregorian calendar*). Here is a counterexample missing from the list without the world today would not be possible: *the logarithm*, I could list hundreds of such intangible innovations that shaped the world as we know it today. Tangible innovations are just the visible part of the iceberg, they are born out of an interrelated web of intangible innovations that made tangible innovations possible. Leapfrog innovations often are intangible ideas that change the world.

### 1.3.2 Do leapfrog innovations create profitable markets?

The next myth about leapfrog innovations is that they create *profitable markets* and that they make private inventors rich. Here is a counterexample that that changed peoples lives towards more wealth: the invention of the *three-field system* did not create a market or made its inventor rich. The knowledge about three-field system was simply a *public good*. Leapfrog innovations, particularly intangible leapfrog innovations often create public goods and their inventors remain poor. Leapfrog innovations often have no business model and require upfront sponsoring or meritocratic rewards afterwards.

### 1.3.3 Are leapfrog innovations monolithic?

The last myth about leapfrog innovations is that they are single impressive innovations like a flashing idea of a genius. Here is a counterexample: the iPhone, not technically new, was rather a *design driven, symbolic innovation*, see Steffen (2010). Leapfrog innovations can result from a web of interrelated smaller innovations, where each of the smaller innovations does not look convincing, but in combination they change the rules of the game. As the iPhone did. And as *greeNsort*^{®} can do, if supported.

### 1.3.4 The lackmus test

In 1998 John Chambers from Bell Labs was awarded the *ACM Software System Award* for *The S system, which has forever altered how people analyze, visualize, and manipulate data*. After the commercial *S+* implementation of *S* we got the free open-source *R* interpreter, we got libraries that allow S-style statistical analysis in *Python*, and we got *Julia*, a language that tries to combine the best of many worlds (*C++*, *R*, *Python*, *Matlab*). Except for SQL, virtually anyone today doing statistical analyses is using one of *R*, *Python*, *Julia*^{3}. I consider the S system a leapfrog innovation, although it is *intangible*, largely *free and open-source* and not a single monolithic invention. If you agree on this, you might well consider *greeNsort*^{®} a leapfrog innovation (in line with its logo): *greeNsort*^{®} is intangible, designed for the public good and is a composed web of interrelated innovations, that together enable more efficient, more sustainable algorithms.

## 1.4 Innovation-Taxonomy

In order to guide the reader through the *greeNsort*^{®} web of innovations, I use the following symbols to classify the innovations:

Type | Symbol | Level | Count |
---|---|---|---|

paradigmatic change | (*) | implications beyond sorting | 21 |

paradigmatic change | D | Definitions | 1 |

paradigmatic change | C | Convictions (Core beliefs) | 12 |

patentable methods | B | Bold techniques | 20 |

patentable methods | A | Algorithms | 21 |

patentable methods | E | Extended algorithms | 9 |

patentable methods | F | Further algorithms | 8 |

prior art | T | useful Techniques | 12 |

Each section of this report that describe innovations is preceded by formal entries of those innovations and their dependencies to prior innovation or prior art. For the sake of simplicity - although the classification allows “Extended algorithms” - I have refrained from adding tuned versions of the algorithms to the dependency net. The complete web of innovations can be visualized as a *(D)irected (A)cyclic (G)raph (DAG)*:

The *greeNsort*^{®} innovations will be introduced step-by-step in the following sections.

although with

*Jumpsort*and*Walksort**greeNsort*^{®}has two particularly efficient algorithms that use random access to blocks of data (not strictly contiguous)↩︎notwithstanding the fact that each recursive algorithms can be re-written in loops↩︎

Often using

*Jupyter notebooks*: Project Jupyter’s name is a reference to the three core programming languages supported by Jupyter, which are Julia, Python and R, and also a homage to Galileo’s notebooks recording the discovery of the moons of Jupiter↩︎