堆

定义

首先理解什么是堆。堆（英语：Heap）是计算机科学中一类特殊的数据结构的统称。堆通常是一个可以被看做一棵树的数组对象。

n个元素序列{k1,k2...ki...kn},当且仅当满足下列关系时称之为堆：

(ki <= k2i,ki <= k2i+1)或者(ki >= k2i,ki >= k2i+1), (i = 1,2,3,4...n/2)

堆的实现通过构造二叉堆（binary heap），实为二叉树的一种；由于其应用的普遍性，当不加限定时，均指该数据结构的这种实现。

堆（二叉堆）可以视为一棵完全的二叉树，完全二叉树的一个“优秀”的性质是，除了最底层之外，每一层都是满的，这使得堆可以利用数组来表示（普通的一般的二叉树通常用链表作为基本容器表示），每一个结点对应数组中的一个元素。

如下图，是一个堆和数组的相互关系：

对于给定的某个结点的下标 i，可以很容易的计算出这个结点的父结点、孩子结点的下标：

Parent(i) = floor(i/2)，i 的父节点下标
Left(i) = 2i，i 的左子节点下标
Right(i) = 2i + 1，i 的右子节点下标

二叉堆一般分为两种：最大堆和最小堆。

最大堆

最大堆中的最大元素值出现在根结点（堆顶）
堆中每个父节点的元素值都大于等于其孩子结点（如果存在）

最小堆

最小堆中的最小元素值出现在根结点（堆顶）
堆中每个父节点的元素值都小于等于其孩子结点（如果存在）

堆排序

原理

堆排序（Heapsort）是指利用堆这种数据结构所设计的一种排序算法。

通常堆是通过一维数组来实现的。在数组起始位置为0的情形中：

父节点i的左子节点在位置(2*i+1);
父节点i的右子节点在位置(2*i+2);
子节点i的父节点在位置floor((i-1)/2);

在堆的数据结构中，堆中的最大值总是位于根节点。堆中定义以下几种操作：

最大堆调整（Max_Heapify）：将堆的末端子节点作调整，使得子节点永远小于父节点
创建最大堆（Build_Max_Heap）：将堆所有数据重新排序
堆排序（HeapSort）：移除位在第一个数据的根节点，并做最大堆调整的递归运算

最大堆调整（MAX‐HEAPIFY）

作用是保持最大堆的性质，是创建最大堆的核心子程序，作用过程如图所示：

由于一次调整后，堆仍然违反堆性质，所以需要递归的测试，使得整个堆都满足堆性质。--递归代码maxHeapifyByRecursion部分。

递归调用需要压栈/清栈，和迭代相比，性能上有略微的劣势。--迭代代码maxHeapifyByIteration部分。

创建最大堆（Build-Max-Heap）

是将一个数组改造成一个最大堆，自下而上的调用 Max-Heapify 来改造数组，建立最大堆。因为 Max-Heapify 能够保证下标 i 的结点之后结点都满足最大堆的性质，所以自下而上的调用 Max-Heapify 能够在改造过程中保持这一性质。如果最大堆的数量元素是 n，那么 Build-Max-Heap 从 Parent(n) 开始，往上依次调用 Max-Heapify。流程如下：

堆排序（HeapSort）

是堆排序的接口算法，Heap-Sort先调用Build-Max-Heap将数组改造为最大堆，然后将堆顶和堆底元素交换，之后将底部上升，最后重新调用Max-Heapify保持最大堆性质。由于堆顶元素必然是堆中最大的元素，所以一次操作之后，堆中存在的最大元素被分离出堆，重复n-1次之后，数组排列完毕。整个流程如下：

代码

https://github.com/tclxspy/Articles/blob/master/algorithm/Code/HeapSort.java

public class HeapSort {    

    /**
     * the implement method of HeapSort
     * @param data which prepare for sorting
     * @return the array after sorted
     */
    public static int[] heapSort(int[] data) {    
        //from the end index, build the Max-Heapify        
        int startIndex = getParentIndex(data.length - 1);
        for(int i = startIndex; i >= 0; i--) {
            maxHeapifyByIteration(data, data.length, i);
            //or
            //maxHeapifyByIteration(data, data.length, i);
        }

        //swap the head and end. then maintain the Max-Heapify
        for (int i = data.length - 1; i > 0; i--) {
            swap(data, 0, i);
            maxHeapifyByIteration(data, i, 0);
            //or
            //maxHeapifyByIteration(data, i, 0);
        }
        return data;
    }
    /**
     * build the Max-Heapify by recursion method
     * @param data input data
     * @param heapSize length of data
     * @param index from the end index, build the Max-Heapify
     */
    private static void maxHeapifyByRecursion(int[] data, int heapSize, int index) {        
        //get the left and right index
        int left = getChildLeftIndex(index);
        int right = getChildRightIndex(index);

        //get the max data's index
        int largest = index;
        if(left < heapSize && data[index] < data[left]) {
            largest = left;
        }

        if(right < heapSize && data[largest] < data[right]) {
            largest = right;
        }

        //swap the max data to parent node. and then maintain the Max-Heapify again
        if(largest != index) {
            swap(data, largest, index);
            maxHeapifyByRecursion(data, heapSize, largest);
        }        
    }

    /**
     * build the Max-Heapify by iteration method
     * @param data input data
     * @param heapSize length of data
     * @param index from the end index, build the Max-Heapify
     */
    private static void maxHeapifyByIteration(int[] data, int heapSize, int index) {    
        while(true) {
            //get the left and right index
            int left = getChildLeftIndex(index);
            int right = getChildRightIndex(index);

            //get the max data's index
            int largest = index;
            if(left < heapSize && data[index] < data[left]) {
                largest = left;
            }

            if(right < heapSize && data[largest] < data[right]) {
                largest = right;
            }

            //swap the max data to parent node. and then maintain the Max-Heapify again
            if(largest != index) {
                swap(data, largest, index);
                index = largest;
            }    
            else {
                break;
            }
        }
    }

    /**
     * swap data[i] and data[j]
     * @param data
     * @param i
     * @param j
     */
    private static void swap(int[] data, int i, int j) {
        int temp = data[i];
        data[i] = data[j];
        data[j] = temp;
    }

    private static int getParentIndex(int current) {
        return (current - 1) / 2;
    }

    private static int getChildLeftIndex(int current) {
        return 2 * current + 1;
    }

    private static int getChildRightIndex(int current) {
        return 2 * (current + 1);
    }    

    public static void main(String[] args) {    
        int[] sort = new int[]{1, 0, 10, 20, 3, 5, 6, 4, 9, 8, 12, 17, 34, 11};        
        int[] result = heapSort(sort);
        for (int i = 0; i < result.length; i++) {
            System.out.print(result[i] + " ");
        }
    }
}

输出

0 1 3 4 5 6 8 9 10 11 12 17 20 34

复杂度

最差时间复杂度 О(nlogn)

最优时间复杂度 О(nlogn)

平均时间复杂度 О(nlogn)

最差空间复杂度总共О(n), 需要辅助空间O(1)

函数maxHeapify将指定子树的根节点"下沉"到合适的位置, 最终子树变成最大堆，该过程最坏时间复杂度为O(logn)。因此总共时间复杂度为 O(nlogn) 。

堆排序算法的空间复杂度是O(1)，从实现上很容易看出来，也叫原地堆排序。

动态图：

附

堆排序在排序复杂性的研究中有着重要的地位，因为它是我们所知的唯一能够同时最优地利用空间和时间的方法--在最坏的情况下它也能保证使用~2nlogn次比较和恒定的额外空间。当空间十分紧张的时候（例如在嵌入式系统或低成本的移动设备中）它很流行。

平均时间上，堆排序的时间常数比快排要大一些，因此通常会慢一些，但是堆排序最差时间也是O(nlogn)的，这点比快排好。

快排在递归进行部分的排序的时候，只会访问局部的数据，因此缓存能够更大概率的命中；而堆排序的建堆过程是整个数组各个位置都访问到的，后面则是所有未排序数据各个位置都可能访问到的，所以不利于缓存发挥作用。简答的说就是快排的存取模型的**局部性（locality）**更强，堆排序差一些。

速度和缓存的问题都反映了堆排序让数据过于大距离的移动，你观察某个元素在整个排序过程中的移动过程，会发现它是前后大幅度的跑动；而快速排序则是尽快的移动到最终的位置，然后做小范围的跳动。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

算法#10--用简单的思维理解堆排序.md

算法#10--用简单的思维理解堆排序.md

堆

定义

最大堆

最小堆

堆排序

原理

最大堆调整（MAX‐HEAPIFY）

创建最大堆（Build-Max-Heap）

堆排序（HeapSort）

代码

输出

复杂度

附

Files

算法#10--用简单的思维理解堆排序.md

Latest commit

History

算法#10--用简单的思维理解堆排序.md

File metadata and controls

堆

定义

最大堆

最小堆

堆排序

原理

最大堆调整（MAX‐HEAPIFY）

创建最大堆（Build-Max-Heap）

堆排序（HeapSort）

代码

输出

复杂度

附