Merge algorithm

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting algorithms, most famously merge sort.

Usages

The merge algorithm plays a critical role in the merge sort algorithm, a comparison-based sorting algorithm. Conceptually, merge sort algorithm consists of two steps:

  1. Recursively divide the list into sublists of (roughly) equal length, until each sublist containing only one element. A list containing a single element is, by definition, sorted.
  2. Repeatedly merge sublists to create a new sorted sublist until the single list contains all elements. The single list is the sorted list.

The merge algorithm is used repeatedly in the merge sort algorithm.

File:Merge sort algorithm diagram.svg
An example for merge sort

An example for merge sort is given above. We are given a unsorted array of integers. The given array has 7 elements. We then divide the array into 7 partitions, with each partition containing only 1element. Then we repeated merge partitioned units to produce new sublists until only 1 sublist remains.

Merging two lists

Merging two sorted lists into one can be done in linear time and linear space. The following pseudocode demonstrates an algorithm that merges input lists (either linked lists or arrays) A and B into a new list C.[1][2]:{{{3}}}:104 The function head yields the first element of a list; "dropping" an element means removing it from its list, typically by incrementing a pointer or index.

algorithm merge(A, B) is
    inputs A, B : list
    returns list

    C := new empty list
    while A is not empty and B is not empty do
        if head(A) ≤ head(B) then
            append head(A) to C
            drop the head of A
        else
            append head(B) to C
            drop the head of B

    // By now, either A or B is empty. It remains to empty the other input list.
    while A is not empty do
        append head(A) to C
        drop the head of A
    while B is not empty do
        append head(B) to C
        drop the head of B

    return C

When the inputs are linked lists, this algorithm can be implemented to use only a constant amount of working space; the pointers in the lists' nodes can be reused for bookkeeping and for constructing the final merged list.

In the merge sort algorithm, this subroutine is typically used to merge two sub-arrays A[lo..mid], A[mid..hi] of a single array A. This can be done by copying the sub-arrays into a temporary array, then applying the merge algorithm above.[1]:{{{3}}} The allocation of a temporary array can be avoided, but at the expense of speed and programming ease. Various in-place merge algorithms have been devised,[3] sometimes sacrificing the linear-time bound to produce an O(n log n) algorithm;[4] see Merge sort § Variants for discussion.

K-way merging

k-way merging generalizes binary merging to an arbitrary number k of sorted input lists. Applications of k-way merging arise various sorting algorithms, including patience sorting[5] and an external sorting algorithm that divides its input into k = <templatestyles src="Sfrac/styles.css" />1/M − 1 blocks that fit in memory, sorts these one by one, then merges these blocks.[2]:{{{3}}}:119–120

Several solutions to this problem exist. A naive solution is to do a loop over the k lists to pick off the minimum element each time, and repeat this loop until all lists are empty:

  • Input: a list of k lists.
  • While any of the lists is non-empty:
    • Loop over the lists to find the one with the minimum first element.
    • Output the minimum element and remove it from its list.

In the worst case, this algorithm performs (k−1)(n−<templatestyles src="Sfrac/styles.css" />k/2) element comparisons to perform its work if there are a total of n elements in the lists in total.[6] It can be improved by storing the lists in a priority queue (min-heap) keyed by their first element:

  • Build a min-heap h of the k lists, using the first element as the key.
  • While any of the lists is non-empty:
    • Let i = find-min(h).
    • Output the first element of list i and remove it from its list.
    • Re-heapify h.

Searching for the next smallest element to be output (find-min) and restoring heap order can now be done in O(log k) time (more specifically, 2⌊log k comparisons[6]:{{{3}}}), and the full problem can be solved in O(n log k) time (approximately 2n⌊log k comparisons).[6]:{{{3}}}[2]:119–120

A third algorithm for the problem is a divide and conquer solution that builds on the binary merge algorithm:

  • If k = 1, output the single input list.
  • If k = 2, perform a binary merge.
  • Else, recursively merge the first k/2⌋ lists and the final k/2⌉ lists, then binary merge these.

When the input lists to this algorithm are ordered by length, shortest first, it requires fewer than n⌈log k comparisons, i.e., less than half the number used by the heap-based algorithm; in practice, it may be about as fast or slow as the heap-based algorithm.[6]:{{{3}}}

Parallel merge

Parallel programming can accelerate Merge algorithms. This section will show you how to transform a serial merge algorithm that can run only on one process into an algorithm that can run on multiple processes or even multiple machines. Take a simple serial merge algorithm that merges two sorted list as an example:[7]

void merge_int(int * a_start, int* a_end, int* b_start, int* b_end, int* dst )
{
    while( a_start < a_end && b_start < b_end ) {
        if ( *a_start <= *b_start )  *dst++ = *a_start++; 
        else                        *dst++ = *b_start++;
    }
    while( a_start < a_end ) *dst++ = *a_start++;
    while( b_start < b_end ) *dst++ = *b_start++;
}

a_start, a_end, b_start, b_end are the pointers that specify the start and end of the list a and b, respectively. dst is the location of the output.

In the first while loop, the first if statement compares the first element from list a and the first element of list b. The statement writes whichever is smaller to the output list dst. The loop terminates when it reaches the end of either list a or list b. The second and third while loops output any remaining elements from the source lists. The complexity of this merge algorithm is O(n), where n is the number of output elements or the number of source elements combined.

Notice that this procedure contains a number of loops that covers the entire sequence, which means that it cannot be broken up into smaller tasks and distributed to multiple processes in an obvious fashion. On the other hand, divide-and-conquer algorithms usually can be parallelized easily. Let’s modify our serial merge algorithm into a divide-and-conquer merge algorithm.[7]

void merge_int_dac(int * src, int a1, int a2, int b1, int b2, int* dst, int r)
{
	int l1 = a2 – a1 + 1;
	int l2 = b2 – b1 + 1;
	if (l1 < l2)
	{
		int t = b1; b1 = a1; b1 = t;
		t = b2; b2 = a2; a2 = t;
		t = l2; l2 = l1; l1 = t;
	}
	if (l1 == 0) return;
	int q1 = (b1 + b2)/2;
	for (int i = b1; i < b2; ++i) 
	{
		if (src[i] == src[q1]){q2 = i; break;}
	}
	int q3 = r + (q1 – a1) + (q2 – b1);
	a[q3] = src[q1];
	merge_int_dac(src, a1, q1 – 1, b2, q2 – 1, dst, r);
	merge_int_dac(src, q1 + 1, a2, q2, b2, dst, q3 + 1);
 
}

Transforming the divide-and-conquer version of the serial merge algorithm turns out to be very simple using the Intel Threading Building Blocks (TBB) along with the C++ lambda function. Following is the parallel programming modification algorithm that uses TBB’s parallel_invoke() function, which invokes the two functions in the arguments simultaneously. parallel_invoke() can invoke up to ten functions.

void merge_int_dac_mp(int * src, int a1, int a2, int b1, int b2, int* dst, int r)
{
	int l1 = a2 – a1 + 1;
	int l2 = b2 – b1 + 1;
	if (l1 < l2)
	{
		int t = b1; b1 = a1; b1 = t;
		t = b2; b2 = a2; a2 = t;
		t = l2; l2 = l1; l1 = t;
	}
	if (l1 == 0) return;
	int q1 = (b1 + b2)/2;
	for (int i = b1; i < b2; ++i) 
	{
		if (src[i] == src[q1]){q2 = i; break;}
	}
	int q3 = r + (q1 – a1) + (q2 – b1);
	a[q3] = src[q1];
	tbb::parallel_invoke(
		[&] {merge_int_dac_mp(src, a1, q1 – 1, b2, q2 – 1, dst, r);}
		[&] {merge_int_dac_mp(src, q1 + 1, a2, q2, b2, dst, q3 + 1);}
	);
 
}

The function run each of the two subtasks (merge_int_dac_mp) on a separate process and the child processes/threads subsequently distribute the further divided tasks into more processes. Notice that this naïve implementation of parallel merge actually runs a lot slower than the single process version because of the overhead of fine-grained spawning of processes and threads. For example, the program will spawn two processes for merging two lists with one element in each list, which clearly takes more time in spawning processes than actually performing the merging. Further improvement can be made by stopping the recursion after the source elements to be merged are smaller than certain number.

Language support

Some computer languages provide built-in or library support for merging sorted collections.

C++

The C++'s Standard Template Library has the function std::merge, which merges two sorted ranges of iterators, and std::inplace_merge, which merges two consecutive sorted ranges in-place. In addition, the std::list (linked list) class has its own merge method which merges another list into itself. The type of the elements merged must support the less-than (<) operator, or it must be provided with a custom comparator.

Python

Python's standard library (since 2.6) also has a merge function in the heapq module, that takes multiple sorted iterables, and merges them into a single iterator.[8]

See also

References

  1. 1.0 1.1 Lua error in package.lua at line 80: module 'strict' not found.
  2. 2.0 2.1 2.2 Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. 6.0 6.1 6.2 6.3 Lua error in package.lua at line 80: module 'strict' not found.
  7. 7.0 7.1 Cormen, T., Leiserson, C., Rivest, R., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). Cambridge, Massachusetts: The MIT Press.
  8. https://docs.python.org/library/heapq.html#heapq.merge

Further reading

  • Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Pages 158–160 of section 5.2.4: Sorting by Merging. Section 5.3.2: Minimum-Comparison Merging, pp. 197–207.
  • Lua error in package.lua at line 80: module 'strict' not found.