Finding the majority element in a list

The problem is to find the majority element in a list if one exists. That is, return the element that occurs occupies >50% of the positions in the list. Additional constraints are that the algorithm should operate in linear time and use constant additional space ( O(N) time and O(1) space).

There are a few way of solving this problem. Initially I thought one solution might be to Radix sort the list in-place and then search for the longest run. This would certainly be linear time but in-place Radix sort it a bit of a pain.

Turns out there’s an easier solution called Moore’s Linear Time Majority Vote Algorithm. You can find a nice description here. The algorithm works iteratively. The first element is selected as the initial candidate and it’s count is set to 1. The algorithm then iterates over the remaining elements if the element matches the candidate it’s count is increased, if the count falls to zero the candidate is replaced with the current element (and the count set to 1). In this way the algorithm cancels out elements as it goes. If the element was the true majority item, then reducing the count just canceled out the non-majority element, and the element remains the majority item in the remainder of the array. If it wasn’t then we simply removed two non-majority items which does not effect the result.

The algorithm will result in a candidate, but we still need to test to make sure that it is actually the majority item. If there was no majority item canceling out “two non-majority” items above will effect the result. So we make a single pass over the array and count the number of occurrences of the candidate and test if this is >50%.

A C++ implementation of the algorithm is shown below:

#include <iostream>
#include <vector>
#include <stdlib.h>

using namespace std;

int find_candidate(vector<int> &data) {

int candidate=0;
int count=0;

candidate=data[0];
count = 1;

for(int n=1;n<data.size();n++) {
if(data[n] == candidate) count++;
else count–;

if(count==0) {candidate=data[n]; count=1;}
cout << "count: " << count << endl;
}

return candidate;
}

int main() {

vector<int> data;

srand(time(NULL));
for(int n=0;n<10;n++) data.push_back(rand()%5);
cout << "data: ";
for(int n=0;n<data.size();n++) cout << data[n] << " ";
cout << endl;

int candidate = find_candidate(data);
cout << "candidate: " << candidate << endl;

int count=0;
for(int n=0;n<data.size();n++) if(data[n] == candidate) count++;

if(count > (data.size()/2)) cout << "Majority item is: " << candidate << endl;
else cout << "No majority item" << endl;

}