The MatLab Program The documented code below executes the clustering algorithm and plots the points as the algorithm proceeds. Each cluster will be a different color. cluster.m %Cluster reads data from a file called data.dat. It will arbitrarily %choose one point to be a hub and cluster all the points around this hub. %It then finds the point farthest away from the hub and makes this point a %new hub. Next it clusters the data around the hub it is nearest. This %process is repeated until the distance from every point to its hub is %less than half the average distance between all pairs of hubs. load data.dat; graph_data; %Plots original points [clusters,dist,hubs] = setup(data); %Clusters all points around first point counter =1; %Keeps track of number of the present number of hubs continue = 1; % 1=true 0=false indicates whether to continue forming new %hubs while continue counter = counter + 1; % adding new hub [m,i]=max(dist); %m= maximum value in the distance array %i is the location of the maximum value in the % array hubs(counter)=i; %assigns index to new hub [dist,clusters] = recluster(data,dist,clusters,i); %Clusters points %to nearest hub pause redraw %Draws new clusters maxdist=max(dist); %returns distance of point farthest from its hub continue = farout(counter,hubs,data,maxdist); %Checks the stop condition end setup.m function [clusters,dist,hubs]=setup(data) %SETUP [clusters,dist,hubs]=setup(data) % This function assigns to the cluster array all ones, (telling that % all points belong to Hub 1), assigns to the hub array a one in % position 1 (telling that the first hub is point 1) and zeros in all % other locations. The distance array contains the square of the % distance from each point to hub one. n=size(data,1); clusters=ones(n,1); hubs=zeros(n,1); hubs(1,1)=1; dist = distance(data,clusters); end distance.m function dist=distance(d,c) %DIST Distance from hub dist=distance(d,c) % This finds the square of the distance between each point and its hub % d = data coordinates of points; c = cluster array contains index of % hub to which point is assigned; c(i)=j point i belongs to the cluster % whose hub is point j n = size(d,1); for i = 1:n dif(i,:)= d(c(i),:) - d(i,:); end dist =( sum((dif.*dif)'))'; %sums the squares of the row elements end recluster.m function [dist,clusters] = recluster(data,dist,clusters,i) %RECLUSTER [dist,clusters] = recluster(data,dist,clusters,i) % Reassigns each point to the nearest hub n = size(data,1); temp = i*ones(n,1); newdist=distance(data,temp); for r=1:n if newdist(r) < dist(r) dist(r)=newdist(r); clusters(r)=i; end %end for if end %end for for end farout.m function continue = farout(counter,hubs,data,maxdist) %FAROUT continue = farout(counter,hubs,data,maxdist) % Calculates stop condition. Stops if the point farthest from its % hub is within the average distance value. index = 0; for i = 1:(counter-1) for j = i+1 : counter index = index + 1; dif(index,:)=data(hubs(i),:) - data(hubs(j),:); end end dist=sqrt((sum((dif.*dif)'))'); average_dist = sum(dist)/(2*index); if sqrt(maxdist) < average_dist continue = 0; else continue =1; end end % end for farout graph_data.m %Graph_data plots the points and draws the axes x = data(:,1); y = data(:,2); minx = min(x); maxx = max(x); miny = min(y); maxy = max(y); plot(x,y,'*') axis([minx-1, maxx+1, miny - 1, maxy + 1]); redraw.m % Redraw draws the points so that each cluster is a different color. % The hubs are represented by a + and the members are represented by a *. n = size(data,1); pointer = zeros(n, 1); for i=1:counter pointer(hubs(i))=i; % Pointer's indices are the data point indices % Pointer's cells are the hub numbers for the points end; hold off; % Color code the points based on the cluster number for i = 1:n x = data(i,1); y = data(i,2); if pointer(clusters(i)) == 1 if clusters(i) == i plot(x, y, 'y+'); else plot(x, y, 'y*'); end hold on; elseif pointer(clusters(i)) == 2 if clusters(i) == i plot(x, y, 'm+'); else plot(x, y, 'm*'); end hold on; elseif pointer(clusters(i)) == 3 if clusters(i) == i plot(x, y, 'c+'); else plot(x, y, 'c*'); end hold on elseif pointer(clusters(i)) == 4 if clusters(i) == i plot(x, y, 'r+'); else plot(x, y, 'r*'); end hold on elseif pointer(clusters(i)) == 5 if clusters(i) == i plot(x, y, 'g+'); else plot(x, y, 'g*'); end hold on elseif pointer(clusters(i)) == 6 if clusters(i) == i plot(x, y, 'b+'); else plot(x, y, 'b*'); end hold on else if clusters(i) == i plot(x, y, 'w+'); else plot(x, y, 'w*'); end hold on end end % Sets up the axes x = data(:,1); y = data(:,2); minx = min(x); maxx = max(x); miny = min(y); maxy = max(y); axis([minx-1, maxx+1, miny - 1, maxy + 1]); data.dat 4 5 1 7 1 2 2 8 8 1 7 8 8 7 5 4