One-to-N Encoding for Nominal Variable
This is an example of my matlab implementation of the function.
One of N encoding is a very simple way of encoding classes for a machine learning method.
A class set is a dataset value that can have one of several non-numeric values.
The number of classes must be known ahead of time.
nv = size(X, 2); nc = size(X, 1); Y = X; % for each variable for i=1:nv atts = unique(table2array(X(:, i))); % We only encode the variable that has more than 2 states (e.g., 0 or 1) if(size(atts, 1) ~= 2) numVar = size(atts, 1); % create new variables equals to the possible state of the variable v = zeros(nc, numVar); % for each case for j=1:nc % find the index of the state of the variable idx = find( atts == X{j, i} ); if(size(idx, 1) == 1) v(j, idx) = 1; else error('Error: Index error when encoding.'); end end % remove the variable and replace with the new variables removedVarName = X(:, i).Properties.VariableNames; Y(:, removedVarName) = []; newVars = array2table(v); % rename the var according to the removed variable for k=1:numVar name = strcat(removedVarName, '_v', int2str(k)); newVars.Properties.VariableNames(k) = name; end Y = [Y newVars]; end % end if end end % end function OneOfNEncodingNominal