Go Premium for a chance to win a PS4. Enter to Win

x
Solved

# Projection of High Dimensional Data onto a subspace.

Posted on 2013-01-21
Medium Priority
579 Views
Hi Experts,
Does anyone have a nice and easy tutorials on Projecting High dimensional Gaussian Data seating somewhere in space to a subspace? I would also like to see a Matlab implementation of this so that i can see how the Gaussian will look like after the projection.

Regards.
0
Question by:wish_C
• 6
• 4

LVL 15

Expert Comment

ID: 38805670
I believe what you need is Principal Component Analysis (PCA).

It's implemented in MATLAB in Statistical Toolbox as PRINCOMP function and PCA function. Check the documentation (follow the links ) for code examples. Many examples you can find with google.

You can also try a good PCA example on FileExchange:
http://www.mathworks.com/matlabcentral/fileexchange/16487-pca-demonstration-example
0

Author Comment

ID: 38805765
Thanks Yuk99 for the quick response. I am trying to open this link you sent me: http://www.mathworks.com/matlabcentral/fileexchange/16487-pca-demonstration-example

but the link does not exist anymore. Any other sample code that i can read to understand how it is coded?

Cheers.
0

Author Comment

ID: 38805999
Thank modus_operandi for increasing the zones of my post.

Cheers.
0

LVL 15

Expert Comment

ID: 38806500
I can open the link to FileExchange right now.
Open the documentation pages for PRINCOMP and PCA. You will find links to examples and demos there.
0

Author Comment

ID: 38809687
The example you sent is confusing to me. I just need a simple one that follow this algorithm.

Given n observations x_1, x_2, . . . , x_n of m-dimensional  column vectors
1. Compute  the mean vector Âµ = (x_1+x_2+. . . +x_n )/n
2. Compute the covariance matrix by MLE
C = (1/n) Si=1n (x_i - Âµ)(x_i - Âµ)^t      - Si=1n(sigma from i=1 to n)
3. Compute the eigenvalue/eigenvector pairs (lamda_i, u_i) of C
with lamda_1 = lamda_2 = . . . = lamda_m   ( =0 )
4. Compute the first d principal components y_i^(j)  = x_i^t u_j, for each observation x_i, 1 = i = n,  along the direction u_j , j = 1, 2, Â· Â· Â· , d.

Thanks.
0

LVL 15

Expert Comment

ID: 38810957
Ah, you want to understand algorithm on the low level. The functions I gave you do this inside.

0. X = rand(n,m);

1. M = mean(X);

2. C = cov(X);
You can also check MLECOV function, if it's more suitable for you.
If you want the details how the COV function works, type "edit cov" in matlab and go through the code, it's not so difficult to understand.

3. [U L] = eig(C);

I hope you can continue to finish the last part. If you cannot, let me know what is exactly the problem, or what exactly you don't understand.
0

Author Comment

ID: 38811902
Here is the code i came up with for the PCA implementation.
``````K=1;
X = rand(5,2);
M = mean(X);
C = cov(X);
[U D]=eig(C);
L=diag(D);
[sorted index]=sort(L,'descend');
Xproj=zeros(d,K);

for j=1:K
Xproj(:,j)=U(:,index(j));
end

Y=X*Xproj;
plot(Y1,'d');
axis([4 24 -2 18]);
``````

I basically generated 5 X 2 matrix and apply PCA on it. I want to project the data on to a subspace (one dimensional subspace) and plot the reduced dimensions on a graph to see the behavior of the data. I also want a plot of a line that shows the direction of the transformed data.

Is my code OK? How can i see the plot of the data on the subspace with a line showing its direction?
0

LVL 15

Accepted Solution

yuk99 earned 2000 total points
ID: 38812240
I only briefly looked at your code. Here are few issues.
Looks like you don't need line 2 to calculate mean. Also d is not defines, but you can get it as d = size(U,1);
You can avoid the for-loop with Xproj = U(:,index(1:K)); replacing lines 8-12.

I haven't check if the last transformation is correct.
Y1 probably should be Y or you missed a line.
To plot 1D data it's better to use, for example, plot(Y,zeros(size(Y),'d'). In your case the x axis values just represent how the points ordered in Y.

I found a very good PCA tutorial (starting from very basics):
http://www.ce.yildiz.edu.tr/personal/songul/file/1097/principal_components.pdf
Hope it's exactly what you need.
0

Author Comment

ID: 38815123
I have actually got this tutorials in the link. This is what i am even trying to follow by using the information in the tutorials to follow the steps and implement it in matlab. Here is what i have:
``````% Find the first K Principal Components of data X (n rows, d columns)
% X contains n pattern vectors with d features
X= [2.5,0.5,2.2,1.9,3.1,2.3,2.0,1.0,1.5,1.1];
Y= [2.4,0.7,2.9,2.2,3.0,2.7,1.6,1.1,1.6,0.9];
Data = [X;Y]';
mx = mean(X);
my = mean(Y);
C = cov(X,Y);
[U D]=eig(C);
L=diag(D);
[sorted index]=sort(L,'descend');
[FVector index]=sort(U,'descend');

%Final data using both eigenvectors in U

figure;
plot(X,Y, 'd');
axis equal;
axis([-2 5 -2 5])

figure;
axis([-2 2 -2 2])

figure;
plot(D,'*')
axis([-2 2 -2 2])

figure;
hold on
plot(D,'*')
axis([-2 2 -2 2])

figure;
plot(FData,'d')
axis([-2 2 -2 2])

figure;
axis([-2 2 -2 2])
``````

How every i have this attached plot which is difference from expected as in the Tutorials.

Even though i have the adjusted data back, but i don't know how i can add the means back to the X and Y to get the original data back.

I have use both eigenvectors in the above code to get the feature and as a results have a 2 dimensions, however i would like to know if possible what modification can be done in my code to reduce the the dimensions from 2 to 1 by selecting taking only the eigenvector with the largest eigenvalue to transform the data from 2d to 1d. Any idea on this please?
Thanks
Plot.bmp
0

Author Comment

ID: 38823223
Any Idea on this issue on the above code please?
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn different types of Android Layout and some basics of an Android App.
Looking for a way to avoid searching through large data sets for data that doesn't exist? A Bloom Filter might be what you need. This data structure is a probabilistic filter that allows you to avoid unnecessary searches when you know the data definâ€¦
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable â€¦
###### Suggested Courses
Course of the Month12 days, 16 hours left to enroll