This seems like a considerably simpler way to do dimensionality reduction as compared to say PCA or SVD. I take it the drawback must be that it doesn’t preserve enough structure? Or that the real value in the other algorithms is that they’re much better at removing dimensions of lesser variance?

This method uses the theoretical upper bound of dimensions required to keep distortion under the given epsilon (though is it really a hard upper bound?). It works for any inputs (random or not). It doesn’t care about structure, it doesn’t try to fit the data. Really cool trick.

But yeah, if you have structured data, and you care about that structure, you probably want to do something like PCA anyway (which is all about fitting your data). Even if you only care about reducing dimensionality, you want something smarter than this random projection because the number of dimensions you need for your structured data is far less than the theoretical upper bound. So yeah, they’re more optimal as they will find the dimensions with little variance which you can then drop.

Since machine learning was mentioned, I wonder how feasible it would be to take one of these random matrices and train it to better match the structure (and then reduce dimensions).

Thanks for posting this - introduced me to random projection. I liked this exposition of random projections a bit better.

This seems like a considerably simpler way to do dimensionality reduction as compared to say PCA or SVD. I take it the drawback must be that it doesn’t preserve enough structure? Or that the real value in the other algorithms is that they’re much better at removing dimensions of lesser variance?

Out of my gut:

This method uses the theoretical upper bound of dimensions required to keep distortion under the given epsilon (though is it really a hard upper bound?). It works for any inputs (random or not). It doesn’t care about structure, it doesn’t try to fit the data. Really cool trick.

But yeah, if you have structured data, and you care about that structure, you probably want to do something like PCA anyway (which is all about fitting your data). Even if you only care about reducing dimensionality, you want something smarter than this random projection because the number of dimensions you need for your structured data is far less than the theoretical upper bound. So yeah, they’re more optimal as they will find the dimensions with little variance which you can then drop.

Since machine learning was mentioned, I wonder how feasible it would be to take one of these random matrices and train it to better match the structure (and then reduce dimensions).

Disclaimer: I know nothing about maths.