Respuesta :
Answer:
Explanation:
Data in a high dimensional space tends to be sparser than in lower dimensions. There are various ways to quantify this, but one way of thinking that may help your intuition is to start by imagining points spread uniformly at random in a three dimensional box. Now flatten the box into a square, pushing two opposite sides together so all the points lie on a single plane. Do you see that the average distance between a point and its nearest neighbor is now smaller? Now flatten the square into a line segment. Do you see that the average distance between a point and its neighbors is now smaller still?
There is no conflict between this and saying that the average distance between any 2 points in the high dimensional space tends to be similar. The latter statement doesn't imply density. The real number line is dense (it has no gaps), and yet the distance between points ranges from 0 to infinity. The point is that the higher the dimension of your space, the more likely the points are to lie near the edges of the space rather than the center.
Again, consider the dimensions we can actually see. Consider a circle with radius=1, inscribed in a square with sides of length=2. The circle occupies π/4 of the square's area, about 78.5%. Now consider a sphere of radius=1 inscribed in a cube with sides of length=2. The sphere occupies π/6 of the cube's volume, about 52.4%. As you see in this example, the odds of a randomly placed point lying close to the center (with close to the center in this case meaning within the circle or the sphere) is lower as the dimension increases. Points are more likely to be in the corners. This is why in high dimensions the distance between the points tends to be similar - because randomly placed points tend to be close to the edges of the region.