the fact that having an entry on dense indexes for each entry on our database table, makes the I/O cost the same for searching over the index table or over the database table,
Because indexes that I'm familiar with are all "dense indexes", I'll use "index" to mean that and "sparse index" for other varieties.
This is simply misguided.
The purpose of an index is to significantly reduce I/O costs of reading tables. An index conceptually does so by storing an ordered list of the keys. The index structure can then be quickly traversed to find a particular value or in many cases a range of values.
Conceptually you can think of an index as an ordered list of the keys and binary search. However, the actual implementation could be quite different. The most common use balanced trees. More esoteric types use hash codes. And then some types are specific to particular data types, such as GIS indexes and full text indexes.
A very important part of an index is that it can locate the specific rows with a particular value. A second important part is that SQL tables represent unordered multisets (multisets are just sets that allow duplicate values).
If you put these together, you will realize that a sparse index really doesn't make sense, because there is no way to find values that are not in the index.
What this is probably referring to are clustered indexes on a table. A clustered index is a particular type of index where the underlying data is actually sorted by the index key (only one clustered index per table). As a space optimization on clustered indexes, sparse storage makes some sense. Scanning the records on a single page is often comparable in cost to descending a few levels of index depth.
I also want to point out that "sparse" has a more common usage in database-talk -- and that would be sparse data. Column-oriented databases are optimized for -- among other things -- columns that often have NULL
values. However, I don't think this use of "sparse index" is related to sparse data.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…