Take a peek at https://github.com/cstamas/maven-indexer-examples project.
In short: you dont need to download the GZ/ZIP (new/legacy format) manually, it will indexer take care of doing it for you (moreover, it will handle incremental updates for you too, if possible).
GZ is the "new" format, independent of Lucene index-format (hence, independent of Lucene version) containing data only, while the ZIP is "old" format, which is actually plain Lucene 2.4.x index zipped up. No data content change happens currently, but is planned in future.
As I said, there is no data content difference between two, but some fields (like you noticed) are Indexed but not stored on index, hence, if you consume the ZIP format, you will have them searchable, but not retrievable.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…