Internally Symbol is already stored as separate table. Although table appears to display String values for symbol columns internally column
stores 32 bit int. For finances cases ISIN and other tickers should always be symbols.
Symbols are optimised for ticker lookups, such as the one below to select entire time series for one day
select isin, ... from tab where isin = 'GB00BH4HKS39' and ts = '2021-01'
Ticker aggregations:
select isin, sum(volume) from tab where ts = '2021-01'
The case for not using symbol type is when you dataset has too many distinct values for the field. I would quantify
"too many" as above 100,000 values. At this point performance of code that has to resolve String to Int and vice-versa starts
to taper off.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…