Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
877 views
in Technique[技术] by (71.8m points)

csv - Index (type) for searching (large) unsorted array

I have program that loads a (sometimes) large CSV file into an array. The data cannot be sorted, and I do not know if the data is text or numbers. This is up to customers.

Example could be

1;JOHN;DOE
2;JANE;DOE;
3;BOBBY;NOTABLES

but it could also be strings

MB9384HJ;TEST1
B9284918;TEST2

The number of lines could be up to a few million.

I would like to seach for a specific value in a column (which one is known ahead of time, this is my "key index column"). Assume this is unique. Key is to find which row this column is in.

Currently the code is traversing from 1..n and comparing. This is obviously slower towards the end.

I am considering these options:

  • a memory SQLite database with key index value and record number
  • a TStringDictionary with key, record as the pairs
  • a Hashed stringlist

My idea is: instead of traversing the array, I query the index for the key (client provides item to search for, it must be random-access). Then I immediately get the rownumber of the array, and I can fetch the data.

Which of the these (or other, if any) would be a better path to follow ?

question from:https://stackoverflow.com/questions/65856825/index-type-for-searching-large-unsorted-array

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

SQLite is probably too much if you just want to search for the key. It would be interesting if you fill a SQLite table with the CSV and have to do complex queries not only on the keys but also the other columns.

A Hashed string list is probably the faster but there is a problem with hash collisions.

A Dictionary is probably the best solution in your specific case. And it is easy since Delphi RTL provide the required generic class.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...