c# - Fast access to key in file (without loading whole file to memory) -


background: writing c# application windows mobile search definitions (scientific) dictionary filesystem. file looks (file has 100k+ entries):

 word1:meanings(2) -meaning 1 bla bla bla -meading 2 bla bla bla [...] 

the user should able enter word , meaning fast possible. users 1 or 2 words. this, created second file sorted list actual word , byte-offset in dictionary file. example:

word1:12344 word2:32241 word3:298 

i through "index" (simple loop through lines , compare if equal) , "random-access" dictionary file using byte-offset. problem is, still slow. tried loading index array/list/hashtable due slow io, takes long (about 20 seconds load index). bad because user 1 word. therefore, looking type of n-tree implementation can work directly on file (without traversing whole index). has advise how this? current solution looks (but buggy , dirty): new index has format:

a:fileoffsetindictionary:fileoffsetof"ab" //the first 2 character starting b:fileoffsetindictionary:fileoffsetof"ba" c:fileoffsetindictionary:0 //"0" means no words starting "c" (just example) [...] ab:fileoffsetindictionary:fileoffsetof"aba" ac:fileoffsetindictionary:878878 //(just random values illustration) [...] ba:fileoffsetindictionary:456 [...] aba:fileoffsetindictionary:2342 [...] 

and searching done way:

users enter word "tree" "t" in index looping through index if "t" found goto fileoffsetof2digit if "tr" found goto fileoffsetof3digit [...] [actually recursively coded] 

the correct answer tell use b-tree index ideal disk based index of nature, or better yet if talking mobile 6.5 , earlier use sql ce database. , can find number of implementations, failing following.

using along lines of current index file idea, make each index record fixed size. if know word never more 50 char , offset fit in 4 byte integer, can create record entries in index file 54 bytes (assuming ascii words, adjust accordingly). can perform binary search on index file rather scaning entire file access each record.


Comments

Popular posts from this blog

python - Scipy curvefit RuntimeError:Optimal parameters not found: Number of calls to function has reached maxfev = 1000 -

c# - How to add a new treeview at the selected node? -

java - netbeans "Please wait - classpath scanning in progress..." -