Differences

This shows you the differences between two versions of the page.

--- doc:oodbcreate [2012/07/25 19:41] – created admin
+++ doc:oodbcreate [2012/07/26 08:15] – admin
@@ Line 1: / Line 1: @@
 ====== OODB - The own OOBD database format ======
-Why an own database format?
+===== Why an own database format? =====
-There are some reasons/advantages to set up an own format for the OOBD:
+There are some reasons/advantages to set up an own format for the OOBD database:
   * the main one: There is no real need to have a full flavored super-duper query engine like "''SELECT this, that FROM here, there WHERE all=nothing..''", a simple key -> value(s) lookup is all we mostly need
-  * low memory usage: the whole search is file based, memory is only used for the found data
+  * low memory usage: the whole search is file based, memory is only used temporary for the found data
-  * generic inputstreams: The db only needs an inputstream as source, which is strictly read only in forward direction. By that e.g. encrypted data file can be used.
+  * generic input streams: The db only needs an input stream as source, which is strictly read only in forward direction. By that e.g. encrypted data files can be used.
   * high speed: the whole search is just a balanced binary tree lookup, which makes it fast also on slow devices
-How to generate such OODB data files?
+===== How to generate such OODB data files? =====
 Such OODB data files are generated by using an csv (comma separated value) input file, where in fact the values are not separated by commas, but by tabs.
@@ Line 18: / Line 20: @@
    oodbCreateCli inputfile.csv > outputfile.oobd
 The outputfile.oobd then belongs into the same directory as the Lua script, which wants to use the database.
-===== Input file Format =====
+===== The Input file Format =====
 The file format of the input file must be as follow:
-HeaderLine \n
+HeaderLine \n \\
-Line 1 \n
+Line 1 \n \\
-..
+... \\
-Line n \n
+Line n \n \\
-HeaderLine = (colum_name 0) \t (colum_name 1) \t (.. colum_name n)
-Line = Key \t Values
+where
-Values = (Value_of_Colum 0) \t (Value_of_Colum 1) \t (..Value_of_Colum n)
+HeaderLine =  \\
+(colum_name 0) \\
+[\t (colum_name 1)] \\
+... \\
+[\t (colum_name n)]
+Line =  \\
+Key \t Values
+Values = \\
+(Value_of_Colum 0) \\
+[\t (Value_of_Colum 1)] \\
+... \\
+[\t (Value_of_Colum n)]
 The input file must be sorted ascending by its keys
-If there are more as one Value per key, the Values must sorted in the sequence as they should be used later
+If there are more as one Value per key, the Values must sorted in that sequence as they should appear in the later query.
+Lines, which begins with an # are seen as comment lines and will be surpressed
-===== Input file Format =====
+===== The OODB (Output) file Format =====
 The generated output format will be as follows:
-HeaderLine 0x0
+HeaderLine 0x0 \\
-Entry 1
+Entry 1 \\
-..
+.. \\
-Entry n
+Entry n \\
-Entry = Key 0x0 (Offset, if key > Searchstring) (Offset, if key < Searchstring) Values 1  0x0 [..Values n  0x0] 0x0
+Entry = \\
+Key 0x0 \\
+Offset (for key > Searchstring) \\
+Offset (for key < Searchstring) \\
+Value 1 0x0  \\
+[..Values n  0x0] \\
+x0
-Offset = binary unsigned 32-Bit Big Endian, calculated as skip() value from the fileposition after the second 4-Byte value up to the start of the next key string to be evaluated. To distingluish
+Offset =  \\
+binary unsigned 32-Bit Big Endian, calculated as skip() value from the fileposition after the second 4-Byte value up to the start of the next key string to be evaluated. To distingluish
 between an offset of 0 to the next key string and a 0 as the indicator for the end of the search tree, the skip() offset given in the file is always 1 higher as in reality, so 1 needs to be
-substracted to have the correct jump width (e.g. Offset in file: 9 means real jump width 9 -1 = 8   = skip(8)
+subtracted to have the correct jump width (e.g. Offset in file: 9 means real jump width 9 -1 = 8   = skip(8)
 How to read this file:
+<code>
 - Read Headerline (from the file beginning until the first 0x0). Store this data for later naming of the found columns.
-- read key value (string until the next 0x0) and the both next 4 Byte long skip() offsets (= relativefile positions) for the greater and smaller key value. If they are 0 (zero), there's no more smaller or greater key available
+- read key value (0x0- terminated String) and the both next 4 Byte long skip() offsets (= relative file positions) for the greater and smaller key value. If they are 0 (zero), there's no more smaller or greater key available
 - compare key with search string:
     - if equal, read attached values in an array. This array then contains the search result(s). Return this and the header line as positive search result.
     - if smaller:
           if smaller file position is 0 (zero), then return from search with empty result array.
-          if smaller file position is not 0, jump per skip() to the file postion of the next index string and continue again with step 2
+          if smaller file position is not 0, jump per skip( value - 1 ) to the file postion of the next index string and continue again with step 2
     - if bigger:
           if bigger file position is 0 (zero), then return from search with empty result array.
-          if bigger file position is not 0, jump per skip() to the file postion of the next index string and continue again with step 2
+          if bigger file position is not 0, jump per skip(value - 1 ) to the file postion of the next index string and continue again with step 2
+</code>
-*/