User Tools

Site Tools


doc:oodbcreate

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
doc:oodbcreate [2012/07/25 19:41] – created admindoc:oodbcreate [2014/03/02 08:08] (current) – removed admin
Line 1: Line 1:
-====== OODB - The own OOBD database format ====== 
  
-Why an own database format? 
- 
-There are some reasons/advantages to set up an own format for the OOBD: 
-  * the main one: There is no real need to have a full flavored super-duper query engine like "''SELECT this, that FROM here, there WHERE all=nothing..''", a simple key -> value(s) lookup is all we mostly need 
-  * low memory usage: the whole search is file based, memory is only used for the found data 
-  * generic inputstreams: The db only needs an inputstream as source, which is strictly read only in forward direction. By that e.g. encrypted data file can be used. 
-  * high speed: the whole search is just a balanced binary tree lookup, which makes it fast also on slow devices 
- 
- 
- 
-How to generate such OODB data files? 
- 
-Such OODB data files are generated by using an csv (comma separated value) input file, where in fact the values are not separated by commas, but by tabs. 
- 
-Such an input file, which must fulfill the requirements written below, is been translated with the [[https://code.google.com/p/oobd/source/browse/trunk/tools/oodbcreate/|oodbCreateCli]] php script: 
- 
-   oodbCreateCli inputfile.csv > outputfile.oobd 
-    
-    
-The outputfile.oobd then belongs into the same directory as the Lua script, which wants to use the database. 
- 
-===== Input file Format ===== 
- 
- 
-The file format of the input file must be as follow: 
- 
-HeaderLine \n 
-Line 1 \n 
-.. 
-Line n \n 
- 
-HeaderLine = (colum_name 0) \t (colum_name 1) \t (.. colum_name n) 
-Line = Key \t Values  
-Values = (Value_of_Colum 0) \t (Value_of_Colum 1) \t (..Value_of_Colum n) 
- 
- 
-The input file must be sorted ascending by its keys 
- 
-If there are more as one Value per key, the Values must sorted in the sequence as they should be used later 
- 
- 
- 
-===== Input file Format ===== 
- 
-The generated output format will be as follows: 
- 
-HeaderLine 0x0 
-Entry 1 
-.. 
-Entry n 
- 
-Entry = Key 0x0 (Offset, if key > Searchstring) (Offset, if key < Searchstring) Values 1  0x0 [..Values n  0x0] 0x0 
- 
-Offset = binary unsigned 32-Bit Big Endian, calculated as skip() value from the fileposition after the second 4-Byte value up to the start of the next key string to be evaluated. To distingluish  
-between an offset of 0 to the next key string and a 0 as the indicator for the end of the search tree, the skip() offset given in the file is always 1 higher as in reality, so 1 needs to be 
-substracted to have the correct jump width (e.g. Offset in file: 9 means real jump width 9 -1 = 8   = skip(8) 
- 
- 
- 
-How to read this file: 
- 
-1 - Read Headerline (from the file beginning until the first 0x0). Store this data for later naming of the found columns. 
-2 - read key value (string until the next 0x0) and the both next 4 Byte long skip() offsets (= relativefile positions) for the greater and smaller key value. If they are 0 (zero), there's no more smaller or greater key available 
-3 - compare key with search string: 
-    - if equal, read attached values in an array. This array then contains the search result(s). Return this and the header line as positive search result. 
-    - if smaller: 
-          if smaller file position is 0 (zero), then return from search with empty result array. 
-          if smaller file position is not 0, jump per skip() to the file postion of the next index string and continue again with step 2 
-    - if bigger: 
-          if bigger file position is 0 (zero), then return from search with empty result array. 
-          if bigger file position is not 0, jump per skip() to the file postion of the next index string and continue again with step 2 
- 
- 
- 
-*/ 
doc/oodbcreate.1343238083.txt.gz · Last modified: 2012/07/25 19:41 by admin