The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

Coding help...sort tree

DisrupterDisrupter Registered User regular
edited October 2010 in Help / Advice Forum
Hey all. Long story short I am working on a project to build a bridge between mysql and an older flat file based database. The old database has the following:

A map file
a key file
X number of index files.


Map and key are very basic. Map file gives basic information about the database, key file contains the actual data with each entry being N bytes. Ive written a series of classes to read the map and then be able to grab data from the key file.

The index files are created by the old database as a way to quickly retrieve the record of indexed keys. What I need help with is determining what sort of tree the data is stored as given the limited documentation on the site.

Here is what the site says about the index files:
Automatic Index Header

The first 140 bytes of the index are as follows:
Offset Length Contents
0 2 "0xC139" - filePro 4.5 index magic number
2 2 Depth of index tree
4 2 "1" if root node is in block zero, else "0"
6 2 Maximum number of keys per node
8 4 Size of each node, in bytes
12 4 Block number of root node
16 4 Number of records in index
20 2 Length of sort key, in bytes
22 2 Length of record number portion of key in bytes
24 64 Sort information -- see below
88 4 Block number of freechain head
92 48 Index comment (NUL-terminated ASCII)

Sort Information

The sort information is the following 8 bytes repeated 8 times - once for each possible sort key.
Offset Length Contents
0 2 The field number
2 1 Associated field instance
3 1 "0" - Used for output formats only
4 2 Field length
6 1 "0" for ascending, "1" for descending
7 1 Field type

Non-leaf node

Each non-leaf node has the following format:
Offset Length Meaning
0 2 The number of node pointers within this node
2 4 Left node pointer
6 n*m n entries consisting of: Lowest key of node (n bytes), Node pointer (4 bytes). The node count does not include the left node pointer.

Leaf node

Each leaf node has the following format:
Offset Length Meaning
0 4 Backward node pointer link
4 4 Forward node pointer link
8 2 Extended block flag:
" 0" - A normal block
" 1" - This block contains a list of duplicate keys, and continues into the next block
" 2" - This block is the end of a chain of duplicate-key blocks.
10 2 Number of key values in this node
12 2*n Offset within block of start of each key value

Each key value is stored as:

Offset Length Meaning
0 n The key value
n m*o A list of the records for this key

Freechain

Each node on the freechain is stored in a singly-linked list as:
Offset Length Meaning
0 4 Forward link pointer

Reading this, im trying to figure out what data type to store the tree as and how to convert it (or if its already stored as one based on the node descriptions.)

First off, I dont understand what a freechain is.

Then I am unsure of how to interpret the nodes.

What does the left pointer point to on the non-leaf-nodes? Is it pointing to the parent? And all other pointers are pointing to children nodes? Id assume that a non-leaf-node would only contain at most, one other non-leaf-node and any amount of leaf nodes?

Leaf nodes contain the data. I get that. Backwards pointer points to the parent node. Forwards pointer possibly points to other leaf-nodes with a duplicate key?

Am I understanding this right? If so, what the heck are free chains?

616610-1.png
Disrupter on
Sign In or Register to comment.