Databases And Bitmaps

Authors

Publication

Pub Details

Date

Pages

See all articles from QL Hacker's Journal 23

I read an interesting article that introduced the idea of using bitmaps in implementing a database. I don’t know if this concept will be of use to any QL programmers, but I found it interesting just to know.

Conventional databases use B-trees and hashing to implement indexes. B-trees are just that, trees. An index is created in a tree structure lumping like records together. Hashing is using a mathematical formula to distribute the records into an array, so that when you need to find them again, you just plug the record into the formula and it takes you to the proper place in the array.

These structures work well in transaction-oriented systems (find, edit, delete), but they begin to have problems when the number of conditions of a query increase. The more ANDs and ORs the longer these structures take to search. Enter the bitmap index.

Bitmap indexes store all possible values for a field into a bitmap. Bitmaps can be ANDed and ORed quickly to return a result. Queries can be made with many conditions in the index without even looking at the actual data in the records.

Here is a example of a bitmap index. Using the illustration below, note that each record (1-7) has a 4-bit index which is holds the color of the car. With four possible values for color, the appropriate bit is turned on for each color.

          1  2  3  4  5  6  7
-------------------
Blue 1 0 0 0 0 1 0 Cars 1 and 6 are Blue.
Orange 0 1 0 0 0 0 0 Car 2 is Orange.
Green 0 0 0 0 1 0 1 Cars 5 and 7 are green.
Red 0 0 1 1 0 0 0 Cars 3 and 4 are red.

Below is the query for “which cars are red, made in 1995, and cost $19,000.” Each record has a bitmap index for each of the values. To get the result the three bitmaps are ANDed and the result shows in the Result column: Car #4. ORing three 7 bit numbers can be done very quickly.

        Red   1995   $19K   Result
----------------------------
1 0 0 0 0
2 0 0 1 0
3 1 0 0 0
4 1 and 1 and 1 = 1
5 0 0 0 0
6 0 0 0 0
7 0 1 0 0

Bitmap index have two major problems: they are hard to update and can’t handle high cardinality. Cardinality is the number of different values each field can have. If you have a database of cities, there are many possible values for the city. Where as in this car example, there are only so many different colors a car can be. Bitmaps can take substantially longer to update than a plain B-tree.

Like with most computer solutions, the most optimal solution may be a combination of all structures. Using B-trees, Hashing, and Bitmap indexes, in conjunction, may work out to be the best approach.

Products

 

Downloadable Media

 

Image Gallery

Scroll to Top