AdBrite

Saturday, June 16, 2007

Data structures: VB and VBA offer two options: arrays and collections

Data structures: VB and VBA offer two options: arrays and collections
by Mike Tuersley

This month is devoted to data structures. This isn't a computer science course, so let's define a data structure as a data holder--something that holds whatever data we need while our program executes. Visual BASIC and VBA use two basic data structures: the array and the collection. This month when I say VB, I also mean VBA.

ARRAYS

Most VB programmers are familiar with the array (figure 1). An array can be single or multidimensional, which means it can have rows and columns. While the array is the most commonly used data structure, it's not the best choice for at least two reasons. The first is data access. To find something within the array, you have to know exactly what you want. If you knew that, you wouldn't have to look, would you? To search for an item, you must iterate through the array.
[FIGURE 1 OMITTED]

The second reason is that the array must be dimensioned to a known limit. Once set, it's not easy to add an item after that limit is reached. For example, if you have a list of four people, you can't add a fifth person until you resize the array using ReDim or ReDim Preserve. ReDim resizes the array and clears any existing values, while ReDim Preserve resizes and keeps any existing values. Also, the upper limit of the array can't be a dynamic variable. An exception to the dimensioning rule is using an array with a command such as Split (figure 2).

[FIGURE 2 OMITTED]
Advertisement

If the array isn't dimensioned, the Split command dimensions and fills the array on the fly. Arrays are zero-based in VBA (and in VB, although you can use the Option Base statement to change it). In most cases, the array is limited to a single data type.

COLLECTIONS

A collection (figure 3) is usually a better choice for a data structure. All AutoCAD programmers have used collections whether they realize it or not, because AutoCAD uses them. Layers and blocks are two of the most commonly accessed collections within AutoCAD. A collection differs from an array in that it isn't dimensioned, it is one-based, and it can hold multiple data types.

[FIGURE 3 OMITTED]

Once the collection variable is declared, it must be set using the New command. Though sometimes the New command is used within the declaration, it's best to avoid doing it that way. Each new item should be specifically added using the collection object's Add method. Then when an item is needed, it's accessed using the collection object's Item method.

There are two main reasons to choose a collection over an array. The first is that you can add as many items as you need without worrying about a limit. The second is data access speed. When finding an item in the collection, the program doesn't have to iterate through each item. If you use the optional keyword when adding items, the program can go directly to the desired item.

In addition to the keyword, two more optional parameters--Before and After--let you add items to the collection in specific places. You can also use Count to return the number of items that the collection currently holds and Delete to remove items from the collection. If you start Delete, all items located after the deleted item are renumbered so there is not an empty position.

One problem with both an array and a collection is that neither includes data validation. For example, say you want only unique values stored within your array or collection. With an array, you can write a function that iterates all existing items and compares the new value to each one. If a match is found, the function does nothing. Otherwise it adds the item to the array.

With a collection, you can use keywords and trap the error encountered when a duplicate keyword is added (figure 4). Though using an On Error Resume Next statement is discouraged, it can work to your advantage here. You could also test for the error number 457, then clear and resume only if it's that number. On any other error, do whatever needs to be done.

[FIGURE 4 OMITTED]

KICK IT UP A NOTCH

Kick your applications up by using a Microsoft Scripting Dictionary as your data structure. To access the Dictionary object, make a reference to the Microsoft Scripting Runtime and then use it, as follows:

Dim sdGuys As Dictionary

Set sdGuys = New Dictionary

From here, the dictionary is just like a collection on steroids. It has more methods and properties such as Exist, which checks for a duplicate entry without using our error trick for the collection or writing an entire function as we did for the array.

Mike Tuersley is lead applications engineer and lead consultant at Imaginit Technologies

0 comments: