DataLab is a compact statistics package aimed at exploratory data analysis. Please visit the DataLab Web site for more information....



Guided Tour: Transforming Data

A quite useful feature of DataLab is the support of arbitrary mathematical transformations which can be applied to the data matrix or parts of it. This transformation capability can also be used to create new artificial data (e.g. to create random numbers).

The principle of the data transformation is quite simple: The rows and columns are denoted by the characters 'R' and 'C' together with some numerical index. The expression 'C13' means for example the variable (column) number 13. These identifiers can be used within a mathematical formula just like variables in a BASIC statement. An expression such as 'C2 = C3*(C10-C4)' calculates the difference between columns 4 and 10 and multiplies it by the values in column 3. The result is stored in column 2. The calculation is carried out for each row in the data matrix unless the range of rows is restricted.

In order to enter a formula, you have to select the command Math/Apply Math Formula or the shortcut button . Thereafter an editor is started which allows you to enter the formula. This formula editor shows the last formula entered as a default. You can also switch to the history list of formulas by using the drop down list of the formula entry box..

For now let's create a data matrix (100 by 5 elements) holding artificial data. Therefore, first we have to set the number of rows and columns at 100 and 5, respectively (use the command Edit/Resize Data Matrix or the shortcut button in the matrix view). Now, enter the following formula (command Math/Apply Math Formula):

C1 = SIN(0.33*RIX)*COS(0.13*RIX)

This fills the first column with a nice curve. You may plot this curve by opening a plot window and selecting a Col-Idx plot (command Window/New or the shortcut button , then use the setup button of the plot window to select a Col-Idx type plot). Now let's do another experiment: suppose we want to add some Gaussian noise to a copy of the signal. We therefore enter the following formula:

C2 = C1 + 0.1*GAUSS

This equation creates the sum of the first column and normally distributed random numbers and stores it in the second column. You can display the resulting signal either by opening another data plot, or by switching to the next column in the existing plot (use the arrow buttons). Looks much more realistic a signal, doesn't it ?


Last Update: 2012-Jul-25