The SDL Component Suite is an industry leading collection of components supporting scientific and engineering computing. Please visit the SDL Web site for more information....



Format of ASC data file


This data file (text file, default extension 'ASC') has the following structure:

Line 1 Arbitrary header line, containing a maximum of 255 characters.
Line 2 Parameter NFEAT (integer): number of columns (variables, features) of the data matrix (not including the optional object names and class information). Any comment may follow this number as long as this comment is separated by at least one blank from the numeric value and the whole line is no longer than 255 characters.
Line 3 Parameter NOBJ: number of objects of the data matrix. Any comment may follow this number as long as this comment is separated by at least one blank from the numeric value and the whole line is no longer than 255 characters.
Line 4 Parameters FLAG_ROWATTRIB, FLAG_FEATNAMES, FLAG_OBJNAMES (possible values: 'TRUE' or 'FALSE'). These parameters control the presence or absence of some additional information, such as the class information (FLAG_ROWATTRIB), the names of features (FLAG_FEATNAMES), or the names of objects (FLAG_OBJNAMES). If any of these parameters is 'TRUE' the specific information is included in the following data table. The format of the data table is adjusted accordingly. The values of the parameters must be separated by at least one blank. Any comment may follow these parameters.
Lines 5..k Names of features: the following line(s), holding the names of the features, is (are) present only if the parameter FLAG_FEATNAMES is set 'TRUE'. The identifiers of the features must be separated by at least one blank or any ASCII character below 32 and they have to be stored in the same sequence as the variables. If a feature identifier contains blanks, the identifier has to be enclosed in double quotes ("). A single double quote can be included by using two double quotes (""). The number of names have to be equal to the number of features. The feature names may be stored in any number of lines and the lines may be of any length. Note that the maximum length of a column identifier is 50 characters.
Lines k+1..n Class information, object names, and data: the data table is stored row by row, starting with the first variable as the first entry. Each row of variables is preceded by optional class information and an optional row identifier (=object name). This additional information is stored only if the parameters 'FLAG_ROWATTRIB' and/or 'FLAG_OBJNAMES' are set 'TRUE'. If an object name contains blanks, the identifier has to be enclosed in double quotes ("). A single double quote can be included by using two double quotes ("").

Between the values of a row any number of carriage returns or blanks are allowed. In any case it is strongly recommended that the data table be stored in such a way that it can be read and edited easily.

The values may be stored in any format (integer, floating point, exponential notation) and they must be separated at least by one blank. The class information must be of integer type, the row identifiers are interpreted as strings. The lines can have any length and must not contain any comment.

The following example shows an ASCII data file, which contains 10 rows of 3 variables each. Class information, features names and object names are included.

This is a sample file
3                 ;number of features
10                ;number of objects
TRUE TRUE TRUE    ;class info, feat.names, obj.names
                   F1      F2      "oil speed"
1   S23X4         3.380    2.20    -4
1   S24X4        15.900   -2.20    -4.033E-01
1   C24X3         3.607    1.20    2.2
2   "S12 early"  -3.305    2.20    -4
2   S12          35.340   -2.20    2.888E-01
1   SWINTER      13.670    1.20    22
2   "SPG MER 9"  -3.376    2.20    4.0
1   B1           25.375   -2.20    -1.113E+01
2   B2           -1.650    1.20    -0.1
2   B3            2.509    1.20    -10.0


Last Update: 2007-Apr-09