7.4. Data file reading

The following data file describes the following two sequences of vectors:

1.1 2.2 , 4.4 5.5 , 4.3 6 , 7.7 8.8 0.5 1.5 , 1.5 2.5 , 4.5 5.5 , 8. 8. , 7. 8.

Those sequences can be encoded in a data file:

# A simple data file

[ 1.1 2.2 ] ; [ 4.4 5.5 ] ; [ 4.3 6. ] ; [ 7.7 8.8 ] ; 
[ 0.5 1.5 ] ; [ 1.5 2.5 ] ; [ 4.5 5.5 ] ; [ 8. 8. ] ; [ 7. 8. ] ;

The file must be terminated by a new line. A sequence can span multiple lines if terminated by a backslash (\).

This simple program extract reads this file (here named ``test.seq'').

Reader reader = new FileReader("test.seq");
List<List<ObservationVector>> seqs = ObservationSequencesReader.
             readSequences(new ObservationVectorReader(), reader);
reader.close();

A 3 states HMM can be fitted to those sequences using a code such as:

KMeansLearner<ObservationVector> kml =
     new KMeansLearner<ObservationVector>(3,
          new OpdfMultiGaussianFactory(2), seqs);
Hmm<ObservationVector> fittedHmm = kml.learn();

The argument of the constructor of OpdfMultiGaussianFactory is the dimension of the vectors.