Starting with Full-Sized Codebooks

If you want to start with more than one vector per codebook you need some algorithm that can create multiple vectors. If you only need one vector per codebook, this one vector can be computed in a single pass one-vector averaging. For mor then one vector you'll have to use the newural gas or k-means algorthm. These two algorithms are iterative algorithms. It is possible to do the traininig in such a way that every training iteration (one pass over the entire training set) equals one iteration of k-means or neural gas. Depending on the kind of data and the size of the codebooks you might need very many iterations. Since reading the entire training set over and over again can cost quite some time, you might prefer to first extract features into coeebook-specific files and run neural gas or k-means on them.