Train the CRF using the stochastic gradient descent method.
When the evaluation data IO action is Just, the iterative
training process will notify the user about the current accuracy
on the evaluation part every full iteration over the training part.
TODO: Add custom feature extraction function.