Coder Social home page Coder Social logo

Comments (6)

rcurtin avatar rcurtin commented on June 14, 2024 2

states = total no. of tags in training data ?

Yes, that is correct.

also, you mentioned for my case the observations will be multi-dimensional vectors, I didn't understood that

Each column of a sequence (sentence) should correspond to one word. You can choose how you want to represent each word. One option is to store each word as a numeric index (e.g. "hello" corresponds to 0, "goodbye" corresponds to 1, etc., etc.), but that will only work for emission distributions that are DiscreteDistribution. If you are using GaussianDistribution or GMM, then you will want to represent each word as, e.g., an embedding or a one-hot encoded vector, or something like this.

of type size_t - is it to represent states as numbers?

Yes, each state is represented by its index.

Does each row = states for each sequence?

Yes, each row vector in stateSeq should correspond to the list of hidden states for each word in the corresponding sentence.

from mlpack.

rcurtin avatar rcurtin commented on June 14, 2024 1

So, the answer to the question depends on whether you are doing this from C++, or from a binding or command-line program. In both cases, it could be helpful to take a look at the tests to get an idea of some examples, although I do understand that looking at test code is not always the easiest:

In short an HMM is trained on a series of sequences (optionally, you might know the hidden states for each observation in a sequence, but that is not required). In C++, this is represented as a std::vector<arma::mat>, where each element in the outer std::vector corresponds to a sequence, and each inner arma::mat (which is a sequence) has each observation in the sequence as a column.

In really simple cases, each observation might be a single scalar (e.g. the temperature); in this case, each arma::mat sequence would have 1 row (the temperature) and however many columns were in that sequence. Each sequence can have a different length (number of columns). In more complex cases, each observation may actually be a multidimensional vector; I think that will be the case with your parts-of-speech tagging.

In C++, it will be up to you to use data::Load() to load each matrix in the sequence and pack it into a std::vector<arma::mat>. Of course if you only have one sequence, then there is only a need for one element in the std::vector.

If you are using the bindings (e.g. command-line mlpack_hmm_train), you can pass in a single sequence with the input_file option, where that file is just a matrix that contains a single sequence as described above. Or, if you specify the batch option, then it is expected that the file specified by the input_file option contains a list of filenames, each of which specifies one sequence.

I hope this helps! It actually is on the short-term TODO list to clarify the expectations of these methods, so hopefully that should help out. Let me know if I can clarify anything. 👍

from mlpack.

kiner-shah avatar kiner-shah commented on June 14, 2024

@rcurtin Thanks for the above explanation. However, it seems I still do have some confusion regarding the inputs to Train().
For my use case, from what I understood:

  1. Sequence = sentence
  2. Words = observations
  3. States = POS tags
  4. Transition probability: probability of current state C given previous state was P
  5. Emission probability: probability of observation O given the current state C.

I will be using C++ for experimentation (but I would love to know how to use CLI bindings as well).
From what I have thought:

  1. I will call the constructor: HMM(states) // states = total no. of tags in training data ?
  2. Then I will call Traiin():
    Train(dataSeq,  // vector of size = no. of sentences in data set
                    // each element of vector is a matrix, where I am confused on what will be the rows and columns and what will each element of matrix hold
                    // also, you mentioned for my case the observations will be multi-dimensional vectors, I didn't understood that
          stateSeq  // vector of rows (of type size_t - is it to represent states as numbers?) with arbitrary number of columns
                    // Does each row = states for each sequence?
    );
    

from mlpack.

mlpack-bot avatar mlpack-bot commented on June 14, 2024

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

from mlpack.

kiner-shah avatar kiner-shah commented on June 14, 2024

@rcurtin I was able to implement POS Tagging with HMMs successfully.
I made a Youtube video explaining steps from start to end.
Also, you can find the code here.

Thanks for the guidance.

from mlpack.

rcurtin avatar rcurtin commented on June 14, 2024

Awesome! I will point people towards that in the future when there are questions about the HMM code. Also, if you had interest in adapting that to the examples repository I think it would be nice to add, but don't feel obligated (it's easy enough to link to the repository you have).

from mlpack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.