Skip to main content

Posts

NLG pipeline

NLG ¶ NLG stands for Natural Languate Generation. NLG is one field of AI aims to generate the understandable and appropriate texts from raw data. We should differientiate the concepts of NLG with NLP and NLU. NLP is natural languate processing. This is a field in AI working on text generally. NLP contains Speed recornigion, Speed synthesis, NLG and NLU. NLU and NLG is subsets of NLP. While NLG generate the text, NLU uses text as input and generate some pattern such as Sentiment Analysis, Summary. The pipeline of NLG NLG can be divided into 3 phases: Document planning, Microplanning and Realisation. The purpose of Document planning is to chose what to say and the purpos of Microplanning and Realisation is to find how to say. There are some components in each phase. In traditional NLG system, we have 5 components: Content Determination, Text Structure, Aggregation, Lexicalisation, Reffering expression, Realisation. Content Determination Content Determination is sets of enti...

Generative Adversarial Networks

Generative Adversarial Networks (GAN) is one a Neural Network architecture which simulates zero-sum game. There are 2 parts of this Neural Network. The first is called Generator and other is called Discriminator. Generator tries to mimic data and make the fake data likes the real data in distribution. Meanwhile, the Discriminator tries to maximize the difference between real data and fake data. It is reason we call zero-sum game. Two parts are coaction with each other. This structure makes the GAN to be a interesting Neural Network architecture and it has many application in both academic and industry. In modeling, the GAN is an approach of equilibrium Networks such as Boltzmann Machine did. It is an optimization problem with objectives of: minimize Generator and maximum Discriminator simultaneously.  $max_{D}min_{G} V(D,G)$  $max_{D}min_{G} V(D,G) = E_{x\sim p_{data}(x)}[log(D(x))] + E_{z\sim p_{z}(z)}[log(1 - D(G(z)))]$ $V(D,G)$ is optimization problem subject to G ...

Where do variance and bias come from ?

What are variance and bias ? Variance and bias are assertions to evaluate our model based on different datasets. Both of them happen together and we have to find the methodology to trade-off them. A good model is a small variance and bias model.  In wikipedia, we have definitions of variance and bias, formally: The  bias  is an error from erroneous assumptions in the learning  algorithm . The  variance  is an error from sensitivity to small fluctuations in the training set.  (source:  https://en.wikipedia.org/wiki/Bias–variance_tradeoff ) However, I want to make them more normal and practical, so we could understand them as bellowing definitions: Variance asserts fluctuation of precision of model against training data, test data with real data(new data). One model has a hight precision in training and test phase but has low precision with real data, It means this model has hight variance. Hight variance also is called as overfitting: Too...

Mutual information and feature selection

   Feature selection is one of the most important step to make your model works well. In data mining, feature selection is the first step and it effects to all of process. Feature selection help model on some points: - The model will be trained faster - Reduce overfitting - Simplifying model - Reduce the dimension of data Hence, feature selection is kick-off step and it effects overall, especially in model. There are 3 type of feature selection: Filter methods, wrapper methods and embedded methods. Filter methods: this methods "filter" data based on correlation score. Normally, our data have many features, and a label. We calculate the correlations between features and label. After that, we only retrain the features that have a good (relevant) correlated score and remove others. In this type of method we have some ways to calculate the correlation. - Pearson correlation: this one is based on covariance between 2 continuous variables. $$ p_{X,Y} = \frac {Cov(X, Y)...

Expectation maximizaton

Expectation maximization algorithm Expectation maximization(EM) is an algorithm that applied in many applications. EM can be used in Hidden Markov Model (HMM) or in Bayes model. This algorithm basically has 2 steps: Expectation step and Maximization step. The main advantage of EM is resolve problem with incomplete data or with latent variable. In simple, E step gives an assumption and M step will maximize the assumption and find out the next attribute for next E step. The algorithm is finished when we got convergence. We will talk about the main idea of algorithm and the math behind it.  The most popular example of EM is flip two coins A, B. Assume, we have two biased coins A and B. We flip coin in $m$ times, each time for $n$ flips. The question is: what is probability of head of coin A and coin B respectively: $\theta_{A}$ and $\theta_{B}$ in experiment.  If all information is provided: which coin (A or B) is used in each time, we can calculate the probabilities eas...

Introduction to recurrent neural networks

Recurrent neural networks (RNN) is one of the most popular neural networks. If you hear about LSTM ( Long-short term memory), it is one of types of RNN. Specially, the RECURSIVE neural networks is a general of Recurrent neural networks. The different between them is shared weights. In Recursive neural network, shared weights are put in every node, however in recurrent neural networks, shared weights are put through sequences. Problem : Could you know what word will be filled in the sentence: “I like French … ” ?. Let represent the sentence in numberic of words, based on dictionary, we are facing with a sequences problem: predicting the next word given by previous words. RNN does not only dare with sequence problem, It also build a neural network that can remmember. It is exactly what the brain does regularily. Normally, a feedforward neural networks only process information through layers and forget information in t...