By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email
No need to pay just yet!
About this sample
About this sample
Words: 733 |
Pages: 2|
4 min read
Published: Nov 8, 2019
Words: 733|Pages: 2|4 min read
Published: Nov 8, 2019
Nowadays instant messaging applications like Whatsapp, instagram etc. are becoming the trend in communication. If Chatbots follows the simplicity of an instant messaging application, it will be really successful. The hospital reservation system is a text-driven application, so user can easily interact with the bot. Following are the phases for building the system:
Datasets are generated by using random () function in python language. Various patterns for the same meaning are listed together for eg., ‘can you make an appointment’, ‘i want an appointment’, ‘i would like to book an appointment’, ‘can i have an appointment’ and each phrases is randomly selected using random () function. The conversation is based on facts such as specialization, doctor name, appointment day and patient profile that contain features such as patient gender and age. A sample knowledge base of proposed system is shown in table 3.1. Based on the facts provided by user, an api call will be issued with these specifications and a token number is generated. User can change the features according to his wish. Api calls are also updated according to change in specifications. From this information, a set of thousand samples are generated as training dataset, another thousand as test dataset and finally thousand as validation dataset. These three are mutually exclusive dataset. While fitting the model, train dataset may produce higher accuracies on every epoch which shows the sign of overfitting. So validation dataset can be used for regularization by early stopping.
Two variants of memory networks are used to build the system: end-to-end memory networks and gated end-to-end memory networks. Both models are similar in architecture except in memory updation. Architecture can be divided into four modules: input module, question module, memory module and answer module.
Each conversation comprises of a user utterance and bot response. Here an embedding matrix A is used to embed sentence in a continuous space and obtain the vector representation. So at a time t, previous utterance from user (c_1^u,…,c_(t-1)^u) and responses from bot (c_1^r,…c_(t-1)^r) are appended to the memory. m=(AΦ(c_1^u ),AΦ(c_1^r ),…,AΦ(c_(t-1)^u ),AΦ(c_(t-1)^r )) where Φ(.) is a mapping function that maps each utterance to a bag of dimension vocabs V and A which is the embedding matrix.
The last user utterance c_t^u is also embedded using the same matrix A givingq=AΦ(c_t^u) which acts as the initial state of controller.
Memory module performs attention mechanism over memory to find the salient parts of the previous conversation that are relevant to produce a response. The controller which is defined in question module will perform the attention process. Mach between the user utterance q and the memory m already defined in input module is computed by taking inner product followed by softmax: p_i=Softmax(u^T m_i) where p_i is the probability vector over memories. The output of the memory module is represented by the sum over input sentence representations, weighted by the matching probability vector: o=R∑_i▒p_i m_i, where R is d×d square matrix. This type of attention mechanism is known as soft attention mechanism, because it is easy to compute gradients and back propagate through this function.
Finally the answer module generates answers for the questions. The controller state is updated in different ways for end-to-end memory networks and gated end-to-end memory networks. In end-to-end memory networks, the controller state is updated as q_2=o+q. But in gated end-to-end memory networks, the controller state is updated as q_2=o⨀T(q)+q⨀(1-T(q)) and T(q)=σ(W_T q+b_T)where W_Tand b_T are the hop-specific parameter matrix and bias term for a particular hop and T is the transform gate for the same hop. Transform gate determine how much information it transform from input to the next layer. The memory can be iteratively reread to look for more relevant information using the controller for k hops. In our experiment we have used 3 hops. The final prediction is defined as a ̂=Softmax(〖q_(k+1)〗^T WΦ(y_1 ),…〖q_(k+1)〗^T WΦ(y_C )) where there are C candidate responses in y and W is of dimension d×V. The entire model is trained using Adam Optimizer, minimizing a standard cross-entropy loss between predicted value a ̂ and actual value a.
Browse our vast selection of original essay samples, each expertly formatted and styled