IMRSV at the Montreal AI Symposium
We recently attended the Montreal AI Symposium as spectators and as participants. We wanted to give a quick run down of what our experience was as two year veterans and talk a bit about some of the highlights we saw!
This was our second year participating in the symposium and our experience this year was even better than the first year (which was really great!). This year the collection of people, companies and researchers was exceptional. The quality of the crowd, the presenters and participants made for a really exceptional event.
One of the first speakers was a research group from MILA presenting Duckietown. They train a DNN within a simulation. The task is then to drive a REAL small toy car around a small track. The trained model is loaded into a physical car and the environment is also physically made. Different lighting conditions, obstacles, etc. are simulated during training. What was impressive about Duckietown is their focus on taking simulated data and environments and deploying them in a real life scenario. This is a fascinating area of research that we plan on watching closely.
One of the more popular trends which we expect to be reflected in this years publications was that of reinforcement learning and Generative Adversarial Networks (GANs). The popularity of these approaches to solve a variety of problems in machine learning seems to have really accelerated in the last year.
One element we noticed missing from this years symposium was any work in speech-to-text. It is an area of interest for us and one which we continue to pursue research in. Hopefully the gap is filled for next years symposium!
We spoke with a company called Imagia who are working in the medical imaging space. Interestingly they also talked about leveraging Natural Language Processing (NLP) and speech processing.
We also met with IVADO who are bringing together industry and academic research. We had an excellent discussion about some of our challenges and how we could improve collaboration between the two sectors.
Finally during our presentation we highlighted our works titled: Semi-Supervised Extractive Summarization of Meeting Transcripts and Detection of Abusive Online Behaviour Using Multi-Label Classification.
Presenting Semi-Supervised Extractive Summarization of Meeting Transcripts on behalf of IMRSV Data Labs were Hichem Mezaoui and Qianhui Wan. Qianhui was presenting her approach to transcription which involved multi-channel beamforming for speech enhancement as a preprocessing step. This was followed by a DNN/HMM hybrid model for speech-to-text as well as an integrated trigram language model for grammar correction.
Hichem presented his work on summarization of the transcripts that used a bidirectional LSTM model with attention for punctuating what is otherwise a raw speech-to-text output lacking any punctuation. He then used a random walk model for sentence embeddings which he found outperformed sophisticated RNN’s and LSTM’s on tasks such as textual similarity, extractive summarization and entailment. Finally he quantified importance of sentences using an approach which ignored the syntax information in order to score the semantic importance of the text relative to the key topics.
In future the embeddings obtained from this approach could be used as features in downstream supervised tasks, since it has been shown that, even though the approach does not take into account the order it exploits better the semantics than RNN's or LSTM's.
The Latent semantic analysis proposed for the extractive summarization could be generalized for n-order tensors for multi-document summarization, content based detection of Information, and behaviour analysis using higher SVD such as Tucker or Parafac decompositions.
Presenting Detection of Abusive Online Behaviour Using Multi-Label Classification on behalf of IMRSV Data Labs was Isuru Gunasekara. Isuru’s objective was to classify sentences in a multi-label environment (a single sentence could attract multiple labels). A demo of the classifier is available at https://imrsv.ai/toxic-language-identifier
Isuru compared how different representations of the text affected the accuracy of the classifications. He also identified advantages and drawbacks of different classification methods. Through his research he also investigated the effectiveness of different neural network structures for language classification and how different models can be combined to achieve better results. Finally he worked on developing effective methods for cleaning language used online to catch things such as abbreviations, acronyms, and mis-spellings.
Overall we had an amazing time sharing our work with the community and Montreal and participating in the AI symposium. We hope there are many more years to come!