Soutenance de thèse de Ting ZHANG
26 octobre 2017 @ 14 h 00 min - 17 h 00 min
Ting Zhang (équipe IPI) soutiendra sa thèse intitulée « New Architectures for Handwritten Mathematical Expressions Recognition »
jeudi 26 octobre 2017 à 14h, dans l’amphi 1 du bât. ISITEM à Polytech.
La présentation aura lieu en Anglais.
Jury : Christian Viard-Gaudin (Professeur, Université de Nantes, Directeur), Harold Mouchère (Maitre de conférences, HDR, Université de Nantes, Co-encadrant), Laurence Likforman-Sulem (Maitre de conférences, HDR, Telecom ParisTech, rapporteur), Thierry Paquet (Professeur, Université de Rouen, rapporteur), Christophe Garcia (Professeur, INSA de Lyon, examinateur).
Abstract:
As an appealing topic in pattern recognition, handwritten mathematical expression recognition exhibits a big research challenge and underpins many practical applications. Both a large set of symbols (more than 100) and 2-D structures increase the difficulty of this recognition problem. In this thesis, we focus on online handwritten mathematical expression recognition using BLSTM and CTC topology, and finally build a graph-driven recognition system, bypassing the high time complexity and manual work in the classical grammar-driven systems. To allow the 2-D structured language to be handled by the sequence classifier, we extend the chain-structured BLSTM to an original Tree-based BLSTM, which could label a tree structured data. The CTC layer is adapted with local constraints, to align the outputs and at the same time benefit from introducing the additional ’blank’ class. The proposed system addresses the recognition task as a graph building problem. The input expression is a sequence of strokes, and then an intermediate graph is derived
considering temporal and spatial relations among strokes. Next, several trees are derived from the graph and labeled with Tree-based BLSTM. The last step is to merge these labeled trees to build an admissible stroke label graph(SLG) modeling 2-D formulas uniquely. One major difference with the traditional approaches is that there is no explicit segmentation, recognition and layout extraction steps but a unique trainable system that produces directly a SLG describing a mathematical expression. The proposed system, without any grammar, achieves competitive results in online math expression recognition domain.