Academy & Industry Research Collaboration Center (AIRCC)

Volume 13, Number 03, February 2023

Molecular Design Method based on New Molecular Representation and Variational Auto-encoder

  Authors

Li Kai, LiNing, Zhang Wei, Gao Ming, Beijing Information Science and Technology University, China

  Abstract

Based on the traditional VAE, a novel neural network model is presented, with the latest molecular representation, SELFIES, to improve the effect of generating new molecules. In this model, multi-layer convolutional network and Fisher information are added to the original encoding layer to learn the data characteristics and guide the encoding process, which makes the features of the data hiding layer more aggregated, and integrates the Long Short Term Memory neural network (LSTM) into the decoding layer for better data generation, which effectively solves the degradation phenomenon generated by the encoding layer and decoding layer of the original VAE model. Through experiments on zinc molecular data sets, it is found that the similarity in the new VAE is 8.47% higher than that of the original ones. SELFIES are better at generating a variety of molecules than the traditional molecular representation, SELFIES. Experiments have shown that using SELFIES and the new VAE model presented in this paper can improve the effectiveness of generating new molecules.

  Keywords

VAE, Molecular notation, Multilayer convolutional network, Fisher information, LSTM.