Big industrial data mining through explainable automated machine learning
Machine learning (ML) has penetrated all aspects of modern life, and brought more convenience and satisfaction for variables of interest. However, building such solutions is a time consuming and challenging process that requires highly technical expertise. This certainly engages many more people, not necessarily experts, to perform analytics tasks. While the selection and the parametrization of ML models require tedious episodes of trial and error. Additionally, domain experts often lack the expertise to apply advanced analytics. Consequently, it necessitates frequent consultations with data scientists; nevertheless, such collaborations tend to cost the delays, which can lead to risks such as human-resource bottlenecks. As the complexity of these tasks increases, so does the demand for support solutions. In response, the field of automated ML (AutoML) is a data mining-based formalism that aims to reduce human effort and speedup the development cycle through automation. It can be applied to create pipelines of traditional ML models and ensembles, or to search for neural network architectures (NAS). In this regard, existing approaches include Bayesian optimization, evolutionary algorithms as well as reinforcement learning. These approaches have focused on providing user assistance by automating parts or the entire data analysis process, but without being concerned about its impact on the analysis. The goal has generally been focused on the performance factors, thus leaving aside other important and even crucial aspects such as computational complexity , confidence and transparency. In the talk I will present an overview of the main components of AutoML, the search spaces representing ML models, the search strategies, and the Explainability of automated AI.