Aplicação do algoritmo FunSearch na selecão de features para treinamento de Modelos de Predição

This work builds on top of FunSearch [1], an iterative, genetic algorithm, applying it to preprocess and select important features in different types of datasets by using a Large Language Model. One experiment was created for each dataset, which consists of generating an efficient evaluator function and running the algorithm for a specified number of iterations, then comparing with Kaggle user’s results for this dataset. For simpler, smaller datasets or with consistent columns, the algorithm performs slightly worse than the Kaggle data scientists, but with consistent improvements. For more complex and poorly cleaned datasets, the amount of features and information consistency poses a challenge to building efficient evaluator functions.

-

PDF Disponível