• Alejandro Beltran

Fiscal Data from Audit Reports with NLP and ML

Updated: Oct 26, 2020

This project uses a collection of audit reports scraped from the Auditoria Superior del Estado de Sinaloa's (State Audit Office) website that detail spending of each municipality from 2008 through 2016. The objective of this project is to identify in what municipalities the auditor reported major observations and what was the monetary amount of these discrepancies.  The project is broken into 6 sections: 1.   Preprocessing documents 2.   Implementing an LDA topic modeller 3.   Implementing a Text Classification Model 4.   Extracting information using NER 5.   Results

Please see the code and presentation at this interactive Google Colab notebook.

PDF • 2.22MB

65 views0 comments

Recent Posts

See All

Fiscal data for Mexican Municipalities

This project uses annual data from INEGIs EFIPEM (Finanzas Publicas Estatales y Municipales) to create a dataset of municipal finances from 1989 through 2019. Click here for the interactive Google Col