Co-authored with Yasodara Córdova. Also available in Spanish.

This blog was originally posted at Governance for Development, World Bank Group.


Every month in Brazil, the government team in charge of processing reimbursement expenses incurred by congresspeople receives more than 20,000 claims. This is a manually intensive process that is prone to error and susceptible to corruption. Under Brazilian law, this information is available to the public, making it possible to check the accuracy of this data with further scrutiny. But it’s hard to sift through so many transactions. Fortunately, Rosie, a robot built to analyze the expenses of the country’s congress members, is helping out.

Rosie was born from Operação Serenata de Amor, a flagship project we helped create with other civic hackers. We suspected that data provided by members of Congress, especially regarding work-related reimbursements, might not always be accurate. There were clear, straightforward reimbursement regulations, but we wondered how easily individuals could maneuver around them.

Furthermore, we believed that transparency portals and the public data weren’t realizing their full potential for accountability. Citizens struggled to understand public sector jargon and make sense of the extensive volume of data. We thought data science could help make better sense of the open data provided by the Brazilian government.

Using agile methods, specifically Domain Driven Design, a flexible and adaptive process framework for solving complex problems, our group started studying the regulations, and converting them into software code. We did this by reverse-engineering the legal documents–understanding the reimbursement rules and brainstorming ways to circumvent them. Next, we thought about the traces this circumvention would leave in the databases and developed a way to identify these traces using the existing data. The public expenses database included the images of the receipts used to claim reimbursements and we could see evidence of expenses, such as alcohol, which weren’t allowed to be paid with public money. We named our creation, Rosie.

Rosie isn’t working alone. Beyond translating the law into computer code, the group also created new interfaces to help citizens check up on Rosie’s suspicions. The same information that was spread in different places in official government websites was put together in a more intuitive, indexed and machine-readable platform. This platform is called Jarbas – its name was inspired by the AI system that controls Tony Stark’s mansion in Iron Man, J.A.R.V.I.S. (which has origins in the human “Jarbas”) – and it is a website and API (application programming interface) that helps citizens more easily navigate and browse data from different sources. Together, Rosie and Jarbas helps citizens use and interpret the data to decide whether there was a misuse of public funds. So far, Rosie has tweeted 967 times. She is particularly good at detecting overpriced meals. According to an open research, made by the group, since her introduction, members of Congress have reduced spending on meals by about ten percent.

The results are promising – not just from the perspective of social accountability, fighting misuse of public funds and state-capture, issues that have long eroded the healthy development of countries. Results are also promising in terms of citizen engagement and social accountability. This project covered a small amount of spending, relative to overall public spending in Brazil, but it shows the potential for artificial intelligence (AI) techniques to offer new tools for detection and targeting scarce resources. AI can also be a tool for narrowing the scope of the general governmental open data to very specific citizen-focused contexts, making data more actionable, reachable by the very own individuals. This is more easily said than done, but it is essential to understand the ingredients that made this initiative so fruitful.

Read part two to learn about the key ingredients that made this initiative so successful.