Hacking Democracy Conference
Our Learnings from the Projects of the Ill-Gotten Party Assets Settlement Committee

Group picture with the speakers

On December 9, 2022, FNF Global Innovation Hub held the “Hacking Democracy Conference: Using Data to Innovate Advocacy Approaches”. By introducing the knowledge of data science though training as well as sharing the case of how Taiwan applies data science to facilitate transitional justice, we helped advocates better understand how to systematically collect and process data and to transfer them into useful information.

Ms. Anna Marti, Head of FNF Global Innovation Hub, started the conference by introducing to all participants the purpose of the Hacking Democracy Conference, which is to explore new ways of promoting democracy. That is why FNF Global Innovation Hub wants to showcase how data science can serve as a tool for advocates of democracy and human rights to discover innovative approaches. She also mentioned that when it came to data, we often thought of a series of number in Excel, but data was more than that. Hearing from the four speakers about their processes and stories of handling data, FNF Global Innovation Hub finds that data plays an important role in promoting democracy and human rights.

First, Mr. Tsong-Shyan Lin, Commissioner of the Ill-Gotten Party Assets Settlement Committee (CIPAS), shared with us why CIPAS wanted to initiate the Text Analysis System and Digital Storytelling project. He mentioned that by investigating history, they hoped to correct wrongdoings in the past caused by the authoritarian rule of the state, and to urge the political parties that obtained property by taking advantage of authoritarian rule to return these illegally obtained assets to the people. In the process of restoring the truth, they’ve made good use of modern technology to make their investigations easier and more thorough, and to further help them share the results with the general public.

In dealing with transitional justice, one must study a large number of historical materials, and since assets are involved, there are not only experts in history and law, but also specialists in land economics and finance in the CIPAS project team. However, without the application of data science, it would be difficult for these experts to integrate their work. Therefore, they sought help from data scientists, so that they can understand the full picture of history more clearly, and sort out the contexts of the ill-gotten party assets. In terms of presentation of results, CIPAS also worked with data scientists to set up a “Historical Stories” page on their official website, compiling some of the more representative or interesting stories with attachments of scanned historical documents for the public to read. Besides, CIPAS also organizes seminars and field trips, taking people on a visit to historical sites of party assets and injustice.

Ms. Helene Chien, Data Journalist and Editor of Digital Storytelling, went on to share how she used tools for data visualization and infographics to optimize reading experience. She stated that people can use existing open-source tools available online to create digital storytelling reports or survey, even if they don’t know how to write a program. From her experience of making digital reports that helps readers understand the work of CIPAS and historical materials of transitional justice, she shared her five key steps for handling issues or news:

  1. Focus on certain aspects: Find the entry point of the issue to help the public understand the core focus
  2. Build and present the structure of information: Create a table of contents for long articles. The table of contents will present a clear structure of information and thus enable readers to select what they would like to read first and what they would like to focus.
  3. Visualize Data or present data with interactive charts: Turn statistics and information into charts in order to clearly present results and insights
  4. Summarize and interpret information: Extract important information from all the materials you have and rewrite legal, political or accounting terminology in plain language
  5. Enhancing the visibility of your report by processing report materials: Make full use of audio-visual and graphic materials to introduce cases and stories, and thereby increasing the exposure of existing materials

The third speaker, Mr. Chun-Yin Lee from the Institution of Sociology at Academia Sinica, gave us a detailed introduction of the search system that he and the Data for Social Good team developed for CIPAS. Mr. Lee explained why “word segmentation” is necessary. This is because, unlike English, Chinese words are not separated by spaces in writing, so when dealing with Chinese historical materials, the text must first be analyzed by the word segmentation system to identify meaningful words and phrases. To enhance the accuracy of the word segmentation system, they also developed a dictionary of proper nouns relating to ill-gotten party assets, focusing on the entity identification of proper nouns. In particular, when different words refer to the same person, thing or object, how can the system identify and categorize them? For example, the dictionary helps the system learn that Chiang Chung-Cheng, Chiang Kai-Shek, and Generalissimo Chiang all refer to the same person. With the dictionary, the system can also identify whether a word refers to a person, a location, a date, or an object, etc., thereby increasing the accuracy of the word segmentation tool.

Aside from the abovementioned tools, they’ve also set up a search and recommendation system. The search system works just like Google, except that it is combined with the dictionary of proper nouns relating to ill-gotten party assets. By searching for key words, the system can filter articles that contain the words. The recommendation system breaks down the key words in an article, presents them in a table, and use a word matrix to list out the number of times that a word has been mentioned in the articles, and thereby identifying the topic of an article, and helping researchers quickly find and read related historical materials.

Finally, Data Scientist Mr. Yen-Ting Su showed us some achievements of the project, including the entity identification of the word segmentation system and a social network analysis, which is to present data in interactive charts, and use nodes and links between the nodes to show how important a word is and how many times it is mentioned in historical documents. Researchers can use the social network analysis together with the recommendation system to find more related literature.

Talking about future prospects of the project, Mr. Su responded that there is still an unsolved problem. That is, it is difficult to digitalize historical materials. However, good digitalized data is the foundation of data science. Some raw data, such as graphics or data that cannot be identified by machines, still require some manual work in the beginning to allow data to be quickly analyzed by machines. On this basis, CIPAS and data scientists are dedicated to improving the application digital technology to such occasion. Internally, they use Natural Language Processing solution to assist researchers in finding and identifying data more quickly in a large database; externally, they use tools like digital storytelling and data visualization to help the public easily understand the research findings of CIPAS.

At the end, the speakers also mentioned that the systems are open-source and accessible for public viewing. Even the tools for digital storytelling reports are also based on existing open-source software if possible. The significance of insisting on and implementing an open-source system is that the analysis tools will be open and transparent to the public, so that even if there are errors, they can be detected immediately and corrected quickly. It also serves as a reference for those who want to develop similar projects.

Training workshop by Data for Social Good project of DSP

Before this case study sharing session, we’ve also entrusted the Data for Social Good team from DSP to hold a data science training workshop to democracy advocates. The workshop aims to give our participants an introduction of data science through lecture discussions. Through this workshop and conference, FNF Global Innovation Hub hopes to truly help advocates of democracy, freedom and human rights to process data with higher proficiency, as well as facilitate the interdisciplinary cooperation between activists and data scientists.

How Can Technology Help to Facilitate Transitional Justice?

panelists' thoughts on data science

Before FNF’s Hacking Democracy Conference on December 9, let’s learn how the Conference’s panelists built the Textual Analysis System and Digital Narratives Project for Ill-gotten Party Assets Settlement Committee (CIAS). Click here to register for the Conference to learn more!

Read more