Each academic year, we, at Artes Research, kick-off the Digital Scholarship Module – a training for first-year PhD researchers at the Faculty of Arts – with a session dedicated to research data workflows. Three researchers from the Faculty of Arts offer a behind-the-scenes look at their research workflows by outlining how they approach and structure their research, the tools they use, and with what kind of data they are working. The goal of this session is to provide examples of more advanced workflows for the first-year PhD researchers as they embark on their research journey. Hopefully this recap of the session can spark some inspiration for you!
Vicente Parrilla López – Plain text and structured notetaking
Vicente’s research, which is in the field of musicology, focuses on reviving the Renaissance practice of improvised counterpoint. Apart from a PhD researcher, he is also a musician and recorder player himself. In his research workflow, Vicente consistently seeks out tools to enhance efficiency and further streamline the structure of his work.
Vicente introduced us to the versatility and accessibility of plain text files, highlighting the benefit of this file format, as it is universally usable across various computers and software platforms. One drawback, however, lies in readability due to the absence of text formatting and smaller typography. Fortunately, applications like iA Writer, which allow users to use markdown to apply additional formatting, address this issue.
Vicente highlights the advantages of using plain text files for structured notetaking in conjunction with applications like iA Writer:
- Distraction-free writing: plain text notetaking ensures an undisturbed writing experience with basic formatting; once you are finished you can preview your text for example as HTML or PDF output.
- Versatility: plain text files are very adaptable; they can be exported to various formats such as HTML for websites, DOC for Microsoft Word, PDF, and even transform into programming language files like Python, Java, JSON, CSS, XML, LaTeX, among others.
- Interconnectedness: notetaking tools like these often incorporate a tagging system that facilitate connections between concepts and ideas.
- Search capability: these tools also offer robust search functionalities, ensuring swift and efficient retrieval of desired information.
An important aspect of Vicente’s notetaking workflow is the integration of structured metadata. Vicente implements a dedicated metadata section at the beginning of each note, enhancing the categorization and contextualization of his notes. In general, adding metadata in a systematic way offers several advantages. By recording key details like creation date, authorship, and related keywords, metadata enriches a note by adding surrounding context. Additionally, metadata enhances searchability by allowing the user to search for specific information or themes across an entire note repository. Lastly, structured metadata can foster collaboration between various users but also across different projects.
Vicente also introduced us to the concept of text expanders. The purpose of this type of software is to replace designated keystrokes, known as ‘shortcuts’ or ‘abbreviations,’ with expanded text segments. Its strength lies in expediting the writing process by swiftly inserting frequently used words or phrases into articles, grant applications, and more. It can also help to easily integrate standardized metadata and bibliographic entries. Using the text expander software allows Vicente to have a streamlined writing experience. When used systematically, it also helps him create consistency across various documents. Moreover, the program saves him the time that would be spent on manually inserting phrases or words he uses frequently in his research and writing.
Stijn Carpentier – Digitized source material and distant reading
Within the Negotiating Solidarity project, Stijn’s research aims to uncover and contextualize the wide variety of contacts between actors within Belgian civil society and the rapidly growing influx of foreign guest workers from the 1960s to the 1990s. Despite labeling himself as a hobbyist in the Digital Humanities realm, Stijn presented to us an inspirational workflow where he merges historical research with digital tools.
Stijn’s journey into DH was triggered by his source material. For his research, he wanted to explore how guest workers in Belgium were communicating about their activities and their ideas through periodicals and other types of serial sources. As the term suggests, serial sources are published at regular intervals, resulting in an overwhelming volume of material that cannot always be read entirely during the timeframe of a PhD project. Consequently, Stijn sought an efficient method to comprehensively analyze this extensive array of sources without having to read them all in full.
The first step to achieve this goal was digitization. Stijn encountered both undigitized and poorly OCR’d digitized sources, prompting him to undertake the digitization process himself. However, digitization is time-consuming; hence, Stijn emphasizes the importance of collaboration with the archives or institutions housing the materials. They may offer assistance in digitizing the content or provide access to their scanning equipment and OCR software. Stijn stresses that while digitized sources offer many advantages such as searchability, it remains crucial to engage with the physical materials. Understanding the contextual nuances of their creation and preservation is imperative, rather than treating them merely as isolated PDF files.
Once he tackled the first hurdle of digitization, Stijn delved into distant reading, a text analysis method enabling insights into vast corpora without the need for exhaustive reading. To conduct this analysis, he used the software AntConc.
Upon uploading his documents to AntConc, Stijn could perform basic word searches and proximity-based word analysis. The tool also enables tracking keyword mentions over time, which helps to get an overview of patterns and how they evolved. As a result, Stijn could efficiently extract core ideas from an extensive corpus, a task that would have been impossible for him to complete during his PhD if he were using close reading methods. Such tools not only extract information but also foster creativity in research, encouraging novel perspectives on the research material that might otherwise remain unexplored.
Stijn concluded by comparing Digital Humanities to a Swiss army knife: it is like a versatile tool that doesn’t necessarily need to be the focal point of your project but serves as a valuable instrument for exploring both your sources and your research domain. Beyond that, DH facilitates connections with peers. Belgium boasts a vibrant Digital Humanities community, offering ample opportunities for networking and learning from a diverse group of experts and enthusiasts.
|If you want to get involved in the DH community in Belgium you can join the DH Virtual Discussion group for Early Career Researchers. The discussion group meets on a monthly basis via MS Teams. Each meeting features a presentation from a member of the Belgian DH community, a moment to share DH-related news, and a chance to network.
Tom Gheldof – A day in the (tool) life
Tom Gheldof is the CLARIAH-VL coordinator at the Faculty of Arts. Throughout the years, he was involved in several projects in the field of Digital Humanities such as the Trismegistos project at the Research Unit of Ancient History. Currently, he is a scientific researcher of the ‘CLARIAH-VL: Advancing the Open Humanities Service Infrastructure’ project that aims at developing and enhancing digital tools, practices, resources, and services for researchers in many fields of the humanities.
Tom provided an insider’s view of his typical day, shedding light on the various tools he employs:
- Identification: to introduce himself, Tom showcased his ORCID iD, a persistent digital identifier that sets researchers apart regardless of name similarities. It serves as a central hub to which you can link all of your research output. Not only does it boost the visibility of your work, it also streamlines administrative tasks, as you only need to update one platform that you can then connect with your funder, publishers, etc.
- Text recognition: given that Tom’s research relies on manuscripts, he has familiarized himself with automated text recognition. His primary tool for this is Transkribus, a platform that uses machine learning technology to automatically decipher handwritten and printed texts. Through a transcription editor, users within the Transkribus community transcribe historical documents, training the system to recognize diverse text forms – be it handwritten, typewritten, or printed – across various languages, predominantly European.
- Annotation: Tom relies on Recogito for his research on place names. This online annotation tool offers a user-friendly interface for both texts and images. Recogito provides a personalized workspace to upload, collect, and organize diverse source materials such as texts, images, and tabular data. Moreover, it facilitates collaborative annotation and interpretation of these resources.
- Coding: for coding tasks, Tom uses Visual Studio Code, a free coding editor compatible with multiple programming languages. To collaborate and access code with open licenses, he turns to GitHub, a repository where people share their code, fostering a collaborative coding environment.
- Relational databases: Tom has a lot of expertise when it comes to building relational databases. A relational database allows you to represent complex datasets and the connections between and within different types of data. He uses the FileMaker environment, which has broad functionalities and permits export of the data to any other format.
To familiarize yourself with these and similar tools and methods, Tom recommends exploring the tutorials that are available at The Programming Historian, a DH journal that offers novice-friendly, peer-reviewed instructional guides.
Through trial-and-error, the presenters have figured out their workflow, which can hopefully inspire you to tailor your personalized data management processes. However, they all emphasized that the best research workflow is the one that works for you. For further inspiration when it comes to DH and research data, consider joining DH Benelux 2024, hosted by KU Leuven. This year’s conference, with the theme “Breaking Silos, Connecting Data: Advancing Integration and Collaboration in Digital Humanities”, is sure to bring much more inspiration when it comes to organizing, manipulating, and sharing research data.