The following is an interview between Alisa Grishin, Artes Research intern 2022-2023, and Heike Pauli, BiblioTech Hackathon participant and MA student in linguistics. The hackathon took place in March 2023. It was a 10-day event and included a pre-hackathon orientation moment called “Meet the Data, Meet the People.” Heike’s group, the Digital Peripatetics, worked on the Lovaniensia dataset. Lovaniensia comprises the old academic collection, featuring the work published by members of the Old University of Leuven and output linked to the university spanning the period between 1425 and 1797. You can learn more about the team’s project by having a look at their project poster in the BiblioTech Zenodo community. To read more about the hackathon and the results, you can visit the BiblioTech website.
Heike delivers her team’s project presentation at the closing event of the hackathon. On the right, three members of the Digital Peripatetics team pose for a team picture.
What first interested you in the hackathon? Have you done one before? What is your background?
I had not done one before. My initial interest was because it was an opportunity to combine my passion for Latin, Greek, and AI. You don’t see that overlap often. So before the hackathon, I hadn’t seen a place where I could combine my overlapping interests in AI and the classics. This was a unique opportunity for that. Then a few of my professors mentioned it, and I continued hearing about it–probably like three times–and took it as a sign [laughs].
What was your primary concern when beginning the project?
I would say, for me, the OCR. But that was a surprising concern. We had to skip it entirely due to the quality and the number of errors. Apart from that we didn’t have a lot of troubles. And once we established that the OCR would be a challenge, we all just worked together. It was a missed opportunity because there was a lot we could have done and a lot available, but since we decided to look at the metadata and not the OCR it was a bit restricting. I would have liked to look at the text more. There is still a lot of room to explore this data.
What was your primary audience for this project?
Our target audience was mostly people interested in the Old University. Our group leader was a prime example of the type of person we were catering to. We tried to look at the university itself: what was being created, who was there. So, the target audience was basically the people who were in the group [laughs]. It was almost like we were creating a resource for ourselves.
How did you establish your methodology and approach to the dataset? Were you inspired by any other platforms or projects?
Indeed, I think some people were inspired by past projects. We just had to start with the metadata. We did the usual stuff like making a word cloud and other things you would typically do for Natural Language Processing (NLP); it was pretty exploratory. During the introduction at the “Meet the Data, Meet the People” event, the explanation of metadata was also really helpful. And it just started with the basic approach, and then, from there, we knew where to go. We used the tools we already had. I was at first disappointed that we couldn’t make a crazy new tool or code, but it’s challenging to do that in 10 days.
What was the brainstorming process like for this project?
We had a lot of ideas at first. At “Meet the Data, Meet the People” everyone was writing all over the brainstorming paper. Then we had to tone it down to make it coherent and achievable. We picked three or four tasks, which was only the tip of the iceberg. We had to think about what was manageable, so the different analyses were the best approach.
What was your role in the project and how was it different/in line with what you were expecting?
I was expecting that I’d need to program a lot, but I’m not an expert or anything. So it wasn’t necessary; we found that a lot of tools exist which could help with that. I focused more on the language. It was nice because we all had a place where we could best apply our skills. Some people were really great at social network analysis and worked on that, and my job was to work on, or maybe get mad at, the OCR [laughs]. Sometimes it was really accurate and helpful and then other times there were errors. At the end, I found my place in the language analysis side. So in looking at my background, the Natural Language Processing was an obvious approach. And I also loved presenting the project at the closing event. We divided the tasks according to what people wanted to do. At first it was hard to organize it that way, but it helped make it coherent and a singular project.
How did you use your academic experience to help with the hackathon?
The easiest part was the Latin. We were the most academically prepared for this. For me, it also helped to see that I could do some useful stuff with my interest in computational linguistics; I already knew a bit of Python, so it was nice to apply my DH skills and see that they are useful.
Was there any main motivator/goal that encouraged the team when things didn’t go as expected?
It helped that we all had a background and were interested in the dataset ourselves, so that was our main encouragement. It was within our field and we all had a common love for the material.
What kind of advice would you give to a team doing their first hackathon?
Just do it! Don’t overthink it. Even though there was a prize, the main interest was to get knowledge and to explore the data. We didn’t worry too much, honestly. Sometimes there can be frustrations or things won’t work, but those aren’t necessarily failures and you can explore new approaches.
What kind of advice would you give to someone in your field, specifically?
You can bring a new perspective even if you don’t know about technology. You can help those with a technological background if you’re open to it. There’s a stereotype for people in the Classics that we don’t know about computers and only care about books [laughs], but you don’t have to become a professional programmer during or after the hackathon. The knowledge you gain could be helpful for research, your career, anything. It’s not just for people who know how to program. It’s an option for everyone. Blur the lines.
I found it very useful to have a chance to explore all the fields I’m interested in. I would consider participating again. The 10 days was also a good timeframe in the grand scheme of things.