Sunday, May 14, 2006

A Research Genealogy Project? (2)

I circulated the idea of a Research Genealogy Project among a few colleagues, who have offered some comments, giving me a bit more to ponder, particularly the basic question of what is this is really for? What purpose does it serve?

My tentative response to this at the moment is that the long term goal is to understand about higher levels of knowledge, understanding and insight and how they can propagate, flourish and advance. At a more mundane level, it might offer clues into the kinds of conditions that are more likely to lead to successful research activities based on a large body of genealogy data, perhaps useful for funding bodies.

In terms of a genealogy project based on formal research qualifications, I would focus initially on the relationships rather than the objects. There are many kinds of relationships and a standard each-way link without any meaning is usually not appropriate: the existing Maths Genealogy project already has some a sense of ordering or direction in which the Professor generally is the one who imparts to the student until the student absorbs and understands.

There are other inputs that could be modelled: ranging from formal instruction to collaboration, to influence. Looking back at my own Ph.D. (Use of Formal Methods for Safety-critical Systems), apart from my supervisor, I was given guidance by a few other staff and learnt from quite a number of researchers in the field. For instance, at the start I had to learn from those who had developed the formal theoretical foundations (e.g. the theory of testing equivalences of processes), whilst others provided certain contextual background (the application domain of medical device communications). When it came to applying some new theory, I used some methodologies (that applied safety analysis techniques) that adapted or built on the work of contemporary Ph.D students. All these informed and influenced me in my own research, but in different ways.

I corresponded with some of these by email, but although it might be interesting to model correspondence between researchers (nice graph theory applications), I can't see how you can dig into these emails in practice and in any case they were just a small proportion of authors that influenced my work.

It's going to be easier if you can work with what has been freely published, which brings us back to the thesis. What if they could be marked up in such a way that you can extract meaning? So you could know in a particular thesis whose work had provided the foundations, who was doing similar work. This is a task for experts in knowledge representation, retrieval and analysis. Patterns might emerge that show coalesence among some theses, where a lot of researchers tackle a popular topic and related issues; further some theses may show a lot of interconnectivity not only within subject areas but across subject areas, which might suggest making more explicit particular areas for co-operation and joint conferences. On the other hand, some research may be shown to go off on a limb and have little to do with others. Some nice visuals will make this much easier to see!


Thurston said...


I am the person that created the academic genealogies at the University of Notre Dame. My inspiration was visiting the chemistry dept at the Michigan Statue Univ. They put the academic genealogy of the dept on the wall of the main lobby of the bldg.

The project of creating an academic genealogy at first seems rather easy and straight-forward. But the more I have grown it the more difficulties I encounter - adding updates, verifying information, site maintenance, and personally, a lack of computer experience. I am begining to think that a wiki might be the way to distribute/centralize the work.

You might be interested to know about the Chemistry academic genealogy at Univ. Illinois ( Vera Mainz, the creator, tries to include who was influenced by whom in addition to the advisor plus a little bio for each person.

Paul Trafford said...

Hello Thurston,

Thanks for leaving your comments. It's nice to hear how your project was sparked though, as you have experienced, any kind of 'family' genealogy project soon becomes quite a commitment if the number of maintainers doesn't grow in line with the number of people in the genealogy. It then becomes very much a task for dedicated enthusiasts only!

Your idea of a wiki certainly addresses the problem of scalability, distributing the workload and allowing the project to grow organically. It's got a good chance of coverage now that the majority of academics are online, but who can edit whose entry? It's difficult to assure the accuracy without some form of identity management. Such issues have been quite exposed in Wikipedia, though I'm actually very fond of this encyclopaedia and carry a snapshot with me on a PDA.

I'm lucky in that I get to chat to some tech-savvy people at work (often on Friday lunchtimes, when fish and chips is a popular British menu item!), one of them being Stuart Yeates of OSS-Watch. He suggests (being rather blunt about the scalability issue) RDF/FOAF as a solution, which very much puts individuals in control of their own records, though how does one fill in the gaps?
Of course, maintaining one's own record won't be sufficient as it only covers people alive today. I also think that there are certain structures that need to be designed in to allow us to answer certain questions.

This project could be a real test of the 'semantic web' :-)