Paul Trafford's blog: computing

Showing posts with label computing. Show all posts

Saturday, May 31, 2025

Two Research Papers on Religious Experience and Artificial Intelligence

I have just drafted two papers: How might Artificial Intelligence be Approached for Research in Religious Experience? (available on Academia or on my own website), and How might Artificial Intelligence Support Research in Religious Experience? (also available on Academia or on my own website).

These papers came about after the Journal for the Study of Religious Experience put out a call for papers for a special issue on AI. I was initially reluctant to respond as I had (and still have) reservations about the list of topics and areas. However, I was urged to apply by a member of the Alister Hardy Trust, who pointed out that I have studied both computer science and religion. So, I sent off an enquiry, establishing that this was to be the first issue to treat AI.

After a bit of further thought, I submitted the following abstract for consideration:

This paper explores the subject of Artificial Intelligence with a view to enhancing research activities in religious experience. It starts with a broad examination of AI from a historical perspective as an evolution in notions of computation and in the development of machines, with comparisons made between machine and human intelligence. Further evaluation is carried out on the nature of the systems and theories, using a Buddhist lens, focusing especially on intention, agency and creativity, with references from the Theravada canon. Observations on stillness draw out profound distinctions, providing indications of what is or is not appropriate in terms of AI expectations and engagement. From these considerations, a practical application of AI emerges in terms of augmenting human intellect, along the lines envisaged by Douglas Engelbart, a pioneer in the field. Accordingly, initial indications are provided of how the work of RERC may be enhanced by the analysis of its archive of personal testimonies using machine learning to implement techniques in natural language processing. These may include topic analysis of, for example, subject matter and circumstances; and sentiment analysis of emotional states induced, thereby helping to corroborate and extend existing findings.

The journal’s editorial team accepted the proposal, indicating that they were particularly interested in the possibility of AI assisting with the database of accounts at the Alister Hardy Religious Experience Research Centre (RERC). So I subsequently registered for database access, chose a theme (stillness) as a basis for comparing human and machine intelligence, and proceeded to draft a paper, which I managed to submit by the deadline.

My submission was reviewed and after a few weeks I received some feedback. This made me pause to reconsider and after a few days deliberating I decided to withdraw my submission. I shan’t go much into the reasons, but evidently, I had not conveyed clearly enough the main argument, expressed in the opening pages, plus the various applications of AI drew attention away from this argument to the affordances that AI offers. That is what I wanted to avoid. So, the paper needed to be split into two (the argument followed by the application), whilst keeping the existing order. After the split, I expanded the first paper a little to add a key message: scholars and practitioners of spiritual and religious experience should become more involved in the evaluation of AI especially because they have a fuller understanding of what it means to be human.

It is becoming common for universities to address this question (a Google search over the Oxford University domain), quite often in interdisciplinary panels, but I find they do little to affect the general direction of travel, which continues with rapid extensions and expansion of development, albeit with some regulatory hurdles? One of the underlying problems is the repeated pairing of human intelligence and machine intelligence, as though they are comparable side by side. Other terms have built up and established anthropomorphic language (machine learning, deep learning, and so on), which compounds the issue and leads to tacit acceptance.

There are, in my view, more suitable alternatives. Whilst working on the KEPLAIR project, I was made more aware of the contrast between neural networks as a ‘black box’ technique and transparent techniques based on logical reasoning. In this connection, I was introduced to Douglas Engelbart’s work on Augmenting Human Intellect. Rather than frame questions around AI, we frame them in terms of enhancing humanity. Accordingly, this framework underpins the second paper, which is more conventional in that it makes suggestions of machine learning applications to support research into religious experience, particularly the RERC accounts. This is largely in the area of hermeneutics (for which Voyant Tools is well-suited), though the requirements on data protection encourage more unusual possibilities.

The papers are currently unpublished drafts. I’d welcome suggestions for open access publication.

Monday, May 03, 2021

Now available: Buddhism and Computing

(Subtitle: A milestone is but a beginning)

I’m pleased to announce that Buddhism and Computing: How to Flourish in the Age of Algorithms has been released and is now available.

The book marks the culmination of investigations spanning more than a decade, focused on the design of technology that benefits humanity in the long-term. The enquiries have been primarily from the perspective of Buddhist ethics, but are substantially interdisciplinary, brought together in a way that I hope will foster much-needed dialogue involving everyone.

In that spirit, I'm pleased that the Daily Alternative have published an introductory piece.

I very much welcome feedback – comments, reviews, emails, etc. Please spread the word!

Thank you.

- Paul

Saturday, April 10, 2021

Adventures in the Read/Write Web

[This is an article about my explorations of the Internet, particularly some of the tools and applications I've used and developed to foster collaboration and enable broad participation in the 'read/write' Web. It has involved various experiments with software and hence is quite technical in nature. Sorry if it's difficult to understand, but I'm happy to try to explain and clarify.
Updated 2 October 2021 with a section on static and offline search]

I became acquainted with the Internet at the third time (and university) of asking. My first hint came in 1988, as a maths undergraduate at Southampton University. A friend studying computer science showed me pages and pages of dot matrix printout that he had received by electronic mail from his friend at Warwick University. I was unmoved. The second hint came in 1991, as a maths postgrad at Glasgow University, when another student shared with me the joys of e-mail with friends from abroad, pointing to her computer terminal. Again, I passed.

Two years later, in 1993, as a PhD student in computer science at Kingston University, the Internet was intrinsic to my research and it was then that I dived in; I soon became immersed in e-mail, Usenet Gopher, and another service that seemed to be taking the world by storm, the World Wide Web (or, simply, ‘the Web’).

At that time, I shared a research lab with students from various other disciplines, including Maria Winnett, who specialised in Computer-Supported Cooperative Work (CSCW). Although I did not know it at the time, these kinds of initiatives were just further iterations of ongoing efforts over the decades, as exemplified by Douglas Engelbart’s ‘Mother of All Demos’ given in 1968 (in brief / in full), and later reflected on, inspired in turn by Vannevar Bush’s essay of the '40s “As We May Think”. Furthermore, these approaches would, perhaps unconsciously, influence some of my own work.

Tim Berners-Lee had conceived and implemented the Web only a few years before. A core part of his vision was that the Web should be read/write; the first browser was thus a browser-editor, called WorldWideWeb. Whilst most of the ingredients were in place — the http protocol, HTML, Web browser and httpd server — some aspects were not complete. For instance, his editor could only ‘write’ to local files; the HTTP PUT request method was yet to be devised (more about this later …)

I first explored the Web using NCSA Mosaic, which I also used to take some baby steps in authoring my first HTML markup using its Personal Annotations feature. I then started contributing to departmental pages and I was on my way, but my usage was conventional and most of my attention was on my own research.

OSS Wisdom and Easy Web Editing

Whilst I had been using Unix (Solaris) for my research, I wasn’t really cognisant of free and open source software until I started preparing for my first full time job in 1998 at the University of Derby. I took over the technical development of MultiFaithNet, an online gateway to religious resources. The site was needing a new home; I was asked to set up and maintain a new server, of which I had zero experience. Faced with a dwindling budget, I bought a book on RedHat 5.1, a distribution of the Linux operating system, and installed it on my home computer off the accompanying cover CD. I acquainted myself with the main components, including the Apache httpd web server, CGI/Perl and regular expressions. PHP and MySQL came a bit later.

The site contained a mixture of informational content and community-oriented facilities. The project team maintained editorial control with some pages retaining the same structure, but being revised on a fairly frequent basis. Team members really wanted something as easy to use as a word processor without having to know HTML or worry about the technicalities of uploading to a server. Netscape anticipated such demand with Netscape Composer, which added editing facilities to its Navigator browser. It provided a WYSIWYG interface, making it amenable to all my colleagues, who were not concerned with the underlying HTML that it generated.

There remained the problem of how to enable an easy means to upload edited files to the server. I found a solution thanks to the introduction of the PUT request method, in the HTTP/1.1 specification of the international Web standards. Such a facility had been missing when Sir Tim was first editing web pages using WorldWideWeb (they could only be updated locally under the file URI scheme). The provision of PUT was a pivotal step in opening up the read/write web and, on the client side, support was quickly added to Netscape Composer.

Accordingly, I followed the guidance and implemented support for PUT request method on the server. However, as the Apache Week article intimates, it came with risks and soon become deprecated, so I secured it with various measures, as per the article, with IP address restrictions for good measure.

Annotations in CritLink

MultiFaithNet was a platform for dialogue and engagement. To support this paradigm in the technology infrastructure, I explored web annotations and came across CritLink (and the wider toolset, CritSuite), a project of the Foresight Institute developed by Ka-Ping Yee. It used a proxy server approach to add comments non-invasively with a rich feature set, including bi-directional links, that are not supported in the Web.

I quickly felt it had a lot of potential to support collaborative working, and downloaded and installed it on MFN with the aim of encouraging internal use to begin with. I also contacted Ka-Ping Yee to give some encouragement, suggesting that he present his work in the field of CSCW. Perhaps already having this in mind, he duly delivered a paper. As I started speculating about what lessons might be learnt about free and open source software, I mentioned CritSuite in Open Sources: A Higher Consciousness in Software Development, a paper I gave at an unusual conference, To Catch the Bird of Heaven, which hosted various perspectives on wisdom.

However, these initiatives subsequently came to a halt as funding for the MultiFaithNet project dried up and I needed to find other employment. The uptake of CritSuite failed to gain critical mass, partly because of lack of resources (reflected in the Crit.org server often struggling with poor performance) and partly because some rivals took some of the limelight. However, of probably greater bearing was that larger organisations took exception to having anyone provide ‘frank’ comments, which they regarded like graffiti. There was no moderation (apart from the facility to comment on comments). Nevertheless, among those willing to give CritLink a try, it gained considerable respect.

NetWedit: A WYSIWYG Web editing widget

Today, it’s fairly easy for anyone to create and maintain their own websites. Probably the majority of web content is written and updated through web forms, much of it in content management systems such as WordPress, which accounts a large proportion of all web sites. It was designed as a blogging platform, as such a major step to read/write web according to Sir Tim, when interviewed by Mark Lawson for the BBC in 2005.

Blogging consists not only of jotting down one’s thoughts, as in a manuscript, but presenting it as a Web document. A key enabler, usually powered by JavaScript, is the WYSIWYG editing box. It is now taken for granted, but twenty years ago, Web editing wasn’t so amenable. Whilst there were sporadic attempts to realize in situ editing via the browser’s own client, as I had explored with HTTP PUT, the normal procedure for updating websites was more convoluted. Typically, an author would write copy and send it to someone with technical skills to prepare a web version on their PC using an authoring tool such as Dreamweaver or else hand code the HTML and CSS. Then the files were transferred in a separate process via ftp and finally reviewed by the original author.

However, by the turn of the century, database-driven content management systems (CMS) were already emerging, where it was no longer a matter of editing individual pages, but instances of data that were supplied to templates. Some CMS had large numbers of users who each needed to maintained a few paragraphs and update them whenever they liked. I was faced with such a situation after I had moved down to Oxford to join the Humanities Computing Unit: as part of an online teaching resources database system, theologians needed a simple way to edit some content through the web-based.

With necessity being the mother of invention, I released my first open source contribution in 2001 under LGPL – I’m not a great fan of 'infective' licenses. :-/ It was a rich text editing widget called NetWedit, designed for Netscape Navigator (version 4, upwards) and also supported by Mozilla0.9+, and IE4+ (Windows only). It gained some exposure in HEI after its inclusion in Paul Browning’s TTW WYSIWYG Editor Widgets list. Apart from the Java solutions, all the other entries on the list would only work on one operating system or in one browser; NetWedit was possibly the world’s first non-Java-based cross-browser solution, albeit not fully cross-platform.

Implementation

I was not a JavaScript guru, so I chose the path of least resistance, targeting the textarea HTML input area, which is generally used as a form element for more substantial amounts of text. For users to make dynamic updates without server-side scripting, I figured that I needed editing to take place in a pop-up window, whilst storing the HTML source in the textarea box of the parent window. I realised that dealing with user interactions, especially changes to content, would be a complex process, so I settled on a workflow process whereby the user would type text (without markup) in the boxes and then press a button to launch a pop-up, in which to then carry out the markup (formatting, links, etc.). So, it’s more accurate to say that NetWedit is a markup tool.

Once I had figured this approach, the implementation was quite neat. The code consisted of a single JavaScript library file. When loaded on a given page, any number of form textarea elements could be enabled for WYSIWYG markup. Here’s an extract from a sample page with two editable areas.

To the left you can see a bit of text followed by a couple of textarea boxes in a Web form. These boxes are actually the HTML source views. Normally with rich text editors, we see either the source view or the rich text view, not both, but a feature of NetWedit is that you can see both side by side. It’s thus a handy instructional tool.

I’ve pressed the [Markup 1] button to launch a markup window. In fact, launching the window itself already showed the use of the tag.

Any markup applied in the popup was instantly reflected in the source windows, thereby offering interesting possibilities to learn the markup. Thus, highlighting the word, ‘sample’, and pressing the [U] button is instantly reflected as follows:

As mentioned above, this ‘editor’ didn’t actually allow edits to the text itself - these had to be done in the source textarea.

Visual Shenanigans

Subject to the Document Object Model at that time, there was only one method to capture user-selected text, document.getSelection(), which returned a string. Later revisions, returned an object and allowed one to determine exactly where in the DOM the selection was made. However, here there was no immediate method to locate its whereabouts - if you selected ‘the’, which ‘the’ was it? There was nothing more granular.

I got round the limitation by using a kind of visual deception through (an abuse of) CSS, inspired by WordPerfect’s ‘reveal codes’. Using the tag, I inserted a unique sequence of characters after the first character of each word, hidden by a CSS setting of zero display width. When selecting the intended word, getselection() would include the invisible characters, so that when it came to identifying a particular instance of a word, the system was actually searching for a unique word.

All is revealed by viewing the frame source for the text being marked up:

Behold the generated code in its marked-up glory! Note especially the fragment, t|7|o,

But none of this formatting was applied to the source, so only the intended markup, to, was retained:

The widget was successfully deployed in the Theology Faculty Teaching Resources site, where Faculty could maintain their own profiles – research interests and publications. It was considered simple to use and was all that was needed at that time.

I also received quite a bit of response to the release of the software, especially in the first year or two, when it was tried in custom web-based content management systems (WordPress wasn’t released until 2003). I even wondered about deploying it in CritSuite to make it easier to make granular text selections. However, as browser support became more comprehensive, more sophisticated solutions such as FCKEditor came along (also in 2003), and I knew I would have to take a back seat, though I did try to up my game with RTWedit.

Alas, my editing widgets don’t work in any modern browsers, because the method I used was superseded, though I guess it’s just a matter of working out what the new methods are (and hopefully, there’s no new security constraint). However, it's still possible to see NetWedit, as released in 2001, working on Windows 10: download Netscape 9 from an archive and then launch Navigator and load the page from the Wayback machine (under plain http).

The Aesthetics of XML editing via XSLT

The problem with HTML is that it became increasingly about presentation, despite being in fact a special instance of SGML, designed for documents in general and their semantics (think about how to encode Shakespeare!) This fact was emphasized by colleague in the HCU, especially by its Director, Lou Burnard, and his expert assistant, Sebastian Rahtz. These were authorities on semantic markup and had a great influence on web developments at OUCS.

I discovered this when working on Interactive Audio Comprehension Materials (IACM), a web-based system to train students in various modern languages by listening to passages and answering questions to check their understanding. The data – passage content and multimedia timings – was stored entirely in XML. With Sebastian's help, another team member, Paul Groves, had already developed a delivery system using Perl’s XML:Sablotron, but now the Faculty wanted to add a web-based editing system.

So I devised a system that took user’s web form input and turned it into custom XSLT to transform the XML to XML. As I later reflected, it had the nice aesthetic property of being amenable to recording not only changes to data, but how those changes were made. I showed this solution to Sebastian, who was intrigued and, characteristically, almost immediately wondered whether it could handle more complex scenarios. But a need didn’t arise.

Ingredients for Innovation and Collaboration

OUCS was a very collaborative environment; it helped that almost everyone had a certain level of technical competence and many were experts in their field, so exchanges were fruitful. Everything was available in-house – from R&D to hosting and administration, which facilitated greater exploration and faster turnarounds.

The department was prepared to experiment and take risks, which yielded a lot of innovation. In particular, Sebastian architected the OUCS web site based on TEI XML. It meant all staff having to learn TEI, which required some effort, but at least that could author content using an editor of their choice, ranging from vi to XMetaL. However, once the documents were written, the system effortlessly delivering HTML, PDF and many other documents using Apache Axkit.

OUCS is now history, along with many of the processes, systems and services that were developed. Nowadays, the content and presentation of departmental websites are more managed and controlled. The presentation is more visual, yet the result seems to me to be increased homogeneity, not only in look and feel, but in the kind of content. It reads well because it’s edited for a general audience, yet it feels intellectually less exciting, lacking some freshness and spontaneity.

Nevertheless, aware of cycles in development, I remain optimistic as upward spirals are always possible. Looking at annotations, among the various initiatives, it looks like Hypothes.is is gaining traction and keeping the spirit of collaboration of the early pioneers. There are still some good prospects for a truly read/write Web.

Offline Search for Content Management Systems

Core to ‘read’ing the web is search, another facet we can delve into.

Again in 2001, I became involved in the development of a multimedia Chinese language learning system, featuring a range of interactive exercises. It was implemented as a website driven by CGI/Perl. I was asked to ‘put it on a CD’, with the option to support further updates to its contents. I eventually delivered a production system, with admin facilities, that output a static site according to a specified folder hierarchy. There was, however, one wish that I never got round to fulfilling – a search function. The most promising cross-platform solutions available at the time were mainly based on Java, but I couldn’t manage to incorporate them.

Almost twenty years later, 2020, I am using WordPress and have almost the same requirement – I wish to generate a static version of a WP site that I can search offline. Surely, there’s a nice plugin available?

I duly wandered over to WordPress plugins directory. After a while, it became evident that the search facilities that were available required some server infrastructure, whether that was the use of a third party service like Algolia or some experimental ‘serverless’ solution, All of these, whether ‘scriptless’ or otherwise, still require server infrastructure, even if the site itself is fully static. They are no use for a searching a static site on a memory stick when you have no Internet.

Looking for something that I might (legally) adapt for my purposes, I found WP Static Search, a plugin that looked far more promising, being built on an established JavaScript package, Lunr.js. Again, the author is a developer who is working with this kind of technology in commercial applications and it turned out that it had some bugs; and the author hasn’t yet responded to suggested fixes. Even so, I could make these changes myself by forking the repository on Github.

After applying the fixes, I then tackled the requirement to make this work offline (without reference to any server). The key to this was to take account of the JavaScript security model’s restriction on loading files – basically, any file like a search index cannot be loaded in the usual way. So, the index had to be incorporated in the library itself, in one of the JavaScript files. Thus, I modified the index-builder to write the index inside lunr-index.js itself. I’ve published the modified code on GitHub: https://github.com/paultraf/wp-static-search . You can go ahead and download the master zip, but then rename it wp-static-search.zip before installing in WordPress.

It’s rudimentary, but it works. I use it for the Sigala Research site, in conjunction with wget, which creates the static version for me.

Friday, March 26, 2021

Press release: Buddhism and Computing

I am pleased to share the press release for Buddhism and Computing: How to Flourish in the Age of Algorithms (Mud Pie Books), which will be published on 1 May.

Download the PDF file (or click on the image below).

A few notes are available in a previous blog post. More to follow.

Friday, December 04, 2020

Forthcoming book: Buddhism and Computing

During the past decade and a half, I have explored a range of topics on this blog relating to Buddhist theory and current practice (e.g., on interventions). I’m pleased to report that many of these ideas and strands of thought have been distilled in a forthcoming book, Buddhism and Computing, which will be published by Mud Pie Books on 1 May 2021 (pandemic permitting).

Its chapters are:

Prologue
1 Numbers and Computation
2 Buddhist Ethics
3 Computers and Intelligence
4 Modelling Mind and Processes
5 Designing Sustainable Social Networks
6 Ensuring Online Safety
7 Are We Here Yet?
Epilogue

Being just a small volume, there isn’t space for a comprehensive introduction to Buddhism, but I have tried to provide some of the basics (a great exercise in being concise!). The application to computing draws on my background in mathematics, computer science and IT.

We are faced with many concerns about the potential impact of machines and AI for the future of humanity. They are widely reported in the media, but the responses tend to be focused on particular issues and get immersed in technical and legal details. So, as a start, we need to create the cognitive space to delve more deeply into the nature of human agency and well-being. Then we are in a better position to evaluate AI and other advanced technologies. Everyone should be involved in this process.

The book will be a contribution to the 'Mud Pie Slices' series, where readers can explore many interesting and sometimes unexpected connections that Buddhism offers. To be informed about Buddhism and Computing and other forthcoming titles, please subscribe to the Mud Pie mailing list. (Note that after submitting your details through the form, you should receive an email asking for confirmation.)

Several interesting titles are already being lined up…

Thursday, October 28, 2010

New PDFs for an Old Ph.D. in formal methods

I completed my Ph.D. in 1997 on The Use of Formal Methods for Safety-Critical Systems. Happy to share the findings, I put copies of thesis online, but whilst LaTeX -> dvi -> postscript may have been routine practice for users of UNIX systems running X Windows, it was not very Web friendly for many others (much as Ghostview is good, it doesn't have such a popular appeal!) Finally I've got round to converting to PDFs and have added them to my PhD page. I tried a few years ago, but my initial attempts generated huge files and I couldn't work out why; fortunately ps2pdf.com's settings were far more reasonable and the entire thesis would almost fit on a 1.4MB floppy disk!

As to the subject matter, I was unable to progress the research as I wished; the small group at Kingston University soon petered out and there were relatively few openings elsewhere, so that's when I moved more towards I.T. from computer science (they are very different activities). In the '90s there were growing hopes that formal methods would gain a more general foothold, but when I glance at FM sites now, it seems their use remains very niche; references to LOTOS, the process algebra that I used to model a communications protocol for medical devices, point to materials that are rather old - the World-wide Environment for Learning LOTOS is indicative of this. There are still research activities, typically in compilation, but overall it's a bit surprising and disappointing. Yet given the greater computing power on tap, particularly cloud computing, perhaps this area may yet develop a lot further...?

Sunday, January 14, 2007

ZX Spectrum and Scrabble Nostalgia

I've been prompted to delve back into my teenage years by a surprising article in the February issue of PC Pro which urges us to 'Forget 3D games, says Dick Pountain, it's Scrabble that PCs really need to get to grips with.'

From an early age I had a penchant for words and numbers, their calculation and manipulation. This manifested in several ways during my time at secondary school: I became keen on Scrabble and I started teaching myself to program computers.

Like many of my generation, I owe my first steps to Sir Clive Sinclair: initially, I started with a Sinclair ZX81, which my parents kindly bought for me for Christmas. With 1K of RAM, I was limited in what I could develop, though I was able to validate UK VAT registration numbers! Nevertheless, it was enough to introduce me to a new world and I could write my first programs - in Sinclair's implementation of BASIC. Within a year, there was another breakthrough with the ZX Spectrum and soon after I persuaded my parents again to invest in this new toy that boasted 16K RAM, 16 colours, sound and a wider range of software titles.

I then made a concerted effort to produce a Scrabble program, where the computer could act as one of the players. Whereas previously I had been content to write everything in BASIC, in this instance, I learnt sufficient Z80 machine code to be able to convert the main 'thinking' algorithms. Result: the computer responded in a few seconds rather than a couple of minutes!

It can be quite a work of art to cram as much as you can into the 16K. The computer had a 500 word vocabulary and included an algorithm for ensuring it played the highest scoring move, but generally it wasn't a strong opponent. In actuality, 7K was devoted to the screen display, which can be set as part of the procedure to load the program. Normally, that leaves you with just 9K, but I allotted this space to display a detailed logo in one section and a simple blocky 'Welcome' title in another, which 'hid' about 2K of instructions, which could be read as a scrolling message along the bottom of the screen. Once the game had started, though, the who screen display was refreshed, so these instructions could not be revisited unless the game was reloaded.

Having played Scrabble competitively, I wanted to see the development of a version that was much more competitive. After a while, there was a highly polished product, Psion Scrabble. I wrote to them in the beginning of '86 and described tactics that could enhance the software. Three months later I received a kind response thanking me for the ideas and wishing me well in my 'A' levels, but the overall message was that Psion was going to concentrate on the development of its hardware products. (Perhaps I should have bought some shares?!)

I stopped development of the code around that time, but retained some interest in how tactics could be encoded. About 10 years later, I happened to come across a journal article by Steven Gordon concerning Scrabble algorithms. I corresponded a little by email and learnt that he had implemented a number of similar ideas, but I think far more systematically! So he's probably a good contact for Mr. Pountain.

Over 20 years later...

I've come across an old cassette tape with a copy of the program on it and having invested in an external sound card to digitise these tapes , I decided to undertake a conversion. After some fiddling, I worked out the right settings and the process works fine - the Creative Player is able to sample at the right frequency and bits and I used MakeTZX to convert into a tape archive format.

If you're curious in seeing the program (and don't have high expectations!) you are welcome to download a copy, available as a zip package. When you've unzipped the package, you will see a number of files, with a readme and instructions. Look through those and then launch the .tzx file in an emulator — I've found emuZwin works very well on a PC.

Sunday, July 16, 2006

RAMBLE Project blog - hiatus and archival

This post concerns a work-related blog I have been maintaining, which disappeared off the radar for a couple of weeks or so. This is to explain what has happened.

From Autumn 2004 until Spring 2005 I managed a small externally-funded project in mobile learning called RAMBLE, which concerned blogging on PDAs and other handheld devices and linking them with institutional learning environments. A readable overview was published in an online journal called Ariadne.

As part of the process, I maintained a project blog and the budget included all the hosting needs, but once the project had finished - as so often happens - the blog could only be maintained on good will and very mimimal resources. Even so, the blog server software, Pebble weblog, impressed several colleagues and even the Director hosted his blog there... But alas we were hit by spam, which escalated in magnitude, and it was decided to remove the service and I don't think it will come back online :-(

For a while none of the blogs were available at all, but I've found a way of creating an archive that, all being well, preserves the orginal addresses of the posts, i.e. the permalinks. Pebble stores everything to do with each blog in flat files, so I simply copied the files across to a fresh local installation of Pebble and ran a spidering tool (wget) to grab a static snapshot, and then the sys admin could copy these files to the server. As I type there's a wget-generated archive available at the moment, but it's not yet complete and retains options for posting comments etc.

Another blog, pault@LTG, has suffered the same problems, and I need to find a replacement; I'm thinking of setting up on Educause as I'm registered member, due to attend the 2006 conference in Dallas in October.

Sunday, April 30, 2006

A Research Genealogy Project?

The Mathematics Genealogy project provides a field to categorise dissertations according to the Math Subject Class. Seeing how the selection is very broad, e.g. covering computer science, I was prompted to wonder what about genealogy projects for other subjects? There appear to be a few ideas and initiatives, including Thomas Witten's proposal for a Physics PhD Genealogy project, the High Energy Physics directory, the Software Engineering Academic Genealogy, the Theoretical Computer Science Genealogy and the Notre Dame University academic genealogy, that covers current members of its departments of Chemistry & Biochemistry and Physics.

It's a very fragmented picture, with independently developed systems, very partial coverage of researchers and yet already some duplication. It will become even more so as subject disciplines keep growing...

So it makes sense to me to take a fundamentally more integrated view that incorporates research in any field, one that can also have a richer model, taking into account different kinds of research qualifications, not just PhDs; and different kinds of relationships, not just formal supervisor-student; thereby responding to issues raised in the Mathematics PhD in the United Kingdom.

The findings yielded on this broader base will be fascinating, showing among other things how disciplines evolve over the generations, shedding light on questions such as: What happened to descendants of those who studied classics? What did the ancestors of computer scientists research? Many trends can be observed. There's a lot of talk in the UK about lifelong learning, so how about considering lifelong and generational research?

Another aspect that needs attention is the quality of entries. It's a tall order for just one central team responsible for verifying information received and compiling the database, which is the current arrangement at the Mathematics Genealogy Project. It would be better to distribute the workload and make use wherever possible of local expert knowledge, suitably authorised to update data in the areas with which they are familiar, whilst allowing for as wide public participation as possible.

So what's the solution?

I'm quite sure that the biggest consideration is organisational, not technical. It's probably a workflow problem and perhaps can be addressed by appealing to other international networks, most likely business networks. The quality control needs to rest with academic departments and it seems sensible that they should deal with information relating first to their department, then their institution and then neighbouring institutions. So I envisage an international network of genealogy research nodes where public contributions would be submitted though their nearest research node rather like, "contact your nearest reseller."

A few days ago I attended a presentation by someone who has done work for the World Wide Web consortium and he re-iterated the point that if there's one technical issue affecting software above all others it's scalability. So any proposal probably ought to design and develop a system that distributes the processing (cpu and resources) as well as the administration, though the computing power need not be distibuted per site (big companies typically use a few data centres containing large numbers of rack-mounted PCs). This suggests an application for a parallel computing grid.

I don't know what the implementation itself should look like: it could well be underpinned by a relational database or might even be a special kind of wiki (thinking about how that can really grow rapidly). However, the data model should certainly be given careful consideration. How to deploy it on the Internet? How to authenticate and authorise? Lots of questions will pop up if one investigates further!

Mathematics Genealogy: Indexing

Exploring some of the entries in the Mathematics Genealogy project has led me learn about some interesting and unexpected connections, but it's also highlighted quite a number of limitations regarding accuracy and promptness of updates. Saying this is really just an indication that if you offer something good, then people will be looking for more!

One particular issue is that the total number of descendants requires a separate process to run as explained by the FAQ, which says:

Because of the time required to run the descendant counting program, it is only run once per week (early morning US Central Time on Sundays), while our data is updated nightly.

That surprises me somewhat as with around 100,000 people with not very many details stored per user and few relations, it's not a big complex database. The issue here is probably that it's a relational database and the advisor-student relationship is hierarchical, somewhat like a tree structure. However, it's not a tree because of having multiple parents (multiple advisors), but rather a directed graph, where the nodes represent the mathematicians and the edges correspond to the advisory relationship. [I'm taking definitions from MathWorld, an encyclopaedia that provides clear and nicely formatted explanations with diagrams]. Further, I think there is a fair chance that it would be more general than a simple directed graph from the scenario of the same supervisor supervising a candidate in more than one thesis - although it might sound unlikely today, it is quite plausible a few centuries ago, when a researcher could be at the forefront in a number of fields. I'd also expect it to be an oriented graph, in that supervision is expected to go in one direction, but it's not inconceivable that a student produces a thesis separately in two fields under two supervisors and then shares the knowledge back across.

Returning to the problem of counting, hierarchical relationships are easy to model in a relational database, but retrieving even summary counts may mean a lot of spidering through the hierarchy, which can be very slow. The key consideration is how to index the database. I'm not a database expert, but have seen this issue in the daily work I undertake as an administrator of WebLearn an e-learning system based on software called bodington, which is essentially a web database application. The system contains various resources, arranged hierarchically, in trees, so more specific than the genealogy case.

Jon Maber, the original developer, had started work on Bodington in the mid 90s and had thought about the issue of efficent queries about resources within a given branch; he reviewed approaches to indexing and decided to adopt the tree visitation model devised by Joe Coelko. Celko had given consideration to this graph theory problem and came up with SQL for Smarties: A Look at SQL Trees, an article that appeared in DBMS, March 1996 . Basically, each node or vertex has two indices - left and right - that are numbered according to a complete tour of all the nodes, visiting each twice. It means that selecting the number of descendants of a resource a simple SQL statement that subtracts one index from another at the given node. However, there is a trade-off in that every time you update the database you need to update the index, so if lots of changes are being made it can be a major performance issue.

Celko's solution may not be appropriate in this case, but it looks like the of approach that may lead to a suitable index that will allow real-time queries of how many descendants. The article was published more than 10 years ago, so I expect research has progressed a fair bit since then.

Saturday, April 22, 2006

Structure and Flow - an example in XML editing

Carrying on with the balance of structure and flow, although I left formal methods research a long time ago, I still come across it as a recurring theme in IT work.

At my present workplace, as a software developer I found myself with the task of extending a web-based system to allow anyone to use the web to edit some data encoded in XML. XML (short for 'eXtensible Markup Language) is a hot topic that promises the recording of meaningful information, its long term preservation and wonderful exchange and interoperability among software systems (e.g., because it's stored in a text file, so you can read an XML file in Notepad). An XML file is a data file, basically a hierarchical structure of tags and content. It's structure and data in one.

So where's the flow? That comes in the editing, because to edit the documents I devised a system that used a functional programming language called XSLT [well, it looks like it should be functional, though proof "by example" doesn't look like proof!]. Every change to an XML document is carried out in terms of XSLT, i.e. suppose we have XMLDOC1, then apply an XSLT stylesheet xslt1 to get XMLDOC2; and then apply xslt2 to get to XMLDOC2 and so on. In practice, each stylesheet defines a slight change in the document, with all else remaining the same. The operative verbs are simply: add, delete, and update. Perhaps you could use the word 'perturbation' for this?

As it happens, an XSLT stylesheet is actually an XML document, so again it has all those nice qualities described above, which means that using this system not only allows you to have a convenient text-based history of the documents, but also of the transformations and I can not only share data, but the transformations needed to carry out changes. There's a walkthrough illustrating what I mean through a number of screenshots.

Now there's an irony in using XSLT for change because as instances are XML documents it means that the transformations are themselves data and structure. So in one sense we have a sequence of data structures - where data and the way it changes is in the same format. But what I haven't addressed is how you actually generate and carry out the transformations. This requires a processor!

I've grown to appreciate this system for it gives me a sense of holism - a stream of documents and transformations in one flow. It certainly intrigued one of my colleagues, for whom XML and XSLT are very much his bread and butter!

Notes on reading 'Wholeness and the implicate order': Introduction.

A copy of Bohm's book (Routledge Classics 2002) arrived last week, conveniently just before I set off for a few days' holiday, staying at my father's house. It looks fascinating, so I'll jot down some responses, though at this stage I don't know how far I'll take this. In any case, I should say I can be a very slow reader!

The introduction develops some rationale for Bohm's new perspective, which appears to have emerged from deep personal observation, a state of absorption, as well as his considerable experience as a physicist.

When I thought of 'wholeness' and 'reality,' what came first to mind were the elements, especially depicted in the dhammakaya meditation tradition as a sphere - the four elements of earth, fire, air and water at cardinal points surrounding the space element at the centre and within that the element of consciousness.

Things can be observed at different levels, on different scales. My impression is that at any given scale, science is familiar with progress/movement through stages and has developed laws of motion that model this accurately. However, what laws or models are there for movements between scales? What about the flow between levels of abstraction? I raise this because in the bit of literature I encounter, there seem to be different models for macro and micro, so what is happening on the journey from macro to micro?

The tensions between/balance of structure and flow can be found in many disciplines. I came across it whilst doing research in the field of [concurrent] formal methods in computer science, in which mathematical techniques are used to specify and analyse software systems. You can make a crude division in terms of orientation: one is 'structure' based, viz the so-called 'axiomatic' techniques of VDM, Z etc.that are oriented around sets; the other is 'flow'-based, which is the emphasis in process algebras - how systems are defined in terms of the actions that can be carried out from state to state rather than descriptions of the states per se and hence action-based or operational semantics. This was brought home to me by a very valuable survey of formal methods by Jonathan Ostroff [Formal Methods for the Specification and Design of Real-Time Safety Critical Systems", The Journal of Systems and Software, Vol. 18, No. 1, pp 33-60, Elsevier Scienc Publishing Co. Inc., New-York, April 1992.]

[p. xii] The content of consciousness to be 'reality as a whole'? It's quite an assumption that there can be consciousness of whole reality - is that possible? I'm glad Bohm emphasises the importance of view - it affects everything!

In the introductory class on Buddhist texts that I attended in Spring, Richard Gombrich explained how the Buddha always taught about consciousness of... [and the teachings state that viññana (translated as consciousness) is one of the 5 heaps that are not part of deathless nibbana].

[p. xiii] I can see that this work is very much contraflow vs prevailing views that have become entrenched since the so-called 'Age of Reason.' A process-oriented view was something the Buddha expounded 2500 years ago, expressed succinctly in Pali as sabbe sankhara anicca... - "all conditioned formations are impermanent." The growing interest in the Buddha's teachings presents a veritable challenge to those who separate subject from object and take a materialistic view, which seems the predominant characteristic of European thought during the past few hundred years.

[p. xiv] A language with verbal emphasis. Again, the Buddha focused teachings a great deal on processes of mind: indeed the path to Enlightenment, the Eightfold Noble Path is expressed in terms of verbs, starting with 'Right View' and detailed modes of practice themselves as expressed in e.g. the Satipatthana Sutta describe exercises through the four modes of mindfuless (body, feelings, mind, mental qualities) - that are always working with change; magga is a flow/process of going through stages and something that may be worth noting is that what also occurs is a subtle progression in the nature of observation.

We can go further with emphasising verbs and one of the most striking example can be found in the Buddha's instruction to Bahiya (see previous entry), "in the seeing, just the seen; in the hearing, just the heard, ..." But this is for a very very advanced practitioner, on the brink of full final attainment. So conventionally the subject-object paradigm is often more practical ... I wonder what Bohm's 'rheomodes' is all about and how far this language can be taken...?

Paul Trafford's blog