Paul Trafford's blog: innovation

Showing posts with label innovation. Show all posts

Wednesday, February 05, 2025

Be Free to Comment – Heedfully

[7 Feb 2025: Added a bit more detail concerning the term 'heedfulness', with a link to my research.
21 Feb 2025: Added an abstract
24 Feb 2025: Added a word cloud, generated by Voyant Tools.]

Abstract

News media have for years allowed members of the public to add their comments on some of their published output (bulletins, reports, features, and so on), which we refer to as items. However, such opportunities to have your say have been gradually diminishing, raising concerns about bias and control of the narrative. Yet, with increasing divisions in society across the world, the need for freedom of expression, to learn other points of view and enter into dialogue so as to grow in understanding, is surely greater than ever.

To address the problem, we propose a new overarching service that will allow members of the public to comment on items published especially by professional news media from around the world. All comments brought together in one place, with links off to the original news sources. It will, in particular, where legally acceptable to do so, allow commenting on items (or summaries, where access to items is restricted) that do not yet have a comments area. It will also act as a knowledge base, pooling sources from the articles and the comments, allowing different voices, facilitating a neutral point of view (NPOV).

Such an undertaking requires a heightened level of heedfulness, attentiveness about the consequences of our verbal actions. We therefore propose guidelines for contributions, based on the Buddha’s teachings, on what we term ‘5-star speech’: truthful, offered in a mind of good will, expressed in a kind manner, something relevant and beneficial to the recipient, and at the right time.

An enhanced form of moderation is recommended to help maintain the guidelines. This is to be steered not by academics or those in the news media industry, but by those who are skilled in conflict resolution. Accordingly, we recommend as candidates for this role those who are trained in Nonviolent Communication or similar.

Introduction

I offer some initial thoughts for a novel service, a kind of overarching comment system for news media across the world.

First, we need some background.

Then we can introduce our intentions for the service.

A narrowing field

Ever since news organisations started publishing on the Web over 20 years ago, their current affairs content has steadily grown from front page headlines and news bulletins to in-depth articles and features, ranging from local to international affairs. Nowadays, with digital editions having largely overtaken print-based media, the majority of their output and subscribers are online.

Opportunities were soon identified for reader contributions through feedback mechanisms, discussion fora and online communities, following a practice that was established before the arrival of the Web with various Internet services such as bulletin board systems (BBS), Listservs and UseNet newsgroups. Eventually, news media organisations provided opportunities as part of their Web presence. Hence, the BBC’s Have Your Say and The Guardian’s Comment is Free, which allowed members of the public to share their views on a variety of topical issues.

The Guardian usefully provides various entry points to their conversations, including an index of themes. Although it was last updated in 2013, it still links to a list of current articles on the given theme, such as Christianity. However, relatively few articles have a comments count, whereas clicking on the same link 11 or 12 years ago (archived), reveals a page where almost every article has comment counts.

This reflects a general pattern. In recent years, the number of articles open for comment has dwindled as has the visibility of comments — in some cases, only succinct emails are solicited. The media companies have become more selective, increasingly erring on the side of caution, particularly on matters that they regard as sensitive (to whom, is not always clear).

And this change has been noticed:

(Google's auto-complete suggestions on entering a search about The Guardian's service draw on existing usage patterns.)

Feelings of Discomfort as signs of Opportunities

Why the change? The inclination towards closure became increasingly expressed as more people joined in the conversation. Whilst at the first glance this was democracy in action, it also — to some — led to an unacceptable decrease in overall quality. Such a sentiment was expressed in 2015 by a journalist in The Guardian, who describes feeling under attack, where odious comments become prominent and worthwhile contributions become buried. Yet, as the article hints, avoidance can mean missing out on something valuable. And the issue of quality is not a new phenomenon: I recall a contributor to a Usenet group confiding to me in around 1995 that he was disinclined to post further because the signal-to-noise ratio had become too low, but he still continued to make contributions where there was something of interest.

Having forbearance can be worthwhile. Especially with newcomers. It did not take me long to get into a bit of hot water, receiving a couple of angry responses following my first Usenet posting (on ‘Quantum Theory and Meditation’). In hindsight, I was naive and not following good practice: the choice of topic (too ambitious), the way I expressed myself (perhaps dismissive in tone) and the use of markdown (what?!). Altogether probably injudicious. Within two days the first irate message appeared in my Inbox, “No! No! No! … ” , quite a sobering experience. Yet, I never intended to offend and, undeterred, I persevered in trying to understand and to articulate my point. The thread continued, gaining some further interest, others extended the conversation and I was happy to see it develop freely. I became on friendly terms with several participants and along the way received glimpses of the life of an atomic physicist working in a laboratory. Although I had got bitten, I found the experience valuable. Viewed as opportunities to learn, it is worth persevering with such situations and seeking to improve one’s conduct. Potentially everyone can benefit.

The problem with quality on the Web today is in large part due to the user interface, where the design is often optimised for ease of use. Following Steve Krug’s influential work, Don’t Make Me Think, Web designers have made it a trivial process to enter one’s thoughts in a box and hit the [Send] — the interfaces are slick, as in oil slick, with words potentially gushing forth without due consideration.

Freedom of Expression and the Right to be Heard

Much is made, especially in the West, of the right to freedom of expression, as declared in Article 19 of the United Nations’ Universal Declaration of Human Rights (1948)

Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers.

(English language version; many other languages available via the the translation project.)

However, the International Covenant on Civil and Political Rights added a limiting clause to Article 19 (1966):

3. The exercise of the rights provided for in paragraph 2 of this article carries with it special duties and responsibilities. It may therefore be subject to certain restrictions, but these shall only be such as are provided by law and are necessary:
(a) For respect of the rights or reputations of others;
(b) For the protection of national security or of public order (ordre public), or of public health or morals.

Who decides what is necessary? What is really in the national interest? These clauses are open to varying interpretation, but in general there is a spectrum of views from those who declare themselves to be ‘free speech absolutists’ to others who would like it closely controlled.

In trying to foster appropriate contributions to Comment is Free, The Guardian weighs in with a quote from a former editor and owner of the Manchester Guardian, C.P. Scott, taken from his centenary essay:

“Comment is free, but facts are sacred.”

The subclause is heavily leaned upon as a restraint, but in reality it may not be a straightforward matter to establish those facts and what may be promulgated as such can sometimes fade away (consider, for example, the origins of the Covid-19 virus). However, it is soon followed by another sentence that also merits quoting:

“The voice of opponents no less than that of friends has a right to be heard."

This becomes more urgent in a society that has an increasing tendency to see complex situations in simplistic terms as black and white, which often creates or reinforces division. In that contentious area of ‘freedom of speech’, it is evident is that many people find it very hard to listen to other points of view and to discuss them with an open mind.

Balance and Impartiality

Some media outlets have maintained comment sections across a broad spectrum of topics and taken a relatively liberal position in terms of allowing critical views. To enable this they still maintain oversight and will promptly delete comments regarded as falling foul of their community guidelines. They often requiring participants to have a paid subscription. One example is the Daily Telegraph, which provides conversation space for perhaps the majority of their articles, with journalists occasionally joining in for a kind of real-time Q&A.

Yet, even there, certain topics or people still seem out of bounds, which cannot be explained by the fulfilment of legal obligations alone. So the spotlight is turned on networks and relationships, particularly those of a financial nature — who are the sponsors and what do they have to gain? The Daily Telegraph Global Health Security is funded by the Bill & Melinda Gates Foundation. Can the paper really retain full editorial independence on matters relating to health and the pharmaceutical industry?

Cases of vaccine harms emerged as soon as they were issued: UK CV Family, which has long-known about problems, was granted core participant status in module 4 of the UK's Covid-19 Public Inquiry in 2023, but had been reporting problems the year before with mRNA-based vaccines from Pfizer and Moderna, as well as AstraZeneca's viral vector vaccine . However, they only get mentioned by the Telegraph in 2024, and the article itself confines itself to AstraZeneca, after they had left the market. Gradual acceptance of Covid-19 vaccine harms and their seriousness is belatedly emerging: ‘The truth about Covid vaccines may finally be coming to light’, but the article is brief and comments conflict markedly with its general direction. Similarly, an investigation by a former Government minister has again focused solely on AstraZeneca. Newspapers are expected to investigate, to reveal different sides of the story, but have fallen short with this virus. It brings to mind the old adage, “He who pays the piper calls the tune.”

Many comments have made points to this effect, but for media that are supposedly on the pulse, the response on matters like this has been glacial. It tends to shift faster the more such valid data, properly established, become common knowledge. Providing a global forum for the public to share their responses and insights can help with this process.

Acknowledging and working with diverse views

The Internet has doubtless assisted many people in freedom of expression. However, it is generally recognised that there is increasing polarisation of views, fostering growing anxiety and talk of the need for more constraints (usually of others). Differences are evident in the political views expressed in the editorials of The Daily Telegraph and The Guardian, but are even more pronounced among their readers, who appear to form distinct camps. Attitudes to other points of view are often derogatory and dismissive (“LOL”); some [im]posters seize the opportunity to engage in trolling, searching for places to score points, often using language designed to upset. There’s little benefit in the short term and probably negative effects in the long term.

This dynamic is nothing new, but one can rise above it.

Or he might say: 'Whereas some recluses and brahmins, while living on the food offered by the faithful, engage in wrangling argumentation, (saying to one another): "You don't understand this doctrine and discipline. I am the one who understands this doctrine and discipline." — "How can you understand this doctrine and discipline?" — "You're practising the wrong way. I'm practising the right way." — "I'm being consistent. You're inconsistent." — "What should have been said first you said last, what should have been said last you said first." — "What you took so long to think out has been confuted." — "Your doctrine has been refuted. You're defeated. Go, try to save your doctrine, or disentangle yourself now if you can" — the recluse Gotama abstains from such wrangling argumentation.'

Brahmajāla Sutta (the All-embracing Net of Views), Digha Nikaya (Long Discourses of the Buddha) number 1, translated by Bhikkhu Bodhi

This quote describes the behaviour of spiritual wanderers 2,500 years ago. Today, across all walks of life, there are still these same futile patterns of argumentation, which brings the weariness that we’ve already indicated. However, the Buddha spent much of his time giving teachings, being engaged in conversation, in which he makes a valid case, draws on solid evidence and develops sound lines of reasoning. So, in that spirit, we can try to foster healthy, constructive discussion.

In fact, a proportion of readers cross over and subscribe to other papers of different hues and contribute comments there. There is great potential in this scenario. I remember being told about a reader with an avid interest in current affairs who regarded himself as a liberal, naturally inclined to The Guardian. However, he regularly read the Daily Telegraph to seek out other points of view and developed an appreciation of its “cosy” perspective without a fundamental shift in his views. He sounds like the kind of person able to enter dialogue with a relaxed openness that I would like to foster online.

The potential of collective support is also worth noting. In 1999, I attended a remarkable session of the annual conference of the International Interfaith Centre, Facing The Past – Freeing the Future — in which there was a wary dialogue between a Turkish Cypriot and a Greek Cypriot, both honed in arguing for their respective opposing positions. The audience, for a long time without speaking, tried to support them, willing them on to seek cooperation rather than confrontation.

At the last minute, as recorded by David Craig for the IIC Newsletter 11, June 1999 , there was a breakthrough:

“There was a tangible tension in the room but when Deepak Naik asked what each was going to do when he got back to his community, common ground had been found: each would tell the young people in his island group that the other side wanted peace, and co existence and an end to division.”

Sources

News media often omit citing primary sources, seemingly content to receive information by word of mouth, in press briefings, though a fairly recent trend has been to embrace ‘fact checkers’ to ostensibly give definitive answers. As to why this is the case, there has (of course!) been some discussion with suggestions that the Internet’s disruptive nature is difficult for them to handle.

From my experience, whereas the BBC makes reference to primary sources in some of its articles (with clear disclaimers, as it sees fit), I have found the Daily Telegraph hardly does so. Articles seldom contain a dedicated reference section (which might be expected at the foot of the article) and linked text that sounds like it should point to a source (such as “according to a recently released report”) usually leads instead to another Telegraph article, effectively a kind of clickbait. That might boost the service’s analytics and aid monetisation in the short term, but when repeated, I am put off.

With trust ebbing away from mainstream media, access to primary sources really ought to be given priority.

Outline of a Proposal

Here’s my idea. I propose Be Free to Comment — Heedfully, a new service reflecting the primary intention of freedom of expression conducted in an appropriate manner. It’s a working title to capture what it’s about. By ‘heedful’ I have in mind the Pali Buddhist term appamāda, a notion I have researched in relation to the Fifth Precept, concerning the avoidance of intoxicants. The heedfulness is self-directed, keeping in mind the consequences of one's actions, and should be in reference to 5-star speech, which is described below.

The service acts as a meta discussion area in which — subject to legal compliance in respective jurisdictions — each article of a mainstream newspaper has an accompanying discussion area with a link to the original source (which is automatically archived, for example, by the Wayback Machine). Where access to articles is restricted, for example, by requiring a subscription, a paragraph summary of the item is created as a basis for a broader discussion of the topic as a whole.

This is just an outline to indicate my approach.

Which Sources?

Following a well-established process, considering how the largest social media have evolved, the service should start on a small scale and be expanded gradually.

So, it might begin in the UK, with a few national newspapers and long-established broadcasters. Candidate sources include (in alphabetical order): the BBC, The Guardian, The Independent, the Daily Mail, The Daily Mirror, The Observer, The Sun, The Telegraph, and The Times. There may follow a horizontal expansion (across the UK) or vertical (across the world). Whichever way is chosen, the intention is to include ‘alternative’ sources, which enhances their visibility and therefore their scrutiny.

The articles themselves can initially be confined to certain topics. One possibility is to choose those where opportunities for comments are sparse. However, these are likely to be the most controversial topics, the most difficult to handle well. A fledgling service should start with easier fare, build up experience and confidence, and then expand in a measured way to take in more challenging areas.

News Access

The system will use a database to store basic details of each news item: the publication (source), URL, title, date, author, type (bulletin, feature, etc.), keywords, comments status (open, moderated, email, closed, etc.) and access details. The design can follow conventions established by news aggregators sites and apps, of which there is a large choice. (Note that Reddit has news aggregator functionality, but it’s more general than that.)

There should be search and browse facilities to quickly find specific items and items on a particular topic or theme. Insights may be gleaned from news coverage as a whole. So, we can find answers to questions such as: What proportion of articles on climate change are open to comments?

Machine can use techniques in natural language processing (NLP) to carry out topic analysis and sentiment analysis.

Participation

There are many successful online communities, ranging from Meta’s offerings of Facebook groups and WhatsApp to Reddit and Stack Exchange. These last two are both distributed question-and-answer (Q&A) systems, but whereas Reddit encourages subjective contributions, Stack Exchange is not interested in opinions, but objective solutions. As with Stack Exchange's history, many started out as programmers’ hobbies and extra-curricular activities, but subsequently the user base has grown and diversified, with the subject coverage becoming encyclopaedic. The experience gained in these kinds of communities should inform service design, whilst I also wish to introduce a Buddhist perspective, as I believe this is rooted in a deep understanding of behaviour in human interactions.

Following general practice, discussion of current affairs are text-based and do not include audiovisual multimedia, only links to such content. This should be easier to manage.

User accounts

Should contributions require a user to be registered (with minimal personal data: first and last name for identification, a forum username and an email address)? Perhaps the option of anonymous comments might be supported, albeit with additional constraints (as per The Register’s Anonymous Coward).

Bots are not allowed (and don’t try to impersonate one and say “Did you come here because you are irritated? Come, come, elucidate your thoughts.”)

Guidelines based on the Buddha’s teaching on speech

The main guidance on contribution would be ‘Five Star Speech’, a direct appropriation of the Buddha’s teachings where one contributes:

truthfully
in a mind of good will
expressed in a kind manner
something relevant and beneficial to the recipient
at the right time.

Vāca Sutta (On Speech), Anguttara Nikaya (Numerical Discourses of the Buddha) number 5.98, translated by Thanissaro Bhikkhu

These are not rules, but users are strongly encouraged to try to contribute along these lines as far as possible as it will enhance the overall experience and make it easier to adhere to the actual rules. These will be in the terms of service, which will be along existing lines and may borrow from those that already exist for social media.

Moderation

This is the main requirement of managing the service and for its basis, I quote again from the Buddha:

Intention, I tell you, is kamma. Intending, one does kamma by way of body, speech, & intellect.

Nibbedhika Sutta, Anguttara Nikaya (Numerical Discourses of the Buddha) number 6.63, translated by Thanissaro Bhikkhu

In Buddhism, actions in body, speech and mind, termed in Sanskrit karma, sow the seeds of happiness and unhappiness. When comments are laced with ill feeling and hatred; we refer to it as hateful speech. The words that result from hatefulness might be termed hate speechem> but need to be contextualised since the same words expressed in other contexts, with a benevolent mind, are entirely different in nature. This means that more effort is needed to ensure that contributions are appropriate, which we will address below.

Requiring users to register has some positive impact in terms of self-restraint, but is not sufficient. Skilful moderation is key to the viability of any such service, maintaining its civility. I envisage a combination of manual oversight supported by automated monitoring, with the adoption of some existing best practices, but with more focus on the nature of speech.

There are many facets to this, so, just as an example, monitoring should spot trolling patterns and then advise (with evidence) the poster to review their contributions.

Human moderators

Who should moderate? What should the criteria be? Essentially those who understand, uphold and encourage the principles of well-informed and kindly dialogue in the context of online discussions of current affairs.

The kind of scenario they would deal with would be a subscriber email or automated notification that one or more posts in a particular thread has crossed a ‘bad speech’ threshold. Their role is then to pay closer attention to that discussion, using their experience to facilitate it in a way that keeps it on topic and generally in accordance with 5-star speech. Mindful of the controversy concerning editorial practice at Wikipedia, expressed by its co-founder, Larry Sanger, for example, in an interview with UnHerd, I feel special care is needed to avoid and mitigate bias as far as possible.

So, the kind of person I would like to largely carry out this role is not someone with expertise or strong views on the topic, but rather someone who is a good listener and discerning in human interactions. A service that brings together people who are potentially at opposite ends needs very skilful support. So who is qualified for such facilitation? I have in mind experienced practitioners of Nonviolent Communication, a practice developed by Marshall Rosenberg to solve problems through communication rooted in empathy. Ideally, when problems arise, they enter into a dialogue with the poster, but it depends on resourcing.

Whilst they might not by themselves be able to determine the pertinence of an argument, the truth of a statement or validity of evidence put forward, they are trained in techniques to facilitate that process. Suppose that someone keeps insisting on an assertion, but complaints are received claiming that the assertion is false. What then? The original poster may be invited to clarify, but at this point some authority in the subject matter might be called upon to arbitrate. If no resolution is found, then what kind of closure is appropriate? Closing the thread? Or, as a general measure, limiting the number of posts to a given thread? Whilst wikis can descend into edit wars, in these fora, one can only edit one’s own posts. Some ‘hot topic’ or ‘contested’ flag might be raised.

There ought to be some kind of rewards system as well for those who have made substantial and highly regarded contributions. These might include badges and free newspaper subscriptions.

The use of automation to support moderation

We can enable the sites to be self-moderating to some degree through the use of devices such as a reputation system, commonly indicated by ‘likes’; in Stack Exchange, users largely determine this by indicating support for or approval of posts by others.

However, to aid efficiency and to improve coverage, automated moderation should be deployed, running in the background. Just as topic analysis and sentiment analysis can be used to give an overview, so it can also be used to discern problematic posts such as the trolling already mentioned. Algorithms can fulfil many assistive functions such as automatically grading topics for sensitivity, to give a quick overview, refined with human guidance.

At this point, a note about artificial intelligence (AI). The meaning of AI has narrowed to be, in many people’s minds, synonymous with machine learning, with the emphasis on machine capability. In contrast, I give centrality to the paradigm of Augmenting Human Intellect, as conceived by Douglas Engelbart in his report for Stanford Research Institute in 1962. This approach, as exemplified by his ‘Mother of all demos’, emphasizes the potential of machines to help us do things better and more efficiently.

The actual methods used should also be given careful consideration — I would prefer to avoid the use of ‘black box’ methods such as neural networks wherever possible, especially for reasons of transparency.

Management of Sources

The system should collate, store and readily make available metadata about the original articles, so that visitors to the site know their provenance. Ideally, each one should be archived in more than one way, including on the Wayback Machine at Internet Archive. The sources typically depend in turn on other sources, so there is scope for web crawlers, with many possible foci, such as exploring dependencies of thought.

Discussions, will bring up further sources. So, contributions should be mined also, building up a knowledge base available for analysis. This knowledge can be made available in various ways through the user interface and programmatically through APIs. For example, each article can have a tab that summarises the sources mentioned, perhaps with weightings, though that needs to be robust against gaming the system.

System Architecture

I’ve hardly given any thought to this as I’ve only been pondering what the service might look like. However, one aspect that particularly interests me is the potential for a distributed architecture. So, I would like to explore further Bluesky’s decentralised approach, as discussed in less technical terms in The Pragmatic Engineer.

Conclusion

So there it is. An idea, a tentative proposal, which, I hope, will give fresh impetus to commentary on current affairs, facilitating a fresh dialogue among and between people of differing views. A service for individuals, citizens of the world, that might even help in resurrecting ‘What the papers say‘, last seen at the BBC.

I’d like it to be turned into a business case. I’m not the best person to do this, but I can see how it might be of professional benefit to many parties: it should drive traffic to news sources and various professionals will find it a valuable repository for public opinion.

I’ll just add that an important consideration is how to maintain the service’s independence and protect it from undue influence (in the spirit of the declaration on freedom of expression). Policy statements are just one aspect. To increase its chances of longevity, the funding needs to be well spread and, when implemented, the moderation principles to be firmly maintained, true to the original intention.

Saturday, April 10, 2021

Adventures in the Read/Write Web

[This is an article about my explorations of the Internet, particularly some of the tools and applications I've used and developed to foster collaboration and enable broad participation in the 'read/write' Web. It has involved various experiments with software and hence is quite technical in nature. Sorry if it's difficult to understand, but I'm happy to try to explain and clarify.
Updated 2 October 2021 with a section on static and offline search]

I became acquainted with the Internet at the third time (and university) of asking. My first hint came in 1988, as a maths undergraduate at Southampton University. A friend studying computer science showed me pages and pages of dot matrix printout that he had received by electronic mail from his friend at Warwick University. I was unmoved. The second hint came in 1991, as a maths postgrad at Glasgow University, when another student shared with me the joys of e-mail with friends from abroad, pointing to her computer terminal. Again, I passed.

Two years later, in 1993, as a PhD student in computer science at Kingston University, the Internet was intrinsic to my research and it was then that I dived in; I soon became immersed in e-mail, Usenet Gopher, and another service that seemed to be taking the world by storm, the World Wide Web (or, simply, ‘the Web’).

At that time, I shared a research lab with students from various other disciplines, including Maria Winnett, who specialised in Computer-Supported Cooperative Work (CSCW). Although I did not know it at the time, these kinds of initiatives were just further iterations of ongoing efforts over the decades, as exemplified by Douglas Engelbart’s ‘Mother of All Demos’ given in 1968 (in brief / in full), and later reflected on, inspired in turn by Vannevar Bush’s essay of the '40s “As We May Think”. Furthermore, these approaches would, perhaps unconsciously, influence some of my own work.

Tim Berners-Lee had conceived and implemented the Web only a few years before. A core part of his vision was that the Web should be read/write; the first browser was thus a browser-editor, called WorldWideWeb. Whilst most of the ingredients were in place — the http protocol, HTML, Web browser and httpd server — some aspects were not complete. For instance, his editor could only ‘write’ to local files; the HTTP PUT request method was yet to be devised (more about this later …)

I first explored the Web using NCSA Mosaic, which I also used to take some baby steps in authoring my first HTML markup using its Personal Annotations feature. I then started contributing to departmental pages and I was on my way, but my usage was conventional and most of my attention was on my own research.

OSS Wisdom and Easy Web Editing

Whilst I had been using Unix (Solaris) for my research, I wasn’t really cognisant of free and open source software until I started preparing for my first full time job in 1998 at the University of Derby. I took over the technical development of MultiFaithNet, an online gateway to religious resources. The site was needing a new home; I was asked to set up and maintain a new server, of which I had zero experience. Faced with a dwindling budget, I bought a book on RedHat 5.1, a distribution of the Linux operating system, and installed it on my home computer off the accompanying cover CD. I acquainted myself with the main components, including the Apache httpd web server, CGI/Perl and regular expressions. PHP and MySQL came a bit later.

The site contained a mixture of informational content and community-oriented facilities. The project team maintained editorial control with some pages retaining the same structure, but being revised on a fairly frequent basis. Team members really wanted something as easy to use as a word processor without having to know HTML or worry about the technicalities of uploading to a server. Netscape anticipated such demand with Netscape Composer, which added editing facilities to its Navigator browser. It provided a WYSIWYG interface, making it amenable to all my colleagues, who were not concerned with the underlying HTML that it generated.

There remained the problem of how to enable an easy means to upload edited files to the server. I found a solution thanks to the introduction of the PUT request method, in the HTTP/1.1 specification of the international Web standards. Such a facility had been missing when Sir Tim was first editing web pages using WorldWideWeb (they could only be updated locally under the file URI scheme). The provision of PUT was a pivotal step in opening up the read/write web and, on the client side, support was quickly added to Netscape Composer.

Accordingly, I followed the guidance and implemented support for PUT request method on the server. However, as the Apache Week article intimates, it came with risks and soon become deprecated, so I secured it with various measures, as per the article, with IP address restrictions for good measure.

Annotations in CritLink

MultiFaithNet was a platform for dialogue and engagement. To support this paradigm in the technology infrastructure, I explored web annotations and came across CritLink (and the wider toolset, CritSuite), a project of the Foresight Institute developed by Ka-Ping Yee. It used a proxy server approach to add comments non-invasively with a rich feature set, including bi-directional links, that are not supported in the Web.

I quickly felt it had a lot of potential to support collaborative working, and downloaded and installed it on MFN with the aim of encouraging internal use to begin with. I also contacted Ka-Ping Yee to give some encouragement, suggesting that he present his work in the field of CSCW. Perhaps already having this in mind, he duly delivered a paper. As I started speculating about what lessons might be learnt about free and open source software, I mentioned CritSuite in Open Sources: A Higher Consciousness in Software Development, a paper I gave at an unusual conference, To Catch the Bird of Heaven, which hosted various perspectives on wisdom.

However, these initiatives subsequently came to a halt as funding for the MultiFaithNet project dried up and I needed to find other employment. The uptake of CritSuite failed to gain critical mass, partly because of lack of resources (reflected in the Crit.org server often struggling with poor performance) and partly because some rivals took some of the limelight. However, of probably greater bearing was that larger organisations took exception to having anyone provide ‘frank’ comments, which they regarded like graffiti. There was no moderation (apart from the facility to comment on comments). Nevertheless, among those willing to give CritLink a try, it gained considerable respect.

NetWedit: A WYSIWYG Web editing widget

Today, it’s fairly easy for anyone to create and maintain their own websites. Probably the majority of web content is written and updated through web forms, much of it in content management systems such as WordPress, which accounts a large proportion of all web sites. It was designed as a blogging platform, as such a major step to read/write web according to Sir Tim, when interviewed by Mark Lawson for the BBC in 2005.

Blogging consists not only of jotting down one’s thoughts, as in a manuscript, but presenting it as a Web document. A key enabler, usually powered by JavaScript, is the WYSIWYG editing box. It is now taken for granted, but twenty years ago, Web editing wasn’t so amenable. Whilst there were sporadic attempts to realize in situ editing via the browser’s own client, as I had explored with HTTP PUT, the normal procedure for updating websites was more convoluted. Typically, an author would write copy and send it to someone with technical skills to prepare a web version on their PC using an authoring tool such as Dreamweaver or else hand code the HTML and CSS. Then the files were transferred in a separate process via ftp and finally reviewed by the original author.

However, by the turn of the century, database-driven content management systems (CMS) were already emerging, where it was no longer a matter of editing individual pages, but instances of data that were supplied to templates. Some CMS had large numbers of users who each needed to maintained a few paragraphs and update them whenever they liked. I was faced with such a situation after I had moved down to Oxford to join the Humanities Computing Unit: as part of an online teaching resources database system, theologians needed a simple way to edit some content through the web-based.

With necessity being the mother of invention, I released my first open source contribution in 2001 under LGPL – I’m not a great fan of 'infective' licenses. :-/ It was a rich text editing widget called NetWedit, designed for Netscape Navigator (version 4, upwards) and also supported by Mozilla0.9+, and IE4+ (Windows only). It gained some exposure in HEI after its inclusion in Paul Browning’s TTW WYSIWYG Editor Widgets list. Apart from the Java solutions, all the other entries on the list would only work on one operating system or in one browser; NetWedit was possibly the world’s first non-Java-based cross-browser solution, albeit not fully cross-platform.

Implementation

I was not a JavaScript guru, so I chose the path of least resistance, targeting the textarea HTML input area, which is generally used as a form element for more substantial amounts of text. For users to make dynamic updates without server-side scripting, I figured that I needed editing to take place in a pop-up window, whilst storing the HTML source in the textarea box of the parent window. I realised that dealing with user interactions, especially changes to content, would be a complex process, so I settled on a workflow process whereby the user would type text (without markup) in the boxes and then press a button to launch a pop-up, in which to then carry out the markup (formatting, links, etc.). So, it’s more accurate to say that NetWedit is a markup tool.

Once I had figured this approach, the implementation was quite neat. The code consisted of a single JavaScript library file. When loaded on a given page, any number of form textarea elements could be enabled for WYSIWYG markup. Here’s an extract from a sample page with two editable areas.

To the left you can see a bit of text followed by a couple of textarea boxes in a Web form. These boxes are actually the HTML source views. Normally with rich text editors, we see either the source view or the rich text view, not both, but a feature of NetWedit is that you can see both side by side. It’s thus a handy instructional tool.

I’ve pressed the [Markup 1] button to launch a markup window. In fact, launching the window itself already showed the use of the tag.

Any markup applied in the popup was instantly reflected in the source windows, thereby offering interesting possibilities to learn the markup. Thus, highlighting the word, ‘sample’, and pressing the [U] button is instantly reflected as follows:

As mentioned above, this ‘editor’ didn’t actually allow edits to the text itself - these had to be done in the source textarea.

Visual Shenanigans

Subject to the Document Object Model at that time, there was only one method to capture user-selected text, document.getSelection(), which returned a string. Later revisions, returned an object and allowed one to determine exactly where in the DOM the selection was made. However, here there was no immediate method to locate its whereabouts - if you selected ‘the’, which ‘the’ was it? There was nothing more granular.

I got round the limitation by using a kind of visual deception through (an abuse of) CSS, inspired by WordPerfect’s ‘reveal codes’. Using the tag, I inserted a unique sequence of characters after the first character of each word, hidden by a CSS setting of zero display width. When selecting the intended word, getselection() would include the invisible characters, so that when it came to identifying a particular instance of a word, the system was actually searching for a unique word.

All is revealed by viewing the frame source for the text being marked up:

Behold the generated code in its marked-up glory! Note especially the fragment, t|7|o,

But none of this formatting was applied to the source, so only the intended markup, to, was retained:

The widget was successfully deployed in the Theology Faculty Teaching Resources site, where Faculty could maintain their own profiles – research interests and publications. It was considered simple to use and was all that was needed at that time.

I also received quite a bit of response to the release of the software, especially in the first year or two, when it was tried in custom web-based content management systems (WordPress wasn’t released until 2003). I even wondered about deploying it in CritSuite to make it easier to make granular text selections. However, as browser support became more comprehensive, more sophisticated solutions such as FCKEditor came along (also in 2003), and I knew I would have to take a back seat, though I did try to up my game with RTWedit.

Alas, my editing widgets don’t work in any modern browsers, because the method I used was superseded, though I guess it’s just a matter of working out what the new methods are (and hopefully, there’s no new security constraint). However, it's still possible to see NetWedit, as released in 2001, working on Windows 10: download Netscape 9 from an archive and then launch Navigator and load the page from the Wayback machine (under plain http).

The Aesthetics of XML editing via XSLT

The problem with HTML is that it became increasingly about presentation, despite being in fact a special instance of SGML, designed for documents in general and their semantics (think about how to encode Shakespeare!) This fact was emphasized by colleague in the HCU, especially by its Director, Lou Burnard, and his expert assistant, Sebastian Rahtz. These were authorities on semantic markup and had a great influence on web developments at OUCS.

I discovered this when working on Interactive Audio Comprehension Materials (IACM), a web-based system to train students in various modern languages by listening to passages and answering questions to check their understanding. The data – passage content and multimedia timings – was stored entirely in XML. With Sebastian's help, another team member, Paul Groves, had already developed a delivery system using Perl’s XML:Sablotron, but now the Faculty wanted to add a web-based editing system.

So I devised a system that took user’s web form input and turned it into custom XSLT to transform the XML to XML. As I later reflected, it had the nice aesthetic property of being amenable to recording not only changes to data, but how those changes were made. I showed this solution to Sebastian, who was intrigued and, characteristically, almost immediately wondered whether it could handle more complex scenarios. But a need didn’t arise.

Ingredients for Innovation and Collaboration

OUCS was a very collaborative environment; it helped that almost everyone had a certain level of technical competence and many were experts in their field, so exchanges were fruitful. Everything was available in-house – from R&D to hosting and administration, which facilitated greater exploration and faster turnarounds.

The department was prepared to experiment and take risks, which yielded a lot of innovation. In particular, Sebastian architected the OUCS web site based on TEI XML. It meant all staff having to learn TEI, which required some effort, but at least that could author content using an editor of their choice, ranging from vi to XMetaL. However, once the documents were written, the system effortlessly delivering HTML, PDF and many other documents using Apache Axkit.

OUCS is now history, along with many of the processes, systems and services that were developed. Nowadays, the content and presentation of departmental websites are more managed and controlled. The presentation is more visual, yet the result seems to me to be increased homogeneity, not only in look and feel, but in the kind of content. It reads well because it’s edited for a general audience, yet it feels intellectually less exciting, lacking some freshness and spontaneity.

Nevertheless, aware of cycles in development, I remain optimistic as upward spirals are always possible. Looking at annotations, among the various initiatives, it looks like Hypothes.is is gaining traction and keeping the spirit of collaboration of the early pioneers. There are still some good prospects for a truly read/write Web.

Offline Search for Content Management Systems

Core to ‘read’ing the web is search, another facet we can delve into.

Again in 2001, I became involved in the development of a multimedia Chinese language learning system, featuring a range of interactive exercises. It was implemented as a website driven by CGI/Perl. I was asked to ‘put it on a CD’, with the option to support further updates to its contents. I eventually delivered a production system, with admin facilities, that output a static site according to a specified folder hierarchy. There was, however, one wish that I never got round to fulfilling – a search function. The most promising cross-platform solutions available at the time were mainly based on Java, but I couldn’t manage to incorporate them.

Almost twenty years later, 2020, I am using WordPress and have almost the same requirement – I wish to generate a static version of a WP site that I can search offline. Surely, there’s a nice plugin available?

I duly wandered over to WordPress plugins directory. After a while, it became evident that the search facilities that were available required some server infrastructure, whether that was the use of a third party service like Algolia or some experimental ‘serverless’ solution, All of these, whether ‘scriptless’ or otherwise, still require server infrastructure, even if the site itself is fully static. They are no use for a searching a static site on a memory stick when you have no Internet.

Looking for something that I might (legally) adapt for my purposes, I found WP Static Search, a plugin that looked far more promising, being built on an established JavaScript package, Lunr.js. Again, the author is a developer who is working with this kind of technology in commercial applications and it turned out that it had some bugs; and the author hasn’t yet responded to suggested fixes. Even so, I could make these changes myself by forking the repository on Github.

After applying the fixes, I then tackled the requirement to make this work offline (without reference to any server). The key to this was to take account of the JavaScript security model’s restriction on loading files – basically, any file like a search index cannot be loaded in the usual way. So, the index had to be incorporated in the library itself, in one of the JavaScript files. Thus, I modified the index-builder to write the index inside lunr-index.js itself. I’ve published the modified code on GitHub: https://github.com/paultraf/wp-static-search . You can go ahead and download the master zip, but then rename it wp-static-search.zip before installing in WordPress.

It’s rudimentary, but it works. I use it for the Sigala Research site, in conjunction with wget, which creates the static version for me.

Paul Trafford's blog