Research Articles in Semplified HTML (RASH)
The Research Articles in Simplified HTML (RASH) format is a markup language that restricts the use of HTML elements to only 32 elements for writing academic research articles. It allows authors to use embedded RDF annotations. In addition, RASH strictly follows the Digital Publishing WAI-ARIA Module 1.0 for expressing structural semantics on various markup elements used.
The development of RASH started from the whole HTML5 grammar, and proceeded by removing and restricting the particular use of HTML elements, to make them expressive enough for representing the structures of scholarly papers and to have the language totally compliant with the theory on structural patterns for XML documents. These patterns allow one to create unambiguous, manageable and well-structured markup languages and, consequently, documents, fostering increased reusability (e.g., inclusion, conversion, etc.) among different languages. Also, thanks to the regularity they provide, it is possible to perform easily complex operations on pattern-based documents even when knowing very little about their vocabulary (automatic visualisation of document, inferences on the document structure, etc.)
The formal grammar of RASH has been developed by means of RelaxNG, which is a simple, easy to learn, and powerful schema language for XML, accompanied by a descriptive documentation. The grammar has been logically organised in four distinct logical blocks of syntactic rules, defining respectively elements, attributes, content models8 for the elements and their related attribute lists.
The 32 HTML5 elements that can be used in RASH are: a
, blockquote
, body
, code
, em
, figcaption
, figure
, h1
, head
, html
, img
, li
, link
, math
, meta
, ol
, p
, pre
, q
, script
, section
, span
, strong
, sub
, sup
, svg
, table
, td
, th
, title
, tr
, ul
.
In addition, RASH defines different ways to implement formulas. The standard specification for representing mathematics on the Web is MathML, which can be used in RASH. However, even if MathML is the best accessible way for writing mathematical formulas, the organisation of the elements for defining even a quite simple formula is quite verbose and this is a reasonable obstacle to its direct adoption. Thus, it is also possible to define formulas by means of an image (element img
with @role = 'formula'
), or by using the element span
(with @role = 'formula'
) containing a LaTeX or AsciiMath formulas – that can be rendered correctly via MathJax.
Two applications has been developed in order to check whether a document is compliant with the RASH grammar:
a Bash script that enables RASH users to check their documents simultaneously both against the specific requirements in the RASH RelaxNG grammar and also against the HTML specification through W3C Nu HTML Checker;
a Python application that enables one to validate RASH documents against the RASH grammar, and it makes also available a Web interface for visualising all the validation issues retrieved in RASH documents.
The visualization of a RASH document is rendered by the browser by means of appropriate CSS3 stylesheets and Javascript scripts developed for this purpose. In particular, RASH adopts external libraries, such as Bootstrap and JQuery, in order to provide the current visualisation and include additional tools for the user, such as the footbar with statistics about the paper (i.e., number of words, figures, tables and formulas) and a menu to change the actual layout of the page
In addition, RASH these scripts implements also the automatic rendering of paper items, such as references to a bibliographic entry or a figure, so as to reduce the cognitive effort of an author when writing a RASH paper by means of a text editor.
The SPAR Xtractor Suite is a Java application that performs the automatic enrichment of RASH documents with RDFa annotations defining the actual structure of such documents in terms of the FRBR-aligned Bibliographic Ontology (FaBiO) and the Document Component Ontology (DoCO). SPAR Xtractor is designed as a one-click tool able to add automatically structural semantics to a RASH document.
In particular, SPAR Xtractor takes a RASH document as input and returns a new RASH document where all its markup elements have been annotated with their actual structural semantics by means of RDFa. The tool associates a set of FaBIO or DoCO types with specific HTML elements. The set of HTML elements and their associations with FaBIO or DoCO types can be customised according to specific needs of expressivity.
The list of some venues that have adopted RASH as submission format:
2017 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2017), held during the 26th International World Wide Web Conference (WWW 2017)
3rd Workshop on Managing the Evolution and Preservation of the Data Web (MEPDaW 2017), held during the 14th Extended Semantic Web Conference (ESWC 2017)
7th International Workshop on Consuming Linked Data (COLD2016), held during the 15th International Semantic Web Conference (ISWC 2016)
4th International Workshop on Linked Data for Information Extraction (LD4IE 2016), held during the 15th International Semantic Web Conference (ISWC 2016)
6th Workshop on Linked Science (LISC 2016), held during the 15th International Semantic Web Conference (ISWC 2016)
2nd Workshop on Mobile Deployment of Semantic Technologies (MoDeST 2016), held during the 15th International Semantic Web Conference (ISWC 2016)
PROV: Three Years Later 2016 Workshop, held during the Provenance Week 2016
Semantic Publishing Challenge 2016 (SemPub2016), held during the 13th Extended Semantic Web Conference (ESWC 2016)
4th International Workshop on Linked Media (LIME 2016), held during the 13th Extended Semantic Web Conference (ESWC 2016)
20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016)
2016 Workshop on Web APIs and RESTful Design (WS-REST2016), held during the 16th International Conference on Web Engineering (ICWE2016)
2016 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2016), held during the 25th International World Wide Web Conference (WWW 2016)
2015 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), held during the 24th International World Wide Web Conference (WWW 2015)
2015 International Workshop on Learning in the Cloud (LC2015), held during the 26th ACM Conference on Hypertext and Social Media (Hypertex 2015)
Semantic Publishing Challenge 2015 (SemPub2015), held during the 12th Extended Semantic Web Conference (ESWC 2015)
1st International Workshop on LINKed EDucation at the ISWC 2015, held during the 14th International Semantic Web Conference (ISWC 2015)
3rd International Workshop on Linked Data for Information Extraction (LD4IE 2015), held during the 14th International Semantic Web Conference (ISWC 2015)