This repository is archived and no longer being maintained. Active development will be formally managed in the unified-doc organization.
unified document renderer for content.
Content as structured data. -- unified
Knowledge is unified abstractly across humanity. We share common goals of acquiring, storing, and sharing knowledge. Content represents the physical manifestation of storing knowledge, and is stored in various digital formats in the modern computing age. Sharing content seamlessly across formats is a current challenge in unifying human knowledge.
Various softwares act on content types to parse, process, and render the underlying data for human consumption. Many solutions try to be interoperable, but are largely limited by the lack of a common interface across content types and programs. These solutions can be largely described as API interactions between software, but not as interactions with the actual content. The unified initiative addresses this problem by representing content in unified syntax tress where programs can work closely with the underlying structured content.
unified-doc is a project of unified document renderers and associated utilities, that use the unified ecosystem to render any supported content types into HTML-based markup. It represents content as structured data, and preserves fidelity of the original source content in the rendered document, all at the same time supporting powerful features that enrich the document (e.g. annotations), and remaining interoperable with standard and evolving web technologies.
The following section covers the design of how
unified-doc renderers and programs are implemented.
At the time of writing,
unified-doc supports parsing the following content types into hast trees:
This is done through the
processor module which provides a single entry point to define how supported content types are parsed into
processor applies an opinionated (but configurable) sanitization step using the
Now that the source content is represented as unified
hast tree, everything downstream can be consistently implemented. Let's talk about compiling and rendering the
hast tree into an actual
document refers abstractly to the output of compiling and rendering the
hast tree. This output should be a HTML-based markup to support easy methods to further enrich the document with available web technologies.
unified-doc supports the following renderers:
Renderers should use the
processor module internally so that it can support all content types that
processor supports. It can optionally include rehype plugins depending on features to be supported.
react-unified-doc uses the
hast-util-annotate utility to support annotation features on
hast trees processed by
One of the more important and useful features when rendering documents is supporting annotations. Here are some use cases of annotations in common document workflows:
- Highlighting: Text content is highlighted in the document with custom styles. This is the broadest domain and there are many UIUX implementations to tailored for specific document workflows.
- Bookmarking: Loading a document with a and clicking on a valid anchor link will scroll to the bookmarked annotation.
- Commenting: Clicking on an annotation loads associated comments.
- Redlining: Text content is underlined, showing the difference between two versions of the document.
annotationrepresents text content that is visibly marked to the user and does not disrupt the rest of the document layout.
The definition above is intentionally worded to emphasize the following:
- text content: Only text content is meaningful to the viewer. For HTML-based markup, this is semantically represented by
- visibly marked: annotated text nodes should apply visual cues indicating they are annotated or 'marked'. For HTML-based markup, this is represented semantically by
marknodes, and visual customizations of these nodes is important in conveying annotation information.
- does not disrupt: annotations should be pure semantic additions to the document without affecting the rendered document.
Annotations should support intuitive user interactions (e.g. clicking, hovering). These interactions allow building useful features that enrich the document (e.g. tooltips, permalinks, updating annotations).
Note: As mentioned earlier, it is important to view annotations as a pure additive operation when rendering documents. Annotation implementations should never couple the rendering of documents and annotations nor affect the document layout. This ensures that downstream applications of plugins and web technologies work seamlessly.
Just as all content and programs are interoperable in the unified ecosystem, the
unified-doc renderers should be compatible with the rehype plugin ecosystem. See the
react-unified-doc plugins docs for an example on how this is achieved.
This project is built on top of the unified ecosystem. Please check out all the inspirational and ambitious projects happening there!
Help contribute towards making content and knowledge more accessible for machines and humans.
There are no formal contribution guidelines yet. Be respectful and nice!
Useful infomation about the project:
- The project is linted with
xowith some custom configuration.
- While the project uses
typescript, it is not a
typescriptproject and uses it purely to aid development. This is intentional to make the code more accessible to the broader JS community.
- Tests are managed with
- Docs are managed with