@atjson/source-html

Create atjson documents from HTML

Stats

stars ๐ŸŒŸissues โš ๏ธupdated ๐Ÿ› created ๐Ÿฃsize ๐Ÿ‹๏ธโ€โ™€๏ธ
@atjson/source-html
Minified + gzip package size for @atjson/source-html in KB

Readme

๐Ÿงญ @atjson/source-html

The HTML source turns an HTML document into an annotated document, with the raw HTML source as the text, and all the tags (and attributes) as annotations.

This source can be used to parse and convert HTML pages into another form of markup, like markdown. The snippet of code to do this is:

import HTMLSource from "@atjson/source-html";
import CommonMarkRenderer from "@atjson/renderer-commonmark";
import OffsetSource from "@atjson/offset-annotations";

function htmlToMarkdown(html: string) {
  return CommonMarkRenderer.render(
    HTMLSource.fromRaw(html).convertTo(OffsetSource)
  );
}

๐Ÿ”ฎ Insights into your HTML

The HTML source is particularly useful to take HTML and be able to modernize it into a rich experience, for example. We've taken a complex static / JS rendered webpage and turned it into a React application using atjson at Condรฉ Nast as an example of how powerful this can be.

๐Ÿ’โ€โ™‚๏ธ How Annotations are generated

We dynamically generate the HTML annotations for this package directly from the WHATWG HTML spec. To regenerate the annotations for this source, run the script in scripts/generate-annotations.js:

node ./scripts/generate-annotations.js

This will regenerate all files in the annotations directory, so beware! Any manual changes can and probably will be overridden by this script in the future, as the HTML spec evolves over time.

If you find any bugs or have a feature request, please open an issue on github!

The npm package download data comes from npm's download counts api and package details come from npms.io.