#header,next=TZ2_COMMANDS.html,prev=./RESPONSIVE.html

Bigger, faster, smarter - the monster returns.

Objectives

Create a javascript program to convert LaTeX math formula's into responsive MathML. In order of importance

Reference Material

Design

LaTeX Packages

Like the original, TeXZilla 2 does not load and process LaTeX packages. Instead it has a library of about 900 commands taken from widely used LaTeX packages. These commands are directly converted to MathML by imitating the intended effect of the LaTeX command.

The advantage of this approach is that dependencies on LaTeX package's are completely avoided, as are conflicts with user macro's. It also means some commands can be implemented with more natural syntax and there is very little user configuration required.

The disadvantage is that the TeXZilla 2 code is not 100% compatible with LaTeX. This is no big deal because there are already quite a few compatibility "issues" within LaTeX and also with MathJax and KaTeX. If a high degree of compatibility is required, fully bracket command arguments and avoid "low-level" and "non-math" LaTeX commands.

Argument Marshalling

TeXZilla uses rules similar to TeX for marshalling macro arguments. An argument is either a single token or a group of tokens between curly brackets.

However rules for marshalling command arguments sometimes differ. Each unbracketed argument extends until it's "natural" end token. For example like TeX a length dimension extends until it's unit field. But unlike TeX a subformula starting with a \left command extends until the corresponding \right command. And a number extends until whitespace or a non-numeric token is encountered. So for example TeXZilla interprets 12^34 as 12 to the power 34, not, 1 followed by 2 to the power 3, followed by 4. To get full LaTeX compatibility for the two different interpretations write {12}^{34} or 1 2^3 4.

Layers

TeXZilla 2 has four layers of processing. The first layer walks the DOM tree finding strings of TeX code that need converting to MathML. It calls the other layers to convert them into strings of MathML code which it then injects back into the DOM tree.

The second layer takes a TeX string and converts it into a stream of tokens according to the venerable TeX syntax rules, expanding macro's along the way.

The stream of tokens is passed to the third layer, the interpreter which carries out any special syntax parsing required. It reduces the stream of about 900 different TeX tokens down to a more manageable 30 odd MathML tree building commands.

The fourth layer takes the stream of commands and builds a MathML-like tree. It then serializes the tree to a string of MathML code and passes the results back to the first layer.

Input

All input is Unicode, it's not limited to Ascii. TeXZilla interprets it's input as a stream of tokens.

Greek Letters

TeX does not include the Latin-like Greek capital letters, it simply uses the Latin equivalents. To better integrate with unicode TeXZilla includes them as separate letters with the names \Alpha etc. Also to align with the Mathematical Alphanumeric Symbols unicode block it includes two additional Greek capital letters \Digamma and \Thetasym.

Mathematical Symbols

Almost all mathematical symbols have been assigned unicode's in the extended plane and are supported in the common MathML fonts. So there is no need for any other special fonts at all. Commands like \mathfrak and \mathcal are implemented simply by mapping A-Z etc. into this plane.

Fonts

TeXZilla uses a single type OpenType Math base font to provide almost all glyphs. The base font to be used is specified in the configuration. It uses STIX Two Math as a fallback to provide glyphs missing from the base font. If a glyph is missing from both the base font and STIX it will be sourced from a system font.

Most OpenType Math Fonts do not include complete sets of Chancery and Roundhand script characters. So some or all of these characters sets are sourced from STIX (which is why they often look the same when the base font is changed). Glyphs which do not have assigned unicodes, are sourced from the STIX private use area. These include "blackboard italic" latin glyphs, "sans serif" Greek letters and a small number of "negated" relations.

Oldstyle numbers which are not available in STIX Two Math. If the base font doesn't provide them they are sourced from the system cursive font.

Handwriting Fonts

There are three handwriting fonts available - neat, untidy and marker pen. These fonts are synthesized by combining glyphs from an Opentype math base font and a Truetype handwriting font.

Macros

Are user-defined commands. Perform text substitution - using strings of tokens. Arguments for macro's are marshalled by looking ahead in the input stream for the next token, or string of tokens contained between balanced pairs of { }. As a result of this rule macro argument values always contain balanced sets of braces. The outer pair of braces are not considered to be part of the macro value. In TeX there is no implicit boxing of command arguments. However many of the AMS commands appear to get their arguments automatically boxed in braces. I'm guessing the reason for this discrepancy is that the AMS commands were once macro's which boxed their arguments internally.

Math Style

The following user selectable italicization styles have been implemented.

Math Style Roman Greek Numbers
Upper Lower Upper Lower
ISO italic italic italic italic upright
TeX italic italic upright italic upright
French upright italic upright upright upright
Upright upright upright upright upright upright

When a handwriting font is selected Roman letters and numbers are neither italic nor upright, they have a "hand written" style.

When the Euler Math font is selected all characters are upright (it is a font feature of the font).

Unsupported LaTeX Commands

The following low-level LaTeX class commands are not supported because there is no reliable way to translate them to MathML:

\mathopen \mathclose \mathop \mathbin \mathord \mathpunct \mathrel

It's usually possible to use some high-level command instead. If there is no alternative the following MathML-like commands taken from TeXZilla 1 can be used.

\mi \mn \mo \mtext

Each takes one text argument which is the MathML tag content. For example to create a double increment operator or a number in scientific format

\mo{++} \mn{−1.234E18}


To Do

Other possibilities:


OpenType Math Fonts

List of features supported by OpenType Math fonts.

STIX ASAN CAMB CONC EREW EULR* FIRA GARA GFSN LATM LETE LIBR NCMM NOTO XCHR XITS
Calligraphic def - def def priv def - ss03 def def def - def ss01 - ss01
Bold Calligraphic def - def def priv def - ss03 - def def - def - - ss01
Script ss01 def - - def - - def - - - def ss01 def def def
Blackboard Italic priv - - - - - - - - - - - - - - ss06
Blackboard Bold - - - - priv priv - - - - priv - ss03 - priv ss05
Sans Serif Greek priv - - priv priv priv - - - - priv - priv - priv ss02
Full size primes ss04 def - ssty ssty ssty - - - - ssty ssty† - ssty ssty -
Wide hat etc. - 10ffa6 - e520 e520 e520 - - - - e3e1 - - - e520 -
Upright Integrals priv def def ss03 ss03 def ss01 ss07 - - ss08 def ss02 ss02 ss03 ss08
Mathematical g ss02 - - - - def - - - - cv11 - - - - -
Oldstyle Numerals - yes yes yes yes yes - - - - yes - - - yes -

* all glyphs upright   † not quadruple prime

Common Font Issues

Other Font Issues


Testing

Browsers: Chrome, Firefox, Safari, Edge. Also Samsung Internet and Opera.

Platforms: MacOS, Windows, Android, iOS.

Hardware: MacBook, Windows Desktop, Samsung Phone, iPad.