MathML
From Better Practices
Contents |
[edit] MathML
[edit] Overview
MathML (the Mathematics Markup Language) is the W3C (World Wide Web Consortium) standard for encoding mathematics. It belongs to the XML (eXtensible Markup Language) family of markup languages. XML is the standard for encoding specialized information, particularly for the web. Other examples of XMLs are ChemML (for chemistry) MusicML (for music), SVG (for graphics) XSL (for style), InkML (for digital ink). There are hundreds of specialized XMLs, and new ones are being defined all the time. It's probably fair to say that just about every software or technology company has defined at least one specialized XML. All XMLs share a common syntax, which makes them very portable, easy to design, easy to understand, and easy to modify. We could define our own "better practices" XML if we wanted to. (Let's call it BPML.)
MathML comes in two flavors--Presentation MathML and Content MathML. Presentation MathML encodes how mathematical expressions are built up and displayed in two dimensional form (box, sub, super, pre, under, over, etc.). Content MathML encodes the structure of a mathematical expression (sum, product, integral, power, vector, matrix, etc.). Both versions actually encode much more of the structure of a mathematical expression than a typesetting language like TeX. You can mix the two versions.
Content MathML is currently limited to mathematics at about the sophomore college level and below. However, like any good XML, Content MathML can be extended by the author to add additional constructs. (But of course, the new constructs would not be "standard".) To display Content MathML, a browser needs a stylesheet (XSL) to translate the Content MathML to Presentation MathML.
Both versions are extremely verbose, precisely because so much information is encoded. To give just one simple example, you cannot simply write "2" in MathML. You have to mark it up to indicate that, yes, you really do mean the number 2, as opposed to say, a variable named "2". The verbosity is good in the sense that much of the semantics of an expression is captured, but bad for authors who have to write the markup.
[edit] Basic Structure
Technically, we are discussing compound XML documents, with exposition in XHTML and with embedded MathML. The first part of a compound XML document contains declarations about the type of XMLs used, the document type definition, and other information. A document with MathML should be saved with extension xhtml or xml, and should have the following initial declarations:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN"
"http://www.w3.org/Math/DTD/mathml2/mathml2.dtd">
<?xml-stylesheet type="text/xsl" href="mathml.xsl"?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
You don't really have to understand what all this means, but if you're curious, the first line declares that this is an XML document and gives the version and encoding. The second line declares that this is a compound document with MathML, and gives a link to a document type definition (DTD), The third line gives a references to an XSL style sheet, and finally the html tag gives a reference to a namespace.
You will need to download the MathML style sheets mathml.xsl, pmathml.xsl, pmathmlcss.xsl, ctop.xsl and put them in your document directory. (The files are available at the W3C Math site.) The last of these, ctop.xsl is only needed for Content MathML and hence can be omitted if you are just going to use presentation MathML.
[edit] Examples
[edit] Presentation MathML
The following markup in Presentation MathML
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mi>x</mi><mo>=</mo>
<mfrac>
<mrow>
<mo>-</mo><mi>b</mi><mo>±</mo>
<msqrt>
<msup>
<mi>b</mi><mn>2</mn>
</msup>
<mo>-</mo>
<mn>4</mn><mo>⁢</mo><mi>a</mi><mo>⁢</mo><mi>c</mi>
</msqrt>
</mrow>
<mrow>
<mn>2</mn><mo>⁢</mo><mi>a</mi>
</mrow>
</mfrac>
</mrow>
</math>
gives the quadratic formula
Note the following:
- The use of rows to define the blocks of the expression
- The markup of "entities" (basic variables)
- The markup of numbers
- The markup of operators
- The use of "invisible times" for implied multiplication
[edit] Content MathML
The following markup in Content MathML
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<apply><eq />
<apply><int />
<bvar><ci>z</ci></bvar>
<lowlimit>
<apply><minus />
<infinity />
</apply>
</lowlimit>
<uplimit><infinity /></uplimit>
<apply><exp />
<apply><times />
<apply><minus />
<cn type="rational">1<sep />2</cn>
</apply>
<apply><power />
<ci>z</ci>
<cn>2</cn>
</apply>
</apply>
</apply>
</apply>
<apply><root />
<apply><times />
<cn>2</cn>
<pi />
</apply>
</apply>
</apply>
</math>
gives the integral equation
Note the following
- The markup of "entities" (basic variables)
- The markup of numbers
- The structure of operators and relations (times, equal, divide, square root)
- The markup of special functions (exp)
- The structure of the integral (bound variable, limits)
[edit] Specifications and Tutorials
- MathML Home
This is the MathML home page at W3C, with links to myriad resources.
- MathML 2.0 Specification
This is the official specification for the current (2.0) version of MathML. It has an introductory chapter, followed by chapters on fundamentals, Presentation MathML, Content MathML, mixing the two types, characters and fonts, the MathML interface, and the document object model (DOM).
- Putting Math on the Web with MathML
This is a brief tutorial, also from W3C, on how to prepare the header information of an XHTML document with embedded MathML so that the document will display properly with various browsers.
- A Gentle Introduction to MathML
This is a very nice tutorial for authors of MathML by Robert Miner of Design Science, one of the co-chairs of the W3C Math Working Group. As with any other language, it's important that you understand the basic structure of MathML, even if you use authoring tools that shield you from much of the code.
[edit] Tools
- Amaya
Amaya is the W3C's testbed browser and editor. It's free and open source, and available for Windows, Mac, or Linux. It has a split display that shows the document in one pane and the source code in another. You can edit either pane, so you can choose between WYSIWYG or source editing. Amaya has 6 palettes of MathML constructs. You simply click on a template in a palette and fill in in the various boxes. This is easy, but very slow.
- TtM
TtM translates a TeX or LaTeX document into an HTML document with embedded Presentation MathML. The Linux version is free; the Windows versions costs $39.95. The website also has an interactive page where you can input small bits of TeX or LaTeX and see the MathML output.
- ASCIIMathML
ASCIIMathML is based on JavaScript. The script processes an HTML document with math encoded in a TeX or in a simple calculator-like language. The resulting HTML document with embedded Presentation MathML is then displayed in the browser. The source document simply has a link to the ASCIIMathML JavaScript file. ASCIIMathML is free and open source.
- MathType
MathType is a WYSIWYG math editor from Design Science, and is an enhanced (and much improved) version of the Equation Editor that comes with Microsoft Word 2003 and below. (But not Word 2007, which has native support for math with its own input language; see the Word page for more details.) Math expressions are constructed by clicking on templates in palettes and then filling in the boxes. MathType can save mathematical expressions as GIF images, as LaTeX markup, or as Presentation MathML markup. It's very easy to use, but like all WYSIWYG tools, too slow for serious mathematics. The academic version of MathType costs $57. You can download a free 30-day evaluation copy.
- WebEQ
WebEQ is suite of tools, also from Design Science, for authoring MathML. WebEQ Editor is a WYSIWYG tool, similar to MathType. WebEQ Publisher is a translator that takes as input an HTML document with mathematics in a LaTeX-like language called WebTeX, and produces as output an HTML document with embedded Presentation MathML. Thus, WebEQ Publisher is similar in concept to TtM. The academic version costs $57. You can download a free 30-day evaluation copy.
- Content Psuedo-TeX Translator
Our colleague Tom Leathrum has written a translator that takes his own version of LaTeX as input and produces Content MathML as output. The tool is in the form of an interactive web page with input and output text boxes. This is the only tool that I know of that produces Content MathML
[edit] What's New in MathML 3
- Closer alignment of Content MathML with OpenMath
OpenMath is another extensible standard for representing the semantics of mathematical objects
- Support for elementary school math
Markup is being designed for the display of the division algorithm (which varies widely from country to country), for vertical alignment of columns for other arithmetic operations, etc.
- MathML for CSS
This is a specification for how a subset of Presentation MathML could be rendered by CSS (Cascading Style Sheet), version 2 or 3. This will provide an alternative way to render MathML, in addition to native support (Firefox) and via plug-ins (Internet Explorer). It will allow rendering of MathML in Opera, for example, which has superb CSS support. It may also lead to more consistent rendering of MathML across browsers.
- Standard linear input language
This would be a standard, TeX-like input language that could be faithfully translated into Presentation MathML. Hopefully, this would be supported by tools (WebEQ, ASCIIMathML, etc.), so that authors would not be tied to a particular tool.
--Kyle 15:57, 14 July 2007 (EDT)
