Semantic Latex

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Semantic Latex

Tim Daly
There has been an effort in the past to extract mathematics from
latex. It seems that the usual latex markup does not carry enough
semantic information to disambiguate expressions.

Axiom has a similar problem occasionally where the interpreter
tries to guess and the compiler insists on type specifications.

Axiom provides an abbreviation for each type, such as FRAC for
Fraction and INT for Integer.

Might it be possible to create latex macros that take advantage
of this to provide unambiguous markup. For instance, instead of

\frac{3x+b}{2x}

we might have a latex markup of

\FRAC[\INT]{3x+b}{2x}

where there was a latex macro for each Axiom type. This would
turn into an latex \usepackage{AxiomType}

There would be a map from the \FRAC[\INT] form to the \frac
form which seems reasonably easy to do in latex. There would
be a parser that maps \FRAC[\INT] to the Axiom input syntax.

The problem would be to take the NIST Math Handbook sources
(is the latex available?) and decorate them with additional markup
so they could parse to valid Axiom input and valid latex input
(which they already are, but would validate the mapping back to
latex).

Comments?

Tim


_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer
Reply | Threaded
Open this post in threaded view
|

Re: Semantic Latex

James Davenport
Indeed, the semantics of LaTeX is pretty weak. I REALLY wouldn't like to start from there - even (good) MathML-P, with ⁢ etc. is much better.
However, LaTeX is what we have, and what we are likely to have in the near future, so we must live with it, and yours seems like as good an accommodation as any.
Another problem is that mathematicians do not mean what they write: $\frac{x+1}2$ is logically an element of Z(x), but the mathematician probably intended Q[x].
James

Sent from my iPhone

> On 10 Aug 2016, at 11:50, "Tim Daly" <[hidden email]> wrote:
>
> There has been an effort in the past to extract mathematics from
> latex. It seems that the usual latex markup does not carry enough
> semantic information to disambiguate expressions.
>
> Axiom has a similar problem occasionally where the interpreter
> tries to guess and the compiler insists on type specifications.
>
> Axiom provides an abbreviation for each type, such as FRAC for
> Fraction and INT for Integer.
>
> Might it be possible to create latex macros that take advantage
> of this to provide unambiguous markup. For instance, instead of
>
> \frac{3x+b}{2x}
>
> we might have a latex markup of
>
> \FRAC[\INT]{3x+b}{2x}
>
> where there was a latex macro for each Axiom type. This would
> turn into an latex \usepackage{AxiomType}
>
> There would be a map from the \FRAC[\INT] form to the \frac
> form which seems reasonably easy to do in latex. There would
> be a parser that maps \FRAC[\INT] to the Axiom input syntax.
>
> The problem would be to take the NIST Math Handbook sources
> (is the latex available?) and decorate them with additional markup
> so they could parse to valid Axiom input and valid latex input
> (which they already are, but would validate the mapping back to
> latex).
>
> Comments?
>
> Tim
>

_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer
Reply | Threaded
Open this post in threaded view
|

Re: Semantic Latex

Richard Fateman-2
On 8/14/2016 11:05 AM, James Davenport wrote:
> Indeed, the semantics of LaTeX is pretty weak. I REALLY wouldn't like to start from there - even (good) MathML-P, with &InvisibleTimes; etc. is much better.
> However, LaTeX is what we have, and what we are likely to have in the near future, so we must live with it, and yours seems like as good an accommodation as any.
Face it:  Mathematicians do it all the time.  They read journal articles
with
no more information than the position of glyphs on a piece of paper.

There are poorly printed papers and reference books in which it is
impossible
to be sure what the glyphs are  (esp. books printed on crude paper in
Moscow...)

And there are poorly written papers which cannot be read in isolation --
the authors (and reviewers, editors) have so much absorbed the context of
their field that they neglect to define their peculiar notation.

Nevertheless, that's what the literature looks like.

when I was trying to scan Gradshteyn & Rhyzik or similar books, we stumbled
over it page by page. I recall finding a place where we  figured out what
the typeset integration result was by trying out our various
semantic opinions and differentiating.

Talking with run-of-the-mill professional academic applied mathematicians is
sometimes revealing.  At one demonstration (in Essen, Germany, a conference
on "Retrodigitalization" of mathematics -- {It may sound better in
German}, A
program of mine read in a page or two from  Archiv der Mathematik, and spit
it out -- but with a modern font,  and in two-columns, other changes
too.  The mathematicians in
the audience were thunderstruck, because they thought that the program must
have understood the mathematics to make that kind of transformation.

Of course all it did was guess at the
appropriate TeX to produce the equivalent spacing, and "knew" nothing of
the semantics.

Actually, I was amazed by the result when I saw it, but for two reasons.
  (a) Someone else had actually used my program;
  (b) There were no errors.

[The reason for (b)  is that the recognition program had been trained on
exactly
-- maybe only -- that page -- it was trained so that
defective/broken/linked characters were
mapped to the right answers].

But the point remains that if we wrote a program that was as smart as
(the collection of...) smartest human mathematicians,  then TeX would be
enough
semantics.





>  
> Another problem is that mathematicians do not mean what they write: $\frac{x+1}2$ is logically an element of Z(x), but the mathematician probably intended Q[x].
I think that most people using DLMF.nist.gov  would not know or care.
It's not their part of mathematics.

It is probably unfortunate if Axiom (or Openmath or MathML) cares and
consequently requires such users to know.

RJF

> James
>
> Sent from my iPhone
>
>> On 10 Aug 2016, at 11:50, "Tim Daly" <[hidden email]> wrote:
>>
>> There has been an effort in the past to extract mathematics from
>> latex. It seems that the usual latex markup does not carry enough
>> semantic information to disambiguate expressions.
>>
>> Axiom has a similar problem occasionally where the interpreter
>> tries to guess and the compiler insists on type specifications.
>>
>> Axiom provides an abbreviation for each type, such as FRAC for
>> Fraction and INT for Integer.
>>
>> Might it be possible to create latex macros that take advantage
>> of this to provide unambiguous markup. For instance, instead of
>>
>> \frac{3x+b}{2x}
>>
>> we might have a latex markup of
>>
>> \FRAC[\INT]{3x+b}{2x}
>>
>> where there was a latex macro for each Axiom type. This would
>> turn into an latex \usepackage{AxiomType}
>>
>> There would be a map from the \FRAC[\INT] form to the \frac
>> form which seems reasonably easy to do in latex. There would
>> be a parser that maps \FRAC[\INT] to the Axiom input syntax.
>>
>> The problem would be to take the NIST Math Handbook sources
>> (is the latex available?) and decorate them with additional markup
>> so they could parse to valid Axiom input and valid latex input
>> (which they already are, but would validate the mapping back to
>> latex).
>>
>> Comments?
>>
>> Tim
>>


_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer
Reply | Threaded
Open this post in threaded view
|

Re: Semantic Latex

Tim Daly

>>Another problem is that mathematicians do not mean what they write:
>> $\frac{x+1}2$ is logically an element of Z(x), but the mathematician probably intended Q[x].

>I think that most people using DLMF.nist.gov  would not know or care.
> It's not their part of mathematics

I think this is the fundamental change that Axiom (and computational mathematics
in general) adds to the question. I've previoiusly called this the "provisos" question.
What are the conditions on a formula for it to be valid and applicable?

What "semantic latex" would introduce is explicit conditions on the formulas such as
"Fraction Integer" or "Complex", as well as conditions on the analytic continuations
for branch cuts.

The likely result would be that one formula in G&R might turn into several because
a simplification that is available over C might not be available over R.

This adds to the size of the tables but makes their computational mathematics
use much easier. Suppose that 3.7.14 was split based on provisos into one
formula valid over R and one valid over C (based on a simplification).

A "computational mathematics G&R" would be quite useful, possibly leading to a
normalization of assumptions made by the various systems....as in, "Oh, we were
using 3.7.14a (valid only over R) and you were using 3.7.14b (valid over C)"

It would also highlight research questions as "we can handle integration of
3.7.14a but not 3.7.14b


On Sun, Aug 14, 2016 at 7:05 PM, Richard Fateman <[hidden email]> wrote:
On 8/14/2016 11:05 AM, James Davenport wrote:
Indeed, the semantics of LaTeX is pretty weak. I REALLY wouldn't like to start from there - even (good) MathML-P, with &InvisibleTimes; etc. is much better.
However, LaTeX is what we have, and what we are likely to have in the near future, so we must live with it, and yours seems like as good an accommodation as any.
Face it:  Mathematicians do it all the time.  They read journal articles with
no more information than the position of glyphs on a piece of paper.

There are poorly printed papers and reference books in which it is impossible
to be sure what the glyphs are  (esp. books printed on crude paper in Moscow...)

And there are poorly written papers which cannot be read in isolation --
the authors (and reviewers, editors) have so much absorbed the context of
their field that they neglect to define their peculiar notation.

Nevertheless, that's what the literature looks like.

when I was trying to scan Gradshteyn & Rhyzik or similar books, we stumbled
over it page by page. I recall finding a place where we  figured out what
the typeset integration result was by trying out our various
semantic opinions and differentiating.

Talking with run-of-the-mill professional academic applied mathematicians is
sometimes revealing.  At one demonstration (in Essen, Germany, a conference
on "Retrodigitalization" of mathematics -- {It may sound better in German}, A
program of mine read in a page or two from  Archiv der Mathematik, and spit
it out -- but with a modern font,  and in two-columns, other changes too.  The mathematicians in
the audience were thunderstruck, because they thought that the program must
have understood the mathematics to make that kind of transformation.

Of course all it did was guess at the
appropriate TeX to produce the equivalent spacing, and "knew" nothing of
the semantics.

Actually, I was amazed by the result when I saw it, but for two reasons.
 (a) Someone else had actually used my program;
 (b) There were no errors.

[The reason for (b)  is that the recognition program had been trained on exactly
-- maybe only -- that page -- it was trained so that defective/broken/linked characters were
mapped to the right answers].

But the point remains that if we wrote a program that was as smart as
(the collection of...) smartest human mathematicians,  then TeX would be enough
semantics.





  Another problem is that mathematicians do not mean what they write: $\frac{x+1}2$ is logically an element of Z(x), but the mathematician probably intended Q[x].
I think that most people using DLMF.nist.gov  would not know or care. It's not their part of mathematics.

It is probably unfortunate if Axiom (or Openmath or MathML) cares and consequently requires such users to know.

RJF

James

Sent from my iPhone

On 10 Aug 2016, at 11:50, "Tim Daly" <[hidden email]> wrote:

There has been an effort in the past to extract mathematics from
latex. It seems that the usual latex markup does not carry enough
semantic information to disambiguate expressions.

Axiom has a similar problem occasionally where the interpreter
tries to guess and the compiler insists on type specifications.

Axiom provides an abbreviation for each type, such as FRAC for
Fraction and INT for Integer.

Might it be possible to create latex macros that take advantage
of this to provide unambiguous markup. For instance, instead of

\frac{3x+b}{2x}

we might have a latex markup of

\FRAC[\INT]{3x+b}{2x}

where there was a latex macro for each Axiom type. This would
turn into an latex \usepackage{AxiomType}

There would be a map from the \FRAC[\INT] form to the \frac
form which seems reasonably easy to do in latex. There would
be a parser that maps \FRAC[\INT] to the Axiom input syntax.

The problem would be to take the NIST Math Handbook sources
(is the latex available?) and decorate them with additional markup
so they could parse to valid Axiom input and valid latex input
(which they already are, but would validate the mapping back to
latex).

Comments?

Tim




_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer