Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc support #18

Closed
abchugh opened this issue Feb 7, 2020 · 8 comments
Closed

Pandoc support #18

abchugh opened this issue Feb 7, 2020 · 8 comments

Comments

@abchugh
Copy link

abchugh commented Feb 7, 2020

Firstly, thank you so much for this awesome plugin. It is what powers my large parts of my new non-profit.

This feature request for making this plugin compatible with pandoc (the best/most-widely known open-source document converter).

A large portion of our project is to convert latex documents into markdown so that they can be used on our website (eg see bit.ly/sphz-art ). To do this, we use pandoc to convert latex to markdown and use the markdown-it-texmath for rendering.

However, there are a couple of major issues when rendering pandoc output:

  1. Inline block latex - pandoc considers it okay to have block quotes inline (doesn't need to be in a new paragraph) and renders them correctly if requested
  2. Tex overflowing over multiple lines

For example both these issues show up in the following text generated by pandoc:

The upper bound is $${\ell({x^n})} \leq |n| \, {\ell({x})}$$ for all $n \in 
{\mathbb{Z}}$. 

I would be really glad if you could provide support for this.

@goessner
Copy link
Owner

goessner commented Apr 5, 2020

@abhishekchugh: Your request seems to be quite similar to kramdown syntax, which markdow-it-texmath already supports.

So I added pandoc syntax as described by you as a (beta) feature to version 0.6.5.

Please have a closer look into that pandoc usage and test it.

Thanks

@goessner
Copy link
Owner

goessner commented Apr 5, 2020

Closing ... please reopen in case of pandoc usage problems.

@goessner goessner closed this as completed Apr 5, 2020
@abchugh
Copy link
Author

abchugh commented Apr 6, 2020

@goessner : Thank you so much for working on this request. The new code identified '$$-quoted' latex strings correctly. There are couple of minor issues that can be easily resolved:

  1. The new inline block (with '$$' as tag ) should be 'math_block' instead of 'math_inline'
  2. I didn't see support for Tex overflowing into multiple lines. As a hack, I removed the '\r\n' rex of 'math_inline' like so

'rex: /$(\S[^$]*?[^\s\\]{1}?)$/gy,'

I don't know if introduces any other bugs, but I didn't find any.

Also, you may want to fix this for all delimiter types (not just 'pandoc') because this makes the plugin inconsistent with pandoc's general philosophy of ignoring single-line breaks.

Expected input and output example:
Screenshot from 2020-04-06 10-16-56

@goessner
Copy link
Owner

goessner commented Apr 6, 2020

Unfortunately it only works by accident. Having a closer look into the pandoc inline rules shows, that only the $...$ rule is activated. So ...

  • I need to investigate, why two $s are matched.
  • I also need to investigate if and when those \n\r's are needed.

I forgot, what exactly is responsible, why $...$ and $$...$$ cannot coexist in inline mode.
For investigating this effect, I need much more time, which i do not have at current.

BTW: Forcing an inline rule $$...$$ to result in a display mode formula then, would be no problem.

I would keep pandoc delimiters as beta inplementation in the list. You only need to decide, if ...

  • keep $$...$$ in inline mode, which would result in kramdown mode.
  • keep $...$ in inline mode, which would result in dollar mode.

I would like to eliminate those \n\r's then, as you are suggesting, for testing.

Taking the second alternative in combination with a simple preprocessing from your side, which is changing $$...$$ to \n\n$$...$$\n\n would be an interim solution and should be no problem.

I need to publish Version 0.6.6 in time, so ... what do you think?

@abchugh
Copy link
Author

abchugh commented Apr 6, 2020

Stefan, I copy the plugin code in this repository and modify slightly (coz I need to modify the rendering so that I create a custom Angular plugin). So I have created custom delimiters which I needed (pasted at the end of this comment). If I find any issues, I will use your pre-processing suggestion.

Having said that, pandoc support will make this more plugin more widely used since pandoc is extremely popular and multi-line support feels quite important when you are working with large formulas.

In any case, I am grateful for this work - I just can't wrap my head around how markdown-it plugins work and the authors refuse to write better documentation. Feel free to come back to this if you think this is important and you have time to spend on this.

custom_pandoc: {
        inline: [
            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$$'
            },
            {   name: 'math_inline',
                rex: /\$(\S[^$]*?[^\s\\]{1}?)\$/gy,
                // rex: /\$(\S[^$\r\n]*?[^\s\\]{1}?)\$/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$',
                pre: texmath.$_pre,
                post: texmath.$_post
            },
            {   name: 'math_single',
                rex: /\$([^$\s\\]{1}?)\$/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$',
                pre: texmath.$_pre,
                post: texmath.$_post
            }
        ],
        block: [
            {   name: 'math_block_eqno',
                rex: /\${2}([^$]*?)\${2}\s*?\(([^)$\r\n]+?)\)/gmy,
                tmpl: '<section class="eqno"><eqn>$1</eqn><span>($2)</span></section>',
                tag: '$$'
            },
            {   name: 'math_block',
                rex: /\${2}([^$]*?)\${2}/gmy,
                tmpl: '<section><eqn>$1</eqn></section>',
                tag: '$$'
            }
        ]
    },

@goessner
Copy link
Owner

goessner commented Apr 7, 2020

ok ... simply change the first inline rule from

            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$$'
            },

in order to get a display mode formula to

            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<section><eqn>$1</eqn></section>',
                tag: '$$'
            },

and you will see, that it will be not invoked due to the match of the second ...

good luck

@goessner
Copy link
Owner

goessner commented Jun 14, 2020

@abchugh: I finally found some time and successfully implemented your requested features ... at least I think so. Can you please test it in your environment?

At the same time I removed the texmath.rules['pandoc'] entry. I also like that behavior a lot, so I enhaced the texmath.rules['dollars'] rules accordingly. If there are reasons to keep the pandoc namespace, you might simply specify

texmath.rules['pandoc'] = texmath.rules['dollars'];

somewhere.

thanks

@abchugh
Copy link
Author

abchugh commented Jul 13, 2020

@goessner I finally got some time to update the code with the latest updates. The inline block-mode worked quite well. Did you add support for inline tex overflowing into multi-line lines? I couldn't get that working. Could be a mistake at my end too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants