Preprocessor to embed markdown images #1067

juhasch · 2017-08-25T12:44:58Z

Images in markdown ![](myimage.png) will be added as attachments to the cell.

TODO

Test
Documentation

soamaven · 2017-08-25T18:39:52Z

Just to be explicit, looks like this only embeds MD referenced images. HTML embedding still needs to be done. Glad to see the progress though :D

soamaven · 2017-08-26T06:19:25Z

For some reason, I cant get attachments working with my jupyter nb (v5.0.0), so I can't test some things.

Is it possible to refer to an attachment in the html? E.g.:
'<img src="data:image/png;base64,a325bd...' width="50%">
becomes
'<img src="data:image/png;base64,attachment:myimage.png.' width="50%">

if so that makes embedded images actually very nice to deal with in MD cells, and gives full html control over display.

juhasch · 2017-08-26T07:35:33Z

Attachments only work with markdown images. If you use base64 images in html img tags, they will be sanitized by the notebook.

So it will need to be a 2 step solution:

a preprocessor to embed markdown images as attachments. This allows
jupyter nbconvert -to notebook mynotebook.ipynb --EmbedImagesPreprocessor.embed_images=True to create a copy of the notebook with embedded images.
an html exporter, that will embed all images.

soamaven · 2017-08-26T17:53:47Z

If you use base64 images in html img tags, they will be sanitized by the notebook.

Thanks, this is an important detail. When does this happen? Upon render and kernel startup?

This PR #1067 helps to embed MD images -- the htmlexporter.py didn't do this.

What I am looking to achieve is:

embed html AND/OR markdown images before export
export to slides. I want to have one presentation file I can send to collaborators that works in their browser. Very similar to how a powerpoint presentation is one file.

It seems like step one is not possible for html base64 because of sanitization, which means step 2 will fail. Is the only option to re-implement the --to slides exporter to also embedhtml images?

juhasch · 2017-08-26T18:36:07Z

I believe you need a new exporter to embed images and convert to slides.

Maybe @mpacer can share his thoughts what a good approach would be ?

gabyx · 2017-08-30T19:05:38Z

So would the goal for the export_embedded extension be that for
"Download Embedded HTML"

call embedhtml.py exporter which uses this preprocessor first (is that possible to hook up), and afterwards replaces all remaining image tags and outputs .html

"Download Embedded Notebook"

call exporter notebook with your preprocessor (which should also handle inside Markdown and store them as attachements)

So the path would be -> first transform the notebook to a new notebook that all images are embedded -> export to HTML

So then for the HTML export we can use the default normal HTML exporter

gabyx · 2017-08-30T20:35:09Z

@juhasch Could you please have a look at my branch (where I tried to use your preprocessor inside embed_html) it is called but somehow does not work, key error in self.attachements, do you know why?
Link to Commit

juhasch · 2017-08-31T05:49:18Z

Actually, all you need to do is copy the html_embed exporter and replace the base class:

class EmbedSlidesExporter(SlidesExporter):

And instantiate your exporter in the from_notebook_node function accordingly

    def from_notebook_node(self, nb, resources=None, **kw):
        output, resources = super(
            EmbedSlidesExporter, self).from_notebook_node(nb, resources)
...

Now, what I don't like is that we now have the code for embedding twice, so we should find a solution for that.

You don't need the preprocessor. It is useful if you want to stay in higher-level formats, i.e. notebook to notebook conversion.

gabyx · 2017-09-26T21:13:27Z

Bugfix:
bugfix.patch.zip

match.string is not doing what it should do, match.group(0) is the whole match, whereas
match.string is the whole line strangely....

jcb91 · 2017-10-13T13:01:37Z

docs/source/exporting.rst

+Embedding images in notebooks
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. autoclass:: EmbedPreprocessor


I think this should be EmbedImagesPreprocessor, no?

Would say so too, its more explicit :-)

its more explicit

but more importantly, at the moment I don't think EmbedPreprocessor is an extant class, so I don't think it'll produce anything at all...

jcb91 · 2017-10-16T15:34:40Z

So where is this sat, at the moment? @gabyx are you saying there's a bug in the current implementation? Can you give a little more explanation as to what that is, and maybe an example of how to trigger it?

gabyx · 2017-10-16T18:33:57Z

src/jupyter_contrib_nbextensions/nbconvert_support/pre_embedimages.py

+            else:
+                return match.string
+        elif url.startswith('attachment'):
+            return match.string


[asd](attachement:asd.jpg) # this comment...
will result here in
[asd](attachement:asd.jpg) # this comment... # this comment...
it should be match.group(0)

match.string is not what should be returned if we do not want to substitute... (no?)
match.string is the string passed to match/search... https://docs.python.org/3/library/re.html#re.match.string

gabyx · 2017-10-16T18:34:09Z

src/jupyter_contrib_nbextensions/nbconvert_support/pre_embedimages.py

+            if self.embed_remote_images:
+                data = urlopen(url).read()
+            else:
+                return match.string


dito as below

mpacer · 2017-10-17T01:05:33Z

If this is getting a full EmbedPreprocessor working that is compatible with the existing HTMLExporter and SlidesExporter (and ideally the MarkdownExporter); I'd be interested to restart the conversation on how that would fit into nbconvert core.
as far as needing a separate exporter, if the only difference is enabling the EmbedPreprocessor, you might want to wait for [WIP] Download converted documents with uploadable configuration jupyter/notebook#2413 to land, since it creates a new endpoint that you can use to include a config file (json) and nbconvert will attach that config file to the exporter, meaning you could set this by just having prespecified config files associated with the extension

Preprocessor to embed markdown images

7e7c4c0

juhasch mentioned this pull request Aug 25, 2017

Preprocessor for embedding images #1064

Open

Keep existing attachments

2293ab9

soamaven approved these changes Aug 25, 2017

View reviewed changes

gabyx mentioned this pull request Aug 31, 2017

Changed regex parsing in embedhtml.py to XML parsing with lxml, regex… #1052

Merged

juhasch added 9 commits September 16, 2017 13:51

Add test for EmbedPreprocessor

9811acb

Add docstring to docs

6076f06

Fix Python2 failure

d445f5d

Make Travis lint happy

12d7a74

Add image resizing option

0d68281

Add image resizing option

fd30aaf

Make lint happy

019772c

Make isort happy

c4b75ee

Move optional resizing into own function

9f54d50

gabyx mentioned this pull request Sep 26, 2017

Convert notebook to Static HTML -- Markdown cells with html image references not viewable jupyter/nbconvert#328

Closed

gabyx mentioned this pull request Sep 26, 2017

Properly incorporated pre_embedimages into EmbedHTMLExporter #1113

Closed

jcb91 reviewed Oct 13, 2017

View reviewed changes

gabyx reviewed Oct 16, 2017

View reviewed changes

Jürgen Hasch added 2 commits February 17, 2018 11:51

Address comments in code review

502af43

Merge branch 'master' into feature/pre_embedimages

ed3e91e

juhasch merged commit 86050e8 into ipython-contrib:master Feb 17, 2018

juhasch deleted the feature/pre_embedimages branch February 17, 2018 12:25

juhasch mentioned this pull request Feb 23, 2018

Export embedded images to html #1248

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preprocessor to embed markdown images #1067

Preprocessor to embed markdown images #1067

juhasch commented Aug 25, 2017 •

edited

Loading

soamaven commented Aug 25, 2017

soamaven commented Aug 26, 2017

juhasch commented Aug 26, 2017

soamaven commented Aug 26, 2017 •

edited

Loading

juhasch commented Aug 26, 2017

gabyx commented Aug 30, 2017 •

edited

Loading

gabyx commented Aug 30, 2017 •

edited

Loading

juhasch commented Aug 31, 2017

gabyx commented Sep 26, 2017 •

edited by jcb91

Loading

jcb91 Oct 13, 2017

gabyx Oct 13, 2017

jcb91 Oct 13, 2017

jcb91 commented Oct 16, 2017

gabyx Oct 16, 2017

gabyx Oct 16, 2017 •

edited

Loading

gabyx Oct 16, 2017

mpacer commented Oct 17, 2017

Preprocessor to embed markdown images #1067

Preprocessor to embed markdown images #1067

Conversation

juhasch commented Aug 25, 2017 • edited Loading

TODO

soamaven commented Aug 25, 2017

soamaven commented Aug 26, 2017

juhasch commented Aug 26, 2017

soamaven commented Aug 26, 2017 • edited Loading

juhasch commented Aug 26, 2017

gabyx commented Aug 30, 2017 • edited Loading

gabyx commented Aug 30, 2017 • edited Loading

juhasch commented Aug 31, 2017

gabyx commented Sep 26, 2017 • edited by jcb91 Loading

jcb91 Oct 13, 2017

Choose a reason for hiding this comment

gabyx Oct 13, 2017

Choose a reason for hiding this comment

jcb91 Oct 13, 2017

Choose a reason for hiding this comment

jcb91 commented Oct 16, 2017

gabyx Oct 16, 2017

Choose a reason for hiding this comment

gabyx Oct 16, 2017 • edited Loading

Choose a reason for hiding this comment

gabyx Oct 16, 2017

Choose a reason for hiding this comment

mpacer commented Oct 17, 2017

juhasch commented Aug 25, 2017 •

edited

Loading

soamaven commented Aug 26, 2017 •

edited

Loading

gabyx commented Aug 30, 2017 •

edited

Loading

gabyx commented Aug 30, 2017 •

edited

Loading

gabyx commented Sep 26, 2017 •

edited by jcb91

Loading

gabyx Oct 16, 2017 •

edited

Loading