Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly incorporated pre_embedimages into EmbedHTMLExporter #1113

Closed
wants to merge 16 commits into from

Conversation

gabyx
Copy link
Contributor

@gabyx gabyx commented Sep 26, 2017

I have incoporated the changes in the pull request #1067 into the pull request #1052
Once #1052 is merged, I will sync and we can discuss this here:

So far I hooked the pre_embedimages.py preprocesser into the EmbedHTMLExporter (embedhtml.py) such that :

  • First pre_embedimages.py in Preprocessor to embed markdown images #1067 embeds all markdown images ([](/path/to/image.png)) as attachments with base64 to make a standalone notebook.
  • Another preprocessor in embedhtml.py gives all attachements a unique id
    this is necessary since we can have same attachment names in different cells
    and after the HTMLExporter has run, we are doomed since we don't know which <img src:"attachement:name"/> map to which attachements. So we save all unique attachements in the resources which we then use again in the next step:
  • EmbedHTMLExporter runs which converts all <img src="..."> tags to base64 embeded tags (attachements and all urls) by using resources["unique-attachements"]

I think this is pretty neat :-).
I am not sure about the more hacky unique id stuff... but was the only way I could solve it now.

Important Test Notebook:
EmbedImages.zip

bildschirmfoto 2017-09-26 um 23 32 45

@gabyx
Copy link
Contributor Author

gabyx commented Oct 11, 2017

some linting and python2.7 issues needs fixing...

@jcb91
Copy link
Member

jcb91 commented Oct 11, 2017

Once #1052 is merged, I will sync and we can discuss this here

#1052 was already merged, 2017-10-05, at 10:52:23 GMT.

@jcb91
Copy link
Member

jcb91 commented Oct 11, 2017

Did you mean once #1067 is merged?

@gabyx
Copy link
Contributor Author

gabyx commented Oct 11, 2017

No, the pull request I opened before the merge of #1052
Here I sync the developments on the branch in #1067:
These changes here are improvements to the embedhtml.py
and it only depends on #1067 by (pre_emebedimages.py).
Would be cool if someone could review my idea :-)

@gabyx
Copy link
Contributor Author

gabyx commented Oct 12, 2017

ON linux (except linting for pre_embedimages.py which should be fixed in #1064
it works properly, on windows it fails completey, wtf...? really?
Some strange issue,-> PITA

@jcb91
Copy link
Member

jcb91 commented Oct 13, 2017

Ok, I'm afraid I'm a little confused about the status of this & #1067 & where we're aiming to go with the two. As I currently understand it, the idea seems to be:

For me the first problem with this PR as it stands is that it seems to be based on an older version (2293ab9) of #1067, and need rebasing onto the current version in order for much to make sense...

@gabyx
Copy link
Contributor Author

gabyx commented Oct 13, 2017

sure we can update this of course. can i just merge the branch from #1067 into this request again? I tested this with nbconvert and had no problems with attachements? everything works as expected

@jcb91
Copy link
Member

jcb91 commented Oct 13, 2017

can i just merge the branch from #1067 into this request again?

you could, but it would be neater (I think) to rebase onto it. This (rebasing a published branch) could cause confusion if others have already based work on it, but it hasn't been published long, so I think we can risk that for a little more clarity with what's happening. Does that make sense?

I tested this with nbconvert and had no problems with attachements?

Sure, but that's the point of this PR, isn't it? What I was asking about was whether, if I attempt to export a notebook that has attachments using base nbconvert with no extensions, it actually works (attachment images appear in output)?

@gabyx
Copy link
Contributor Author

gabyx commented Oct 14, 2017 via email

@juhasch
Copy link
Member

juhasch commented Oct 15, 2017

Having the preprocessor allows generating notebooks with all images included, which is helpful when exchanging notebooks. That said, people use <img> tags, too. This is required because the current markdown flavor does not provide a way to center or size images. You can't embed those images in the preprocessor. Also, using <img> tags breaks non-html exports...

So exporting is a mess, and the best way I see is to have a preprocessor and an exporter.

@jcb91 nbconvert seems to ignore attachments right now. The extractoutput preprocessor only looks at output data right now.

I would prefer two individual PRs: one for the preprocessor and one for the exporter. As the feature set of both things is independent, this makes things easier, IMO.

@jcb91
Copy link
Member

jcb91 commented Oct 16, 2017

So exporting is a mess, and the best way I see is to have a preprocessor and an exporter.

👍 Ok, I think I follow that, and it seems like a sensible solution.

@jcb91 nbconvert seems to ignore attachments right now. The extractoutput preprocessor only looks at output data right now.

Ok, that did (does!) seem strange, but at least I'm not missing something obvious! I guess the notebook format has jumped without nbconvert having time to catch up 🤦‍♂️ Would it be worth adding somethign like #1067 as a PR to nbconvert itself?

I would prefer two individual PRs: one for the preprocessor and one for the exporter. As the feature set of both things is independent, this makes things easier, IMO.

That would seem reasonable, except that the feature-set doesn't quite seem independent - it makes sense to be running the preprocessor for the exporter, no? Otherwise don't we miss out any attachments in the markdown images? But yeah, it seems that #1067 is the one to be working on for now, at any rate

@gabyx
Copy link
Contributor Author

gabyx commented Oct 16, 2017

That would seem reasonable, except that the feature-set doesn't quite seem independent - it makes sense to be running the preprocessor for the exporter, no?

I strongly think so too, they are not really independent, but should be viewed as combined in a single exporting pipeline, like running the preoproc. first and afterwards the exporter...

Just to be clear, this PR's idea is exactly to run the pre_processor (automatically) before the exporter, thats all it does. And so far (also not with #1067) we have no solution which properly exports everything to html like the above attached notebook file, this PR covers this:
https://github.com/ipython-contrib/jupyter_contrib_nbextensions/files/1335074/EmbedImages.zip

@gabyx
Copy link
Contributor Author

gabyx commented Oct 25, 2017

Configuration options are now available: jupyter/notebook#2413
So we will be able to upload a config json for example to adjust image conversion =)

@mpacer
Copy link

mpacer commented Oct 25, 2017

well… it's not merged yet. And I have a huge backlog of issues that various people want fixed before I can get to merging it… and it gets set back every time another change is made to the nbconvert handlers 😭 , which we'll need to do to get jupyter/notebook#2974 fixed

@juhasch
Copy link
Member

juhasch commented Feb 3, 2018

@gabyx Would you like to give this another go ?
Let's get it merged here, and eventually get it into nbconvert later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants