Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better way to add optional extension to specified template file #491

Merged
merged 8 commits into from
Jan 6, 2017

Conversation

takluyver
Copy link
Member

Using a custom Jinja loader which wraps the FileSystemLoader. This means we only add the file extension for the default loader, and assume anyone who is providing extra loaders knows what they're doing.

Closes gh-490

@takluyver takluyver added this to the 5.1 milestone Dec 16, 2016
@mpacer
Copy link
Member

mpacer commented Dec 16, 2016

Will this solution break on files ending with .tplx?

@takluyver
Copy link
Member Author

I don't think so. Nothing I did was specific to any particular ending.

"""
def __init__(self, loader, extension):
self.loader = loader
self.extension = extension
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you change this to

 def __init__(self, loader, extension_list):
      self.loader = loader
      self.extension_list = extension_list 

then you could accommodate (below) something that is similar to:

if any([template.endswith(ext) for ext in self.extension_list]):
    raise
return self.loader.get_source(environment, template+self.extension_list[0])

which would default to returning .tpl

and then later on you'd need to do something like:

template_extension_list = List(['.tpl','tplx']).tag(config=True, affects_environment=True)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see why now that this is unnecessary. It's handled via the LaTeXExporter, which means that this mechanism will adapt to whichever value to which we set that traitlet.

@mpacer
Copy link
Member

mpacer commented Dec 16, 2016

under template_extension it expects .tpl as the extension rather than as either .tpl or .tplx.

My guess is the reason why this works with only one string is that we just overwrite the Unicode traitlet for LaTeX & pdf export to be .tplx.

Right now we have no instance where we might use some templates with either extension, but is that necessarily the case?

@mpacer
Copy link
Member

mpacer commented Dec 16, 2016

Or alternatively, could we change it so that nbconvert didn't expect templates to have any particular extension? I'm trying to figure out where the extension is actually being used to see if this is possible, and is it just in the code where we're finding and loading the templates that come with nbconvert by default?

It seems there should be an elegant way of handling in memory and on disk templates that doesn't require pretending like in memory templates were passed in with a file extension.

@mpacer
Copy link
Member

mpacer commented Dec 20, 2016

Realized we should have someone else look over the code than @takluyver and I.

@Carreau @minrk should this solution for in memory templates be ok?

@takluyver
Copy link
Member Author

The file extension for Latex templates is different because we use a different template syntax there, to avoid clashing with Latex syntax. So those template files are effectively a different format, and you can't mix and match the two kinds. So I don't think it's necessary for a single loader to support multiple extensions.

If I was writing this code from scratch, I wouldn't bother automatically adding extensions; I think it's clearer what's going on if users have to supply the extension themselves. But since it already works without users providing the extension, I want to keep it working.

It seems there should be an elegant way of handling in memory and on disk templates that doesn't require pretending like in memory templates were passed in with a file extension.

That's what this does. We only wrap the FileSystemLoader in this machinery to add extensions, so if you supply an in-memory template as in #490, the code does not deal with extensions at all.

@mpacer
Copy link
Member

mpacer commented Dec 20, 2016

@takluyver Ok, that makes sense. I misunderstood how this was working.

You said it ignores the extension, but it looks like it doesn't ignore the extension, but rather it adds if it's not present in https://github.com/jupyter/nbconvert/pull/491/files#diff-43c2e238630461c0a2c422a0b09fae55R77. But I think that's only going to happen if you use the FileLoader. So this also has a feature improvement (since people can specify templates without their extensions now), right?

The reason why none of that machinery is extension sensitive if you use another loader is because those loaders are just prepended onto the list and thus are operating independently of this change, right?

Do we have separate machinery for interpreting .tplx templates then? It seems to be mostly the same but the only difference is that the LatexExporter is specifying a different template extension in https://github.com/jupyter/nbconvert/blob/master/nbconvert/exporters/latex.py#L43. Is that why it counts as a different format?

Since .tplx are effectively a different format, I added a test specifically for .tplx templates.

@mpacer
Copy link
Member

mpacer commented Dec 21, 2016

I said I was going to be able to send the patch release tomorrow when this gets in and it only needs a few more eyes.

I just broke everything with the last push, and I will be confident that once I can get this expected failure to actually fail, I will understand this well enough to merge it. That may take more than tomorrow as I thought I'd get further than I did tonight. Still made progress though, but I'm going to edit the title to include [WIP].

@mpacer mpacer changed the title Better way to add optional extension to specified template file [WIP] Better way to add optional extension to specified template file Dec 21, 2016
@mpacer
Copy link
Member

mpacer commented Dec 21, 2016

Ok @takluyver so the place where I'm stuck is getting to directly pass the loader that is to be used so that it errs in the correct place.

Figuring out why something is sometimes a Unicode object and therefore not convertible to string.

In the most recent version of test_result.txt, the failure is coming back as a failure, but from the wrong place.

And in previous versions, it was successfully hitting the "if" statement, but failing, now it seems to be passing the if statement but then not raising the error. I find it extremely confusing at this point.

To verify this last bit I'm having it print to the log the status of that conditional evaluation (which is true as of the last push, meaning it should go inside the conditional and then raise the error from there).

@takluyver
Copy link
Member Author

So this also has a feature improvement (since people can specify templates without their extensions now), right?

It was already possible to specify a template file without the extension, because of this code adding the extension. But we were doing that regardless of where the template was coming from. This pushes the extension handling down a level so it only happens when the template is being loaded from the filesystem.

The reason why none of that machinery is extension sensitive if you use another loader is because those loaders are just prepended onto the list and thus are operating independently of this change, right?

Right. In normal usage, we set up just a single FileSystemLoader to get templates from files. That loader is what's being wrapped in the new ExtensionTolerantLoader. In Matthew's case, he is adding a DictLoader to the start of the list. That will be asked for the template before our own loader, and because it's not wrapped in ExtensionTolerantLoader, it will only get the name as it was specified, with no additions.

(If someone wanted the extension-adding behaviour for their own loader, they could also wrap it in ExtensionTolerantLoader. But I'm not planning to document or support it as a public API.)

Do we have separate machinery for interpreting .tplx templates then?

It uses the same machinery, but we tell Jinja2 to look for different markers using this code. Specifying different syntax like this is a documented feature of Jinja2, but I think it's not very widely used.

@@ -0,0 +1,65 @@
============================= test session starts ==============================
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a particular reason you committed the test results? This isn't something we normally do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know we normally don't do this, but I was trying to allow you to see what I was seeing as I was stuck. The hope was that in seeing the error wherever it was I ended up getting stuck you could suggest a remedy. The idea is to put the test in here for now while writing the code and remove it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal was to regain one of the advantages of pair coding that is lost when we only push successful changes on Github.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, gotcha. Good idea, but I'd suggest doing that by putting the log in a gist and linking to it. Anything that's committed to the repository ends up in everyone's copy of the git history for ever, unless we fiddle around with rebasing and squashing commits, so we tend not to commit things that we intend to delete again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you're saying, but that only considers the commit history as a concise, clean history it has been developed and as it is considered retrospectively. It is not treating version control a tool for working on and thinking about things as they are being developed and recording the sequence of thoughts themselves. Git and GitHub act as both of those tools so we should not reflect only the one and not the other. Right now the emphasis in the project leans heavily toward the concise, retrospective, clean history side of things. This has made learning and understanding the code base substantially more difficult than it would have been if it was a more balanced approach.

Unfortunately, doing it as a separate gist would (at least) triple the overhead needed to do this (including requiring that we explicitly go onto GitHub each time we want to include a step in the testing procedure and need to write a comment in order to include the appropriate link). For this an other reasons detailed in the following paragraphs, I don't think a gist is an appropriate solution.

What was most impressively useful about this was that I could just do this in the course of other work (e.g., by running commit -am after making changes and using py.test nbconvert/exporters/tests/test_templateexporter.py --runxfail > test_result.txt || cat test_result.txt).

Think of it in analogy to a scientific notebook. Suppose you have two labs, Lab 1 and 2. In lab 1, suppose you're working in notebook A. If you need to open a new notebook, file it, and create a direct reference to the new notebook in notebook A each time you want to document partial progress in an experiment. In contrast, suppose there is Lab 2 in which you just include an additional page in notebook A to log work in progress. Assuming that the two labs perform task at roughly the same rate, you're going to see much more logging of progress in Lab 2 than in Lab 1.

But even more so, I think stuff like this can be useful as part of the commit history (and not squashed and rebased!). If I am just looking at the commit history, I would have no way to know that this test was run and posted if it was only mentioned in the PR's top level comments (or worse, comments on the code itself, which could be extremely difficult to find if the code changes).

@Carreau has been specifically encouraging me to logging this sort of incomplete solution and in-progress work (though we've not talked specifically about this). In this conversation there are usually 2 aims: 1. people who are outside the project who want to understand how we arrived at what we arrived at can better grok that if they can follow our reasoning steps and 2. people who are in the project trying to recreate why something is the way it is can do so more easily if they have access to finer grained commits (including commits that record the state of the tests at particular points in the process of developing the code).

I know that if I'd been able to see people's error logs and the changes that they made in response to those logs, it would have sped up my ability to comprehend the code base. This one of the biggest advantages gained from watching someone else work through a problem live. By recording tests in this manner, it makes it possible to do so in retrospect as well.

I'm not saying that we should be doing this all the time, but if we're running into a tricky problem (like the catching errors that stop the "right" error from being thrown) I'm going to keep trying to do this. Allow me to try this selectively for a while and let's look back at the practice in retrospect rather than trying to decide before we have any evidence as to whether the method is or is not effective.

Note, given that you helped figure out what was happening in the other comment, I'm including the use in this case to have been a success.

def test_in_memory_template_failure_to_find(self):
# Loads in an in memory template using jinja2.DictLoader
# creates a class that uses this template with the template_file argument
# converts an empty notebook using this mechanism
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test here doesn't match up to its description; I'm not sure which you intend it to be ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's the comment from the code I had previously added and had copied as a basis for this code. Apologies for not having changed it, but I had wanted to get stuff working before I focused on comments.

I hadn't intended this as fully ready for production but was pushing as I was going in hopes of on-going collaboration between the two of us. This is in contrast to pushing only completed solutions that do little to convey how they we arrived at. Assume that the function is supposed to fail and (based additionally on my other commit messages and GitHub comments in which I mentioned what I was trying to do and how it was failing) it should be failing in a way that provides better code coverage. Thus, it should be hitting the raise in the new ExtensionTolerantLoader. It does not seem to be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going from this commit message:

Now it's happening in the Choice Loader and not the ExtensionTolerantLoader

I think I see the confusion. ChoiceLoader is what wraps the series of loaders to try, so that the user can provide extra_loaders with higher priority than our default loader. ChoiceLoader will try loaders in sequence. If one of those loaders throws TemplateNotFound, ChoiceLoader catches that and moves on to try the next loader. If all loaders fail like that, ChoiceLoader finally raises TNF itself.

So the TNF exception you get from using our TemplateExporter class will always come from ChoiceLoader, because it catches the exceptions from individual loaders.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So since this is hitting the line of code I was aiming at, does that mean that this is good enough? Or should I switch out the ChoiceLoader in the environment so it doesn't catch the error and it gets passed back directly?

My inclination is that we've got the coverage result from a more or less reasonable use case (someone specifying a template in one place, and then thinking they need to add the suffix manually even though that template isn't available on the path when the template without the suffix is), so we should be good to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how would you specify the extra_loaders value if you weren't including it in the definition of the Exporter itself? It looked like it didn't like directly setting it because of it's identity as a traitlet List.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be sufficient to test it from this level, though I think we can simplify the test a bit.

I'm not sure exactly what you're asking about setting extra_loaders. It should work with either of these patterns:

# Either:
exporter = TemplateExporter(extra_loaders=[foo])

# Or:
exporter = TemplateExporter()
exporter.extra_loaders = [foo]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second exporter.extra_loaders = [foo] was giving me an error, I can't access it anymore (because squash) and it's not giving me an error here, so I must have been doing something differently and incorrectly.

@mpacer
Copy link
Member

mpacer commented Dec 21, 2016

Ok I'm at an impasse at figuring out how to switch out ChoiceLoader, since it's fairly deep in the environment stack.

takluyver and others added 5 commits December 21, 2016 13:34
Using a custom Jinja loader which wraps the FileSystemLoader. This means
we only add the file extension for the default loader, and assume anyone
who is providing extra loaders knows what they're doing.

Closes jupytergh-490
Add test for key suffix

Add comments to tests
Remove test for key suffix

Add comments to latex test
add import

add class definition

relocate to surface at top level

Test result

clean up code

Use other configuration method

try a plain raise

start writing direct debugging code, structure seems to be in place

This should actually fail

Nope, everything got worse

Here's the error report, note the object definition is not ExtensionTolerantLoader

Raise the right exceptiontype

Change it around to directly set the loader

test result failure

This is failing where I wouldn't expect it to fail

Ok, now we're getting somewhere. Now it seems to be an issue with the conditional which was never being hit.

Approaches traitlets incorrectly, but now we may in the ballpark

Now it's happening in the Choice Loader and not the ExtensionTolerantLoader

check that the conditional is passing. satifice for tonight.
@mpacer
Copy link
Member

mpacer commented Dec 21, 2016

squashed away the tests, sad to change history, but in this case I think it makes sense.

return self.loader.get_source(environment, template+self.extension)

def list_templates(self):
return self.loader.list_templates()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line used by anything?

It's not being hit by any current tests, is it being used for something or is it just a nice utility that will be useful eventually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing in our code currently uses it, but it's part of the API for template loaders in Jinja, and since it's easy to implement, I implemented it.

with open(template, 'w') as f:
f.write(test_output)
exporter = MyExporter()
exporter.template_file = template + exporter.template_extension
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this a bit. All we need to test it, I think, is a template name that we can be reasonably sure won't exist, like does_not_exist.tpl. Then we don't need to subclass TemplateExporter, or create a temporary directory, or create a template that does exist. We can just instantiate TemplateExporter(template_file='does_not_exist.tpl') and check that trying to use it raises TemplateNotFound.

@mpacer
Copy link
Member

mpacer commented Dec 22, 2016

Took a little bit of figuring out how the load_templates() functions work in Jinja2, since there are actually 2 different kinds of them but you'd never know it by looking at their API docs ( Environment.load_templates(extensions=None, filter_func=None)& *Loader.load_templates()).



@pytest.mark.xfail(strict=True, raises=TemplateNotFound)
def test_in_memory_template_failure_to_find(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name is misleading, because it's not doing anything with in-memory templates.

Once that's fixed, I think this is good.

…o error output will be useful. Change file name
@mpacer mpacer changed the title [WIP] Better way to add optional extension to specified template file Better way to add optional extension to specified template file Jan 3, 2017
@mpacer
Copy link
Member

mpacer commented Jan 3, 2017

Sorry for the delay on this, it should all be good now.

@mpacer
Copy link
Member

mpacer commented Jan 4, 2017

@takluyver ping, I'm thinking that if this and a couple of the other PRs get in it'll be good for a 5.1 release (which then will also let me shift back rtd to stable).

out, resources = exporter.from_notebook_node(nb)


@pytest.mark.xfail(strict=True, raises=TemplateNotFound)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a stylistic point, I think it's nicer to use pytest.raises(), making it specific to the statement we expect to raise an error, rather than marking the whole test as an expected failure. More info in docs here:

http://doc.pytest.org/en/latest/assert.html#assertions-about-expected-exceptions

I'm not going to hold up merging for this, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants