Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPUB: option to omit titlepage #6097

Closed
vv01f opened this issue Jan 30, 2020 · 36 comments
Closed

EPUB: option to omit titlepage #6097

vv01f opened this issue Jan 30, 2020 · 36 comments

Comments

@vv01f
Copy link

vv01f commented Jan 30, 2020

feature request

I'd like to have no additional titlepage (single page with title, not the cover) in ePub (and maybe other formats as well)

e.g. with titlepage: false in yaml or --titlepage=false as CLI option.

my quick hack was using custom CSS section.level1 h1 { display: none; } to hide the section in question. another option would be a postprocessing step and removing it from the DOM.

@mb21
Copy link
Collaborator

mb21 commented Jan 30, 2020

what pandoc version are you on? I think this should already work...

see https://pandoc.org/MANUAL.html#templates and https://github.com/jgm/pandoc/blob/master/data/templates/default.epub3#L21

@jgm
Copy link
Owner

jgm commented Jan 30, 2020

Messing with the template will probably yield a blank title page rather than none at all, because it will still be referred to in the epub's index.

Why do you want to omit the title page: it's a standard part of a book?

@tarleb
Copy link
Collaborator

tarleb commented Feb 8, 2020

Closing for lack of feedback; please reopen if necessary.

@tarleb tarleb closed this as completed Feb 8, 2020
@NickBarreto
Copy link

Hello, coming across this now and I think this option would be useful also, and would suggest reopening.

@jgm is absolutely right that a title page is a standard part of a book and generating one is useful. However, I could see value in being able to suppress the inclusion of one, generally to introduce another title page from within the document source.

For example: the way we produce our title pages in EPUB at the moment (not using pandoc) is in one of two cases:

  • Case 1: text title page
    We have an html title page which consists, essentially, of an h1 with the title, an h2 with the author, and an image logo.

  • Case 2: title page is a 'patch'
    We have a title page which has 2 images: a title page 'patch' image, used so the book's title and author are rendered in a simplified black and white version of the typography in the book's cover, with alt text of 'book title by author name', and the same logo above. This is necessary because text in a cover is often distressed or otherwise altered in ways that embedding the same font as the cover would not necessarily result in the same visual presentation. Alt-text is very important here, as ever, for accessibility.

I was testing out how we would do this with pandoc, but I'm having trouble including the logo image used in both cases with the epub3 template, it isn't an image linked to in the source document, so pandoc doesn't know to package it in result EPUB, so I get a broken image link. I have managed to work around this in by base64 encoding the logo image in the template, but this is not an ideal implementation.

It also doesn't solve the problem of Case 2, when I want to pass in an image that is specific to this book into the template. Which, again, I could do it in base64 in a custom metadata attribute, but that would add a lot of bloat to a yaml file!

Both of these scenarios are easily dealt with if I could use --titlepage=false as suggested, and have a filter of my own which builds a title page from content in my source document, which would be able to be aware of image references as any other images in a pandoc source.

Alternatively, being able to flag a specific media file for inclusion in an EPUB would also solve the issue in this use case.

@jgm
Copy link
Owner

jgm commented Feb 19, 2020

Would it work to have a metadata field with the image? This would ensure that the image is linked in the source document, hence included in the EPUB container. And you could just use the variable corresponding to the metadata field in the titlepage portion of the template.

@NickBarreto
Copy link

NickBarreto commented Feb 19, 2020

Yes, it would absolutely work if I could specify an image path in a metadata element and that image gets added to the package. That would also let me use an if in the template to include the patch image if one is present, or html h1 and h2 title and author if not, and solve both of those use cases.

This isn't existing functionality though, right?

I have quickly edited the base template to reference a metadata field I added in an external yaml file:

metadata.yaml:

logo_image: images/logo.jpg

template.epub3:

<div> 
    <img src="$logo_image$" alt="logo"/>
</div>

If I build an epub with this at the moment, the referenced resource is missing.
$ pandoc -o test.epub -t epub3 source.md --metadata-file=metadata.yaml --template=template.epub3

I get the same error if I add that metadata element within source.md as a yaml block. Either way would be fine in the end, though I generally prefer external metadata files.

@jgm
Copy link
Owner

jgm commented Feb 19, 2020

I was thinking rather

logo_image: ![logo](images/logo.jpg)

and in template

<div>
   $logo_image$
</div>

Try that!

@NickBarreto
Copy link

That doesn't appear to work at the moment. If I have logo_image: ![logo](images/logo.jpg) in an external yaml file, pandoc appears to fail to load the entire metadata.

If I add it in a yaml block at the start the file itself, when I run pandoc I get the following warning:

[WARNING] Could not parse YAML metadata at line 5 column 1: Lexical error near " ![logo](images/logo.jpg)"

The result epub doesn't contain the image referenced. It is, at least, valid according to EPUBCheck in that since that YAML element hasn't been parsed, that metadata field is blank, and so the template doesn't evaluate it, and I don't have a broken image link – where the img html element should be in the generated file is just blank (which makes sense).

(Just to confirm, I am on pandoc 2.9.2)

@tarleb
Copy link
Collaborator

tarleb commented Feb 20, 2020

The value has to be quoted (because of the !). Try

logo_image: '![logo](images/logo.jpg)'

@NickBarreto
Copy link

Aha! @tarleb thank you very much. It works like a charm.

@vv01f
Copy link
Author

vv01f commented Jul 22, 2020

Why do you want to omit the title page: it's a standard part of a book?

Sorry, I needed to change my notification settings on GH to receive notice.

Not every epub needs to be a full blown book. And metadata can be restrained to being metadata instead of redundantly in the content additionally. E.g. extracts (for educational purpose, the object was an article from an issue of a collection … where the sources haven been referenced in the texts footnotes and metadata sufficiently) or single chapters (not starting with heading1 for # but e.g. heading2 for ## in markdown source) for comfortable reading are a nice usecase. Then the heading1 will be added with the title so right after a titlepage the title is repeated. Another idea I had was --top-level-division=chapter, but with no effect to the heading1 still being filed with the title. That all is fine with no other titlepage. When adding a toc, the default toc-title (which I couldnt change to an empty string with the metadata variable, so used ol.toc #toc-li-1 { display: none; } for now) might give the title a third time. All together a bit much titleage.

now pandoc version 2.10, since today. :)

@kensmosis
Copy link

At risk of reviving a zombie thread, I'll second (third?) that request for an option to disable title-page generation.

Many publishers (myself included) use images for the pretitle, title, and copyright pages. This results in a nicer looking ebook, and allows far greater flexibility in the layout of those pages.

In my case, I extract pdf page images from the print version, convert these to png files, and then embed each on a separate page in the frontmatter. I do this via raw html in the markdown (mainly to control centering/etc) but probably could do it using markdown's own image insertion capabilities just as well.

All these steps are automated via a Makefile. Right now, I have to use an extra script to unzip the epub file, remove the pandoc-generated title page, modify the toc, rezip it, etc. This is very clumsy and almost obviates the use of pandoc (as opposed to just using a python generation script for everything in the first place).

It would be great if pandoc had an option to omit the generated title page, and I could skip all this rigamarole and keep things clean and simple. Pandoc is absolutely amazing, and I love it. But sometimes tiny things like this can sabotage the utility of otherwise incredible software for specific use cases. On the other hand, this probably is one more reason I should learn Haskell, so I can contribute instead of complaining :)

Cheers,
Ken

@jgm
Copy link
Owner

jgm commented Sep 5, 2021

So if your title page is, essentially, an image, you can handle that by using a custom template.
Modify the default template by finding the section under $if(titlepage)$..$endif$ and inserting what you like there.

@kensmosis
Copy link

Thanks jgm (and for the quick response)! For some reason, I was under the impression templates only could be used for certain output formats (latex, etc) but not epub. This should do the trick nicely, as well as make my life easier in other ways too!

Cheers,
Ken

@mb21 mb21 changed the title option to omit titlepage EPUB: option to omit titlepage Sep 5, 2021
@kensmosis
Copy link

Hmm... looks like I spoke prematurely. The template gives me flexibility to insert html, but I can't import an image. Pandoc doesn't scan the template for image links, so the image never gets pulled in. I tried a dummy -metadata-file solution proposed by someone online but it did not pull in the file (and even if it did, it's not clear that the internally assigned file00n.png would be associated with an explicit link in the template file). I've monkeyed around with this quite a bit, and there doesn't seem to be a way to make it work.

There's another problem with the title page too. Because it comes first, I can't have a half-title before it, and this forces an unconventional reordering of my frontispiece image and copyright page as well.

It may seem like just leaving the autogen'ed title page blank is a minor issue, but this may not be the case. The problem is that this autogen'ed page usurps "title page" status. Like lots of people, I generate an epub as an intermediate step to converting to kindle format using amazon's mandatory online converter. Having 2 distinct title pages likely will screw up the conversion process, and readers will land on a blank starting page or encounter other navigational oddities.

Cheers,
Ken

@jgm jgm reopened this Sep 12, 2021
@NickBarreto
Copy link

If you have a look back through the thread, @kensmosis, I did get title patch images working with a template. You can't include it in the template directly (unless you byte-64 encode it in the template file which not ideal), but if you reference an image in a metadata file, you can include it in your page that way.

Do note how it's important to include the image links as a markdown image link, and not just a source path.

Usage just as @jgm proposes above but you need to wrap the image link in the metadata file in quotes otherwise it causes the yaml to be invalid.

It works nicely but probably won't address your half title requirements (though I personally think a half title page is redundant in an ebook).

@kensmosis
Copy link

kensmosis commented Sep 12, 2021

@NickBarreto: Thanks for the reply. Unfortunately, I still must be doing something wrong. The file is indeed slurped in and it almost does what is needed, but the generated title page has an extra <img..> command.

YAML file: ktitleimage: '![title page](gen/title.png)'

Relevant Template code: (in $if(titlepage)$ section):

<title>The Delivery</title>
<center>
<img src="$ktitleimage$" data-custom-style="imgFull" style="width:100.0%;height:100.0%" alt="title page"/>
</center>

Output (EPUB/text/title_page.xhtml):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en-GB">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <title>The Delivery</title>
  <link rel="stylesheet" type="text/css" href="../styles/stylesheet1.css" />
</head>
<body epub:type="frontmatter">
<title>The Delivery</title>
<center>
<img src="<img src="../media/file0.png" title="fig:" alt="title page" />" data-custom-style="imgFull" style="width:100.0%;height:100.0%" alt="title page"/>
</center>
</body>
</html>

And here is the call:

pandoc gen/ebook.md -o out/eb/book.epub --wrap=none --standalone --verbose --fail-if-warnings --strip-comments -M lang="en-GB" --top-level-division=chapter --toc-depth=1 -t epub -f markdown+smart+header_attributes --metadata-file=./NeededDummyMetadata.md --template ./KenEpubTemplate.pandoc --css ./KenEpubStyles.css

FYI, I'm running pandoc 2.13.

Cheers,
Ken

@kensmosis
Copy link

kensmosis commented Sep 12, 2021

Ok, I now see what is happening. I'm not sure if it's a difference in version, but the command is being expanded to <img=...> when referenced as $ktitlepage$. In case anyone comes across a similar problem, what seems to work for me is the following:

YAML metadata file: ktitleimage: '<img src="gen/title.png"/>'

Relevant part of template (after $if(titlepage)$):

$ktitleimage$

Using the <img ...> instead of ![]() in the YAML file was the key. It still slurps in the relevant image, but doesn't result in a double <img ...> tag. If you try to use the ![]() definition in the YAML file, and just type $ktitleimage$ in the template, an error will result because ![]() is markdown and doesn't get processed into an <img..> tag early enough (at least in my version). Anyway, this trick worked for me.

Cheers,
Ken

@bgoldowsky
Copy link

I'd like to add my support for the original request - have an option that omits generation of a title page entirely.

My use case is this: I am working on an educational application (see https://cisl.cast.org/products/about-clusive ) which is based on EPUB. Some of the EPUBs we have are long books, some are short readings. We want to allow teachers to upload Word documents (or potentially other file formats) and behind the scenes use Pandoc to convert them to EPUB. However, the documents that teachers are likely to upload this way are mostly going to be short - a page or a few pages. Adding a title page to these is excessive.

Bottom line - EPUB is a useful and flexible interchange format for content. There are many ways it can be used, not all of which follow the traditional "book" model.

@mrichar1
Copy link

mrichar1 commented Apr 7, 2022

To add another situation where omitting the title page is necessary...

We have a book in HTML with explicitly numbered chapters (e.g. <h1>3. Some Chapter</h1>), which are used in references (e.g. See Chapter 3). When generating EPUB the new title page is added as the first item in the TOC, numbered 1. This means that the TOC numbering doesn't line up.

Ideally we'd like to not have a title page OR have it in the TOC, but being able to remove it from the TOC is essential.

@jgm
Copy link
Owner

jgm commented Apr 7, 2022

When generating EPUB the new title page is added as the first item in the TOC, numbered 1. This means that the TOC numbering doesn't line up.

I'm not sure what you mean, as this doesn't accord with my experience, but this may be because you're using a different epub reader. If you could share a sample (a small input file plus the pandoc command you're using to turn it into an epub, and the pandoc version you're using) I could take a look.

@mrichar1
Copy link

mrichar1 commented Apr 7, 2022

Ah, thanks for the tip this might be a bug and not normal behaviour.

It turns out that in our html the first item in <body> is an <a> before the <h1>. Removing this fixes the problem.

With further experimenting I've discovered that it looks like any content will trigger this. Here's a simple test-case.

pandoc --toc --standalone -o out.epub test.html

<html>
<head>
<title>Test Title</title>
</head>
<body>
<a id="chapter_1" />
<h1>Chapter 1</h1>
</body>
</html>

I've tested this with both 2.9.2.1 and 2.18 on Ubuntu, viewing the EPUB file with both fbreader and Calibre's ebook-viewer. Let me know if I can provide any other debugging info!

@jgm
Copy link
Owner

jgm commented Apr 7, 2022

OK, I see. This doesn't have anything to do with the title page; it has to do with the way the EPUB chapters are created based on the structure of the input document. We don't want to just throw away content before the first heading, so we put this in a separate chapter with no name.

@mrichar1
Copy link

mrichar1 commented Apr 7, 2022

OK - it looks like the default behaviour is to create a Title Page then the TOC, then this 'extra' page, which is added to the contents with a heading the same as the Title page, with whatever content comes before the first <h1>

I can see why this happens, but in our specific case the <a> is a fragment anchor for the chapter, allowing internal document links to reference it. I've tried moving the header inside the anchor, i.e. <a id="chapter_1"><h1>Chapter 1</h1></a> but it still happens, which is a little counter-intuitive.

I can see that the content itself shouldn't be lost, but it would be nice if there was a way to say that only items with explicit headers should end up in the TOC?

@jgm
Copy link
Owner

jgm commented Apr 7, 2022

Just put the anchor in the h1 element itself

<h1 id="chapter_1">

@szanni
Copy link

szanni commented May 2, 2022

I'd like to add my support for the original request - have an option that omits generation of a title page entirely.
Bottom line - EPUB is a useful and flexible interchange format for content. There are many ways it can be used, not all of which follow the traditional "book" model.

Exactly this. I stumbled across this issue trying to suppress the generation of a title page myself. I am trying to publish various small songbooks - no need for a title page, copyright or anything. EPUB to me is just the most versatile exchange format.

@danj-ca
Copy link

danj-ca commented Sep 23, 2022

I would also like the option to suppress creation of an automatic title page. In my case, I am providing front matter pages (including the title, copyright, dedication, etc) as inputs to pandoc, along with my book's manuscript.

Having pandoc unilaterally decide where the title page should go hinders my workflow.

Suggested alternative: There is already a mechanism for identifying specific front matter pages in epub3. I suggest that, if a heading is already labeled {epub:type=titlepage}, pandoc suppresses the automatic addition of a title page.

@duplaja
Copy link

duplaja commented Nov 2, 2022

I'd like to throw my support behind the original request as well. My use case is converting individual HTML pages, before combining at the end to a single larger epub file. I am having to manually delete every title page upon merging, at this point.

jgm added a commit that referenced this issue Nov 2, 2022
So far this isn't used, but it contributes to solving #6097.
@fullstopslash
Copy link

I too would like to toss my hat in the ring supporting this. I thought I was running into a bug as the only format I'm converting my document into that duplicates the title is epub. The title shows up 3 times if i also generate the document with --toc, the third time being the half-title I suppose. This in an epub with a cover image that also has the title on it is quite a lot in my opinion.

@jgm jgm closed this as completed in aaa3bea Dec 4, 2022
@jgm
Copy link
Owner

jgm commented Dec 4, 2022

Testing welcome; you can compile from source or try a nightly in 24 hours.

@fullstopslash
Copy link

fullstopslash commented Dec 8, 2022

Thank you for making this happen! I unfortunately won't have time to test it until later this weekend. It does occur to me however, are there other instances in which a title page gets automatically inserted and could be suppressed? I think converting to pdf's also automatically inserts a title page, and this could be another thing people want switch it off. My apologies however, I don't have the time to investigate this at the moment.

@jgm
Copy link
Owner

jgm commented Dec 8, 2022

converting to pdf's also automatically inserts a title page

No, not unless you use certain book classes, but this is all LateX dependent.

@adrianbienias
Copy link

adrianbienias commented Jan 17, 2023

I set # Dummy heading {epub:type=titlepage} in one of my markdown pages, but the default title page is still generated.
Am I doing something wrong, or are there some additional things required?

@jgm
Copy link
Owner

jgm commented Jan 17, 2023

I went with the --epub-title-page=false option, not the one suggested above (looking for an alternative epub:type=titlepage heading).

@jgm
Copy link
Owner

jgm commented Jan 17, 2023

If desired, please open another issue for the suggestion

if a heading is already labeled {epub:type=titlepage}, pandoc suppresses the automatic addition of a title page

@adrianbienias
Copy link

Got it. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests