Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby-transformer-remark): fall back to pruneLength if no content separator present #19137

Merged
merged 14 commits into from
Dec 10, 2019
Merged
2 changes: 1 addition & 1 deletion packages/gatsby-transformer-remark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ If that is the case, you can set `truncate` option on `excerpt` field, like:

### Excerpts for HTML embedded in Markdown files

If your Markdown file contains HTML `except` will not return a value.
If your Markdown file contains HTML `excerpt` will not return a value.

In that case, you can set an `excerpt_separator` in the `gatsby-config.js` file:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,13 @@ Object {
}
`;

exports[`Excerpt is generated correctly from schema given MARKDOWN without excerpt separator, falls back to pruneLength 1`] = `
Object {
"excerpt": "Where oh where **is** my little pony? Lorem…
",
}
`;

exports[`Excerpt is generated correctly from schema given PLAIN correctly uses excerpt separator 1`] = `
Object {
"excerpt": "Where oh where is my little pony?",
Expand Down
23 changes: 23 additions & 0 deletions packages/gatsby-transformer-remark/src/__tests__/extend-node.js
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,29 @@ In quis lectus sed eros efficitur luctus. Morbi tempor, nisl eget feugiat tincid
{ pluginOptions: { excerpt_separator: `<!-- end -->` } }
)

const contentWithoutSeparator = `---
title: "my little pony"
date: "2017-09-18T23:19:51.246Z"
---
Where oh where **is** my little pony? Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi auctor sit amet velit id facilisis. Nulla viverra, eros at efficitur pulvinar, lectus orci accumsan nisi, eu blandit elit nulla nec lectus. Integer porttitor imperdiet sapien. Quisque in orci sed nisi consequat aliquam. Aenean id mollis nisi. Sed auctor odio id erat facilisis venenatis. Quisque posuere faucibus libero vel fringilla.

In quis lectus sed eros efficitur luctus. Morbi tempor, nisl eget feugiat tincidunt, sem velit vulputate enim, nec interdum augue enim nec mauris. Nulla iaculis ante sed enim placerat pretium. Nulla metus odio, facilisis vestibulum lobortis vitae, bibendum at nunc. Donec sit amet efficitur metus, in bibendum nisi. Vivamus tempus vel turpis sit amet auctor. Maecenas luctus vestibulum velit, at sagittis leo volutpat quis. Praesent posuere nec augue eget sodales. Pellentesque vitae arcu ut est varius venenatis id maximus sem. Curabitur non consectetur turpis.
`

bootstrapTest(
`given MARKDOWN without excerpt separator, falls back to pruneLength`,
contentWithoutSeparator,
`excerpt(pruneLength: 40, format: MARKDOWN)`,
node => {
expect(node).toMatchSnapshot()
expect(node.excerpt.length).toBe(45)
expect(node.excerpt).toBe(
`Where oh where **is** my little pony? Lorem…\n`
)
},
{ pluginOptions: { excerpt_separator: `<!-- end -->` } }
)

const content = `---
title: "my little pony"
date: "2017-09-18T23:19:51.246Z"
Expand Down
52 changes: 30 additions & 22 deletions packages/gatsby-transformer-remark/src/extend-node-type.js
Original file line number Diff line number Diff line change
Expand Up @@ -347,10 +347,10 @@ module.exports = (
}

async function getExcerptAst(
fullAST,
markdownNode,
{ pruneLength, truncate, excerptSeparator }
) {
const fullAST = await getHTMLAst(markdownNode)
if (excerptSeparator && markdownNode.excerpt !== ``) {
return cloneTreeUntil(
fullAST,
Expand All @@ -375,12 +375,13 @@ module.exports = (
}

const lastTextNode = findLastTextNode(excerptAST)
const amountToPruneLastNode =
pruneLength - (unprunedExcerpt.length - lastTextNode.value.length)
const amountToPruneBy = unprunedExcerpt.length - pruneLength
const desiredLengthOfLastNode =
lastTextNode.value.length - amountToPruneBy
if (!truncate) {
lastTextNode.value = prune(
lastTextNode.value,
amountToPruneLastNode,
desiredLengthOfLastNode,
`…`
)
} else {
Expand All @@ -398,7 +399,8 @@ module.exports = (
truncate,
excerptSeparator
) {
const excerptAST = await getExcerptAst(markdownNode, {
const fullAST = await getHTMLAst(markdownNode)
const excerptAST = await getExcerptAst(fullAST, markdownNode, {
pruneLength,
truncate,
excerptSeparator,
Expand All @@ -415,18 +417,20 @@ module.exports = (
truncate,
excerptSeparator
) {
if (excerptSeparator) {
// if excerptSeparator in options and excerptSeparator in content then we will get an excerpt from grayMatter that we can use
if (excerptSeparator && markdownNode.excerpt !== ``) {
return markdownNode.excerpt
}
// TODO truncate respecting markdown AST
const excerptText = markdownNode.rawMarkdownBody
if (!truncate) {
return prune(excerptText, pruneLength, `…`)
}
return _.truncate(excerptText, {
length: pruneLength,
omission: `…`,
const ast = await getMarkdownAST(markdownNode)
const excerptAST = await getExcerptAst(ast, markdownNode, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your changes getExcerptAst expect first param to be html AST (not markdown AST) (still trying to wrap my head around why this is needed). Is this correct here?

Copy link
Contributor Author

@samrae7 samrae7 Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pieh thanks for the review. My understanding is that getExcerptAst is a function that will take either kind of AST(html or markdown)

Based on the pruneLength it identifies the last node that should be included. It prunes or truncates that lastNode so that you have an AST that just represents the excerpt. If there is a content separator it will instead use that to generate the tree, and if the pruneLength is less than the length of the markdown it will just return the full tree.

It is being used by getExcerptHtml to get the excerpt when format: HTML is specified so I started using it for the markdown case. An alternative would be to simply prune/truncate the rawMarkdownBody (which is what it was doing before: https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-transformer-remark/src/extend-node-type.js#L422). In that case my change would be simpler and would just be adding a check to this line: https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-transformer-remark/src/extend-node-type.js#L418 to check that markdown.excerpt ! == ""

I thought that the reason the html method doesn't do this is that it could end up returning strings where the markdown symbols have been truncated ( ie. you could end up with a sentence cut off with only part of some markdown like Hello **World** being retuned as Hello **World*). I am not sure about this though ( do you know?). I could do some testing to find out.

Did I address your question? I will test the above when I get time ( will be next day or so). In the meantime if you have context on it please let me know.

Copy link
Contributor Author

@samrae7 samrae7 Nov 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pieh I've added a test that shows what I'm talking about ( I hope). The latest test I've added wouldn't pass if we didn't use the AST to generate the pruned excerpt. It would count markdown characters in the pruneLength and would return cut-off markdown in some cases. ie. in the test case I've added it would return Where oh where **is… as opposed to Where oh where **is**…

Does this answer your question or were you asking something different?

Copy link
Contributor Author

@samrae7 samrae7 Nov 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pieh could you take a look at my reply above? Also, if I misunderstood your question please let me know. Thanks. I just re read your question and the short answer is that either kind of AST will be transformed into an excerptAst by that method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pieh ^

pruneLength,
truncate,
excerptSeparator,
})
var excerptMarkdown = unified()
.use(stringify)
.stringify(excerptAST)
return excerptMarkdown
}

async function getExcerptPlain(
Expand Down Expand Up @@ -552,14 +556,18 @@ module.exports = (
},
},
resolve(markdownNode, { pruneLength, truncate }) {
return getExcerptAst(markdownNode, {
pruneLength,
truncate,
excerptSeparator: pluginOptions.excerpt_separator,
}).then(ast => {
const strippedAst = stripPosition(_.clone(ast), true)
return hastReparseRaw(strippedAst)
})
return getHTMLAst(markdownNode)
.then(fullAST =>
getExcerptAst(fullAST, markdownNode, {
pruneLength,
truncate,
excerptSeparator: pluginOptions.excerpt_separator,
})
)
.then(ast => {
const strippedAst = stripPosition(_.clone(ast), true)
return hastReparseRaw(strippedAst)
})
},
},
headings: {
Expand Down