Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html ebook links are bogus #929

Closed
peff opened this issue Jan 21, 2017 · 4 comments
Closed

html ebook links are bogus #929

peff opened this issue Jan 21, 2017 · 4 comments

Comments

@peff
Copy link
Member

peff commented Jan 21, 2017

Since #917, we abuse the ebook_html database field to store the sha1 of the last book version we imported. But that field is also used to generate a link from each book page. These now look like https://git-scm.com/book/en/1f92accca99927c6a9c74f6fb620a3541048c30e, which does nothing useful (in some cases it just links back to the same book page, in others it's a 500).

The "right" fix is to actually change the database schema to store the sha1 elsewhere. But then what would we put in ebook_html? We don't actually build a downloadable HTML version of the book anymore. So I'm wondering if this link should simply go away.

That raises a similar question for the pdf, mobi, etc, versions. Those are now out of date with the content shown on the site. It looks like you can now build them straight from the progit2 repo. I wonder if we should be doing so, but I think there's a storage question. They're probably a bit large for shoving into the database. I'm not sure I want to get into automatically posting them to S3, though.

/cc @jnavila

@jnavila
Copy link
Contributor

jnavila commented Jan 21, 2017

Ah, I didn't think hard enough. The html files are proposed for download. Anyway, as you said, these versions will rapidly become obsolete, and they don't contain all the fixes that were introduced since Atlas stopped working. I guess that for now, the sanest path is to disable those downloads.

I know nothing of S3 or Heroku. As mentioned in progit/progit2#625
https://www.gitbook.com/ might be worth investigating.

@peff
Copy link
Member Author

peff commented Jan 21, 2017

I think the "right" way to do it with Heroku would be to have the rake job push the built pdfs up to S3, and then store the link in the database. That seems like a lot of complication (including a new billing source!) for a handful of 10MB files. We could probably get by with sticking them in the database.

That said, I'm awfully tempted to just rip out the links completely and see if anybody actually cares.

@jnavila
Copy link
Contributor

jnavila commented Jan 21, 2017

Until we can come up with a way to generate and propose the pdfs for download, I'd rather the old files weren't proposed at all. The differences in version are a source of annoying questions.

Do you have any data on the number of downloads?

@peff
Copy link
Member Author

peff commented Jan 21, 2017

Agreed.

No, I don't have any data. It doesn't look like git-scm.com has any kind of analytics or persistent logging setup. But even if it did, I think the interesting question is hits to the progit2 S3 buckets. I'm not sure who owns those, but probably @ben or @schacon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants