[RFC] Start task for publishing progit2 books from git #917

jnavila · 2017-01-03T17:10:52Z

Right now, it seems there is a local problem with utf-8 encoding in URLs.

This task is already language aware.

peff · 2017-01-04T01:37:49Z

Thanks so much for starting on this! I don't really have enough expertise to offer intelligent comments on the implementation itself. I think the interesting thing would be to import a version we already have (from the existing publishing procedure) and then diff the results. That may be a bit complicated because it exists only in the database, but perhaps we can dump it into a filesystem representation (or get fancy with a Ruby diff gem).

jnavila · 2017-01-04T11:21:50Z

The html generated by Atlas is quite different from the one generated by asciidoctor. So a simple diff on the generated html can't be performed.

As for dumping the values, I can do it on my dev db and post the two versions of some chapters for human comparison. Would there be a way to test-run this import? Maybe @schacon could help here and tell if this PR is close to what he was thinking.

Fiddling with the code, I realized that I haven't seen any code for polling and detecting changes in Atlas. I guess the task that imports books is triggered by an external hook from atlas, in which case the import from git won't be triggered for now.

peff · 2017-01-04T18:29:26Z

Heroku runs rake genbook2 every 10 minutes. Though I think there is also an HTTP endpoint you can hit to add new books.

So as long as the rake task polls the progit2 repo for updates, I think we should be fine. That can run periodically, or it can be kicked off manually.

jnavila · 2017-01-05T21:38:53Z

OK, so I don't really understand how the update of the books used to work. But I know for sure that I have to modify my task to enable the polling of the git repos.

peff · 2017-01-05T21:56:18Z

I don't know how terrible it would be to simply have a file in the repository containing the sha1 of the progit2 commit we want to build, and update it manually. Sort of a poor-man's submodule.

There's a slight hassle because it requires an update here to pull in the new content. But it also means that we would not be surprised if the book makes changes that require us to tweak our formatting, etc.

I dunno. To be honest, the whole rails-app thing seems like huge overkill for what is essentially a static site. I kind of wonder if we could move to a system where the whole thing could be built locally, driven by make or something similar, and pushed up to a static host like GitHub Pages (or even the whole thing could be Jekyll-ified, and let GitHub do the build, though I think we are pushing the limits there between asciidoctoring the manpages and assembling the progit2 material).

jnavila · 2017-01-06T10:54:14Z

The code already uses the ORM to manage the pages. The SHA1 of the HEADs can be stored in DB for later comparison.

As for the pseudo-module system, the path to book generation was copied on the way the git manual pages are generated: use octokit to access the github repos and retrieve the necessary files. The point here is that the book is sliced into sections and subsections which are served as standalone pages from DB, with the need to maintain the cross references between those pages (using a special service using a cross-ref table). Jekyll-ifying it would require a similar process, just that the resulting chunks would be stored to the filesystem instead of the DB, with the added burden to statically generate the correct cross-ref URLs. From what I infer, writing to the filesystem was avoided, I guess for security or due to Heroku limitations.

peff · 2017-01-06T20:18:06Z

From what I infer, writing to the filesystem was avoided, I guess for security or due to Heroku limitations.

Yeah, my secret goal would actually be to take it off of Heroku. The site is largely static except for occasionally pulling in content from a few other sources. We pay Heroku $230/mo for hosting it (three 2x dynos, plus $80 in addons), which seems silly. It gets a lot of hits, I'm sure, but a CDN in front of a static site could handle that easily.

Anyway, that does not need to be resolved for this PR. I mostly just wanted to throw it out there as a possible future direction.

jnavila · 2017-01-06T20:20:03Z

OK, Here is how it's supposed to work with Atlas:

the atlas build has a hook and it pushes to the site, via an http request for update processed in app/controllers/books_controller.rb which updates in DB the status of the book and the download URLs.
Then the crontab task generates the books according to the processed flag of each book in DB.

I plan to replace this by a polling loop of the repos. The thing is that I don't want to force an update of the schema of the database. So, I would reuse the existing ebook_html field to store the SHA1 of the HEAD of each repo.

peff · 2017-01-06T20:31:30Z

That sounds like a good plan. The reuse of ebook_html is a little funny, but I think it conceptually makes sense, and it should be easy to distinguish the two cases by content if need be. I don't know how much pain there is in a schema update, though.

jnavila · 2017-01-08T15:05:23Z

So I went ahead and made all the necessary changes to make this workflow workable. The task works with all the languages.

I plan to allow other languages without requiring the repos to be part of progit. For instance, the german teams has been asking to be incorporated for quite a long time now, and it would be quite easy to pull from the repo they designed from now on.

peff · 2017-01-10T07:35:38Z

I plan to allow other languages without requiring the repos to be part of progit. For instance, the german teams has been asking to be incorporated for quite a long time now, and it would be quite easy to pull from the repo they designed from now on.

Yeah, that sounds like a good plan.

So what's the status of this PR? I was able to pull and build the book locally by setting GENLANG manually. So it seems to work OK, but wouldn't auto-update as the progit2 repo is updated, even if I added a schedule rake remote_genbook2 task.

I'm also not sure what will happen to the non-HTML versions of the book. They'll gradually go out of date with respect to what's on the site, right?

jnavila · 2017-01-10T08:10:02Z

The status of this PR: that's good that you were able to reproduce my build. I have to check again all the cross links (they are all managed by the script) because it's not sure at all that they were all caught for all languages.

Now, the task polls the remote repo and rebuilds the pages in case of update. So it can be added to the schedule. The genbook2 must be disabled at the same time.

The non-web versions are not rebuilt at all. Asciidoctor would allow such a thing, but a lot of additional packages are missing, mostly for PDF.

I hope to let you know soon that the PR is ok for me. If you happen to know somebody versed in Ruby for a code review, that would be great.

This task is already language aware.

The new workflow is creating a single file, so the references to images are all at one single place.

The URL contain translated texts which are UTF-8, so they need to be escaped properly.

jnavila · 2017-01-11T21:22:21Z

I had a bit more of work than expected:

paragraphs at the start of chapters were missing
some internal links were not collected
the xref and pre-link rewriters were buggy

For me, it's good now, I've checked with French that all the links work.

jnavila · 2017-01-18T13:56:38Z

ping @peff . Seems OK to me.

peff · 2017-01-18T14:10:06Z

Yeah, sorry, I've just been busy. I'll merge/deploy later today, and switch the cronjob over to your new rake task.

peff · 2017-01-18T19:52:38Z

OK, merged and deployed, and the book content rebuilt. I updated the cron job to run this rather than genbook2.

It was previously running every 10 minutes, which seemed like quite a lot. I've bumped it to daily, which is how often we built new manpages. If the really is getting updates that frequently, we can perhaps switch it to hourly.

jnavila · 2017-01-18T20:53:26Z

OK, thanks. I'll monitor the issues for new issues.

jnavila mentioned this pull request Jan 6, 2017

Atlas is broken progit/progit2#625

Closed

jnavila force-pushed the local_progit2 branch 2 times, most recently from 79f0974 to 355396d Compare January 8, 2017 14:46

jnavila force-pushed the local_progit2 branch from 355396d to 6d59156 Compare January 11, 2017 21:10

jnavila added 3 commits January 11, 2017 22:16

Task for publishing progit2 books from git

bbe6cf1

This task is already language aware.

Add images for new publishing workflow

4d71b81

The new workflow is creating a single file, so the references to images are all at one single place.

Fix missing URL encodings in books' links

355e954

The URL contain translated texts which are UTF-8, so they need to be escaped properly.

jnavila force-pushed the local_progit2 branch from 6d59156 to 355e954 Compare January 11, 2017 21:17

peff merged commit 6325d8e into git:master Jan 18, 2017

jnavila deleted the local_progit2 branch January 18, 2017 20:53

ben mentioned this pull request Jan 19, 2017

Getting a repository section has plain text rendered as monospaced progit/progit2#709

Closed

peff mentioned this pull request Jan 21, 2017

html ebook links are bogus #929

Closed

YueLinHo mentioned this pull request Feb 12, 2017

Git官网中文版电子书未更新/无法下载 progit/progit2-zh#325

Closed

peff mentioned this pull request Feb 14, 2017

Added fix to replace escaped html entities in <pre> elements with literal character. #740

Closed

YueLinHo mentioned this pull request Oct 26, 2017

自動建置已停止更新 progit/progit2-zh-tw#28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Start task for publishing progit2 books from git #917

[RFC] Start task for publishing progit2 books from git #917

jnavila commented Jan 3, 2017

peff commented Jan 4, 2017 •

edited

Loading

jnavila commented Jan 4, 2017

peff commented Jan 4, 2017

jnavila commented Jan 5, 2017

peff commented Jan 5, 2017

jnavila commented Jan 6, 2017

peff commented Jan 6, 2017

jnavila commented Jan 6, 2017

peff commented Jan 6, 2017

jnavila commented Jan 8, 2017

peff commented Jan 10, 2017

jnavila commented Jan 10, 2017

jnavila commented Jan 11, 2017

jnavila commented Jan 18, 2017

peff commented Jan 18, 2017

peff commented Jan 18, 2017

jnavila commented Jan 18, 2017

[RFC] Start task for publishing progit2 books from git #917

[RFC] Start task for publishing progit2 books from git #917

Conversation

jnavila commented Jan 3, 2017

peff commented Jan 4, 2017 • edited Loading

jnavila commented Jan 4, 2017

peff commented Jan 4, 2017

jnavila commented Jan 5, 2017

peff commented Jan 5, 2017

jnavila commented Jan 6, 2017

peff commented Jan 6, 2017

jnavila commented Jan 6, 2017

peff commented Jan 6, 2017

jnavila commented Jan 8, 2017

peff commented Jan 10, 2017

jnavila commented Jan 10, 2017

jnavila commented Jan 11, 2017

jnavila commented Jan 18, 2017

peff commented Jan 18, 2017

peff commented Jan 18, 2017

jnavila commented Jan 18, 2017

peff commented Jan 4, 2017 •

edited

Loading