Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A link to a sample compendium would be useful #3

Open
tmalsburg opened this issue May 23, 2015 · 13 comments
Open

A link to a sample compendium would be useful #3

tmalsburg opened this issue May 23, 2015 · 13 comments

Comments

@tmalsburg
Copy link

No description provided.

@benmarwick
Copy link
Contributor

@tmalsburg
Copy link
Author

Looks good. In addition, it might make sense to have a dummy repository that illustrates the structure but does not contains other irrelevant material. rrrpkg itself could be used for that.

@cboettig
Copy link
Member

cboettig commented Jun 2, 2015

We should probably link the original compendium example from Robert
Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original
Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even
though these are somewhat older). There certainly are other examples from
other folks (I have a few others as well but variety of authors and styles
is probably best); I think there must be some stuff in J Biostatistics.
(of course not counting things like JSS papers).

I loosely maintain a template like that for my own use:
https://github.com/cboettig/template but I'm not sure that it is a good
idea or not for this. devtools and other R tools already support
creating package skeletons really quickly, with good templates included. I
worry that adding a template here could both become dated quickly and more
importantly, might look overkill for the minimum we're trying to suggest
here.

I do think we need some examples that are much lighter-weight -- e.g.
things that don't pass R CMD check and have all the bells and whistles. I
wonder if it might be worth adapting some existing paper that just provides
some data files and some script files so that it looks like an R package.
e.g. something like: https://github.com/duffymeg/BroodParasiteDescription
(see the author's blog post on this too, which is also relevant to this
discussion:
https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/).
e.g. just dump the R scripts into R/, the data into data/, fix some
file path issues and add a minimal DESCRIPTION file.

On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg <
[email protected]> wrote:

Looks good. In addition, it might make sense to have a dummy repository
that illustrates the structure but does not contains other irrelevant
material. rrrpkg itself could be used for that.


Reply to this email directly or view it on GitHub
#3 (comment).

@gmbecker
Copy link

gmbecker commented Jun 2, 2015

Carl,

I Would argue that R scripts (as opposed to R functions/software) don't
belong in the R/ directory of a compendium. Internally at genentech, our
spec calls for a separate analysis/ directory which prevents them from
being run during install/build, but bundles them with any included data or
functions. It provides an (albiet loose) demarcation between the software
(functions) and the analysis code (scripts).

If this were adopted, tooling around it to run scripts from an analysis
package would be pretty straightforward to develop, I think.

~G

On Tue, Jun 2, 2015 at 12:43 PM, Carl Boettiger [email protected]
wrote:

We should probably link the original compendium example from Robert
Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original
Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even
though these are somewhat older). There certainly are other examples from
other folks (I have a few others as well but variety of authors and styles
is probably best); I think there must be some stuff in J Biostatistics.
(of course not counting things like JSS papers).

I loosely maintain a template like that for my own use:
https://github.com/cboettig/template but I'm not sure that it is a good
idea or not for this. devtools and other R tools already support
creating package skeletons really quickly, with good templates included. I
worry that adding a template here could both become dated quickly and more
importantly, might look overkill for the minimum we're trying to suggest
here.

I do think we need some examples that are much lighter-weight -- e.g.
things that don't pass R CMD check and have all the bells and whistles. I
wonder if it might be worth adapting some existing paper that just provides
some data files and some script files so that it looks like an R package.
e.g. something like: https://github.com/duffymeg/BroodParasiteDescription
(see the author's blog post on this too, which is also relevant to this
discussion:

https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/
).
e.g. just dump the R scripts into R/, the data into data/, fix some
file path issues and add a minimal DESCRIPTION file.

On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg <
[email protected]> wrote:

Looks good. In addition, it might make sense to have a dummy repository
that illustrates the structure but does not contains other irrelevant
material. rrrpkg itself could be used for that.


Reply to this email directly or view it on GitHub
#3 (comment).


Reply to this email directly or view it on GitHub
#3 (comment).

Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.

@cboettig
Copy link
Member

cboettig commented Jun 2, 2015

Ah right, I think that's what's in the rrrpkg readme as well -- analysis
would be better. (one might call it code or scripts but it does seem
like there is momentum behind analysis, and that is nicely more relaxed
term should things that are not strictly scripts be placed in there (e.g.
Rmd files). Good call.

On Tue, Jun 2, 2015 at 12:56 PM Gabe Becker [email protected]
wrote:

Carl,

I Would argue that R scripts (as opposed to R functions/software) don't
belong in the R/ directory of a compendium. Internally at genentech, our
spec calls for a separate analysis/ directory which prevents them from
being run during install/build, but bundles them with any included data or
functions. It provides an (albiet loose) demarcation between the software
(functions) and the analysis code (scripts).

If this were adopted, tooling around it to run scripts from an analysis
package would be pretty straightforward to develop, I think.

~G

On Tue, Jun 2, 2015 at 12:43 PM, Carl Boettiger [email protected]
wrote:

We should probably link the original compendium example from Robert
Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original
Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even
though these are somewhat older). There certainly are other examples from
other folks (I have a few others as well but variety of authors and
styles
is probably best); I think there must be some stuff in J Biostatistics.
(of course not counting things like JSS papers).

I loosely maintain a template like that for my own use:
https://github.com/cboettig/template but I'm not sure that it is a good
idea or not for this. devtools and other R tools already support
creating package skeletons really quickly, with good templates included.
I
worry that adding a template here could both become dated quickly and
more
importantly, might look overkill for the minimum we're trying to suggest
here.

I do think we need some examples that are much lighter-weight -- e.g.
things that don't pass R CMD check and have all the bells and whistles. I
wonder if it might be worth adapting some existing paper that just
provides
some data files and some script files so that it looks like an R package.
e.g. something like:
https://github.com/duffymeg/BroodParasiteDescription
(see the author's blog post on this too, which is also relevant to this
discussion:

https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/
).
e.g. just dump the R scripts into R/, the data into data/, fix some
file path issues and add a minimal DESCRIPTION file.

On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg <
[email protected]> wrote:

Looks good. In addition, it might make sense to have a dummy repository
that illustrates the structure but does not contains other irrelevant
material. rrrpkg itself could be used for that.


Reply to this email directly or view it on GitHub
#3 (comment).


Reply to this email directly or view it on GitHub
#3 (comment).

Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.


Reply to this email directly or view it on GitHub
#3 (comment).

@cboettig
Copy link
Member

cboettig commented Jun 2, 2015

Okay, how's this for a more minimal example: https://github.com/cboettig/BroodParasiteDescription

I've tried to make the bare minimum number of changes to https://github.com/duffymeg/BroodParasiteDescription (see https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/, I think this is a simple and realistic example) to make it an R package format.

Let me know if anyone has feedback on these changes; if it looks like what we're going for, or either needs more (or fewer?) modifications to be realistic & useful. If we think this is good then maybe it's worth making a PR to Meg with these changes, so that we can link her original repo.

@tmalsburg
Copy link
Author

@cboettig This example is very useful but it doesn't have the directories R, manuscript, and vignettes. It would be good it everything that is covered by the proposal was part of the "minimal" example.

@cboettig
Copy link
Member

cboettig commented Jun 2, 2015

@tmalsburg thanks. I'm not sure that those things should be included in the definition of "minimal" -- that project didn't need any user-defined functions, so no R directory. We already have the examples that @benmarwick mentioned which include all of those directories.

Perhaps something more intermediate would still be nice as well (e.g. has R/, maybe manuscript to show a .Rmd example (with pandoc->word as the output format?!) but not all the extra stuff like Docker and travis that are in the other two examples Ben mentioned.

@benmarwick
Copy link
Contributor

That's very interesting, your rearrangement of BroodParasiteDescriptionmost is the most minimal R package I've ever seen! And I can install it just fine, though building it give a few notes and warnings, but that's fine. If you make a PR to the original authors, I'll make a PR to this readme to add some more detail according to the discussion on this thread, and link to some examples (I'll link to your repo for now, and update it if your PR is accepted)

benmarwick pushed a commit to benmarwick/rrrpkg that referenced this issue Jun 3, 2015
benmarwick pushed a commit to benmarwick/rrrpkg that referenced this issue Jun 3, 2015
jennybc pushed a commit that referenced this issue Jun 3, 2015
Added some examples, clarify role of DESCRIPTION, and address other issues in #2 and #3
@jennybc
Copy link
Member

jennybc commented Jun 3, 2015

Thanks @cboettig I think that's a very useful contribution. An example that shows just how thin the "R package layer" can be is very valuable!

@jhollist
Copy link
Member

@jennybc per your request!

Another example: Modeling Lake Trophic State. I'm happy to add and submit PR, but wasn't exactly sure where to add. This example is kind of in between the intermediate and complex example. It also is pretty real-world as the nice clean initial set up got a bit messy with most code in functions, but a lot also embedded in the Rmd.

@benmarwick
Copy link
Contributor

@jhollist I think that would be a great example of an intermediate example, please do add a mention of it with a PR!

@jennybc
Copy link
Member

jennybc commented Oct 13, 2015

@jhollist's example added to README in f83ca4a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants