Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update tskit #101

Merged
merged 6 commits into from
Aug 28, 2020
Merged

update tskit #101

merged 6 commits into from
Aug 28, 2020

Conversation

petrelharp
Copy link
Collaborator

@petrelharp petrelharp commented Jul 29, 2020

Changes (see the tskit changelog):

  • There is now a mutation.time column! So, tsk_mutation_table_X methods need to use it.
  • table collections loaded from files no longer live on top of the underlying kastore memory.
  • vargen genotypes are now ints, not uints.
  • We can now disable table collection integrity checking when sorting tables, which we do when not in a DEBUG build.
  • Everything now has metadata, so tsk_edge_table_X and tsk_migration_table_X now have metadata and metadata length arguments. But, we don't use these at all, so this is trivial.

TODO:

  • Remove mutation time from the mutation metadata, and use the mutation time column instead. (I think we should do this? This is a file format change!) (not doing this)
  • Set the metadata schemas on the tables.

This hasn't been tested at all yet. It would be nice to have some developer documentation that says how to run the tests.

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

"table collections loaded from files no longer live on top of the underlying kastore memory" -so this fixes #72 ? Or is that a different issue?

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

"It would be nice to have some developer documentation that says how to run the tests." Well, the problem is that this is pretty technical and not very automated, at present. You can run the -eidosTest and -slimTest tests, which are documented, but they don't exercise tree-sequence recording significantly; they are more tests of APIs and such than tests that complex models work. Then there's the https://github.com/MesserLab/SLiM-Tests repository, but since the tests there are complex models, their results are stochastic, and there's not a well-defined "right" answer for most of them; you have to compare current results against past results, and when there's a discrepancy, track down the cause and verify that it's an intended change in behavior, not a bug. This is an ongoing problem; testing code that is stochastic is tricky. I'm not saying it's not solvable, but I haven't had the time to solve it in a way that is documentable. Sorry! This is why we need a funding line for a QA/testing person who can build better tests that automatically validate the software; it's a hard problem, and is really a full-time job for someone.

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

These diffs include the new hook for external sorting of the edge table, right? I want to plug into that.

Yes, we can do the file format change, but of course it's tricky since we want to preserve backward compatibility. Let's chat about that.

Anyway, regarding the testing, once the PR feels final, I'll just take it and then I'll do the testing on my side. Not great, but it's what we've got.

@petrelharp
Copy link
Collaborator Author

Yes, this will close #72.

@petrelharp
Copy link
Collaborator Author

So: about the metadata - I forget how much of this you know already, but:

First, tskit now has the sort of thing we do with metadata built in - one specifies a metadata schema, which is a new (optional) property of tables, and then tskit will automatically validate and decode it for you (much like pyslim already does). This happens at the python level, so this will only affect us in that we'll add appropriate metadata schemas to the tables. We could also check that metadata schema are correct when loading in tables, but this seems unnecessary.

Second, there is table collection-level metadata also. We should make up a schema for SLiM, and put the things we need to know but currently are pulling out of provenance in there: WF/nonWF, SLiM generation, not sure what else.

Third, there's the mutation time. Currently we store origin_generation in metadata; the time column is in tskit time-ago. We could leave everything the way it is, which would be convenient, since we wouldn't have to translate back and forth. But we really should not be storing the same thing in more than one place. So, we should remove origin_generation from the mutation metadata struct, and get the information out of the mutation time column. This should be straightforward, except for dealing with the discrepancy between tskit_generation and generation.

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

Checks are failing because the current GitHub head fails a couple of -eidosTest unit tests due to changes in progress under the hood. I can commit a fix that will resolve all of those issues, and then this PR should be able to pass checks, once you rebase it to the current head. All the changes are in Eidos, not SLiM, so you should have zero conflicts. Shall I go ahead and do that commit?

@petrelharp
Copy link
Collaborator Author

Oh, and: yes, this does have the new sorting hooks in! Here's an example of how to do it: https://github.com/tskit-dev/tskit/blob/master/c/examples/cpp_sorting_example.cpp

@petrelharp
Copy link
Collaborator Author

Shall I go ahead and do that commit?

Sure, go ahead, but no hurry. I can ignore the Eidos tests for now.

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

So: about the metadata - I forget how much of this you know already, but:

I forget how much of it I know already, too. :->

First, tskit now has the sort of thing we do with metadata built in - one specifies a metadata schema, which is a new (optional) property of tables, and then tskit will automatically validate and decode it for you (much like pyslim already does). This happens at the python level, so this will only affect us in that we'll add appropriate metadata schemas to the tables. We could also check that metadata schema are correct when loading in tables, but this seems unnecessary.

OK. Checking that the schemas are what we expect them to be on load actually seems like a good idea to me – this would be a way to check that everything is in "SLiM format", without relying on the rest of the load code to catch discrepancies/mismatches when it isn't. It would just be a simple string equality test, right? We should define our schemas in const std::string objects in slim_sim.cpp, and then on load we can just do a schema == standard_schema check and throw an error if there's a mismatch. It would act as a gatekeeper against users trying to do things that we don't allow, and probably either crashing or giving a cryptic error message from something downstream – or even worse, interpreting non-SLiM metadata as SLiM metadata and somehow succeeding in running an incorrect simulation (which would be very unlikely, but still).

Second, there is table collection-level metadata also. We should make up a schema for SLiM, and put the things we need to know but currently are pulling out of provenance in there: WF/nonWF, SLiM generation, not sure what else.

Yep, this makes sense. Some decisions to be made regarding what belongs in the metadata versus the provenance, probably.

Can pyslim insulate end users from the fact that these things are moving around? I.e., can the old way of accessing all this metadata continue to work (perhaps designed as deprecated/legacy APIs, if you like), so that people don't need to dance around this in their Python code? The alternative, I guess, if people want to write Python code that works with both old and new .trees files, would be that they would have to check the version of the file somehow, and use different pyslim APIs depending on the file version, which would be really gross.

Third, there's the mutation time. Currently we store origin_generation in metadata; the time column is in tskit time-ago. We could leave everything the way it is, which would be convenient, since we wouldn't have to translate back and forth. But we really should not be storing the same thing in more than one place. So, we should remove origin_generation from the mutation metadata struct, and get the information out of the mutation time column. This should be straightforward, except for dealing with the discrepancy between tskit_generation and generation.

I'm a little bit leery of this. The origin_generation field is an integer, defined/managed by SLiM, and actually SLiM even allows models to manipulate it for their own purposes if they want to – it can be used as an extra tag field, in a sense, in models that don't care about the actual origin generation of mutations. It is not modifiable in Mutation, but any value may be passed in to addNewMutation() / addNewDrawnMutation(). This is not exactly recommended, but it's allowed. The new time column in the mutation table, on the other hand, needs to be the true time of introduction of the mutation, or the tree sequence won't even pass integrity checks etc. So it is potentially different.

It is also a floating-point number, right? Since tskit time is continuous? And it would potentially contain fractional values, even for SLiM models, because of the way we play with the tree sequence time to compensate for early()/late(), etc., right? All of this would make it very complicated to use that column anyway.

And then – not that I expect this to be an issue for anybody any time soon – there is the fact that the range of 64-bit integers is much larger than the range of integer values that can be fully represented by double-precision floats; they only get 54 bits, IIRC, plus their exponent and sign, so above 54 bits they can no longer represent each successive integer exactly; the set of integers they can represent starts to have gaps. So a model that ran for more than 2^54 generations would lose origin generation precision. Again, I don't expect that to bite anyone any time soon, realistically; but I don't like stuffing a 64-bit int value into a double, it sets off alarm bells in my brain, and "oh, this will never bite anybody" is a common source of problems that do eventually bite someone. :->

@petrelharp
Copy link
Collaborator Author

Checking that the schemas are what we expect them to be on load actually seems like a good idea to me

Ok, let's do it. I think you're right, it'll be just string comparison. (There are probably equivalent schema that differ in eg whitespace, but we can ignore that.)

Can pyslim insulate end users from the fact that these things are moving around?

Yes, totally.

origin_generation can be used as an extra tag field, in a sense,

Oh goodness.

It is also a floating-point number, right? And it would potentially contain fractional values, even for SLiM models, because of the way we play with the tree sequence time to compensate for early()/late(), etc., right?

Hm, good point. It would contain fractional time values for msprime-added neutral mutations, unless we require people to add mutations using the discrete-time Wright-Fisher model, which we don't want to do. I don't think we use fractional values, at present, to denote early/late, however?

So a model that ran for more than 2^54 generations would lose origin generation precision.

Hm. As you say, unlikely to hit anyone anytime soon... 10^15 generations is firmly beyond what anyone is doing.

So, ok - we'll keep this mostly-but-not-entirely redundancy in there. But we better document it!

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

Checking that the schemas are what we expect them to be on load actually seems like a good idea to me

Ok, let's do it. I think you're right, it'll be just string comparison. (There are probably equivalent schema that differ in eg whitespace, but we can ignore that.)

Good.

Can pyslim insulate end users from the fact that these things are moving around?

Yes, totally.

Great.

origin_generation can be used as an extra tag field, in a sense,

Oh goodness.

Yeah, allowing that might have been a bad decision. :->

It is also a floating-point number, right? And it would potentially contain fractional values, even for SLiM models, because of the way we play with the tree sequence time to compensate for early()/late(), etc., right?

Hm, good point. It would contain fractional time values for msprime-added neutral mutations, unless we require people to add mutations using the discrete-time Wright-Fisher model, which we don't want to do. I don't think we use fractional values, at present, to denote early/late, however?

Ah, not a fractional value for early/late, but the tree-seq time counter flips at a different time than SLiM's counter does in the generation cycle, IIRC. But we do use fractional times in SLiM to resolve time conflicts involved in successive calls to addSubpopSplit() within a single generation; search on tree_seq_generation_offset_ in SLiM's code to find that little bundle of joy.

So a model that ran for more than 2^54 generations would lose origin generation precision.

Hm. As you say, unlikely to hit anyone anytime soon... 10^15 generations is firmly beyond what anyone is doing.

So, ok - we'll keep this mostly-but-not-entirely redundancy in there. But we better document it!

Yep.

Let me know when the ball is in my court. Thanks for dealing with this stuff, happy to help when help is needed. I'll commit those Eidos fixes soon. Right now, going back to assembling IKEA furniture. :-O

@petrelharp
Copy link
Collaborator Author

Yeah, allowing that might have been a bad decision. :->

It might be worth making a list of overly-flexible things to disallow in SLiM 4.0...

@bhaller
Copy link
Contributor

bhaller commented Jul 29, 2020

Yeah, allowing that might have been a bad decision. :->

It might be worth making a list of overly-flexible things to disallow in SLiM 4.0...

Sure. Want to kick it off with a new issue? :->

@petrelharp
Copy link
Collaborator Author

It looks to me from ReadProvenanceTable that the only things we get from provenance are:

  • modelType
  • generation

... so, these should go in metadata, instead (this will be a file format bump, but greatly simplify things, since we won't have to parse provenance any more. So, it'll look like this:

{
  "model_type" : "WF",
  "generation" : 123
}

For reference, here's the other things written to provenance.

Would it also be helpful to say whether we've got a reference sequence? Anything else?

We also need to say "this is a SLiM tree sequence", but I think we'll do this by checking whether the metadata schema - ie the thing that says what should be in the metadata - is a SLiM metadata schema (but, I'm checking with Ben J about best practices here...).

@bhaller
Copy link
Contributor

bhaller commented Jul 31, 2020

It looks to me from ReadProvenanceTable that the only things we get from provenance are:

  • modelType
  • generation

... so, these should go in metadata, instead (this will be a file format bump, but greatly simplify things, since we won't have to parse provenance any more. So, it'll look like this:

{
  "model_type" : "WF",
  "generation" : 123
}

This sounds good. Of course we can't actually get rid of the provenance-parsing code, because we need to be able to read the old-format files for backward compatibility; but it'll certainly streamline the code path taken by up-to-date users.

Would it also be helpful to say whether we've got a reference sequence? Anything else?

Hmm. I don't think there's anything else we need, in the sense that we seem to be getting by fine with what we had. But perhaps we should write out a flag saying whether the model was declared as nucleotide-based or not (and if it was, that would imply that a reference sequence is present, since a reference sequence is required by SLiM for all nucleotide-based models). And perhaps we should write out the spatial dimensionality of the model, as "", "x", "xy", or "xyz", and write out which dimensions are periodic (which would be "", "x", "y", "xy", "xz", "yz", or "xyz"). And perhaps we should write out whether it is a separate-sex model (i.e., initializeSex() was called) or not, with a boolean flag. I don't think we need any of those things, and there's no need to do anything with any of them on read, for now; but since we have a convenient/correct place to put them now, there's no harm in writing them out, and perhaps we or someone else will find them useful in future, and perhaps it will even prevent us from having to revise the file format again some time down the road. If you're not sure where to get the necessary info from, I can add these things after your PR is accepted, of course.

We also need to say "this is a SLiM tree sequence", but I think we'll do this by checking whether the metadata schema - ie the thing that says what should be in the metadata - is a SLiM metadata schema (but, I'm checking with Ben J about best practices here...).

Perhaps also by checking that the tree sequence metadata is present, and that its length matches the expected length? But yeah, checking the schema makes sense to me.

@bhaller
Copy link
Contributor

bhaller commented Jul 31, 2020

Oh, and: I was going to fix Eidos so the checks don't fail, but I realized that that's waiting on the quantile() function getting fixed; at present it fails its unit tests, so the check for this PR is going to fail until that gets fixed.

@petrelharp
Copy link
Collaborator Author

Ok, so:

{
  "model_type" : "WF",
  "generation" : 123,
  "spatial_dimensionality" : "xy",
  "spatial_periodicity" : "x",
  "separate_sexes" : true,
  "nucleotide_based" : false
}

@petrelharp
Copy link
Collaborator Author

The other Ben suggests putting all this stuff under a top-level "slim" key, so that other programs can add other stuff. So:

{
 "slim" : {
  "model_type" : "WF",
  "generation" : 123,
  "spatial_dimensionality" : "xy",
  "spatial_periodicity" : "x",
  "separate_sexes" : true,
  "nucleotide_based" : false
 }
}

And: thinking more, I don't think we should just do string comparison to check for slim-tree-sequence-ey-ness: really, all we need to do is check for presence of certain metadata; that means checking for presence of certain keys in the schema. If other stuff is there, that's OK, so it leaves the door open for some other program to make a SLiM tree sequence with extra information in it that SLiM doesn't use. Since you've already got json-parsing code in there, this will be easy. So, we just need to check if the top-level metadata has model_type and generation; if the mutation metadata has selection_coeff and origin_generation and the other things, etcetera.

@bhaller
Copy link
Contributor

bhaller commented Aug 2, 2020

OK, the build is green on Travis-CI again now, all tests passing including those for quartile(). So if you rebase this PR onto the current head, hopefully Travis will say this PR is green, too.

@bhaller
Copy link
Contributor

bhaller commented Aug 7, 2020

So, the Travis build is now succeeding on Xenial but failing on macOS. The macOS fail has to do with std::isnan() and std::isfinite() calls not getting matched up to the correct internal machinery. This is not a problem in the master branch, so it is due to changes in the PR.

I think the culprit is the #ifdef __clang__ stuff in treerec/tskit/core.h:

#ifdef __clang__
 /* Work around bug in clang >= 6.0, https://github.com/tskit-dev/tskit/issues/721
  */
 #if __has_builtin(__builtin_isnan)
 #undef isnan
 #define isnan __builtin_isnan
 #else
 abort();
 #endif
 #if __has_builtin(__builtin_isfinite)
 #undef isfinite
 #define isfinite __builtin_isfinite
 #else
 abort();
 #endif
 #endif

Messing with these kinds of internal math library definitions seems pretty evil to me (although I realize it's trying to work around a compiler bug), especially when done with preprocessor redefinitions like this, and this seems directly related to the build errors on Travis, which look like this:

/Users/travis/build/MesserLab/SLiM/core/genomic_element_type.cpp:137:6: error: no member named '__builtin_isfinite' in namespace 'std'; did you mean simply '__builtin_isfinite'?

                                !std::isfinite(row_1) || !std::isfinite(row_2) || !std::isfinite(row_3) || !std::isfinite(row_4))

                                 ^~~~~~~~~~~~~

                                 __builtin_isfinite

Whatever bug in clang >= 6.0 that stuff in the header is trying to fix, it breaks the build of C++ files that (directly or indirectly) include core.h, because it has the side effect of messing up the C++ std::isfinite() and std::isnan() functions; the preprocessor is replacing those with std::__builtin_isfinite() and std::__builtin_isnan() due to the redefinitions in core.h, which is obviously wrong. I also wonder whether this clang bug is even present in recent versions of clang; if this is just a fix for an obsolete bug in old versions of clang, then maybe we don't even need to cater to those old versions, since we're now on clang 10.0 and 9.0 is pretty widespread? But in any case, the fix as it stands in core.h simply doesn't work in a C++ context.

More broadly, I would suggest that it might be a bad idea for treerec/tskit/core.h to include <math.h> at all, or to try to do math-y stuff in the header. This can be a problem, for a header that gets included in both C and C++ contexts, because C++ files aren't really supposed to include the C math header; they're supposed to include <cmath> instead. It's not clear whether including both <math.h> and <cmath> is prohibited by the C++ standard, or even discouraged by the C++ standard; but be that as it may, it can cause problems on some platforms and so it should be avoided. This has bitten me before, repeatedly. The fix is to avoid putting an include of <math.h> anywhere that might end up getting included by a C++ source file. So the <math.h> include in treerec/tskit/core.h would ideally go into a .c file, not in the .h file. If math-y stuff really needs to be in the .h file, then the .h file might need to do a #ifdef __cplusplus and include <cmath> instead when it is being included in a C++ context; but that would probably be a headache.

But it may be that fixing/removing that #ifdef __clang__ stuff somehow would suffice to fix this build problem. I think including <math.h> in one of the tskit .h files is kinda asking for trouble, but maybe you will be able to get away with it if the first issue gets fixed.

Anyway, apart from that issue, it looks like the self-tests are passing on Xenial, so it looks like you fixed the bug that was biting you yesterday, so that's great. I'm a little nervous about the nature of the fix, though. This whole "missing data" thing seems like a problem to me; I find it worrisome that tskit might be inferring that "missing data" is present in a SLiM tree sequence just because of a particular pattern of ancestry – a pattern that is (as I understand it) omnipresent in forward simulation, no less! I'm not sure it's the right decision to ship tskit 0.3 with this issue present, and I'm concerned about shipping SLiM 3.5 with this issue present, too. I know Jerome wants to get 0.3 out the door, but I'm concerned that letting this problem go out in a major release may prove to be a mistake. I'd like to see whether the full R test suite for SLiM passes with crosscheck enabled. If it does, then perhaps I'll relax about this, but "ship it wrong and we'll think about it later" just feels like a bad idea to me.

@bhaller
Copy link
Contributor

bhaller commented Aug 7, 2020

It looks like the clang bug in question was fixed in January of 2019; I don't know how to tell what version of clang/llvm that means the fix was shipped in.

If removing that fix is not possible, and moving the math-y stuff to the .c file and getting rid of the math.h include in core.h is not possible, then perhaps core.h can at least clean up after itself? If it redefined isnan and isfinite only within its own private context, and changed those definitions back to whatever they were at the end of itself, that would un-break the breakage of std::isfinite() and std::isnan(). But doing this is not supported in bog-standard C: https://stackoverflow.com/a/54916469/2752221

@bhaller
Copy link
Contributor

bhaller commented Aug 7, 2020

I think I'll tag @jeromekelleher here, since it's already Friday afternoon in the UK and @petrelharp probably won't be online for a couple hours. Jerome, see the two comments above; this is a high-priority bug since it breaks C++ code that includes core.h when using clang.

@petrelharp
Copy link
Collaborator Author

petrelharp commented Aug 7, 2020

I'll find some input on the library problem - it'd be nice to fix it in a way that works upstream, too, and I don't know what I'm doing here.

About missing data: an isolated sample with no mutations above it it is a genome that we don't know anything about, so I think it does make sense that this is 'missing data' I do think that one that has mutations above it shouldn't be missing. At least the behavior is optional; we should at least put a big note about setting impute_missing_data=True in the pyslim docs if it doesn't change.

@petrelharp probably won't be online for a couple hours

hah!

@bhaller
Copy link
Contributor

bhaller commented Aug 7, 2020

@petrelharp probably won't be online for a couple hours

hah!

Your chief weapon is surprise!

@bhaller
Copy link
Contributor

bhaller commented Aug 7, 2020

About missing data: an isolated sample with no mutations above it it is a genome that we don't know anything about, so I think it does make sense that this is 'missing data' I do think that one that has mutations above it shouldn't be missing.

I don't think I agree. It's common to start a SLiM simulation with, e.g., 1000 individuals (2000 genomes), one of which contains a mutation (a beneficial sweep mutation, say), the other 1999 of which are empty. We know the same amount of information about all of them: we know that the one genome contains the sweep mutation, and we know that the other 1999 don't. None of them should be taken to be "missing data" – they are all in a known, deliberately chosen state, and we have definitive information about the presence or absence of the sweep mutation for all 2000 of them. I am very concerned about designating any of them as being "missing" according to some heuristic; I think to be designated as "missing" ought to involve some kind of explicit declaration that says "this node is not just empty, it is actually missing".

Empty nodes happen all the time in forward simulation. Even if it is presently harmless for tskit to consider empty first-generation individuals "missing", that may not remain true forever; at some point, it may make sense for tskit to treat "missing" nodes differently from other nodes in this or that respect, and at that point, when the "missing" versus "not missing" distinction is made more consequential, I am concerned that that would then expose this past decision as being the wrong choice – but it would then be too late to fix it easily, because it would have become ossified into explicit policy, other code would depend upon it, etc.

"Missing data" is not a concept that exists in SLiM, and so I think it is a mistake to allow tskit to infer that any node in SLiM is "missing". Even if it's harmless now, I think it may well come back to bite us later.

@bhaller
Copy link
Contributor

bhaller commented Aug 9, 2020

Hey. Made some comments on your last commit. Maybe I should have done them as a "review"; does it matter? What's the difference? Anyway, looking pretty good. I should do a more comprehensive code review once you've got things final-ish. Thanks!

@petrelharp
Copy link
Collaborator Author

I'm still fuzzy on the distinction also, but I think that 'reviewing' lets the comments come through in one discrete chunk (like, I don't get them 'til you're done with the review, so you can wait 'til you're done going through entirely to send them to me), and also gives a more formal way to "approve" the changes. Either way works.

@petrelharp
Copy link
Collaborator Author

Ok - I think everything is working? I've got things synced with pyslim and passing tests over there, at tskit-dev/pyslim#89 . Now'd be a good time to have a look, @bhaller.

Copy link
Contributor

@bhaller bhaller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Peter. Great stuff, looks like you had to wade through a lot of tricky things!

core/slim_globals.h Outdated Show resolved Hide resolved
core/slim_globals.h Outdated Show resolved Hide resolved
core/slim_globals.h Outdated Show resolved Hide resolved
core/slim_globals.h Outdated Show resolved Hide resolved
core/slim_globals.h Outdated Show resolved Hide resolved
treerec/implementation.md Outdated Show resolved Hide resolved
treerec/implementation.md Outdated Show resolved Hide resolved
treerec/implementation.md Show resolved Hide resolved
treerec/tests/test_mutation_output.py Show resolved Hide resolved
treerec/tskit/convert.c Show resolved Hide resolved
@petrelharp
Copy link
Collaborator Author

I think I've addressed all these comments. Have a look? You should read the metadata schema over here: https://github.com/tskit-dev/pyslim/pull/89/files#diff-e1abd0a63c6a0ff314e018d205d42ca4R17 .

@bhaller
Copy link
Contributor

bhaller commented Aug 12, 2020

I think I've replied to your replies as they came at me. Is there a place to see only the loose ends that remain? When I view "Files changed" now it's still showing me lots of conversations that I think we have resolved. What's left?

@petrelharp
Copy link
Collaborator Author

The "Files changed" tab gives you all the associated changes (don't forget to refresh it!), and if you go to the "Commits" tab you can click on particular commits; all my edits are in this commit except for adding the meaning of values in this commit.

core/slim_globals.h Outdated Show resolved Hide resolved
core/slim_sim.cpp Outdated Show resolved Hide resolved
core/slim_sim.cpp Outdated Show resolved Hide resolved
core/slim_sim.cpp Outdated Show resolved Hide resolved
treerec/tskit/text_input.c Outdated Show resolved Hide resolved
@petrelharp
Copy link
Collaborator Author

Ok - I think I got all those issues, in 32747cd.

@petrelharp
Copy link
Collaborator Author

And renaming the constants, in 3cdf4e8

CMakeLists.txt Outdated
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c11 -Wno-deprecated-register -Wno-attributes -Wunused-label -Wimplicit -Wunused-variable -Wunused-value -Wno-pragmas -Wimplicit-fallthrough -Wempty-body -Wshadow -Wparentheses -Wdangling-else -Wmissing-prototypes -Wswitch -Wpointer-sign -Wsign-compare -Wstrict-prototypes -Wconversion -Wno-sign-conversion -Wuninitialized")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wno-deprecated-register -Wno-attributes -Wunused-label -Wimplicit -Wunused-variable -Wunused-value -Wno-pragmas -Wimplicit-fallthrough -Wempty-body -Wshadow -Wparentheses -Wdangling-else -Wmissing-prototypes -Wswitch -Wpointer-sign -Wsign-compare -Wstrict-prototypes -Wconversion -Wno-sign-conversion -Wuninitialized")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c11 -Wno-attributes -Wunused-label -Wimplicit -Wunused-variable -Wunused-value -Wno-pragmas -Wimplicit-fallthrough -Wempty-body -Wshadow -Wparentheses -Wdangling-else -Wmissing-prototypes -Wswitch -Wpointer-sign -Wsign-compare -Wstrict-prototypes -Wno-sign-conversion -Wuninitialized")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wno-attributes -Wunused-label -Wunused-variable -Wunused-value -Wno-pragmas -Wimplicit-fallthrough -Wempty-body -Wshadow -Wparentheses -Wdangling-else -Wswitch -Wsign-compare -Wno-sign-conversion -Wuninitialized")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about to commit these same changes on the master branch; I'm trying to build on Ubuntu now and the warnings are bugging me. So you can just revert this file back to the state of master.

petrelharp and others added 5 commits August 27, 2020 20:33
.

time adjustments

passes treerec/tests!

genotypes fix!

.

top-level metadata

first pass at metadata schemas

schemas schemas more schemas

.

fighting with metadata

pyslim passes tests

update to remove spaces

minor update of tskit C

edits

added possible values

fixes

rename constants

change migration_rec_count_ to uint32_t

fix two warnings

clean up warnings on gcc

schema update

update is_null type in schema
@petrelharp
Copy link
Collaborator Author

OK, I rebased and squashed down a bunch of those commits. I've done the same for pyslim. This could be merged, if that's the plan.

@bhaller
Copy link
Contributor

bhaller commented Aug 28, 2020

Awesome! I'll do it tomorrow morning, since if there's fallout I don't want to deal with it tonight. Thanks!

@petrelharp petrelharp mentioned this pull request Aug 28, 2020
@bhaller bhaller merged commit a4048e8 into MesserLab:master Aug 28, 2020
@bhaller
Copy link
Contributor

bhaller commented Aug 28, 2020

OK, passes all tests with crosscheck on. Deleting the branch.

@bhaller bhaller deleted the tskit_update branch August 28, 2020 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants