Provide a peek goal to easily view BUILD metadata from command line #11347

jriddy · 2020-12-19T14:52:50Z

Problem

It would be convenient to have pants able to export build metadata, either for human consumption or programmatic consumption. This seems to be the essence of issue #4861, which the PR addresses.

Solution

This PR adds a new peek goal, which allows us to look at the BUILD files for given targets.

Eric-Arellano

This is awesome, really cool work!

--

For raw output, I agree with the decision to simply print the whole BUILD file. Way too tricky to otherwise try to identify which part is applicable. And we hope/expect BUILD files to be fairly small thanks to dep inference.

However, I could see it getting confusing when you specify multiple targets/files, and mapping what belongs to what. Thoughts on detecting the case of >1 "spec", and logging a warning to consider using the JSON format? We do that with ./pants py-constraints:

pants/src/python/pants/backend/python/mixed_interpreter_constraints/py_constraints.py

Lines 173 to 180 in 50b3e52

    
           if len(addresses) > 1: 
        
               merged_constraints_warning = ( 
        
                   "(These are the constraints used if you were to depend on all of the input " 
        
                   "files/targets together, even though they may end up never being used together in " 
        
                   "the real world. Consider using a more precise query or running " 
        
                   "`./pants py-constriants --summary`.)\n" 
        
               ) 
        
               output_stdout(indent(fill(merged_constraints_warning, 80), "  "))

--

The JSON output is excellent: it shows the user what Pants's view of a particular file/target actually is. That transparency is great for debugging. We might want to make JSON the default?

I'd recommend two tweaks:

Rather than alias, call it target_type, or something similar. "Alias" is an implementation detail.
I think we want to render default values? That way, we show a complete view of what Pants is doing. It also allows for safer JSON queries, as you're guaranteed the key will exist. If it gets too chatty, we could consider an option like --ignore-defaults.

--

In #4861, we mentioned adding a table format in a followup. I'm not sure that would really be necessary - the mix of Raw and JSON is very helpful already.

--

In #4861, I pondered

Should [we] evaluate the sources and dependencies fields, or show the raw values?

After seeing this PR, I think no, we should not. ./pants dependencies already exists for deps, and ./pants filedeps for sources. It's redundant imo, and also non-trivial, to include those here, including a performance hit.

src/python/pants/backend/project_info/peek.py

Eric-Arellano · 2020-12-20T02:53:34Z

If you have a chance, it'd also be great to add some tests. You'd want to use Approach #3: Rule Runner, maybe also some normal unit tests (Approach #1).

Here's something to get you started:

# Copyright 2020 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

import pytest
from textwrap import dedent

from pants.core.target_types import Files
from pants.backend.project_info import peek
from pants.backend.project_info.peek import Peek
from pants.testutil.rule_runner import RuleRunner


@pytest.fixture
def rule_runner() -> RuleRunner:
    return RuleRunner(rules=peek.rules(), target_types=[Files])


def test_example(rule_runner: RuleRunner) -> None:
    rule_runner.add_to_build_file("project", "# A comment\nfiles(sources=[])")
    result = rule_runner.run_goal_rule(Peek, args=["project"])
    assert result.stdout == dedent(
        """\
        -------------
        project/BUILD
        -------------
        # A comment
        files(sources=[])

        """
    )

Note that we likely do want to do exact string matches, as formatting matters for Pants goals and we don't want to regress.

It would be good to test running on both a BUILD target address (like project:app) and a file address, like project/app.py or project/app.py:lib.

jriddy

This is awesome, really cool work!

Thanks! Glad to finally be able to contribute!

For raw output, I agree with the decision to simply print the whole BUILD file. Way too tricky to otherwise try to identify which part is applicable. And we hope/expect BUILD files to be fairly small thanks to dep inference.

I played (a very small bit) with some inspection tools to find the applicable part, but it looked complicated. It could be an issue in its own right if we wanted to pursue it more.

However, I could see it getting confusing when you specify multiple targets/files, and mapping what belongs to what. Thoughts on detecting the case of >1 "spec", and logging a warning to consider using the JSON format? We do that with ./pants py-constraints:

I don't think a warning is necessary if we make JSON the default output, but I'll defer to your judgement on this.

The JSON output is excellent: it shows the user what Pants's view of a particular file/target actually is. That transparency is great for debugging. We might want to make JSON the default?

I'd recommend two tweaks:
1. Rather than `alias`, call it `target_type`, or something similar. "Alias" is an implementation detail.

2. I think we want to render default values? That way, we show a complete view of what Pants is doing. It also allows for safer JSON queries, as you're guaranteed the key will exist. If it gets too chatty, we could consider an option like `--ignore-defaults`.

I agree with these suggestions.

Make JSON the default output style
Rename alias in output to target_type
Show default values by default
Add --exclude-defaults flag (optional)

In #4861, we mentioned adding a table format in a followup. I'm not sure that would really be necessary - the mix of Raw and JSON is very helpful already.

Agreed. It's human-readable enough, and machine readable with things like jq

In #4861, I pondered

Should [we] evaluate the sources and dependencies fields, or show the raw values?

After seeing this PR, I think no, we should not. ./pants dependencies already exists for deps, and ./pants filedeps for sources. It's redundant imo, and also non-trivial, to include those here, including a performance hit.

Agreed, and if you really want that, jq + xargs are your friends:

./pants peek ... | jq -r ... | xargs ./pants filedeps

Although I admit that trying to come up with a good example jq query made me realize that it doesn't seem to play well with arbitrary keys like we're using, so perhaps making a list of objects is a better way forward. Any opinion on this, @Eric-Arellano ?

jriddy · 2020-12-19T20:07:39Z

src/python/pants/backend/project_info/peek.py

+
+    def default(self, o):
+        """Return a serializable object for o."""
+        if isinstance(o, Requirement):


I don't know if this is the best way to handle objects we find in in build metadata. Will they all be meaningfully serializable? Should we just do our best and str() what we can't encode to JSON directly?

src/python/pants/backend/project_info/peek.py

jriddy · 2020-12-20T14:33:31Z

If you have a chance, it'd also be great to add some tests. You'd want to use Approach #3: Rule Runner, maybe also some normal unit tests (Approach #1).

I always intended to add tests, just didn't get around to it on the first pass, as I wanted to make sure the behavior was right before I started baking it in. Thanks for the template.

rule runner tests
unit tests

jsirois · 2020-12-20T15:13:47Z

Although I admit that trying to come up with a good example jq query made me realize that it doesn't seem to play well with arbitrary keys like we're using, ...

FWIW jq supports arbitrary keys by quoting them (you can also index an object with a quoted key - .["..."]) so I don't think jq needs to be a strong constraint on the output format you choose:

$ echo '
{"foo": "bar"}
{"//super/strange/pants:key": {"baz": 42}}
{"eggs": "spam"}
' | jq '."//super/strange/pants:key".baz'
null
42
null

jriddy · 2020-12-20T22:01:06Z

I've addressed most of these issues, except for the tests, which I'll start adding.

I do have one question left, which is the encoding of build files. I get them as bytes in the digest, but how do I know what encoding to use to decode those bytes back to a string. The encoding could theoretically vary depending on how the file was saved right? Or do you normalize to a particular encoding like utf-8 when converting to a digest?

Eric-Arellano

Awesome, looking great!

src/python/pants/backend/project_info/peek.py

jriddy · 2020-12-22T01:32:54Z

Thanks for the feedback. I'll update according to the comments and add some real test cases, but I won't have time to do this until about the 24th or so. Since it's the holidays, I won't expect responses. Have a good one!

src/python/pants/backend/project_info/peek.py

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

jriddy · 2021-07-17T14:02:28Z

Sorry I left this dormant so long. I mostly just forgot about it (or had it at the very back of my mind) until @cczona pinged me to remind me about this, so thanks for that!

I've addressed the previous comments and added some tests. I couldn't find a much better way to test it than to just test raw output generation. although I could come up with some more scenarios with a generator maybe, but y'all don't seem to be using hypothesis or another generative testing tool, so I guess I'll leave it at some hand-written examples.

Let me know if there's anything left I need to do to make this cut the mustard.

Eric-Arellano

Great stuff! One small change

src/python/pants/backend/project_info/peek.py

src/python/pants/backend/project_info/peek_test.py

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

stuhood

Looks great: thanks a lot for iterating on this! Will merge on green.

jriddy marked this pull request as ready for review December 19, 2020 20:03

jriddy changed the title ~~WIP: Implement a peek goal~~ Provide a peek goal to easily view BUILD metadata from command line Dec 19, 2020

Eric-Arellano reviewed Dec 20, 2020

View reviewed changes

jriddy commented Dec 20, 2020

View reviewed changes

Eric-Arellano reviewed Dec 21, 2020

View reviewed changes

Eric-Arellano reviewed Dec 23, 2020

View reviewed changes

src/python/pants/backend/project_info/peek.py Show resolved Hide resolved

jriddy force-pushed the peek-goal branch from 03f9a6d to 6a9b015 Compare January 7, 2021 00:42

jriddy force-pushed the peek-goal branch from 6a9b015 to a7781c2 Compare March 5, 2021 15:06

Base automatically changed from master to main March 19, 2021 19:20

jriddy added 12 commits July 16, 2021 17:22

implement simplest possible peek goal version

16b26a5

complete render raw output

095fa30

clean up peek impl

3d996aa

render json output

6d4ca53

force str() serialization of unknown value types

e15134c

use unexpanded targets

9045c7b

make json serialization default

f7528ae

simplify raw output rendering

bf8e279

add outputting mixin

9c22c25

leave defaults in by default

55e887e

add test skeleton

159d3bb

style fixes after PR comments

547936b

jriddy force-pushed the peek-goal branch from a7781c2 to 547936b Compare July 16, 2021 21:30

jriddy added 3 commits July 17, 2021 08:03

bring current

1a565a4

shore up testing for peek goal

bedf8de

expand tests to cover several json output scenarios

ac6db67

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano approved these changes Jul 19, 2021

View reviewed changes

src/python/pants/backend/project_info/peek.py Outdated Show resolved Hide resolved

src/python/pants/backend/project_info/peek.py Outdated Show resolved Hide resolved

Eric-Arellano requested review from benjyw and stuhood July 19, 2021 17:56

Eric-Arellano reviewed Jul 19, 2021

View reviewed changes

src/python/pants/backend/project_info/peek_test.py Outdated Show resolved Hide resolved

rely on targets injection into peek params

9ed76cd

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

jriddy force-pushed the peek-goal branch from ce31890 to 9ed76cd Compare July 19, 2021 19:54

Eric-Arellano approved these changes Jul 19, 2021

View reviewed changes

stuhood approved these changes Jul 19, 2021

View reviewed changes

stuhood merged commit b5781b5 into pantsbuild:main Jul 20, 2021

Eric-Arellano mentioned this pull request Aug 2, 2021

Pants peek goal #4861

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a peek goal to easily view BUILD metadata from command line #11347

Provide a peek goal to easily view BUILD metadata from command line #11347

jriddy commented Dec 19, 2020

Eric-Arellano left a comment

Eric-Arellano commented Dec 20, 2020

jriddy left a comment •

edited

Loading

jriddy Dec 19, 2020

jriddy commented Dec 20, 2020 •

edited

Loading

jsirois commented Dec 20, 2020

jriddy commented Dec 20, 2020

Eric-Arellano left a comment

jriddy commented Dec 22, 2020

jriddy commented Jul 17, 2021

Eric-Arellano left a comment

stuhood left a comment •

edited

Loading

	if len(addresses) > 1:
	merged_constraints_warning = (
	"(These are the constraints used if you were to depend on all of the input "
	"files/targets together, even though they may end up never being used together in "
	"the real world. Consider using a more precise query or running "
	"`./pants py-constriants --summary`.)\n"
	)
	output_stdout(indent(fill(merged_constraints_warning, 80), " "))

Provide a peek goal to easily view BUILD metadata from command line #11347

Provide a peek goal to easily view BUILD metadata from command line #11347

Conversation

jriddy commented Dec 19, 2020

Problem

Solution

Eric-Arellano left a comment

Choose a reason for hiding this comment

Eric-Arellano commented Dec 20, 2020

jriddy left a comment • edited Loading

Choose a reason for hiding this comment

jriddy Dec 19, 2020

Choose a reason for hiding this comment

jriddy commented Dec 20, 2020 • edited Loading

jsirois commented Dec 20, 2020

jriddy commented Dec 20, 2020

Eric-Arellano left a comment

Choose a reason for hiding this comment

jriddy commented Dec 22, 2020

jriddy commented Jul 17, 2021

Eric-Arellano left a comment

Choose a reason for hiding this comment

stuhood left a comment • edited Loading

Choose a reason for hiding this comment

jriddy left a comment •

edited

Loading

jriddy commented Dec 20, 2020 •

edited

Loading

stuhood left a comment •

edited

Loading