a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? #15

nemonameless · 2023-12-12T04:48:36Z

        {
            "answer": "B",
            "choice_a": "The man and woman in the image are both looking away from the camera.",
            "choice_b": "The woman's hair is black.",
            "choice_c": "The woman's dog is on the couch next to her in the image.",
            "choice_d": "There are two people in the image.",
            "data_id": [
                "task23/ICL_images/in_context_attribute_2/1.jpg",
                "task23/ICL_images/in_context_attribute_2/2.jpg",
                "task23/ICL_images/in_context_attribute_2/3.jpg"
            ],
            "data_source": "SEED-Bench v2",
            "data_type": "Interleaved Image",
            "level": "L2",
            "question": "<img>: The predominant color of the uniforms worn by the players is blue. <img>: The most notable color present in the woman's outfit is orange. <img>:",
            "question_id": "23_0",
            "question_type_id": 23,
            "subpart": "Interleaved Image & Text Comprehension",
            "version": "v2"
        },

there are 360 questions end with this style <img>:". Did you put the wrong data?

The text was updated successfully, but these errors were encountered:

Bohao-Lee · 2023-12-12T06:53:53Z

Thank you very much for your interest in our work. And I have made the necessary modifications. Currently, there are 509 occurrences of the "<img>" character, which indicates the position of the corresponding images in the $L_2$ problems. For In-Context Captioning, there are a total of 120 problems with 3 images per problem, resulting in 360 images. Meanwhile, Interleaved Image-Text Analysis consists of 49 problems with a total of 149 images.

nemonameless · 2023-12-12T07:16:01Z

Just fixed [img] -> <img>？but have not solved the question I asked.

 "question": "<img>: The predominant color of the uniforms worn by the players is blue. <img>: The most notable color present in the woman's outfit is orange. <img>:",

still no question after the last <img>
Please fix it. Thanks.

Bohao-Lee · 2023-12-12T08:13:50Z

In fact, this format is specifically designed by us to address such issues. As mentioned in our paper, "Part-2 evaluates MLLMs' comprehension of arbitrary interleaved image-text inputs, including In-Context Captioning. In this task, two examples of image-caption pairs along with an image are provided, and the model is expected to describe the specific aspect of the image." For more details, please refer to Section 3.2.2 of our paper on SEED-Bench-2.

nemonameless · 2023-12-12T08:54:30Z

Could you please give a prompt or question? We just want to add a question after the last <img> for our model testing.
For example:
please select the description below that best describes the last image.
which of the following options provides the same type of description for the last picture? ...

geyuying · 2023-12-15T11:07:52Z

Hi, following the few-shot setting of Flamingo[1], we do not provide a specific prompt for evaluating in-context captioning. Sine we adopt PPL as the evaluation metric, it may not be necessary to add a question for model testing.

[1] Flamingo: a Visual Language Model for Few-Shot Learning

nemonameless changed the title ~~a lot of data with more questions than pictures in level L2, is this reasonable?~~ a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? #15

a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? #15

nemonameless commented Dec 12, 2023 •

edited

Loading

Bohao-Lee commented Dec 12, 2023

nemonameless commented Dec 12, 2023 •

edited

Loading

Bohao-Lee commented Dec 12, 2023

nemonameless commented Dec 12, 2023

geyuying commented Dec 15, 2023

a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? #15

a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable? #15

Comments

nemonameless commented Dec 12, 2023 • edited Loading

Bohao-Lee commented Dec 12, 2023

nemonameless commented Dec 12, 2023 • edited Loading

Bohao-Lee commented Dec 12, 2023

nemonameless commented Dec 12, 2023

geyuying commented Dec 15, 2023

nemonameless commented Dec 12, 2023 •

edited

Loading

nemonameless commented Dec 12, 2023 •

edited

Loading