Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues while running test generation #1718

Closed
ayulockin opened this issue Dec 2, 2024 · 2 comments · Fixed by #1733
Closed

Issues while running test generation #1718

ayulockin opened this issue Dec 2, 2024 · 2 comments · Fixed by #1733
Labels
bug Something isn't working module-testsetgen Module testset generation

Comments

@ayulockin
Copy link
Contributor

I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug

I installed ragas from the main branch and trying to run the test generation quick start (code to reproduce). I am getting the error trace 1 (AttributeError: 'MultiHopSpecificQuerySynthesizer' object has no attribute 'get_node_clusters').

I commented out this line:

MultiHopSpecificQuerySynthesizer(llm=llm),

Running the code again gives me error trace 2 (openai.APIConnectionError: Connection error.). I initially thought it was openai's issue but this is originating from RuntimeError: Event loop is closed error. The event loop is being closed prematurely or reused after being closed.

It is arising from here:

def results(self) -> t.List[t.Any]:

I was able to fix this issue with a slight modification to the method above:

# top of the script import
import nest_asyncio
nest_asyncio.apply()

# existing code....

# modification to the method
def results(self) -> t.List[t.Any]:
        """
        Execute all submitted jobs and return their results.
        Uses an existing event loop if one is running, otherwise creates a new one.
        """
        # Check if an event loop is running
        if is_event_loop_running():
            # Use the running event loop
            loop = asyncio.get_event_loop()
            results = loop.run_until_complete(self._process_jobs())
        else:
            # Create a new event loop
            results = asyncio.run(self._process_jobs())

        # Sort results by job index and return
        sorted_results = sorted(results, key=lambda x: x[0])
        return [r[1] for r in sorted_results]

Now I am getting error trace 3 (AttributeError: 'PersonaThemesMapping' object has no attribute 'mappping'. Did you mean: 'mapping'?). I thus commented out this line as well:

MultiHopAbstractQuerySynthesizer(llm=llm),

It worked.

To summarize:

  • both MultiHopAbstractQuerySynthesizer and MultiHopSpecificQuerySynthesizer aren't working for me.
  • Without the modifications I made to the executor.py file, I am unable to run test generation.
  • It's not a jupyter session but running from a script.

Ragas version: '0.2.7.dev5+geb5f745'
Python version: 3.10.12

Code to Reproduce

from langchain_community.document_loaders import DirectoryLoader
from ragas.testset import TestsetGenerator

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings

path = "Sample_Docs_Markdown/"
loader = DirectoryLoader(path, glob="**/*.md")
docs = loader.load()

generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings)
dataset = generator.generate_with_langchain_docs(docs, testset_size=10)

Error trace

  1. This error trace is coming from running the test generation from the Quickstart.
Applying HeadlineSplitter:   0%|                                                                                                                                                  | 0/12 [00:00<?, ?it/s]unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
Applying SummaryExtractor:  33%|██████████████████████████████████████████████▎                                                                                            | 2/6 [00:02<00:04,  1.22s/it]Property 'summary' already exists in node '227147'. Skipping!
Applying [EmbeddingExtractor, ThemesExtractor, NERExtractor]:   0%|                                                                                                               | 0/30 [00:00<?, ?it/s]Property 'summary_embedding' already exists in node '227147'. Skipping!
Traceback (most recent call last):                                                                                                                                                                       
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/test_generation.py", line 21, in <module>
    dataset = generator.generate_with_langchain_docs(docs, testset_size=10)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 188, in generate_with_langchain_docs
    return self.generate(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 330, in generate
    query_distribution = query_distribution or default_query_distribution(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/__init__.py", line 30, in default_query_distribution
    if query.get_node_clusters(kg):
AttributeError: 'MultiHopSpecificQuerySynthesizer' object has no attribute 'get_node_clusters'
  1. This error is coming after commenting out MultiHopSpecificQuerySynthesizer.
Applying HeadlineSplitter:   0%|                                                                                                                                                  | 0/12 [00:00<?, ?it/s]unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
Applying SummaryExtractor:  33%|██████████████████████████████████████████████▎                                                                                            | 2/6 [00:02<00:04,  1.13s/it]Property 'summary' already exists in node '563d1b'. Skipping!
Applying [EmbeddingExtractor, ThemesExtractor, NERExtractor]:   0%|                                                                                                               | 0/30 [00:00<?, ?it/s]Property 'summary_embedding' already exists in node '563d1b'. Skipping!
Generating personas:   0%|                                                                                                                                                         | 0/3 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1576, in _request
    response = await self._client.send(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpx/_client.py", line 1631, in send
    response = await self._send_handling_auth(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpx/_client.py", line 1659, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpx/_client.py", line 1696, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpx/_client.py", line 1732, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpx/_transports/default.py", line 394, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 256, in handle_async_request
    raise exc from None
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 236, in handle_async_request
    response = await connection.handle_async_request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/connection.py", line 103, in handle_async_request
    return await self._connection.handle_async_request(request)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/http11.py", line 135, in handle_async_request
    await self._response_closed()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/http11.py", line 250, in _response_closed
    await self.aclose()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_async/http11.py", line 258, in aclose
    await self._network_stream.aclose()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 53, in aclose
    await self._stream.aclose()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/anyio/streams/tls.py", line 201, in aclose
    await self.transport_stream.aclose()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1287, in aclose
    self._transport.close()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/selector_events.py", line 706, in close
    self._loop.call_soon(self._call_connection_lost, None)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
    self._check_closed()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/test_generation.py", line 21, in <module>
    dataset = generator.generate_with_langchain_docs(docs, testset_size=10)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 188, in generate_with_langchain_docs
    return self.generate(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 369, in generate
    self.persona_list = generate_personas_from_kg(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/persona.py", line 145, in generate_personas_from_kg
    persona_list = run_async_batch(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 226, in run_async_batch
    return executor.results()
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 200, in results
    results = asyncio.run(self._process_jobs())
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 140, in _process_jobs
    result = await future
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 45, in sema_coro
    return await coro
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 96, in wrapped_callable_async
    raise e
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 92, in wrapped_callable_async
    result = await callable(*args, **kwargs)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/prompt/pydantic_prompt.py", line 125, in generate
    output_single = await self.generate_multiple(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/prompt/pydantic_prompt.py", line 185, in generate_multiple
    resp = await llm.generate(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/llms/base.py", line 100, in generate
    result = await agenerate_text_with_retry(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 189, in async_wrapped
    return await copy(fn, *args, **kwargs)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 111, in __call__
    do = await self.iter(retry_state=retry_state)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 153, in iter
    result = await action(retry_state)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner
    return call(*args, **kwargs)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/__init__.py", line 398, in <lambda>
    self._add_action_func(lambda rs: rs.outcome.result())
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 114, in __call__
    result = await fn(*args, **kwargs)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/llms/base.py", line 223, in agenerate_text
    return await self.langchain_llm.agenerate_prompt(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 796, in agenerate_prompt
    return await self.agenerate(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 756, in agenerate
    raise exceptions[0]
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 924, in _agenerate_with_cache
    result = await self._agenerate(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/langchain_openai/chat_models/base.py", line 825, in _agenerate
    response = await self.async_client.create(**payload)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1661, in create
    return await self._post(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1843, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1537, in request
    return await self._request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1600, in _request
    return await self._retry_request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1670, in _retry_request
    return await self._request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1600, in _request
    return await self._retry_request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1670, in _retry_request
    return await self._request(
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/openai/_base_client.py", line 1610, in _request
    raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.
  1. Issue with the MultiHopAbstractQuerySynthesizer:
Applying HeadlineSplitter:   0%|                                                                                                                                                  | 0/12 [00:00<?, ?it/s]unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
Applying SummaryExtractor:  50%|█████████████████████████████████████████████████████████████████████▌                                                                     | 3/6 [00:03<00:02,  1.16it/s]Property 'summary' already exists in node '1526e0'. Skipping!
Applying [EmbeddingExtractor, ThemesExtractor, NERExtractor]:   0%|                                                                                                               | 0/30 [00:00<?, ?it/s]Property 'summary_embedding' already exists in node '1526e0'. Skipping!
Generating personas: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00,  1.67it/s]
Generating Scenarios:   0%|                                                                                                                                                        | 0/2 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/test_generation.py", line 21, in <module>
    dataset = generator.generate_with_langchain_docs(docs, testset_size=10)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 188, in generate_with_langchain_docs
    return self.generate(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 413, in generate
    raise e
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/generate.py", line 410, in generate
    scenario_sample_list: t.List[t.List[BaseScenario]] = exec.results()
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 205, in results
    results = asyncio.run(self._process_jobs())
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/nest_asyncio.py", line 30, in run
    return loop.run_until_complete(task)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 143, in _process_jobs
    result = await future
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 48, in sema_coro
    return await coro
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 99, in wrapped_callable_async
    raise e
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/executor.py", line 95, in wrapped_callable_async
    result = await callable(*args, **kwargs)
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/base.py", line 94, in generate_scenarios
    scenarios = await self._generate_scenarios(
  File "/Users/ayushthakur/integrations/ragas_repo/ragas-exp/ragas/src/ragas/testset/synthesizers/multi_hop/abstract.py", line 119, in _generate_scenarios
    persona_item_mapping=persona_concepts.mappping,
  File "/Users/ayushthakur/miniconda3/envs/ragas-dev-main/lib/python3.10/site-packages/pydantic/main.py", line 856, in __getattr__
    raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}')
AttributeError: 'PersonaThemesMapping' object has no attribute 'mappping'. Did you mean: 'mapping'?

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

cc: @jjmachan do let me know if I am missing anything while running test generation. If not and my analysis is right, I can open the PR to at least fix the executor.py file (I need to test the fix on a jupyter session though.)

@ayulockin ayulockin added the bug Something isn't working label Dec 2, 2024
@dosubot dosubot bot added the module-testsetgen Module testset generation label Dec 2, 2024
@jjmachan
Copy link
Member

jjmachan commented Dec 2, 2024

they @ayulockin thanks a lot for this detailed issue - do make the PR and we can fix this over there.

What I'm wondering is why the tests for executor don't catch this? I might have missed something there but these are the tests if it helps you

thanks a lot 🙂

@shahules786
Copy link
Member

Hey @ayulockin I have raised a fix for this #1733

jjmachan pushed a commit that referenced this issue Dec 5, 2024
I went through hoops the last hr to try a few things to fix the issue
documented in #1718 (except `MultiHopAbstractQuerySynthesizer` and
`MultiHopSpecificQuerySynthesizer`).

This PR proposes the most trivial fix. But it fixes this test generation
issue:

```
from langchain_community.document_loaders import DirectoryLoader
from ragas.testset import TestsetGenerator

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings

path = "Sample_Docs_Markdown/"
loader = DirectoryLoader(path, glob="**/*.md")
docs = loader.load()

generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings)
dataset = generator.generate_with_langchain_docs(docs, testset_size=10)
```

Also both `test_executor.py` and `test_executor_in_jupyter.ipynb` are
passing.

cc: @jjmachan
jjmachan pushed a commit that referenced this issue Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module-testsetgen Module testset generation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants