Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AI usage of newly open tabs #254

Merged
merged 5 commits into from
Jan 6, 2025
Merged

Conversation

rmarescu
Copy link
Member

@rmarescu rmarescu commented Jan 4, 2025

  • Automatically focus on newly open tabs
  • Remove hard-coded Mailosaur email address
    • The email is AW-specific and cannot be used by external contributors
    • Remove the need of having another ENV for the email by using a standard format for the email: shorters@<MAILOSAUR_SERVER_ID>.mailosaur.net
  • Improve README instructions for Mailosaur
  • Remove flaky test with a more deterministic one

@rmarescu rmarescu self-assigned this Jan 4, 2025
Copy link

vercel bot commented Jan 4, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
shortest ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 4, 2025 4:19pm

shortest("Verify that buttons on the landing page are rounded");
shortest("Login using this email: [email protected]");
shortest("Log in", { email: loginEmail });
shortest("Verify that the user can access the /dashboard page");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to work as the user with Mailosaur email doesn't have GitHub connected, and the AI is confused by the Dashboard page:

CleanShot 2025-01-03 at 20 11 52@2x

Unsure what would be a good way to fix it. Maybe change the test, or just delete for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete the test for now.

@rmarescu
Copy link
Member Author

rmarescu commented Jan 4, 2025

The test is failing as AI seems to be confused on using the new tab after clicking the Sign in button on the email tab.

During the run below, screenshot-2025-01-04T14-12-54-219Z.png is actually the email tab, and then is trying to use keyboard shortcuts to navigate tabs, which are Windows-specific.

Attempting to adjust the system prompt to handle this better.

CleanShot 2025-01-04 at 06 13 26@2x

@slavingia
Copy link
Contributor

Weird as it was working before with the old prompt I believe

@rmarescu
Copy link
Member Author

rmarescu commented Jan 4, 2025

I've executed the same test from main, and the behaviour is quite different. It passed 2/4 times:

Run 1: passed once it checked the email was received and had a link (did not click on it)

CleanShot 2025-01-04 at 06 44 24@2x

Runs 2&3: failed because couldn't switch to the new tab after clicking the email link (kept retrying by opening new tabs 5 times)

All the screenshots taken were of the email tab.

CleanShot 2025-01-04 at 06 47 51@2x

Run 4: passed after clicking the email link, didn't check the new page loaded

CleanShot 2025-01-04 at 06 53 57@2x


const loginEmail = `shortest@${process.env.MAILOSAUR_SERVER_ID}.mailosaur.net`;
shortest("Log in", { email: loginEmail }).expect(
"Check Manage Account page from user icon menu",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a chained test to test that the user is actually logged in.

- If you need to test a condition that involves seeing the contents of an email, use the "check_email" tool.
- For email validation, you MUST always use 'Click' and 'Mouse' action instead of using keyboard shortcuts.
- This tool will grab the latest email from the email address given to you and will render it in a new tab for you to see.
- Once you are done with validating the email, navigate back to the original tab.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since now we're automatically focusing on the new tab when opening a link, this is not needed. In my tests, AI couldn't navigate between tabs.

@@ -62,6 +62,12 @@ export class BrowserTool extends BaseBrowserTool {
this.viewport = { width: config.width, height: config.height };
this.testContext = config.testContext;

// Update active page reference to a newly opened tab
this.page.context().on("page", async (newPage) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue reported, where the AI didn't have the new tab, after clicking on the email link, as active.

@m2rads do you think this could cause any possible side effects?

@rmarescu rmarescu changed the title Remove hard-coded Mailosaur email address Improve AI usage of newly open tabs Jan 4, 2025
@rmarescu rmarescu requested a review from m2rads January 4, 2025 17:46
@slavingia slavingia merged commit a79d4f5 into main Jan 6, 2025
5 checks passed
@slavingia slavingia deleted the rmarescu/mailosaur-email branch January 6, 2025 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants