Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addon interferes with Azure portal #747

Closed
MohitM90 opened this issue Sep 5, 2021 · 16 comments · Fixed by #761
Closed

Addon interferes with Azure portal #747

MohitM90 opened this issue Sep 5, 2021 · 16 comments · Fixed by #761

Comments

@MohitM90
Copy link

MohitM90 commented Sep 5, 2021

Hi,
I'm using the 10ten Japanese Reader (Rikaichamp) addon Version 1.2.3 in Microsoft Edge Version 93.0.961.38. Recently, it started to cause problems with the Azure portal (portal.azure.com). When the addon is enabled, the workflow of Logic Apps is not visualized. They are only visualized when the addon is disabled. It was working fine a few weeks ago.

Example:
10ten addon disabled:
10ten disabled

10ten addon enabled:
10ten enabled

I have no idea why it is behaving like that. I have disabled all other addons. The website only behaves like that when 10ten is enabled...

@birtles
Copy link
Member

birtles commented Sep 5, 2021

Hi @MohitM90, thank you very much for filing this bug report!

That's very frustrating that 10ten interferes like that. I wonder if you would be able to open the developer console (F12 then go to "Console") and take a screenshot of what you see? There may be some errors there.

(My guess is that the Azure portal sees the messages sent by 10ten between iframes and panics because it does not recognize them.)

@MohitM90
Copy link
Author

MohitM90 commented Sep 6, 2021

Indeed, there is an iframe-related error that does not occur when the addon is disabled.

When it is enabled, I can see the following error in the DevTools:
10ten enabled_devtools

Without 10ten, I see this:
10ten disabled_devtools

@birtles
Copy link
Member

birtles commented Sep 6, 2021

Wow! I just tried creating a Logic Apps resource myself to test this and got the same error:

image

@birtles
Copy link
Member

birtles commented Sep 6, 2021

Thank you for the confirmation. I think we need to rework the iframe handling to send messages via the background page instead.

That's going to be a bit of work but hopefully it works and will be more robust.

@MohitM90
Copy link
Author

MohitM90 commented Sep 6, 2021

Thank you for the quick response and looking forward to the fix! 🙂

@birtles
Copy link
Member

birtles commented Sep 6, 2021

I started looking into this and it looks quite hard.

In particular, if we go via the background page, even if we have a frameId, mapping that to an actual HTMLIFrameElement is quite tricky. We can try to match on the src of the iframe, but it's quite possible to have multiple iframes with the same src. Furthermore, after an iframe is navigated its src will differ from the src its parent has. And when we have nested cross-origin iframes we can't even iterate through all the candidates from the root.

For some cases we might be able to disambiguate based on the fact that at least in Firefox, the frameIds appear to be based on load order and the ordering of frames within a window appear to be based on when they were inserted in to the document (spec ref) but again, that doesn't help with nested iframes.

In fact, even just determining which iframe is the topmost is hard since we previously relied on using postMessage to determine that.

This is going to take much more thought but we should bear in mind that the case where the top-most frame with the extension loaded is not the root window (with frameId 0) is very uncommon and basically only occurs in Firefox as best I can tell. In Chrome etc. if the topmost window is not eligible for the content script, none of the descendant iframes that get it even if they would otherwise be eligible.

So we might be able to simplify the problem by just assuming top-most window === top-most extension frame === frameId: 0 and then adding a workaround for the Firefox+Live TL case (e.g. if we don't get enable? message with frameId 0 for a given tab, then telling the frame with the lowest frameId that it is topmost via a config update).

But there still remain plenty of other problems like matching frameIds to HTMLIFrameElement objects. In the same-origin case we can reach up to our frameElement and set a data attribute there after the background page tells us our frameId but that's not going to help with the cross-origin case.


As a completely different approach, given that the two sites where we suspect this is causing problems are English sites, perhaps we just need to avoid calling postMessage unless we encounter Japanese text. i.e. stop calling postMessage on startup to determine if we are the top-most iframe and use some other mechanism.

@birtles
Copy link
Member

birtles commented Sep 6, 2021

As a completely different approach, given that the two sites where we suspect this is causing problems are English sites, perhaps we just need to avoid calling postMessage unless we encounter Japanese text. i.e. stop calling postMessage on startup to determine if we are the top-most iframe and use some other mechanism.

A couple of quick notes after looking into this approach: we'll need to be careful to not send the initial isPopupShown query and we'll also need to avoid sending redundant popupHidden messages. Finally, we'll have to be quite careful about detecting Japanese text accurately in order to avoid sending speculative lookup messages.

Update: Add we need to stop firing moveEarth events for every mousemove event that occurs in iframes too -- perhaps #746 can help with that.

Update #2: Stopped firing moveEarth events in 48aabf8

@birtles
Copy link
Member

birtles commented Sep 6, 2021

On further thought, I think the "just try not to send postMessage messages unless we encounter Japanese text" approach is problematic because:

  • It's hard to ensure we don't regress it. It's probably at least a day's work just to write a suitable automated test for that and even then it will probably miss some cases
  • If someone simply uses the Japanese localization of the Azure portal, they'll likely hit the same problem
  • Any Japanese site that chokes on unexpected messages will still break

So we're better off to try to avoid using postMessage at all, if we can find a way to make that work.

@birtles
Copy link
Member

birtles commented Sep 6, 2021

I have a rough plan for approaching this but it's going to take a bit of work so I'll try to get onto it tomorrow.

@birtles
Copy link
Member

birtles commented Sep 7, 2021

Started on this today and boy is it a lot of work. I'm working on it over in the no-postmessage branch but it will take several days.

Here's my rough plan:

  • Use the background page for a determining the top-most window
  • Convert lookup, highlightText, and clearHighlighText to use the background page
  • Convert common content messages (clearResult, nextDictionary, toggleDefinition, movePopup, enterCopyMode, exitCopyMode, nextCopyEntry, copyCurrentEntry) to use the background page
  • Convert popup hidden status messages (popupHidden, popupShown, isPopupShown) to use the background page
  • Convert moonMoved message to use the background page

It's quite a risky change and will need quite a bit of testing but I don't see any other way at the moment.

The first two items are the hardest/riskiest so if we clear those, the rest should be manageable although the last one has pretty bad failure behaviour (i.e. when we can't successfully locate the target frame, the puck just stops looking up content for that frame or even looks it up in the wrong frame. That should only happen on some combinations of cross-origin iframes which already don't work in activeTab mode, i.e. Safari, so maybe it's not such a big deal.)

@birtles
Copy link
Member

birtles commented Sep 9, 2021

I spoke with @shirakaba yesterday who suggested we try hooking window.addEventListener to hide our events from the host page.

I gave this a try on the postmessage-proxy branch but it turns out this doesn't work because the content script runs in its own isolated context so modifications to window / Window.prototype are isolated from the host page (see Sharing objects with page scripts).

I've tried using wrappedJSObject to circumvent this without luck and I notice that it only applies to Firefox in any case. It seems the general approach folks have used to work around this is to inject a <script> element into the document to do their bidding:

That seems even more invasive than what we currently do so I suspect we shouldn't go there.

Furthermore, that approach still has the difficulty that it only hooks at the point when the listener is added. If a page adds a listener before the content script is injected we'll have no way to intercept messages and hide them from the page.

The MDN docs for postMessage also seem to be encouraging extensions to use runtime.sendMessage instead so perhaps that's the way to go after all.

@birtles
Copy link
Member

birtles commented Sep 14, 2021

I implemented the final step today and while it works fine in test cases, it seems to struggle with the various cross-origin iframes on asahi.com, for example.

I'm a bit fuzzy on the details but I think what happens is that when we navigate an iframe in Firefox, at least, it ends up getting a new frameId (from memory when you navigate a tab, however, it doesn't get a new tabId however).

Now, we don't clear out old frames from our list of frames when an iframe is navigated. I don't recall why (we used to but I dropped that code at some point) except that I think it was a combination of (a) not being sure if all browsers behave the same way with regards to assigning new frameIds or not (I might have even tested that and found they don't, I forget), and (b) no explicit API for iterating through a tab's frames allowing us to perform clean up as an "idle task" the way we do for tabs. We do clear all the iframes when the top-most window is navigated, however.

On asahi.com there are plenty of iframes are doing all sorts of requests and, possibly, redirects. One way or another, we end up with a long list of frames with the same initialSrc and when we go to move the puck over an iframe on asahi.com we end up finding a number of matches and choosing, I think, a now-expired iframe.

We can possibly fix this by being more strict about managing frames and being more careful about clearing out navigated iframes. That would allow us to keep a consistent architecture for all inter-frame messages.

However, I'm not completely confident we could do this reliably, particularly not across all browsers and all kinds of navigations. Furthermore, we already know that for this particular message where we need to go from HTMLIFrameElement to frameId—the only message where we need to do this—our lookup method is never going to be perfect. If a page has a number of same-sized cross-origin iframes with the same source, we simply cannot reliably distinguish between them since to the parent page they all have the same attributes (same initial src, same width and height).

So I think I might tweak the approach to have a singular exception for the moonMoved message (which I plan to rename to puckMoved) and allow this message and this message only to use postMessage.

The adverse consequences of postMessage are mitigated by the fact this will only ever occur when using the puck so it will require user interaction first making any breakage more understandable and at least somewhat avoidable.

We can further ameliorate it by making our message listener listen during the capture phase and calling stopImmediatePropagation() on the event. We can't guarantee that we will get the message first, but hopefully we can at least guarantee no-one gets it after us.

If that still proves problematic we can either:

  • Use the background page for same-origin messages and only use postMessage for cross-origin messages to limit the blast radius, or
  • Try again to use the background page for this and track frames better (I will push the commit that includes this approach so we can revert to that if need be)

@birtles
Copy link
Member

birtles commented Sep 14, 2021

Ok I think I've finally got this all working with all the wrinkles ironed out.

I had a lot of trouble with Safari because it turns out it doesn't support sending to specific frames and so I had to add extra handling to ensure that frames ignored messages that weren't intended for them.

Test plan:

  • Test on Firefox with iframes test case, asahi.com and focussing in different frames
  • Test on Firefox with activeTab mode enabled
  • Test on Chrome with iframes test case and focussing in different frames
  • Test on Firefox that Logic Apps in Azure portal now works
  • Test on Safari with iframes test case and asahi.com
  • Test on iOS
  • Test that Live TL still works on Firefox

@birtles
Copy link
Member

birtles commented Sep 15, 2021

I've just submitted version 1.3.5 which includes a fix for this.

It is already available for Firefox, should be available for Chrome in ~1 day and for Edge in ~1 week (the Edge review process is quite slow).

@birtles
Copy link
Member

birtles commented Sep 22, 2021

@MohitM90 Sorry for not updating this issue sooner, but the Edge update took less time than expected and should have been available from about Sep 17 last week. Hopefully Azure portal should work by now.

@MohitM90
Copy link
Author

@MohitM90 Sorry for not updating this issue sooner, but the Edge update took less time than expected and should have been available from about Sep 17 last week. Hopefully Azure portal should work by now.

Yes, it works fine now. Thank you so much! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants