Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace命令添加JVM MONITORENTRY的耗时跟踪 #2004

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mikawudi
Copy link
Contributor

目前trace命令只能跟踪到增强函数内部的methodInvoke,但是除了方法调用耗时,很多情况下是synchronized(obj)进入临界区的耗时,如果函数内有多段临界区,没发准确感知进入每个临界区的耗时
使用bytekit的AtSyncEntry来实现对进入临界区的耗时跟踪,trace命令添加参数skipMonitor来决定是否启用,默认为true

@hengyunabc
Copy link
Collaborator

个人感觉,synchronized的耗时场景是比较少遇到的,给所有的trace类增强 增加了复杂度,感觉不值得😂。

@mikawudi
Copy link
Contributor Author

mikawudi commented Oct 19, 2021

嗯,那我把--skipMonitor 默认设置为true可否?
在skipMonitor的情况下,在transform的时候不会对monitorentry进行增强,不影响原先逻辑
//具体参考com.taobao.arthas.core.advisor.Enhancer #160

主要是在跟踪kafka client的时候,里面一个函数有好几个synchronized块,
buffer满的时候经常性synchronized阻塞,然后用现在的trace完全不知道到底是哪个块阻塞.....
于是萌生了给synchronized加上耗时监控,
从而定位到这种多块synchronized的时候知道哪个synchronized块发生阻塞的功能撒

@mikawudi
Copy link
Contributor Author

另外,还有点想加入monitorenter之后,到monitorexit之前的耗时监控,从而知道到底是传哪些参数的情况下会在临界区占用大量时间导致进入临界区阻塞撒...大佬可以评估价这个的价值~要是意义不大我就不提pr了哈😂

@hengyunabc
Copy link
Collaborator

嗯,那我把--skipMonitor 默认设置为true可否? 在skipMonitor的情况下,在transform的时候不会对monitorentry进行增强,不影响原先逻辑 //具体参考com.taobao.arthas.core.advisor.Enhancer #160

主要是在跟踪kafka client的时候,里面一个函数有好几个synchronized块, buffer满的时候经常性synchronized阻塞,然后用现在的trace完全不知道到底是哪个块阻塞..... 于是萌生了给synchronized加上耗时监控, 从而定位到这种多块synchronized的时候知道哪个synchronized块发生阻塞的功能撒

最好给出具体输出结果,或者提供一个测试demo程序。

@mikawudi
Copy link
Contributor Author

好的,我补个demo代码和输出截图,弄好贴这边~

@mikawudi
Copy link
Contributor Author

mikawudi commented Oct 19, 2021

-----------简单贴下测试demo-------------------

private static Deque<Object> deque1 = new ArrayDeque<Object>();
private static Deque<Object> deque2 = new ArrayDeque<Object>();
//假设为用户耗时敏感操作
private static void doSthWithManySyncBlock() {
    doFunc1();
    synchronized (deque1) {
        System.out.println("do sth for deque1");
    }
    doFunc2();
    synchronized (deque2) {
        System.out.println("do sth for deque2");
    }
    doFunc3();
}
public static void main(String[] args) throws InterruptedException {
	new Thread(new Runnable() {
	    @Override
	    public void run() {
	        Random random = new Random();
	        while (true) {
	            synchronized (deque1) {
	                //模拟在同步代码块中耗时长的坏味道代码
	                sleep(1);
	            }
	            sleep(Math.abs(random.nextInt())%10);
	        }
	    }
	}).start();
	while (true) {
        doSthWithManySyncBlock();
        TimeUnit.SECONDS.sleep(1);
    }
}

-------------------------原trace输出--------------------
t1

在没有对monitor进行耗时监控的时候,执行trace,只能知道三个子函数调用并不耗时,即使jad反编译了,或者看源码了,也不知道是
第一个synchronized块阻塞还是第二个synchronized块阻塞,还是两个都有
--------------------------对monitorenter进行增强后--------------------------
t2

使用--skipMonitor false参数后,可以很明显一眼看出,阻塞都是发生在第一个synchronized(),后续根据代码,查找deque1引用来找到坏味道的代码块的位置

并且,--skipMonitor true的情况下,并不会使用@AtSyncEnter进行增强,和原版的trace命令是一样的增强逻辑,零性能损耗
另外,按bytekit的增强逻辑,其实如果trace的目标中不存在synchronized的话,其实等于没有进行增强,如果有synchronized,按我提交的增强处理逻辑,其实和方法调用增强的性能损耗应该是一致的,这是我把skipMonitor默认设置为false的原因

@mikawudi
Copy link
Contributor Author

最耗时显示的红色高亮好像有点BUG,刚开始没注意,我修复完commit一下

@@ -71,6 +73,7 @@
private final AdviceListener listener;
private final boolean isTracing;
private final boolean skipJDKTrace;
private final boolean skipMonitor;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个skipMonitor感觉容易让用户迷惑,不如直接叫 skipSync ?

final InvokeTraceable listener = (InvokeTraceable) adviceListener;
listener.beforeEntrySync(classLoader, clazz, methodName, methodDesc, lineNumber);
} catch (Throwable e) {
logger.error("class: {}, invokeInfo: {}", clazz.getName(), methodInfo, e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的日志不应该是 invokeInfo ,应该是 methodInfo

@@ -94,6 +95,13 @@ public void setSkipJDKTrace(boolean skipJDKTrace) {
this.skipJDKTrace = skipJDKTrace;
}

@Option(longName = "skipMonitor")
@DefaultValue("true")
@Description("skip MONITORENTRY command trace, default value false.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MONITORENTRY 尽管在字节码层是对的,但用户不容易理解。可能叫 sync block会更好点。

//render method name
if (node instanceof MethodNode) {
MethodNode methodNode = (MethodNode) node;
//clazz.getName() + ":" + method.getName() + "()"
sb.append(methodNode.getClassName()).append(":").append(methodNode.getMethodName()).append("()");
sb.append("[type:method_call] ").append(methodNode.getClassName()).append(":").append(methodNode.getMethodName()).append("()");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有点多余了,原来的输出没啥问题,用户直观就知道是函数调用。

@@ -109,6 +118,12 @@ private void renderNode(StringBuilder sb, TraceNode node, Ansi highlighted) {
.append(" #").append(throwNode.getLineNumber())
.append(" [").append(throwNode.getMessage()).append("]");

} else if (node instanceof SyncNode) {
SyncNode syncNode = (SyncNode)node;
sb.append("[type:").append(MONITOR_ENTER_TYPE).append("] ").append(syncNode).append(syncNode.getClassName()).append(":").append(syncNode.getMethodName());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不如直接输出类似: [sync block] xxxMethod() #13 ,不需要输出长长的类名等信息。

@hengyunabc
Copy link
Collaborator

另外,这个demo不是用户直接粘到ide里就能跑的,最好把完整可运行代码贴上来。

@mikawudi
Copy link
Contributor Author

好的,前两周在工作交接没注意github的事情,这两天我修改一下~

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@24kpure
Copy link

24kpure commented May 5, 2022

Cool !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants