admin管理员组文章数量:1642351
binder机制导致的异常
- 概述
- 异常分析
- can't deliver broadcast
- FAILED BINDER TRANSACTION
- DeadObjectException
- TransactionTooLargeException
- DeadSystemException
概述
在平常程序运行过程中,可能碰到最多跟binder相关的异常是RemoteException,但本文只分析跟binder机制相关的异常,而RemoteException是server端逻辑导致的其它异常在client端的表现。
跟binder机制相关的异常有:android.app.RemoteServiceException: can’t deliver broadcast,JavaBinder: !!! FAILED BINDER TRANSACTION !!!,TransactionTooLargeException,DeadSystemException,DeadObjectException;有没有似曾相识的异常。
这些异常都跟上一遍文章Android进程间通信之binder - 几个重要数字 中的数字有扯不清的关系。
Android进程间通信之binder - 实战
Android进程间通信之binder - 几个重要数字
Android进程间通信之binder - debug transaction
Android进程间通信之binder - 重要工具aidl
Android进程间通信之binder - 上层协议IPCThreadState
Android进程间通信之binder - 工具类Parcel
异常分析
can’t deliver broadcast
Fatal Exception: android.app.RemoteServiceException: can't deliver broadcast
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1813)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:154)
at android.app.ActivityThread.main(ActivityThread.java:6776)
at java.lang.reflect.Method.invoke(Method.java)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1520)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1410)
第一眼看到这个堆栈是不是觉得这是系统问题,跟app没有关系,其实我第一次看见也是这么判断的,还执着的认为BroadcastQueue肯定是出问题了。
分析这个问题就带看源码?首先确定这个异常抛出的前后代码逻辑,我们一起看代码(本文代码都来自于aospxref android11):
void performReceiveLocked(ProcessRecord app, IIntentReceiver receiver,
Intent intent, int resultCode, String data, Bundle extras,
boolean ordered, boolean sticky, int sendingUser)
throws RemoteException {
// Send the intent to the receiver asynchronously using one-way binder calls.
if (app != null) {
if (app.thread != null) {
// If we have an app thread, do the call through that so it is
// correctly ordered with other one-way calls.
try {
app.thread.scheduleRegisteredReceiver(receiver, intent, resultCode,
data, extras, ordered, sticky, sendingUser, app.getReportedProcState());
// TODO: Uncomment this when (b/28322359) is fixed and we aren't getting
// DeadObjectException when the process isn't actually dead.
//} catch (DeadObjectException ex) {
// Failed to call into the process. It's dying so just let it die and move on.
// throw ex;
} catch (RemoteException ex) {
// Failed to call into the process. It's either dying or wedged. Kill it gently.
synchronized (mService) {
Slog.w(TAG, "Can't deliver broadcast to " + app.processName
+ " (pid " + app.pid + "). Crashing it.");
app.scheduleCrash("can't deliver broadcast");
}
throw ex;
}
} else {
// Application has died. Receiver doesn't exist.
throw new RemoteException("app.thread must not be null");
}
} else {
receiver.performReceive(intent, resultCode, data, extras, ordered,
sticky, sendingUser);
}
}
这个异常是执行scheduleRegisteredReceiver函数时抛出的RemoteException,是app.thread binder调用到server端时发生了异常,这儿的server不是system_server进程,这儿的server是指binder的server,在这儿server是app程序,(这儿衍生一个问题:这个ibinder对象是从什么地方赋值的?)既然是binder调用,肯定有一个aidl接口,找到接口描述;
frameworks/base/core/java/android/app/IApplicationThread.aidl
aidl中方法是一个正常的描述接口,第一眼看,这是一个在正常不过的binder同步调用,然而它却是一个异步调用,oneway关键字它写到哪儿了?
void scheduleRegisteredReceiver(IIntentReceiver receiver, in Intent intent,
int resultCode, in String data, in Bundle extras, boolean ordered,
boolean sticky, int sendingUser, int processState);
它的整个aidl接口统一定义为异步调用;
oneway interface IApplicationThread {
...
}
IApplicationThread接口中的函数都是oneway调用;
这里需要去了解binder async调用执行流程:
在这里衍生一个知识点,oneway binder在server端只有一个线程在执行,具体的可以看驱动代码;我们在后面的原理文章中会仔细去分析它。
这个接口的参数加起来也不是很大,但是怎么就超出512k了,其实就是之前的binder调用把空间给占了,同步binder和异步空间使用同一片区域。所以说这个问题还待看到底之前的binder调用是谁,传输了多少size。是不是执行耗时。
FAILED BINDER TRANSACTION
DeadObjectException
上面分析了原理,当前这个failed binder transaction也是binder 内存空间的问题。
E/JavaBinder: !!! FAILED BINDER TRANSACTION !!! (parcel size = 40084)
W/System.err: android.os.DeadObjectException: Transaction failed on small parcel; remote process probably died
W/System.err: at android.os.BinderProxy.transactNative(Native Method)
W/System.err: at android.os.BinderProxy.transact(Binder.java:764)
W/System.err: at c.t.myapplication.IMyAidlInterface$Stub$Proxy.failedBinderError(IMyAidlInterface.java:149)
W/System.err: at c.t.myapplication.MainActivity$2.run(MainActivity.java:47)
W/System.err: at java.lang.Thread.run(Thread.java:764)
产生这个异常的代码
//client端调用代码
int i = 0;
int[] val = new int[10000];
while (i<30) {
i++;
new Thread(new Runnable() {
@Override
public void run() {
try {
binder.failedBinderError(val);
} catch (RemoteException e) {
e.printStackTrace();
}
}
}).start();
}
//service端执行代码,用sleep模拟了一个耗时操作
@Override
public void failedBinderError(int[] val) throws RemoteException {
try {
Thread.sleep(1000000); //Simulated long time operation
} catch (InterruptedException e) {
e.printStackTrace();
}
}
TransactionTooLargeException
W/System.err: android.os.TransactionTooLargeException: data parcel size 1200084 bytes
W/System.err: at android.os.BinderProxy.transactNative(Native Method)
W/System.err: at android.os.BinderProxy.transact(Binder.java:764)
W/System.err: at c.t.myapplication.IMyAidlInterface$Stub$Proxy.failedBinderError(IMyAidlInterface.java:149)
W/System.err: at c.t.myapplication.MainActivity$2.run(MainActivity.java:47)
W/System.err: at java.lang.Thread.run(Thread.java:764)
为了方便对比,我把产生这个异常的代码也贴出来
//client调用代码
int i = 0;
int[] val = new int[300000];
while (i<30) {
i++;
new Thread(new Runnable() {
@Override
public void run() {
try {
binder.failedBinderError(val);
} catch (RemoteException e) {
e.printStackTrace();
}
}
}).start();
}
//service端代码跟上面是同一个函数
产生这两个异常的代码几乎一致,都是启动30个线程同时去调用binder方法,而同步调用的binder却是一个耗时方法,只要成功调用一次,binder service端的内存就会被占用,当重复到一定次数,内存耗尽后,在某一次调用中,binder调用返回失败; 大家还记得当前这个实例中,binder调用占用service的内存大小是多少吗?忘了可以回过头看看上一篇文章,binder的几个重要数字。
之所以产生了两个不同的异常,测试代码唯一不同的地方就是int 数组 val new的size不同;当前binder传输失败如果size小于200k就是第一个异常,如果当前传输大于200k,就抛出第二个异常。
直接看代码:
在binder调用失败后,执行此函数signalExceptionForError,针对具体error抛出相应的异常;本文只关心FAILED_TRANSACTION;
void signalExceptionForError(JNIEnv* env, jobject obj, status_t err,
bool canThrowRemoteException, int parcelSize)
{
switch (err) {
case UNKNOWN_ERROR:
jniThrowException(env, "java/lang/RuntimeException", "Unknown error");
break;
case NO_MEMORY:
jniThrowException(env, "java/lang/OutOfMemoryError", NULL);
break;
case INVALID_OPERATION:
jniThrowException(env, "java/lang/UnsupportedOperationException", NULL);
break;
case BAD_VALUE:
jniThrowException(env, "java/lang/IllegalArgumentException", NULL);
break;
case BAD_INDEX:
jniThrowException(env, "java/lang/IndexOutOfBoundsException", NULL);
break;
case BAD_TYPE:
jniThrowException(env, "java/lang/IllegalArgumentException", NULL);
break;
case NAME_NOT_FOUND:
jniThrowException(env, "java/util/NoSuchElementException", NULL);
break;
case PERMISSION_DENIED:
jniThrowException(env, "java/lang/SecurityException", NULL);
break;
case NOT_ENOUGH_DATA:
jniThrowException(env, "android/os/ParcelFormatException", "Not enough data");
break;
case NO_INIT:
jniThrowException(env, "java/lang/RuntimeException", "Not initialized");
break;
case ALREADY_EXISTS:
jniThrowException(env, "java/lang/RuntimeException", "Item already exists");
break;
case DEAD_OBJECT:
// DeadObjectException is a checked exception, only throw from certain methods.
jniThrowException(env, canThrowRemoteException
? "android/os/DeadObjectException"
: "java/lang/RuntimeException", NULL);
break;
case UNKNOWN_TRANSACTION:
jniThrowException(env, "java/lang/RuntimeException", "Unknown transaction code");
break;
case FAILED_TRANSACTION: {
ALOGE("!!! FAILED BINDER TRANSACTION !!! (parcel size = %d)", parcelSize);
const char* exceptionToThrow;
char msg[128];
// TransactionTooLargeException is a checked exception, only throw from certain methods.
// FIXME: Transaction too large is the most common reason for FAILED_TRANSACTION
// but it is not the only one. The Binder driver can return BR_FAILED_REPLY
// for other reasons also, such as if the transaction is malformed or
// refers to an FD that has been closed. We should change the driver
// to enable us to distinguish these cases in the future.
if (canThrowRemoteException && parcelSize > 200*1024) {
// bona fide large payload
exceptionToThrow = "android/os/TransactionTooLargeException";
snprintf(msg, sizeof(msg)-1, "data parcel size %d bytes", parcelSize);
} else {
// Heuristic: a payload smaller than this threshold "shouldn't" be too
// big, so it's probably some other, more subtle problem. In practice
// it seems to always mean that the remote process died while the binder
// transaction was already in flight.
exceptionToThrow = (canThrowRemoteException)
? "android/os/DeadObjectException"
: "java/lang/RuntimeException";
snprintf(msg, sizeof(msg)-1,
"Transaction failed on small parcel; remote process probably died");
}
jniThrowException(env, exceptionToThrow, msg);
} break;
case FDS_NOT_ALLOWED:
jniThrowException(env, "java/lang/RuntimeException",
"Not allowed to write file descriptors here");
break;
case UNEXPECTED_NULL:
jniThrowNullPointerException(env, NULL);
break;
case -EBADF:
jniThrowException(env, "java/lang/RuntimeException",
"Bad file descriptor");
break;
case -ENFILE:
jniThrowException(env, "java/lang/RuntimeException",
"File table overflow");
break;
case -EMFILE:
jniThrowException(env, "java/lang/RuntimeException",
"Too many open files");
break;
case -EFBIG:
jniThrowException(env, "java/lang/RuntimeException",
"File too large");
break;
case -ENOSPC:
jniThrowException(env, "java/lang/RuntimeException",
"No space left on device");
break;
case -ESPIPE:
jniThrowException(env, "java/lang/RuntimeException",
"Illegal seek");
break;
case -EROFS:
jniThrowException(env, "java/lang/RuntimeException",
"Read-only file system");
break;
case -EMLINK:
jniThrowException(env, "java/lang/RuntimeException",
"Too many links");
break;
default:
ALOGE("Unknown binder error code. 0x%" PRIx32, err);
String8 msg;
msg.appendFormat("Unknown binder error code. 0x%" PRIx32, err);
// RemoteException is a checked exception, only throw from certain methods.
jniThrowException(env, canThrowRemoteException
? "android/os/RemoteException" : "java/lang/RuntimeException", msg.string());
break;
}
}
这代码是不是不需要解释了,全是关键字,只需要将异常msg组织起来,调用jni给vm抛出异常。
DeadSystemException
/**
* Rethrow this exception when we know it came from the system server. This
* gives us an opportunity to throw a nice clean
* {@link DeadSystemException} signal to avoid spamming logs with
* misleading stack traces.
* <p>
* Apps making calls into the system server may end up persisting internal
* state or making security decisions based on the perceived success or
* failure of a call, or any default values returned. For this reason, we
* want to strongly throw when there was trouble with the transaction.
*
* @throws RuntimeException
*/
@NonNull
public RuntimeException rethrowFromSystemServer() {
if (this instanceof DeadObjectException) {
throw new RuntimeException(new DeadSystemException());
} else {
throw new RuntimeException(this);
}
}
在整个app的执行过程中,生命周期,显示view等都需要跟system_server提供的service(activity,window,package)通信,如果系统binder调用出现问题,你在分析log是可能纳闷,system_server从log看运行的很正常,但是app居然给你报告system已经death。
到这儿binder几个常见异常已经算是告一段落。接下来,我们聊聊,在遇到binder问题是如何去调试问题;找到问题root cause。
都看到这儿了,辛苦一下,给点个赞呗。。。。
版权声明:本文标题:Android进程间通信之binder - 可能导致的异常 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://m.elefans.com/dongtai/1729332620a1196587.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论