admin管理员组文章数量:1534191
今天有收到测试同事提的一个ANR的问题单,大家在平时的开发过程中应该都有碰到过这样的问题,尤其是跑monkey,各种错误,之前我们项目还有出现过在Fragment当中直接调用getActivity空指针的,monkey真是厉害啊!什么样的错误都能搞出来。碰到这种问题,一般来说都是应用进程内部处理耗时太久导致的,当然也有其他原因,大家可以在网上查一下,其他场景的我就不说了,今天只说我现在碰到的。
问题单的场景是这样的,跑monkey时,手机上原生的google地图应用出现ANR了,首先呢,我们来看一下trace.txt文件:
----- pid 26814 at 2016-07-06 19:13:28 -----
Cmd line: com.google.android.apps.maps
Build fingerprint: 'Letv/Le1s_IN/X3_HK:6.0/DHXOSOP5846507024D/1467449287:user/release-keys'
ABI: 'arm64'
Build type: optimized
Zygote loaded classes=4117 post zygote classes=4590
Intern table: 61923 strong; 25 weak
JNI: CheckJNI is off; globals=325 (plus 383 weak)
Libraries: /system/app/Maps/Maps.apk!/lib/arm64-v8a/libgmm-jni.so /system/lib64/libandroid.so /system/lib64/libcompiler_rt.so /system/lib64/libjavacrypto.so /system/lib64/libjnigraphics.so /system/lib64/libmedia_jni.so /system/lib64/libwebviewchromium_loader.so /system/priv-app/GmsCore/GmsCore.apk!/lib/arm64-v8a/libconscrypt_gmscore_jni.so /system/priv-app/GmsCore/GmsCore.apk!/lib/arm64-v8a/libgmscore.so libjavacore.so (10)
Heap: 1% free, 22MB/22MB; 86210 objects
suspend all histogram:Sum: 20.196ms 99% C.I. 3us-9267.200us Avg: 420.750us Max: 10277us
DALVIK THREADS (30):
"main" prio=5 tid=1 Runnable
| group="main" sCount=0 dsCount=0 obj=0x75dfe570 self=0x7faf047400
| sysTid=26814 nice=0 cgrp=default sched=0/0 handle=0x7fb45992c0
| state=R schedstat=( 1308888726 426333689 2604 ) utm=96 stm=34 core=1 HZ=100
| stack=0x7fecdee000-0x7fecdf0000 stackSize=8MB
| held mutexes= "mutator lock"(shared held)
at libcore.io.Memory.peekInt(Memory.java:160)
at java.nio.MemoryBlock.peekInt(MemoryBlock.java:246)
at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:232)
at com.android.dex.Dex$Section.readInt(Dex.java:529)
at libcore.reflect.AnnotationAccess.getAnnotationSetOffset(AnnotationAccess.java:221)
at libcore.reflect.AnnotationAccess.getAnnotationReader(AnnotationAccess.java:529)
at libcore.reflect.AnnotationAccess.isAnonymousClass(AnnotationAccess.java:506)
at java.lang.Class.isAnonymousClass(Class.java:1236)
at java.lang.Class.getSimpleName(Class.java:1081)
at com.google.android.apps.gmm.l.y.<clinit>(PG:76)
at com.google.android.apps.gmm.l.ab.a(PG:41)
at com.google.android.apps.gmm.l.ao.a(PG:1047)
at com.google.android.apps.gmm.l.b.b.a(PG:1023)
at a.a.b.a(PG:46)
- locked <0x0b294a9a> (a a.a.b)
at com.google.android.apps.gmm.base.w.c.b(PG:171)
at com.google.android.apps.gmm.l.b.a.a(PG:1031)
at a.a.b.a(PG:46)
- locked <0x026bdfcb> (a a.a.b)
at com.google.android.apps.gmm.base.b.h.onNewIntent(PG:1088)
at android.app.Instrumentation.callActivityOnNewIntent(Instrumentation.java:1219)
at android.app.Instrumentation.callActivityOnNewIntent(Instrumentation.java:1231)
at android.app.ActivityThread.deliverNewIntents(ActivityThread.java:2796)
at android.app.ActivityThread.performResumeActivity(ActivityThread.java:3370)
at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:3470)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2730)
at android.app.ActivityThread.-wrap12(ActivityThread.java:-1)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1572)
at android.os.Handler.dispatchMessage(Handler.java:111)
at android.os.Looper.loop(Looper.java:207)
at android.app.ActivityThread.main(ActivityThread.java:5896)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:789)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:679)
"Signal Catcher" daemon prio=5 tid=2 Runnable
| group="system" sCount=0 dsCount=0 obj=0x12c230a0 self=0x7fa8e52000
| sysTid=26821 nice=0 cgrp=default sched=0/0 handle=0x7fae6fe440
| state=R schedstat=( 7360537 222154 15 ) utm=0 stm=0 core=2 HZ=100
从这个文件当中,第一行我们可以看到当前发生ANR的进程包名,com.google.android.apps.maps就是我们的谷歌地图,后边是当前的调用栈信息,具体怎么看,我就不详细说了,大家请自行百度。从第一条主线程可以看出我们应用中的调用执行流程,整个的一个调用过程都很清楚。其次我们来看一下,system_app_anr@1467803621120文件:
Process: com.google.android.apps.maps
Flags: 0x38d83e45
Package: com.google.android.apps.maps v926101224 (9.26.1)
Activity: com.google.android.apps.maps/com.google.android.maps.MapsActivity
Subject: Input dispatching timed out (Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)
Build: Letv/Le1s_IN/X3_HK:6.0/DHXOSOP5846507024D/1467449287:user/release-keys
Android time :[2016-07-06 19:13:28.00] [103113.336]
CPU usage from 84623ms to 29ms ago:
31% 27729/com.google.android.googlequicksearchbox:search: 29% user + 2.1% kernel / faults: 75909 minor 4974 major
28% 18431/system_server: 17% user + 10% kernel / faults: 150254 minor 7422 major
17% 13882/com.yahoo.mobile.client.android.weather: 9.2% user + 8.4% kernel / faults: 256489 minor 6471 major
14% 192/mmcqd/0: 0% user + 14% kernel
11% 18062/surfaceflinger: 3.7% user + 7.8% kernel / faults: 15025 minor 314 major
6% 8656/com.google.android.gms: 5.1% user + 0.8% kernel / faults: 61837 minor 1800 major
…………………………
Android time :[2016-07-06 19:13:38.06] [103123.398]
CPU usage from 1ms to 8900ms later:
62% 18431/system_server: 31% user + 31% kernel / faults: 13821 minor 449 major
32% 27729/com.google.android.googlequicksearchbox:search: 25% user + 6.5% kernel / faults: 8209 minor 375 major
21% 13882/com.yahoo.mobile.client.android.weather: 8.3% user + 13% kernel / faults: 16612 minor 404 major
13% 26814/com.google.android.apps.maps: 8.7% user + 5% kernel / faults: 17040 minor 1139 major
10% 8656/zygote64: 8.6% user + 2.2% kernel / faults: 5543 minor 141 major
这个文件当中清楚的记录着发生ANR的时间点,说明系统是在19:13:38.06时间点发生并记录下ANR日志的,最后再来看一下我们的log日志文件,所有的main系统日志都记录在这里的,这个文件也是我们分析的主要目标,我们首先根据system_app_anr文件当中ANR的时间点找到对应的日志:
07-06 19:13:38.135 3371 3380 E Parcel : Reading a NULL string not supported here.
07-06 19:13:38.138 18431 18459 E ANRManager: ANR in com.google.android.apps.maps (com.google.android.apps.maps/com.google.android.maps.MapsActivity), time=103113235
07-06 19:13:38.138 18431 18459 E ANRManager: Reason: Input dispatching timed out (Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)
07-06 19:13:38.138 18431 18459 E ANRManager: Load: 39.27 / 36.57 / 35.24
07-06 19:13:38.138 18431 18459 E ANRManager: Android time :[2016-07-06 19:13:38.06] [103123.398]
很明显,是在140157行,紧跟着这行的前面几行记录了发生ANR的原因,这里的描述也非常清楚,Reason: Input dispatching timed out (Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)这句日志的打印是在ActivityManagerService当中的appNotResponding方法的5626行拼接打印出来的,appNotResponding方法的调用代码如下:
final void appNotResponding(ProcessRecord app, ActivityRecord activity,
ActivityRecord parent, boolean aboveSystem, final String annotation) {
ArrayList
firstPids = new ArrayList
(5);
SparseArray
lastPids = new SparseArray
(20);
//[+LEUI][RUBY-13467]zhangp add dump focued window at monkey
if (ActivityManager.isUserAMonkey() && mWindowManagerService.getFocusedWindowToken() == null) {
ActivityRecord ar = mStackSupervisor.getFocusedStack().mResumedActivity;
if (ar != null) {
Slog.i(TAG, "ANR ***** Current resumed activity window status as below *****");
mWindowManager.dumpWindowStatus(ar.appToken);
} else {
Slog.i(TAG, "ANR ***** Current resumed activity is null *****");
}
if (activity != null && ar != null && ar != activity) {
Slog.i(TAG, "ANR ***** Current anr activity window status as below *****");
mWindowManager.dumpWindowStatus(activity.appToken);
} else {
Slog.i(TAG, "ANR ***** Current anr activity cannot dump *****");
}
}
//[+LEUI]end
if (mController != null) {
try {
// 0 == continue, -1 = kill process immediately
int res = mController.appEarlyNotResponding(app.processName, app.pid, annotation);
if (res < 0 && app.pid != MY_PID) {
app.kill("anr", true);
}
} catch (RemoteException e) {
mController = null;
Watchdog.getInstance().setActivityController(null);
}
}
long anrTime = SystemClock.uptimeMillis();
if (MONITOR_CPU_USAGE) {
updateCpuStatsNow();
}
synchronized (this) {
// PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.
if (mShuttingDown) {
Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);
return;
} else if (app.notResponding) {
Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);
return;
} else if (app.crashing) {
Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);
return;
}
// In case we come through here for the same app before completing
// this one, mark as anring now so we will bail out.
app.notResponding = true;
// Log the ANR to the event log.
EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
app.processName, app.info.flags, annotation);
// Dump thread traces as quickly as we can, starting with "interesting" processes.
firstPids.add(app.pid);
int parentPid = app.pid;
if (parent != null && parent.app != null && parent.app.pid > 0) parentPid = parent.app.pid;
if (parentPid != app.pid) firstPids.add(parentPid);
if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);
for (int i = mLruProcesses.size() - 1; i >= 0; i--) {
ProcessRecord r = mLruProcesses.get(i);
if (r != null && r.thread != null) {
int pid = r.pid;
if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {
if (r.persistent) {
firstPids.add(pid);
} else {
lastPids.put(pid, Boolean.TRUE);
}
}
}
}
}
// Log the ANR to the main log.
StringBuilder info = new StringBuilder();
info.setLength(0);
info.append("ANR in ").append(app.processName);
if (activity != null && activity.shortComponentName != null) {
info.append(" (").append(activity.shortComponentName).append(")");
}
info.append("\n");
info.append("PID: ").append(app.pid).append("\n");
if (annotation != null) {
info.append("Reason: ").append(annotation).append("\n");
}
if (parent != null && parent != activity) {
info.append("Parent: ").append(parent.shortComponentName).append("\n");
}
final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,
NATIVE_STACKS_OF_INTEREST);
//[+LEUI][XIII-13037]add
long percent = processCpuTracker.getCpuPercentByName("system_server");
if(DEBUG_TRACEVIEW && percent >= 50 ){
Slog.d(TAG, " debug traceview opened, auto save traceview file because system high cpu:"+percent);
asyncOpenTraceView(false);
}//[+LEUI]end
String cpuInfo = null;
if (MONITOR_CPU_USAGE) {
updateCpuStatsNow();
synchronized (mProcessCpuTracker) {
cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);
}
info.append(processCpuTracker.printCurrentLoad());
info.append(cpuInfo);
}
info.append(processCpuTracker.printCurrentState(anrTime));
Slog.e(TAG, info.toString());
if (tracesFile == null) {
// There is no trace file, so dump (only) the alleged culprit's threads to the log
Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
}
addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
cpuInfo, tracesFile, null);
if (mController != null) {
try {
// 0 == show dialog, 1 = keep waiting, -1 = kill process immediately
int res = mController.appNotResponding(app.processName, app.pid, info.toString());
if (res != 0) {
if (res < 0 && app.pid != MY_PID) {
app.kill("anr", true);
} else {
synchronized (this) {
mServices.scheduleServiceTimeoutLocked(app);
}
}
return;
}
} catch (RemoteException e) {
mController = null;
Watchdog.getInstance().setActivityController(null);
}
}
// Unless configured otherwise, swallow ANRs in background processes & kill the process.
boolean showBackground = Settings.Secure.getInt(mContext.getContentResolver(),
Settings.Secure.ANR_SHOW_BACKGROUND, 0) != 0;
synchronized (this) {
mBatteryStatsService.noteProcessAnr(app.processName, app.uid);
if (!showBackground && !app.isInterestingToUserLocked() && app.pid != MY_PID) {
app.kill("bg anr", true);
return;
}
// Set the app's notResponding state, and look up the errorReportReceiver
makeAppNotRespondingLocked(app,
activity != null ? activity.shortComponentName : null,
annotation != null ? "ANR " + annotation : "ANR",
info.toString());
//Set the trace file name to app name + current date format to avoid overrinding trace file
String tracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null);
if (tracesPath != null && tracesPath.length() != 0) {
File traceRenameFile = new File(tracesPath);
String newTracesPath;
int lpos = tracesPath.lastIndexOf (".");
if (-1 != lpos)
newTracesPath = tracesPath.substring (0, lpos) + "_" + app.processName + "_" + mTraceDateFormat.format(new Date()) + tracesPath.substring (lpos);
else
newTracesPath = tracesPath + "_" + app.processName;
traceRenameFile.renameTo(new File(newTracesPath));
}
// Bring up the infamous App Not Responding dialog
Message msg = Message.obtain();
HashMap
map = new HashMap
(); msg.what = SHOW_NOT_RESPONDING_MSG; msg.obj = map; msg.arg1 = aboveSystem ? 1 : 0; map.put("app", app); if (activity != null) { map.put("activity", activity); } mUiHandler.sendMessage(msg); } }
当然,括号当中的原因reason是外边传进来的,我们往上追踪一下,此处的调用是源于ActivityManagerService当中的inputDispatchingTimedOut来的,当然具体的原因reason还不是在这里出现的,此方法的代码如下:
/**
* Handle input dispatching timeouts.
* Returns whether input dispatching should be aborted or not.
*/
public boolean inputDispatchingTimedOut(final ProcessRecord proc,
final ActivityRecord activity, final ActivityRecord parent,
final boolean aboveSystem, String reason) {
if (checkCallingPermission(android.Manifest.permission.FILTER_EVENTS)
!= PackageManager.PERMISSION_GRANTED) {
throw new SecurityException("Requires permission "
+ android.Manifest.permission.FILTER_EVENTS);
}
final String annotation;
if (reason == null) {
annotation = "Input dispatching timed out";
} else {
annotation = "Input dispatching timed out (" + reason + ")";
}
if (proc != null) {
synchronized (this) {
if (proc.debugging) {
return false;
}
if (mDidDexOpt) {
// Give more time since we were dexopting.
mDidDexOpt = false;
return false;
}
if (proc.instrumentationClass != null) {
Bundle info = new Bundle();
info.putString("shortMsg", "keyDispatchingTimedOut");
info.putString("longMsg", annotation);
finishInstrumentationLocked(proc, Activity.RESULT_CANCELED, info);
return true;
}
}
mHandler.post(new Runnable() {
@Override
public void run() {
appNotResponding(proc, activity, parent, aboveSystem, annotation);
}
});
}
return true;
}
此方法的处理当中拼接了一部分我们最终看到的日志当中的Input dispatching timed out这部分,上面传过来的reason如果为空,那么原因就只有这段了,如果不为空,那就将reason原因拼接在括号里边。proc就是我们当前的进程,它是一个ProcessRecord对象,一般都是不为空的,然后呢就往ActivityManagerService的类变量mHandler中添加了一个Runable消息,最终就会执行这个Runable对象的run方法,就调用到我们上面了。
那好,我们继续往上追,inputDispatchingTimedOut方法是在InputMonitor类的notifyANR方法当中被调用的,这个方法的代码如下:
/* Notifies the window manager about an application that is not responding.
* Returns a new timeout to continue waiting in nanoseconds, or 0 to abort dispatch.
*
* Called by the InputManager.
*/
@Override
public long notifyANR(InputApplicationHandle inputApplicationHandle,
InputWindowHandle inputWindowHandle, String reason) {
AppWindowToken appWindowToken = null;
WindowState windowState = null;
boolean aboveSystem = false;
synchronized (mService.mWindowMap) {
if (inputWindowHandle != null) {
windowState = (WindowState) inputWindowHandle.windowState;
if (windowState != null) {
appWindowToken = windowState.mAppToken;
}
}
if (appWindowToken == null && inputApplicationHandle != null) {
appWindowToken = (AppWindowToken)inputApplicationHandle.appWindowToken;
}
if (windowState != null) {
Slog.i(WindowManagerService.TAG, "Input event dispatching timed out "
+ "sending to " + windowState.mAttrs.getTitle()
+ ". Reason: " + reason);
// Figure out whether this window is layered above system windows.
// We need to do this here to help the activity manager know how to
// layer its ANR dialog.
int systemAlertLayer = mService.mPolicy.windowTypeToLayerLw(
WindowManager.LayoutParams.TYPE_SYSTEM_ALERT);
aboveSystem = windowState.mBaseLayer > systemAlertLayer;
} else if (appWindowToken != null) {
Slog.i(WindowManagerService.TAG, "Input event dispatching timed out "
+ "sending to application " + appWindowToken.stringName
+ ". Reason: " + reason);
} else {
Slog.i(WindowManagerService.TAG, "Input event dispatching timed out "
+ ". Reason: " + reason);
}
mService.saveANRStateLocked(appWindowToken, windowState, reason);
}
if (appWindowToken != null && appWindowToken.appToken != null) {
try {
// Notify the activity manager about the timeout and let it decide whether
// to abort dispatching or keep waiting.
boolean abort = appWindowToken.appToken.keyDispatchingTimedOut(reason);
if (! abort) {
// The activity manager declined to abort dispatching.
// Wait a bit longer and timeout again later.
return appWindowToken.inputDispatchingTimeoutNanos;
}
} catch (RemoteException ex) {
}
} else if (windowState != null) {
try {
// Notify the activity manager about the timeout and let it decide whether
// to abort dispatching or keep waiting.
long timeout = ActivityManagerNative.getDefault().inputDispatchingTimedOut(
windowState.mSession.mPid, aboveSystem, reason);
if (timeout >= 0) {
// The activity manager declined to abort dispatching.
// Wait a bit longer and timeout again later.
return timeout;
}
} catch (RemoteException ex) {
}
}
return 0; // abort dispatching
}
我们当前的场景中,执行逻辑是进入到140行的else/if分支当中了,它是直接通过binder进程间调用来调用ActivityManagerService当中的inputDispatchingTimedOut方法的,其实这里也不是进程间调用了,而是同一进程,因为当前的InputMonitor和ActivityManagerService都是在系统进程SystemServer当中的,好了,我们再往上追。这里的notifyANR方法其实是在InputManagerService的同名方法notifyANR被调用的,这里的代码很简单,只有一句:
private long notifyANR(InputApplicationHandle inputApplicationHandle,
InputWindowHandle inputWindowHandle, String reason) {
return mWindowManagerCallbacks.notifyANR(
inputApplicationHandle, inputWindowHandle, reason);
}
mWindowManagerCallbacks是一个WindowManagerCallbacks对象,WindowManagerCallbacks是一个接口,实现类就是我们上面分析的InputMonitor,所以才会进入到InputMonitor的notifyANR方法当中。从代码的注释中很明显可以看出来,此处的调用是从native层过来的,因为Android的事件最根上都是由Linux驱动都检测到的,所以都是从那里一层一层分发上来的,那么中间肯定是要通过JNI了,从文件名称,我们就可以拼出来,JNI的文件名称肯定是com_android_server_input_InputManagerService.cpp,打开这个文件,我们可以看到定义了一个gServiceClassInfo结构体,当中都是调用Java层的方法名称,这也是Android当中JNI代码的习惯,我们在很多地方都可以找到这样的代码,我们来看一下具体的定义:
static struct {
jmethodID notifyConfigurationChanged;
jmethodID notifyInputDevicesChanged;
jmethodID notifySwitch;
jmethodID notifyInputChannelBroken;
jmethodID notifyANR;
jmethodID filterInputEvent;
jmethodID interceptKeyBeforeQueueing;
jmethodID interceptMotionBeforeQueueingWhenScreenOff;
jmethodID interceptKeyBeforeDispatching;
jmethodID dispatchUnhandledKey;
jmethodID checkInjectEventsPermission;
jmethodID getVirtualKeyQuietTimeMillis;
jmethodID getExcludedDeviceNames;
jmethodID getKeyRepeatTimeout;
jmethodID getKeyRepeatDelay;
jmethodID getHoverTapTimeout;
jmethodID getHoverTapSlop;
jmethodID getDoubleTapTimeout;
jmethodID getLongPressTimeout;
jmethodID getPointerLayer;
jmethodID getPointerIcon;
jmethodID getKeyboardLayoutOverlay;
jmethodID getDeviceAlias;
} gServiceClassInfo;
static struct {
jclass clazz;
} gInputDeviceClassInfo;
static struct {
jclass clazz;
} gKeyEventClassInfo;
static struct {
jclass clazz;
} gMotionEventClassInfo;
// --- Global functions ---
template
inline static T min(const T& a, const T& b) {
return a < b ? a : b;
}
template
inline static T max(const T& a, const T& b) {
return a > b ? a : b;
}
static jobject getInputApplicationHandleObjLocalRef(JNIEnv* env,
const sp
& inputApplicationHandle) {
if (inputApplicationHandle == NULL) {
return NULL;
}
return static_cast
(inputApplicationHandle.get())->
getInputApplicationHandleObjLocalRef(env);
}
static jobject getInputWindowHandleObjLocalRef(JNIEnv* env,
const sp
& inputWindowHandle) { if (inputWindowHandle == NULL) { return NULL; } return static_cast
(inputWindowHandle.get())-> getInputWindowHandleObjLocalRef(env); } static void loadSystemIconAsSprite(JNIEnv* env, jobject contextObj, int32_t style, SpriteIcon* outSpriteIcon) { PointerIcon pointerIcon; status_t status = android_view_PointerIcon_loadSystemIcon(env, contextObj, style, &pointerIcon); if (!status) { pointerIcon.bitmap.copyTo(&outSpriteIcon->bitmap, SkBitmap::kARGB_8888_Config); outSpriteIcon->hotSpotX = pointerIcon.hotSpotX; outSpriteIcon->hotSpotY = pointerIcon.hotSpotY; } } enum { WM_ACTION_PASS_TO_USER = 1, WM_ACTION_WAKE_UP = 2, WM_ACTION_GO_TO_SLEEP = 4, }; // --- NativeInputManager --- class NativeInputManager : public virtual RefBase, public virtual InputReaderPolicyInterface, public virtual InputDispatcherPolicyInterface, public virtual PointerControllerPolicyInterface { protected: virtual ~NativeInputManager(); public: NativeInputManager(jobject contextObj, jobject serviceObj, const sp
& looper); inline sp
getInputManager() const { return mInputManager; } void dump(String8& dump); void setDisplayViewport(bool external, const DisplayViewport& viewport); status_t registerInputChannel(JNIEnv* env, const sp
& inputChannel, const sp
& inputWindowHandle, bool monitor); status_t unregisterInputChannel(JNIEnv* env, const sp
& inputChannel); void setInputWindows(JNIEnv* env, jobjectArray windowHandleObjArray); void setFocusedApplication(JNIEnv* env, jobject applicationHandleObj); void setInputDispatchMode(bool enabled, bool frozen); void setSystemUiVisibility(int32_t visibility); void setPointerSpeed(int32_t speed); void setShowTouches(bool enabled); /* --- InputReaderPolicyInterface implementation --- */ virtual void getReaderConfiguration(InputReaderConfiguration* outConfig); virtual sp
obtainPointerController(int32_t deviceId); virtual void notifyInputDevicesChanged(const Vector
& inputDevices); virtual sp
getKeyboardLayoutOverlay(const String8& inputDeviceDescriptor); virtual String8 getDeviceAlias(const InputDeviceIdentifier& identifier); /* --- InputDispatcherPolicyInterface implementation --- */ virtual void notifySwitch(nsecs_t when, uint32_t switchValues, uint32_t switchMask, uint32_t policyFlags); virtual void notifyConfigurationChanged(nsecs_t when); virtual nsecs_t notifyANR(const sp
& inputApplicationHandle, const sp
& inputWindowHandle, const String8& reason); virtual void notifyInputChannelBroken(const sp
& inputWindowHandle); virtual bool filterInputEvent(const InputEvent* inputEvent, uint32_t policyFlags); virtual void getDispatcherConfiguration(InputDispatcherConfiguration* outConfig); virtual bool isKeyRepeatEnabled(); virtual void interceptKeyBeforeQueueing(const KeyEvent* keyEvent, uint32_t& policyFlags); virtual void interceptMotionBeforeQueueing(nsecs_t when, uint32_t& policyFlags); virtual nsecs_t interceptKeyBeforeDispatching( const sp
& inputWindowHandle, const KeyEvent* keyEvent, uint32_t policyFlags); virtual bool dispatchUnhandledKey(const sp
& inputWindowHandle, const KeyEvent* keyEvent, uint32_t policyFlags, KeyEvent* outFallbackKeyEvent); virtual void pokeUserActivity(nsecs_t eventTime, int32_t eventType); virtual bool checkInjectEventsPermissionNonReentrant( int32_t injectorPid, int32_t injectorUid); /* --- PointerControllerPolicyInterface implementation --- */ virtual void loadPointerResources(PointerResources* outResources); private: sp
mInputManager; jobject mContextObj; jobject mServiceObj; sp
mLooper; Mutex mLock; struct Locked { // Display size information. DisplayViewport internalViewport; DisplayViewport externalViewport; // System UI visibility. int32_t systemUiVisibility; // Pointer speed. int32_t pointerSpeed; // True if pointer gestures are enabled. bool pointerGesturesEnabled; // Show touches feature enable/disable. bool showTouches; // Sprite controller singleton, created on first use. sp
spriteController; // Pointer controller singleton, created and destroyed as needed. wp
pointerController; } mLocked; void updateInactivityTimeoutLocked(const sp
& controller); void handleInterceptActions(jint wmActions, nsecs_t when, uint32_t& policyFlags); void ensureSpriteControllerLocked(); // Power manager interactions. bool isScreenOn(); bool isScreenBright(); static bool checkAndClearExceptionFromCallback(JNIEnv* env, const char* methodName); static inline JNIEnv* jniEnv() { return AndroidRuntime::getJNIEnv(); } };
对notifyANR方法的调用是在NativeInputManager::notifyANR方法当中的,代码如下:
nsecs_t NativeInputManager::notifyANR(const sp
& inputApplicationHandle,
const sp
& inputWindowHandle, const String8& reason) {
#if DEBUG_INPUT_DISPATCHER_POLICY
ALOGD("notifyANR");
#endif
JNIEnv* env = jniEnv();
jobject inputApplicationHandleObj =
getInputApplicationHandleObjLocalRef(env, inputApplicationHandle);
jobject inputWindowHandleObj =
getInputWindowHandleObjLocalRef(env, inputWindowHandle);
jstring reasonObj = env->NewStringUTF(reason.string());
jlong newTimeout = env->CallLongMethod(mServiceObj,
gServiceClassInfo.notifyANR, inputApplicationHandleObj, inputWindowHandleObj,
reasonObj);
if (checkAndClearExceptionFromCallback(env, "notifyANR")) {
newTimeout = 0; // abort dispatch
} else {
assert(newTimeout >= 0);
}
env->DeleteLocalRef(reasonObj);
env->DeleteLocalRef(inputWindowHandleObj);
env->DeleteLocalRef(inputApplicationHandleObj);
return newTimeout;
}
这里传入的inputApplicationHandleObj、inputWindowHandleObj对象和我们Java层的InputApplicationHandle inputApplicationHandle、InputWindowHandle inputWindowHandle两个对象是一一对应的,在Android源码当中,我们见到比较多的,应该都是从Java层调用的C++的,像这样从C++反过来调用Java确实还不是很多,当然这里就是一个例子了,大家也可以看到这样的调用方式,都是用我们当前的JNIEnv*指针来调用。执行此方法的是一个NativeInputManager对象,它的notifyANR也是重写了纯虚类InputDispatcherPolicyInterface的notifyANR方法的,我们可以来看一下InputDispatcherPolicyInterface类的定义,在InputDispatcher.h当中:
class InputDispatcherPolicyInterface : public virtual RefBase {
protected:
InputDispatcherPolicyInterface() { }
virtual ~InputDispatcherPolicyInterface() { }
public:
/* Notifies the system that a configuration change has occurred. */
virtual void notifyConfigurationChanged(nsecs_t when) = 0;
/* Notifies the system that an application is not responding.
* Returns a new timeout to continue waiting, or 0 to abort dispatch. */
virtual nsecs_t notifyANR(const sp
& inputApplicationHandle,
const sp
& inputWindowHandle,
const String8& reason) = 0;
/* Notifies the system that an input channel is unrecoverably broken. */
virtual void notifyInputChannelBroken(const sp
& inputWindowHandle) = 0;
/* Gets the input dispatcher configuration. */
virtual void getDispatcherConfiguration(InputDispatcherConfiguration* outConfig) = 0;
/* Returns true if automatic key repeating is enabled. */
virtual bool isKeyRepeatEnabled() = 0;
/* Filters an input event.
* Return true to dispatch the event unmodified, false to consume the event.
* A filter can also transform and inject events later by passing POLICY_FLAG_FILTERED
* to injectInputEvent.
*/
virtual bool filterInputEvent(const InputEvent* inputEvent, uint32_t policyFlags) = 0;
/* Intercepts a key event immediately before queueing it.
* The policy can use this method as an opportunity to perform power management functions
* and early event preprocessing such as updating policy flags.
*
* This method is expected to set the POLICY_FLAG_PASS_TO_USER policy flag if the event
* should be dispatched to applications.
*/
virtual void interceptKeyBeforeQueueing(const KeyEvent* keyEvent, uint32_t& policyFlags) = 0;
/* Intercepts a touch, trackball or other motion event before queueing it.
* The policy can use this method as an opportunity to perform power management functions
* and early event preprocessing such as updating policy flags.
*
* This method is expected to set the POLICY_FLAG_PASS_TO_USER policy flag if the event
* should be dispatched to applications.
*/
virtual void interceptMotionBeforeQueueing(nsecs_t when, uint32_t& policyFlags) = 0;
/* Allows the policy a chance to intercept a key before dispatching. */
virtual nsecs_t interceptKeyBeforeDispatching(const sp
& inputWindowHandle,
const KeyEvent* keyEvent, uint32_t policyFlags) = 0;
/* Allows the policy a chance to perform default processing for an unhandled key.
* Returns an alternate keycode to redispatch as a fallback, or 0 to give up. */
virtual bool dispatchUnhandledKey(const sp
& inputWindowHandle, const KeyEvent* keyEvent, uint32_t policyFlags, KeyEvent* outFallbackKeyEvent) = 0; /* Notifies the policy about switch events. */ virtual void notifySwitch(nsecs_t when, uint32_t switchValues, uint32_t switchMask, uint32_t policyFlags) = 0; /* Poke user activity for an event dispatched to a window. */ virtual void pokeUserActivity(nsecs_t eventTime, int32_t eventType) = 0; /* Checks whether a given application pid/uid has permission to inject input events * into other applications. * * This method is special in that its implementation promises to be non-reentrant and * is safe to call while holding other locks. (Most other methods make no such guarantees!) */ virtual bool checkInjectEventsPermissionNonReentrant( int32_t injectorPid, int32_t injectorUid) = 0; };
好,到这里呢,还没有看到我们的reason,还需要继续往上跟踪,调用InputDispatcherPolicyInterface的notifyANR是在InputDispatcher当中的,看到这,是不是感觉离我们的终点越来越近了?它的调用是在InputDispatcher的doNotifyANRLockedInterruptible方法当中的,方法代码如下:
void InputDispatcher::doNotifyANRLockedInterruptible(
CommandEntry* commandEntry) {
mLock.unlock();
nsecs_t newTimeout = mPolicy->notifyANR(
commandEntry->inputApplicationHandle, commandEntry->inputWindowHandle,
commandEntry->reason);
mLock.lock();
resumeAfterTargetsNotReadyTimeoutLocked(newTimeout,
commandEntry->inputWindowHandle != NULL
? commandEntry->inputWindowHandle->getInputChannel() : NULL);
}
mPolicy是InputDispatcher的一个强引用的类变量,类型就是InputDispatcherPolicyInterface,调用它的notifyANR时候,会由实现方法NativeInputManager来执行。传入的参数三个参数inputApplicationHandle、inputWindowHandle、reason都通过一个CommandEntry*指针的类变量来获取的,CommandEntry*指针是在InputDispatcher::onANRLocked方法构建好的,InputDispatcher::onANRLocked方法的代码调用如下:
void InputDispatcher::onANRLocked(
nsecs_t currentTime, const sp
& applicationHandle,
const sp
& windowHandle,
nsecs_t eventTime, nsecs_t waitStartTime, const char* reason) {
float dispatchLatency = (currentTime - eventTime) * 0.000001f;
float waitDuration = (currentTime - waitStartTime) * 0.000001f;
ALOGI("Application is not responding: %s. "
"It has been %0.1fms since event, %0.1fms since wait started. Reason: %s",
getApplicationWindowLabelLocked(applicationHandle, windowHandle).string(),
dispatchLatency, waitDuration, reason);
// Capture a record of the InputDispatcher state at the time of the ANR.
time_t t = time(NULL);
struct tm tm;
localtime_r(&t, &tm);
char timestr[64];
strftime(timestr, sizeof(timestr), "%F %T", &tm);
mLastANRState.clear();
mLastANRState.append(INDENT "ANR:\n");
mLastANRState.appendFormat(INDENT2 "Time: %s\n", timestr);
mLastANRState.appendFormat(INDENT2 "Window: %s\n",
getApplicationWindowLabelLocked(applicationHandle, windowHandle).string());
mLastANRState.appendFormat(INDENT2 "DispatchLatency: %0.1fms\n", dispatchLatency);
mLastANRState.appendFormat(INDENT2 "WaitDuration: %0.1fms\n", waitDuration);
mLastANRState.appendFormat(INDENT2 "Reason: %s\n", reason);
dumpDispatchStateLocked(mLastANRState);
CommandEntry* commandEntry = postCommandLocked(
& InputDispatcher::doNotifyANRLockedInterruptible);
commandEntry->inputApplicationHandle = applicationHandle;
commandEntry->inputWindowHandle = windowHandle;
commandEntry->reason = reason;
}
这种调用方式,就像我们构建一个命令,然后执行这个命令一样,这里可以看到给命令传入的方法名就是doNotifyANRLockedInterruptible,在当前的InputDispatcher.cpp当中,我们可以直接搜索我们最终看到的原因:Waiting because no window has focus……,可以看到,我们最终看到的原因就是在这里产生的,然后一步步往下传,最后到ActivityManagerService当中打印出来的,还有两步调用就比较简单了,就是findFocusedWindowTargetsLocked方法当中会调用handleTargetsNotReadyLocked方法,转而调用onANRLocked,这里将两个方法的代码贴出来,具体的场景就不分析了,哇,看一下时间,0:06了!!夜深了,老婆叫我上床呢!!呵呵……
int32_t InputDispatcher::handleTargetsNotReadyLocked(nsecs_t currentTime,
const EventEntry* entry,
const sp
& applicationHandle,
const sp
& windowHandle,
nsecs_t* nextWakeupTime, const char* reason) {
if (applicationHandle == NULL && windowHandle == NULL) {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for system to become ready for input. Reason: %s", reason);
#endif
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationHandle.clear();
}
} else {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for application to become ready for input: %s. Reason: %s",
getApplicationWindowLabelLocked(applicationHandle, windowHandle).string(),
reason);
#endif
nsecs_t timeout;
if (windowHandle != NULL) {
timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else if (applicationHandle != NULL) {
timeout = applicationHandle->getDispatchingTimeout(
DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else {
timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
}
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = currentTime + timeout;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationHandle.clear();
if (windowHandle != NULL) {
mInputTargetWaitApplicationHandle = windowHandle->inputApplicationHandle;
}
if (mInputTargetWaitApplicationHandle == NULL && applicationHandle != NULL) {
mInputTargetWaitApplicationHandle = applicationHandle;
}
}
}
if (mInputTargetWaitTimeoutExpired) {
return INPUT_EVENT_INJECTION_TIMED_OUT;
}
if (currentTime >= mInputTargetWaitTimeoutTime) {
onANRLocked(currentTime, applicationHandle, windowHandle,
entry->eventTime, mInputTargetWaitStartTime, reason);
// Force poll loop to wake up immediately on next iteration once we get the
// ANR response back from the policy.
*nextWakeupTime = LONG_LONG_MIN;
return INPUT_EVENT_INJECTION_PENDING;
} else {
// Force poll loop to wake up when timeout is due.
if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
*nextWakeupTime = mInputTargetWaitTimeoutTime;
}
return INPUT_EVENT_INJECTION_PENDING;
}
}
reason的原因最根上是从InputDispatcher::findFocusedWindowTargetsLocked方法的1237行传进来的,是从C++层调到我们Java层来的,这个方法的代码如下:
int32_t InputDispatcher::findFocusedWindowTargetsLocked(nsecs_t currentTime,
const EventEntry* entry, Vector
& inputTargets, nsecs_t* nextWakeupTime) {
int32_t injectionResult;
String8 reason;
// If there is no currently focused window and no focused application
// then drop the event.
if (mFocusedWindowHandle == NULL) {
if (mFocusedApplicationHandle != NULL) {
injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
mFocusedApplicationHandle, NULL, nextWakeupTime,
"Waiting because no window has focus but there is a "
"focused application that may eventually add a window "
"when it finishes starting up.");
goto Unresponsive;
}
ALOGI("Dropping event because there is no focused window or focused application.");
injectionResult = INPUT_EVENT_INJECTION_FAILED;
goto Failed;
}
// Check permissions.
if (! checkInjectionPermission(mFocusedWindowHandle, entry->injectionState)) {
injectionResult = INPUT_EVENT_INJECTION_PERMISSION_DENIED;
goto Failed;
}
// Check whether the window is ready for more input.
reason = checkWindowReadyForMoreInputLocked(currentTime,
mFocusedWindowHandle, entry, "focused");
if (!reason.isEmpty()) {
injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
mFocusedApplicationHandle, mFocusedWindowHandle, nextWakeupTime, reason.string());
goto Unresponsive;
}
// Success! Output targets.
injectionResult = INPUT_EVENT_INJECTION_SUCCEEDED;
addWindowTargetLocked(mFocusedWindowHandle,
InputTarget::FLAG_FOREGROUND | InputTarget::FLAG_DISPATCH_AS_IS, BitSet32(0),
inputTargets);
// Done.
Failed:
Unresponsive:
nsecs_t timeSpentWaitingForApplication = getTimeSpentWaitingForApplicationLocked(currentTime);
updateDispatchStatisticsLocked(currentTime, entry,
injectionResult, timeSpentWaitingForApplication);
#if DEBUG_FOCUS
ALOGD("findFocusedWindow finished: injectionResult=%d, "
"timeSpentWaitingForApplication=%0.1fms",
injectionResult, timeSpentWaitingForApplication / 1000000.0);
#endif
return injectionResult;
}
从这个方法的名字就很容易可以判断出,它是要寻找当前的焦点窗口。Android的事件分发流程当中,就是要将当前产生的事件分发到我们的焦点窗口上,从一开始的注释If there is no currently focused window and no focused application then drop the event.,也很容易明白,我们要找的焦点窗口肯定是在当前激活的进程当中的,否则事件分发就乱了,如果当前的激活进程不为空,但是焦点窗口为空,则要丢弃这个事件,因为焦点窗口才是我们事件分发的最终目标,目标都没有了,那就根本不用分发了。我们的问题单当中出现的也正是这样的原因。那么为什么焦点窗口没有呢?
回想一下我们前面讲过的Activity窗口的添加过程,一个窗口的添加是在ActivityThread当中的handleResumeActivity方法的wm.addView(decor, l)这句代码,最终调用到WindowManagerService中的addWindow方法才为我们添加上窗口的,那我们顺着这个思路,从这个时间点往前看一下,寻找ActivityThread关键字,可以找到如下信息:
07-06 19:13:21.008 26814 26814 V ActivityThread: Handling launch of ActivityRecord{ed5fa7c token=android.os.BinderProxy@b559f05 {com.google.android.apps.maps/com.google.android.maps.MapsActivity}} startsNotResumed=false
07-06 19:13:21.008 13882 16209 W SyncManager: The last updated time millis for woeid [12712251] was inserted with [1467803598024]
07-06 19:13:21.010 13882 26014 D NativeCrypto: ssl=0x7fa8e7bf00 NativeCrypto_SSL_get_certificate => NULL
07-06 19:13:21.015 13882 13882 D ActivityThread: BDC-RECEIVER handled : 0 / ReceiverData{intent=Intent { act=com.yahoo.mobile.client.android.weathersdk.action.ACTION_WEATHER_FORECAST_CHANGED flg=0x10 pkg=com.yahoo.mobile.client.android.weather cmp=com.yahoo.mobile.client.android.weather/.service.NotificationReceiver (has extras) } packageName=com.yahoo.mobile.client.android.weather resultCode=0 resultData=null resultExtras=null}
在这里的时候,应用主线程已经开始启动我们的目标窗口了,那肯定是进到我们应用当中了,那窗口为什么没有呢?再联系一下trace.txt当中的调用栈信息,肯定是我们在Activity的onResume方法当中逻辑耗时太久,导致WindowManagerService没有成功添加Activity的窗口,所以事件找不到目标,无法分发,所以产生ANR了。好了,这是我们的一个判断,那我们还需要找证据。事件分发,我们找一下InputDispatcher关键字,可以找到如下的信息:
07-06 19:13:27.781 18431 19430 D SettingsInterface: from settings cache , name = sys_storage_threshold_max_bytes , value = null
07-06 19:13:27.813 18431 18597 I InputDispatcher: Application is not responding: AppWindowToken{34356a0 token=Token{8c7dc0f ActivityRecord{32a3d6e u0 com.google.android.apps.maps/com.google.android.maps.MapsActivity, isShadow:false t2134}}}. It has been 8004.7ms since event, 8002.9ms since wait started. Reason: Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.
07-06 19:13:27.827 18431 18597 I WindowManager: Input event dispatching timed out sending to application AppWindowToken{34356a0 token=Token{8c7dc0f ActivityRecord{32a3d6e u0 com.google.android.apps.maps/com.google.android.maps.MapsActivity, isShadow:false t2134}}}. Reason: Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.
可以明显看到,在开始创建窗口,到大约6秒后,系统已经开始分发事件了,此时已经发现没有目标窗口,系统此时已经提醒我们一次了,当然后边呢,目标窗口还是没有创建好,最终就导致ANR了,所以方法就很明确了,就是在我们应用的onNewIntent方法启动Activity时(Instrumentation.callActivityOnNewIntent),执行了耗时逻辑,最终导致了ANR的发生,因为这里的应用是google地图,没有源码,所以无法往下分析,也只是给大家点个思路,大家以后如果有这样的bug,可以顺着这个思路展开。
无意在网上看到一些比较好的博客,也放在这里,推荐大家看看:
Android 系统稳定性 - ANR(一)
Android 系统稳定性 - ANR(二)
Android 系统稳定性 - ANR(三)
Android 系统稳定性 - OOM(一)
Android 系统稳定性 - OOM(二)
Android 性能优化(一)Android 系统稳定性 - Watchdog
Thread.java中定义的状态 | Thread.cpp中定义的状态 | 说明 |
TERMINATED | ZOMBIE | 线程死亡,终止运行 |
RUNNABLE | RUNNING/RUNNABLE | 线程可运行或正在运行 |
TIMED_WAITING | TIMED_WAIT | 执行了带有超时参数的wait、sleep或join函数 |
BLOCKED | MONITOR | 线程阻塞,等待获取对象锁 |
WAITING | WAIT | 执行了无超时参数的wait函数 |
NEW | INITIALIZING | 新建,正在初始化,为其分配资源 |
NEW | STARTING | 新建,正在启动 |
RUNNABLE | NATIVE | 正在执行JNI本地函数 |
WAITING | VMWAIT | 正在等待VM资源 |
RUNNABLE | SUSPENDED | 线程暂停,通常是由于GC或debug被暂停 |
| UNKNOWN | 未知状态 |
版权声明:本文标题:Android ANR问题原因分析 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://m.elefans.com/dongtai/1726874811a1088225.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论