从 FrameTime 到 Stutter,一套衡量「卡顿」的判别语言——
它告诉你:高帧率为什么也能不流畅,以及,肉眼说"卡了"时,机器看到的是什么。
From FrameTime to Stutter — a discriminator language for the experience of lag.
Why high frame rates can still feel choppy, and what the machine sees the moment your eyes say it's stuttering.
the smallest unit of a frame
帧的最小单位
所有关于"流畅"的讨论,最终都得回到一个数字:两帧画面之间的间隔耗时。这就是 FrameTime,可以简单理解为"一帧画面渲染并最终被你看到所用的时间"。
但这里有一个常被忽略的区分——GPU 渲染完成 与 屏幕真正刷新 不是同一件事。GPU 把一帧画完了 (eglSwapBuffers),并不意味着这一帧已经被推到了你眼前。玩家最终看到的,是显示器 Display 的刷新节奏,而不是 GPU 的吐帧节奏。
Every conversation about smoothness eventually collapses into a single number: the time interval between two visible frames. That is FrameTime — simply, the time it takes for one frame to be rendered and ultimately reach your eyes.
But there is a distinction that gets quietly ignored — GPU render completion is not the same event as display refresh. The GPU finishing a frame (eglSwapBuffers) does not mean the frame has been pushed in front of you. What the player ultimately sees is the rhythm of the display, not the rhythm of the GPU.
the two faces of frame rate
帧率的两副面孔
FPS 通常被理解为"帧率":1 秒内的平均画面刷新次数。但这只是它的一面。它还有一个不被注意的孪生兄弟——瞬时帧率,也就是用单帧 FrameTime 反算出来的实时 FPS。
这两者之差,正是大量"明明帧率不低但就是觉得卡"的真相所在。
FPS is usually read as "frame rate": the average number of refreshes per second. But that's only one face. It has a quieter twin — instantaneous frame rate — computed back from a single frame's FrameTime.
The gap between these two numbers is exactly where the truth of "high FPS that still feels choppy" hides.
苹果在 WWDC18 给过一个被反复引用的对照案例:
Apple, at WWDC18, offered a now-classic comparison:
试图以 60 帧运行但实际只能到 40,FrameTime 跳变剧烈,单帧最高 117ms。
体感:一卡一顿(micro stuttering)。
Aiming for 60 fps but landing at 40. FrameTime jumps violently, peaking at 117ms.
Feel: stop-and-go (micro stuttering).
稳定锁帧 30,FrameTime 均匀维持 33ms。
体感:非常平滑。
Locked at 30 fps. FrameTime stays uniform at 33ms.
Feel: unmistakably smooth.
帧率高,未必流畅。
流畅的钥匙不是"更多帧",而是"更稳定的帧"。 WWDC 18 · Apple
A higher frame rate is no guarantee of smoothness.
The key to smooth isn't more frames — it's steadier ones. WWDC 18 · Apple
2012 年 Google I/O,Android 4.1 (Jelly Bean) 用了一个甜蜜的代号—— Project Butter,黄油计划。在它之前,UI 卡顿是 Android 的"原生味道";在它之后,"流畅"第一次被写进了系统级的设计目标。
At Google I/O 2012, Android 4.1 (Jelly Bean) shipped under a sweet codename — Project Butter. Before it, UI stutter was Android's signature flavour; after it, "smoothness" was inscribed into the OS design itself.
黄油计划把"流畅"翻译成了一个工程指标:以硬件 vsync 为锚,连续一次 vsync 没有新画面刷新,记一次 Jank。这是工业界第一次承认:流畅不是平均值的事,而是分布的事。
Project Butter turned "smoothness" into an engineering metric: anchored to the hardware vsync, one consecutive vsync without a new image counts as one Jank. It was the industry's first admission: smoothness isn't a matter of averages — it's a matter of distribution.
为了达成这条指标,Google 同时上了四件武器:
To meet that metric, Google deployed four weapons at once:
所有 UI 渲染对齐显示器的硬件刷新信号。GPU 不再"想画就画",而是在每个 vsync 节拍上提交画面,从源头消除画面撕裂。
All UI rendering aligns to the display's hardware refresh. The GPU no longer paints whenever it pleases; it commits one frame per vsync beat, eliminating tearing at the source.
在前/后双缓冲之上加第三个备用缓冲。当 GPU 偶尔慢一拍,显示器仍能从备用缓冲拿到新画面,避免 vsync 空打。
A spare buffer behind the front/back pair. When the GPU lags by a beat, the display still has a fresh frame ready — no empty vsync.
系统级的"心跳":把输入、动画、布局、绘制全部钉在 vsync 时间点上有序触发。每一帧都按"输入 → 动画 → 测量 → 绘制 → 上屏"的固定节拍走。
An OS-level metronome. Input, animation, layout, draw — all pinned to vsync moments and run in a fixed beat: input → animate → measure → draw → display, every frame.
手指触屏的瞬间,CPU 短暂提频,确保第一帧响应不会因为省电节流而错过 vsync 窗口——按下与画面变化之间,必须感觉不到延迟。
The instant a finger touches the screen, the CPU briefly boosts clock speed so the first response frame doesn't miss the vsync window because of power throttling — there must be no perceptible gap between press and reaction.
单 buffer 的"撕裂"听起来很抽象,其实非常具体:显示器是逐行扫描的,从屏幕顶端到底端用大约 16.67ms 完成一次刷新。如果在扫描的中途,缓冲被换成了下一帧的图像 —— 上半屏还是旧帧,下半屏已经是新帧 —— 你的眼睛就会看到一条裂缝。
"Tearing" sounds abstract — it isn't. The display draws line by line, top to bottom, taking ~16.67ms per refresh. If the buffer is overwritten mid-scan, the top half still shows the old frame while the bottom shows the new one — and your eye sees a crack.
扫描线(白线)从上扫到下,整屏完成后才允许换帧。The scanline sweeps top → bottom; the buffer can only be swapped after a full pass.
扫描走到 38% 时缓冲被覆盖。下半屏跳到新帧 —— 黄线就是物理上的"撕裂"。At 38% of the scan, the buffer is overwritten. The bottom half jumps to a different frame — that yellow line is the physical "tear".
三重缓冲解决了"撕裂 / 漏帧",但同时给"按下到画面变化"塞了一段隐形的延迟 —— 因为指令要排队穿过 3 个 buffer 才能上屏。下面是同一帧从 GPU 出发到玩家眼睛为止,三种策略各自的延迟代价:
Triple buffering kills tearing and dropped frames — but it also slips an invisible latency between the user's finger and the screen. Each frame waits its turn through three buffers. Here's the latency cost of each strategy:
这套组合拳之后,PerfDog Jank、Apple Smoothness、Web Performance Budget 等指标都站在了它的肩膀上:把"流畅"拆成可测量、可比较、可优化的工程语言,是黄油计划留给整个移动行业的真正遗产。
After that combination, PerfDog Jank, Apple Smoothness, the Web Performance Budget — all of them stand on its shoulders. Decomposing "smoothness" into a measurable, comparable, optimizable engineering language is Project Butter's real legacy to the mobile industry.
"对齐 vsync"这件事在 2024 年也还在打补丁。Google 在 2019 年发布了 Swappy 库,专门解决"游戏渲染速度 ≠ 显示器刷新率"的拍子错配 —— 比如游戏内部跑 90fps、显示器只支持 60Hz,Swappy 会主动节流到 60fps 的整数倍,避免节奏不齐造成的 micro-stutter。是同一个 2012 年的思想,工具变了,目标没变。
"Align to vsync" still gets patches in 2024. In 2019 Google released Swappy, a library purpose-built for the "render rate ≠ display rate" beat mismatch — say a game running 90fps on a 60Hz panel, where Swappy throttles to a clean 60fps multiple to prevent the micro-stutter of misaligned cadence. Same 2012 idea, new tool, same goal.
visual inertia & cinematic frames
视觉惯性与电影帧
"卡顿感"从哪里长出来?答案藏在两个看似无关的概念里:
Where does the feeling of lag actually come from? The answer hides in two seemingly unrelated ideas:
大脑会下意识用"上一帧的节奏"预测下一帧。一直 60 帧,它就以为下一帧也是 60。一旦节奏忽然降到 25 帧,预测被打断,卡顿感由此诞生。
所以同样是 25 帧——一直保持的 25 不卡,从 60 突然掉到 25 才是卡。
The brain unconsciously uses last frame's rhythm to predict the next. Steady at 60 fps, it expects 60. The moment the rhythm drops to 25, prediction breaks — and that is where lag is born.
Same 25 fps, two different worlds: a steady 25 doesn't feel like lag; a sudden fall from 60 to 25 does.
电影一般是 24 帧,单帧约 41.67ms。这是一个生理学的临界点:低于这个帧率,人眼就开始能辨别画面的不连续。
Films typically run at 24 fps, with each frame around 41.67ms. This is a physiological threshold: below it, the eye begins to detect discontinuity.
把这两件事放在一起,就能推导出 PerfDog 衡量卡顿的整套阈值:以"前三帧均值"度量节奏的稳定,以"电影帧倍数"度量绝对时长的容忍。
Put these two together and you arrive at PerfDog's full threshold system: measure rhythmic stability via the three-frame moving average; measure absolute-duration tolerance via multiples of the cinematic frame.
下面三条轨道里的小球都在重复"左 → 右 → 左"。试着只看节奏,别看速度:
Three balls below run a loop: left → right → left. Watch the rhythm, not the speed:
流畅 · 节奏完全均匀smooth · perfectly even rhythm
总速度更快,但每一次停顿都被身体清楚地记下来了faster on average — yet every pause is felt by the body
慢一倍,但稳定 — 体感反而比 B 流畅得多half as fast, but uniform — and feels far smoother than B
2012 年所有"流畅"的对话都默认 60Hz。今天的高端手机已经是 120Hz 起步,iPhone Pro 用 LTPO 面板实现 1–120Hz 的无级变速 —— 静态时降到 1Hz 省电,滚动时拉到 120Hz 抢丝滑感。同一个画面在三种刷新率下,节奏完全不同:
In 2012 every conversation about smoothness assumed 60Hz. Today's flagship phones start at 120Hz, with iPhone Pro using LTPO panels to vary refresh rate from 1 to 120Hz — dropping to 1Hz when static to save battery, climbing to 120Hz when scrolling. The same image looks different at each rate:
能看,但快速滑动时会感到"轻微拖影"readable, but a faint smear during fast flicks
业界标准。"流畅"两个字最初就是为这个数定义的the industry standard — "smooth" was originally defined around this number
手指越快,差距越明显。但每帧预算只剩一半 —— 渲染负担也翻倍the faster you flick, the more obvious it gets — but the per-frame budget is half, doubling the render burden
a discriminator function
一条判别式
把上面两个直觉变成可以跑在工具里的判别式,PerfDog 用了一组双条件:一帧必须同时违反"节奏稳定"与"时长可容忍",才算一次 Jank。
To turn those two intuitions into a function the tooling can actually run, PerfDog uses a paired condition: a frame must violate both "rhythmic stability" and "duration tolerance" to count as one Jank.
两条公式只差一个数字 —— 把"2 倍电影帧 (83ms)"改成"3 倍电影帧 (125ms)"。Jank 是"开始觉得卡"的临界,BigJank 是"明确觉得卡了一下"的临界。一帧如果同时跨过两条阈值线(即"前三帧平均 × 2" 和 "电影帧 × N"),就会被记一次。
The two formulas differ by a single number — swap "2 × cinematic frame (83ms)" for "3 × cinematic frame (125ms)". Jank is the threshold where the eye starts to feel lag; BigJank is where it definitely registers a hitch. A frame is counted only if it crosses both lines (the relative "2 × prev-3-avg" and the absolute "N × cinematic frame").
"前三帧均值 × 2" 和 "1000/24 × 2" 这两个数字看起来像是拍脑袋拍出来的,实际上各自有清晰的工程直觉:
"2 × avg of prev 3" and "1000/24 × 2" look like back-of-the-napkin numbers — they aren't. Each has a clear engineering rationale:
PerfDog 是其中一种判别式,市面还有另外两套主流定义。三家直觉不同,对同一段帧数据可能给出完全不同的答案:
PerfDog is one discriminator among several. Two other mainstream definitions exist — and given the same frame data, the three may return entirely different answers:
每错过一次 vsync 就计一次。最朴素,但忽略了"短时抖动 vs 长时卡顿"的体感差。
Every missed vsync is one Jank. The simplest definition — but it ignores the felt difference between brief jitter and a long freeze.
"双条件"判别:节奏破了 + 长得离谱才算。还分了 Jank / BigJank 两档严重程度。
A paired condition — rhythm break and absolutely too long. Also graded into Jank / BigJank.
不数次数,直接累加每帧"超出 vsync 的部分"。MetricKit 推荐目标 < 10 ms/s。
Doesn't count events — sums the excess time over vsync per frame. MetricKit's recommended ceiling is < 10 ms/s.
把上面 FIG 08 那 12 帧的数据原封不动喂给三家判别式,结果差别非常大:
Feed the same 12-frame data from FIG 08 into all three discriminators — and the results diverge significantly:
F4 错过 5 次 vsync · F9 错过 8 次 · F10 错过 2 次。合计 15 次 Jank。
F4 misses 5 vsyncs · F9 misses 8 · F10 misses 2. Total: 15 Janks.
F4 触 Jank · F9 触 BigJank · F10 跨过 33ms 但没跨 83ms,不算。合计 1 Jank + 1 BigJank。
F4 trips Jank · F9 trips BigJank · F10 crossed 33ms but not 83ms, not counted. Total: 1 Jank + 1 BigJank.
超出 16.67 的部分累加:73 + 123 + 33 = ≈ 230 ms 的 hitch time,作为占比报告。
Sum of "excess over 16.67ms": 73 + 123 + 33 = ≈ 230 ms of hitch time, reported as a ratio.
ratio over count
用占比代替次数
Jank 是次数,Stutter 是占比。后者的存在是为了回答一个 Jank 回答不了的问题:每次卡顿到底有多严重?
Jank is a count. Stutter is a ratio. The latter exists to answer a question Jank cannot: how severe is each stutter?
同样是"3 次 Jank",可能意味着完全不同的两段体验:
"Three Janks" can mean two very different stories:
A、B 的 Jank 次数相同,但 Stutter 相差近一个数量级。这就是为什么 Jank 和 Stutter 只能同时看,不能互相替代。
A and B share the same Jank count, but their Stutter values differ by nearly an order of magnitude. That is why Jank and Stutter must be read together, never as substitutes.
假设两段都是 10 秒测试,分别累计 270ms 和 1800ms 的卡顿时长,把数字代入定义就一目了然:
Assume both runs are 10-second tests, with 270ms and 1800ms of stutter time respectively. Drop the numbers into the definition and the gap is undeniable:
不同行业有自己默认接受的 Stutter 区间,可以把它当快速诊断尺:
Each segment of the industry has its own acceptable Stutter band. Treat them as a quick diagnostic ruler:
Jank 是次数,Stutter 是占比。但即使两者都看,仍然有最后一层信息会被埋掉:帧时分布的形状。FrameTime 本质是个长尾分布——大部分帧很快,少数帧拖了所有人下水。光看平均值会被这种尾巴骗。
Jank counts events, Stutter measures share. But even both miss one more layer: the shape of the FrameTime distribution. Frame times are inherently long-tailed — most are fast, a handful drag the whole experience down. The mean lies because of that tail.
结论很简单:没有 Jank → Stutter 必然为 0;有 Jank → 两者趋势一致但并非线性。两个一起看,才能描出体验的全貌。再加一层分布视角(P50/P95/P99),才算把"流畅"这件事真正讲透。
The conclusion is simple: no Jank → Stutter is necessarily 0; with Jank → the two trend together but never linearly. Read both, or you'll miss the shape of the experience. Add a distributional view (P50 / P95 / P99) on top, and only then have you really pinned down "smoothness".
where to look, by scenario
不同场景的指标侧重
是不是所有 APP/游戏都该把 FPS、Jank、Stutter 三件事一起盯?答案是看场景。同一个数字在不同语境下,含义、阈值、优化方向都完全不同。
Should every app and every game watch FPS, Jank, and Stutter all at once? The honest answer: it depends on the scenario. The same number means different things — and points to different fixes — in different contexts.
游戏是三件套都不能放的场景:玩家的手指、画面、预期同步在一帧上。FPS 决定操作的"反馈感",Jank 决定关键瞬间是否破功,Stutter 决定整局体验的"质量底色"。
Games are the one place all three matter at once. The player's finger, the image, and their expectations are aligned on a single frame. FPS shapes response feel; Jank decides whether key moments collapse; Stutter sets the baseline quality of the entire session.
实操上有两条额外注意:
Two practical notes:
APP 没有"一组指标走天下"的奢侈。同一个 App 里,登陆页、信息流、视频播放对应完全不同的指标语言:
Apps don't get a single metric set for everything. Within one app, the login page, the feed, and the video player each speak a totally different metric language:
理论 FPS 应该是 0:没人在交互时,根本不该有任何刷新。一旦 FPS > 0,说明有动画或定时任务在偷偷工作 —— 直接对应发热和耗电的暗债。
Theoretical FPS should be 0. With no interaction, no redraws should fire. Anything above zero means an animation or timer is working in secret — heat and battery drain hidden in plain sight.
FPS 锁在合理值(30 / 60)即可,不必一味追求 120。手指匀速滑时,60fps 完全足够;多出来的帧只在烧 GPU、烧电池,并不会被感知。
Lock FPS at a reasonable value (30 / 60) — don't chase 120 for its own sake. With a finger gliding at constant speed, 60fps is enough; the extra frames just burn GPU and battery without being noticed.
手机交互"灵敏感"的来源——也是 Android 黄油计划诞生的场景。手指越快、画面越敏感,Jank 在这里变得致命:一次 100ms 的卡顿,足以让用户怀疑屏幕坏了。
建议:FPS > 55,Stutter < 1%。
The wellspring of mobile UI responsiveness — and the very scenario Project Butter was born to fix. The faster the finger, the more sensitive the image — Jank becomes lethal here. A single 100ms hitch is enough for a user to suspect a broken screen.
Target: FPS > 55, Stutter < 1%.
视频源帧率一般 18–24 帧,FPS 不能掉、Jank 必须为 0。任何一次卡顿都会让人物嘴型对不上声音,被用户立刻感知。
这里的"FPS 高"反而要怀疑:是不是解码器在反复重绘同一帧?
Source video typically runs 18–24 fps; FPS must not drop and Jank must be 0. Any stutter desyncs lips from sound — the viewer notices instantly.
An unusually high FPS here is suspicious — is the decoder repainting the same frame?
| 场景 | FPS | Jank | Stutter | 建议目标 |
|---|---|---|---|---|
| 游戏 · 战斗 | ≥ 55 | < 3 / 分 | < 1% | 三件套同盯,按场景分桶 |
| 游戏 · 菜单 | 30 OK | < 1 / 分 | < 0.5% | UI 卡顿优先 |
| App · 静态页 | = 0 | — | — | 非 0 即异常 |
| App · 滑动 / Feed | ≥ 30 | < 5 / 分 | < 3% | 避免追高,关注稳定 |
| App · 快滑 / 弹性 | ≥ 55 | < 2 / 分 | < 1% | 触摸响应优先 |
| 视频播放 | 原帧率 | = 0 | < 0.5% | FPS 异常高也是问题 |
| Scene | FPS | Jank | Stutter | Target |
|---|---|---|---|---|
| Game · combat | ≥ 55 | < 3 / min | < 1% | All three, bucketed by scene |
| Game · menu | 30 OK | < 1 / min | < 0.5% | UI smoothness first |
| App · static page | = 0 | — | — | Anything > 0 is a leak |
| App · scroll / feed | ≥ 30 | < 5 / min | < 3% | Stability over peak fps |
| App · flick / spring | ≥ 55 | < 2 / min | < 1% | Touch response first |
| Video playback | source fps | = 0 | < 0.5% | An unusually high FPS is also a bug |
knowing it lagged isn’t enough
知道卡了,还要知道为什么卡
前面六章定义了"什么是卡"。但工程上更难的问题是下一句:这一帧为什么卡了?FrameTime 长,原因可能在 CPU、GPU、JS、布局、IO、硬件、甚至运营商网络。光看 FrameTime 高度,是看不出哪条线让它高的。
The first six chapters defined what a stutter is. The harder question follows immediately: why did this frame stutter? A long FrameTime can come from CPU, GPU, JS, layout, IO, hardware throttling — even the carrier's network. Bar height alone won't tell you which one made it tall.
把工程经验里最常见的根因归类成 6 大族,每族都有自己的"指纹"和对应的诊断工具:
From engineering experience, the most common root causes fall into 6 families — each with its own "fingerprint" and a matching diagnostic tool:
JS / 布局 / 复杂业务逻辑同步占用主线程。指纹:FrameTime 突然飙升,CPU 在这一帧持续 100% 占用单核。
JS / layout / complex business logic monopolizing the main thread. Fingerprint: FrameTime spikes; one CPU core pegs at 100% for that frame.
overdraw、复杂 shader、大尺寸纹理加载。指纹:CPU 已交完命令,但帧仍在等 GPU 完成;profiler 显示 GPU busy > vsync。
Overdraw, heavy shaders, large texture uploads. Fingerprint: CPU has flushed commands but the frame waits for GPU completion; profiler shows GPU busy > vsync.
JS / Java / Swift 的垃圾回收在主线程上同步触发。指纹:周期性、每隔几秒一次的 50–200ms 卡顿,伴随内存陡降。
JS / Java / Swift garbage collection firing synchronously on the main thread. Fingerprint: periodic 50–200ms hitches every few seconds, with a sharp memory dip.
同步文件读写或同步 RPC 进入主线程。指纹:FrameTime 抖动幅度极大(10ms ~ 数秒不等),跟网络/磁盘负载强相关。
Synchronous file IO or sync RPC reaching the main thread. Fingerprint: FrameTime variance is enormous (10ms to seconds), correlated with disk / network load.
Shader 第一次使用时被编译;V8 / JSCore 把热点 JS 升 JIT。指纹:只在"第一次"出现的卡顿——重启后再现,预热后消失。
Shaders compile on first use; V8 / JSCore promote hot JS to JIT. Fingerprint: hitches that appear only on first encounter — they reproduce after a clean start and vanish after warm-up.
设备发烫后内核降频;任务被错误调度到小核上。指纹:游戏前 5 分钟 60fps,第 6 分钟开始稳定掉到 40fps,FrameTime 整体抬升。
The kernel down-clocks once the device gets hot; tasks land on the LITTLE cores by mistake. Fingerprint: a game holds 60fps for 5 minutes, then settles at 40fps from minute 6 onward — FrameTime baseline lifts uniformly.
真正调试时,profiler 会把这 16.67ms 切成一段段堆叠的时间块(Perfetto / systrace / Instruments 都长一样)。哪段长得反常,就是这帧的根因。下面是一个简化示意:
When you actually debug, the profiler dices that 16.67ms into stacked time blocks (Perfetto / systrace / Instruments all look the same). The block that looks abnormally tall is the root cause for that frame:
把上面的所有概念串起来,看一段真实的排查链路。场景:电商商品列表页,快速滚动时偶尔出现一次明显的 ~200ms 顿挫。监控大盘上能看到 P99 帧时异常 spike,5% 用户上报"滑得不顺"。我们沿着五步把它抓下来:
Stitch all the concepts above into a real investigation. Scenario: an e-commerce product list page, where fast scrolling occasionally produces a clear ~200ms hitch. The dashboard shows a P99 FrameTime spike; about 5% of users report "scroll feels off". Five steps to catch it:
P99 FrameTime 平时稳定在 ~25ms,今天某个版本上线后跳到 213ms,集中在"商品列表 · 快速滚动"分桶里。Stutter 也从 0.4% 抬到 8%。
P99 FrameTime is normally ~25ms; after today's deploy it spikes to 213ms — concentrated in the "product list · fast scroll" bucket. Stutter rose from 0.4% to 8%.
在 Perfetto / Instruments 上录一段 5 秒的列表滚动。看 frame timeline,绝大多数帧都在 16ms 左右,唯独中间冒出一个 213ms 的红色长帧——肉眼一眼可见的异常。
Record 5 seconds of list scrolling in Perfetto / Instruments. The frame timeline shows almost every frame near 16ms — except one red 213ms bar in the middle, jumping out instantly.
点开这个 213ms 帧,profiler 把它切成"input · animate · layout · draw · composite"。其中 layout 段占 180ms(正常应 < 5ms)—— 锁定,是 layout 的问题。
Open the 213ms frame; the profiler splits it into "input · animate · layout · draw · composite". Layout alone takes 180ms (normal: < 5ms). The culprit class is layout.
展开 layout 段的火焰图,看到 100 次相同的栈:scroll handler → checkVisible() → getBoundingClientRect()。每次调用都触发同步 reflow—— 100 × 1.8ms = 180ms。一句话定位。
Expand the layout flame graph and see 100 identical stacks: scroll handler → checkVisible() → getBoundingClientRect(). Each call triggers a synchronous reflow — 100 × 1.8ms = 180ms. The line is pinpointed.
把同步 getBoundingClientRect 换成 IntersectionObserver,可见性判断从主线程移到浏览器内置异步通知。重新发版后 P99 从 213ms 回到 24ms,Stutter 从 8% 回到 0.3%——闭环。
Replace the synchronous getBoundingClientRect with IntersectionObserver — visibility checks move from the main thread to the browser's async callback. After redeploy: P99 drops from 213ms to 24ms; Stutter from 8% to 0.3% — closed loop.
keeping Jank from happening
让卡顿不发生
第七章告诉你"卡在哪",这一章告诉你"怎么不卡"。优化手段虽然繁多,但底层只有 三种思路:要么让每帧的工作变少,要么把工作搬走,要么干脆接受妥协。
Chapter 7 told you where it lags. This one tells you how to keep it from lagging. The toolbox is large, but underneath there are really only three strategies: do less work per frame, move work elsewhere, or accept compromise.
让每一帧主线程上的工作变少。能不算就不算、能少算就少算。最朴素的优化方向,往往也最有效。
Make each frame's main-thread work smaller. If you can skip it, skip it. The plainest direction — and usually the most effective.
同样的工作量,搬到别的地方做。Web Worker、异步队列、空闲回调,都是把"必须算"的活儿挪开主线程的工具。
Same work, different runner. Web Workers, async queues, idle callbacks — tools for relocating necessary work off the main thread.
承认一帧塞不下,主动选择"哪一帧不画"或"画得糙一点"。滚动时降帧、视口外剔除、低端机降画质——是工程上的"诚实"。
Admit a frame can't fit, then choose what to drop or simplify: lower framerate while scrolling, cull off-screen, downgrade quality on low-end devices. Engineering honesty.
把一个长任务切成多段,每段之间让 vsync 插一帧。50ms 的活拆成 5 × 10ms,主线程就能在中间画出 5 帧。
Slice a long task into pieces with vsync gaps. A 50ms job split into 5 × 10ms lets the main thread paint 5 frames in between.
10 万条数据只渲染屏幕里那 20 条。React-window / RecyclerView / Compose LazyList 都是这一类。
Render only the 20 items currently visible out of 100,000. React-window / RecyclerView / Compose LazyList all belong here.
JSON 解析、图像编码、复杂计算搬到 Worker 线程。主线程拿结果不参与计算,自然不会卡。
JSON parsing, image encoding, heavy compute — move to a Worker. The main thread takes the result instead of doing the work.
requestAnimationFrame 控制画面、requestIdleCallback 处理低优先级任务。React 18 的 startTransition 也是同源思想。
requestAnimationFrame for visuals, requestIdleCallback for low-priority work. React 18's startTransition is the same school.
Shader 预编译(PSO cache)、图片预解码、JS bytecode 缓存——让"第一次"的成本提前付掉。
Pre-compile shaders (PSO cache), pre-decode images, cache JS bytecode — pay the "first time" cost up front.
检测到帧时拉长就主动降画质:粒子数减半、阴影关闭、动画跳帧。诚实的妥协,比"卡到死"好得多。
When FrameTime drifts, downgrade actively: halve particles, kill shadows, drop animation frames. Honest compromise beats freezing.
举一个最直观的例子:一段 50ms 的 JSON 解析挂在 click handler 上。同步跑会让接下来 3 个 vsync 没有新画面(即 3 次 Jank)。把它切成 5 段、每段之间 yield,主线程就能在中间画出帧。代码量没变,体感差一个数量级:
The simplest example: a 50ms JSON parse hanging off a click handler. Run synchronously and the next 3 vsyncs see no new image — 3 Janks. Slice it into 5 pieces and yield between them; the main thread paints frames in the gaps. Same code count; the felt difference is an order of magnitude:
把第七章的 6 类根因和上面的手段拼一张表,下次找到根因后能直接在表里查处方:
Stitch Chapter 7's 6 root causes to the tactics above into one table — once you've named the cause, look up the prescription:
| 根因 | 主要手段 | 典型工具 / API |
|---|---|---|
| CPU · 主线程长任务 | 任务拆分 · 异步调度 · Web Worker | scheduler.yield · requestIdleCallback · postMessage |
| GPU · 渲染管线瓶颈 | 降 overdraw · 减纹理 · 简化 shader | RenderDoc · Texture Atlas · Mip-mapping |
| GC · 垃圾回收停顿 | 对象池 · 减少临时对象 · 关闭装箱 | Object Pool · TypedArray · 避免闭包临时对象 |
| IO · 主线程 IO / 网络 | 异步化 · 预加载 · 缓存 | async/await · Service Worker cache · IndexedDB |
| COMPILE · Shader / JIT 首次编译 | 预热 · PSO cache · bytecode cache | Vulkan PSO Cache · V8 Snapshot · pre-warm |
| THERMAL · 热降频 | 动态降级 · 限帧 · 推送高负载到峰前 | FPS auto-throttle · QoS class · pre-compute |
| Root cause | Primary fix | Typical tool / API |
|---|---|---|
| CPU · long main-thread task | Chunking · async scheduling · Web Worker | scheduler.yield · requestIdleCallback · postMessage |
| GPU · pipeline bottleneck | Reduce overdraw · smaller textures · simpler shaders | RenderDoc · texture atlas · mip-mapping |
| GC · pause | Object pools · fewer temporaries · avoid boxing | Object Pool · TypedArray · avoid closure temps |
| IO · main-thread IO / network | Async · preload · cache | async/await · Service Worker cache · IndexedDB |
| COMPILE · first-time shader / JIT | Pre-warm · PSO cache · bytecode cache | Vulkan PSO Cache · V8 snapshot · pre-warm |
| THERMAL · throttling | Dynamic downgrade · cap fps · pre-compute peaks | FPS auto-throttle · QoS class · pre-compute |
六大类、三种思路、一张矩阵——优化的全图基本就这些。剩下的工作不是"知道有什么手段",而是"具体场景下挑哪一个"。这件事没有银弹,只有反复地:测、改、再测。
Six causes, three strategies, one matrix — that's the optimization map in full. The remaining job isn't "knowing what tools exist", it's "picking the right one for this scene". No silver bullet — just measure, fix, measure again.
smoothness as a cross-platform dialect
流畅是一种跨平台方言
前面所有的概念听起来像移动端专属,其实它们在 Web 和 iOS 上都有同形异姓的孪生兄弟。一旦认出"这是同一种语言",跨端跳起来就不再陌生:
Most of the concepts above sound mobile-specific. They aren't — they have same-shape, different-name twins on the Web and on iOS. Once you recognize "this is the same language", crossing platforms stops feeling foreign:
2024 年取代 FID 成为 Core Web Vitals 三件套之一。测量"用户交互到下一次画面更新"的耗时,目标 < 200ms。本质就是 Web 版的 Jank Time。
Replaced FID in 2024 as a Core Web Vitals metric. Measures the time from user input to the next paint, with a target < 200ms. Fundamentally Web's "Jank time".
iOS 14+ 系统级指标。每帧错过 vsync 的"超出时间"被累加,作为 ms/s 的占比上报。Apple 推荐目标 < 10 ms/s。
A system-level metric since iOS 14. Excess time per missed-vsync frame is summed and reported as a ms-per-second ratio. Apple's recommended target is < 10 ms/s.
1968 年 IBM 的研究:响应在 100ms 内 = "瞬时",< 1s = "连续",> 10s = "中断"。它和帧时无关,但和"流畅感"同源 —— 都是用户对延迟的容忍曲线。
A 1968 IBM study: response < 100ms feels instant, < 1s feels continuous, > 10s feels broken. Independent of FrameTime — but rooted in the same human-tolerance curve for latency.
流畅(FrameTime 稳定)和响应(输入延迟低)是两件事,常常被混作一谈。一次完整的"按下到看见反馈"经过 5 个阶段,每一段都可能成为瓶颈:
Smoothness (steady FrameTime) and responsiveness (low input latency) are different things, frequently conflated. A complete "press → see-it-react" path crosses 5 stages — any one can become the bottleneck:
所以下一次别人说"我们 60fps 流畅"时,可以追问一句:P99 帧时是多少?输入延迟是多少?hitch 占比多少?真正流畅的产品,三个问题都答得出,且数字相互呼应。
So the next time someone says "we're at 60fps, smooth" — push back: What's your P99 FrameTime? Your input latency? Your hitch ratio? Truly smooth products answer all three — and their answers reinforce each other.
"流畅"听起来是一种感受,
但它在工程上能被切成 FrameTime、FPS、视觉惯性、电影帧、判别阈值与占比。
每一个数字,都对应一种被你身体记住、却说不出名字的不适。
"Smoothness" sounds like a feeling.
In engineering it splits into FrameTime, FPS, visual inertia, cinematic frames, discriminator thresholds, and ratios.
Each number is the name of a discomfort your body has memorised but cannot pronounce.