Skip to content

mustaqahmed/user-activation-v2

Repository files navigation

User Activation v2 (UAv2)

A frame-hierarchy based model to track active user interaction.


⚠️ This page is no longer maintained! ⚠️

This is now part of the HTML spec (see the section on tracking user activation) and has shipped in Chrome 72.


⚠️ This page is no longer maintained! ⚠️

Introduction

What's a user activation?

The term user activation means the state of a browsing session with respect to user actions: an "active" state typically implies either the user is currently interacting with the page through some input mechanism (typing, clicking with mouse etc.), or the user has completed some interaction since the page got loaded. (User gesture is a misleading term occasionally used to express the same idea, e.g. "allowing something only with a user gesture", even though a swipe gesture doesn't typically activate a page.)

Browsers control access to "abusable" of APIs through user activation. The most obvious example of such an API is opening popups through window.open(): when rogue developers started to abuse the API to open popups arbitrarily, most (all?) browsers began to block popups when the user is not actively interacting with the page. Since then, browsers have gradually made many other APIs dependent on activation (more precisely, made them user activation gated), like making an element fullscreen, vibrating a mobile device, autoplaying media etc. To highlight the scope, ~30 different APIs in Chrome are user activation gated.

What's the problem today?

The Web is in a terrible state today in terms of user activation behavior. Because each browser has incrementally added user activation dependence to it's own set of APIs over a course of many years, we see widely divergent behavior among major browsers. For example, pop-blocking behavior is inconsistent among major browsers for all non-trivial cases of user activation.

More importantly, the current HTML spec can't really fix the broken situation in the Web today because it needs to add important details and doesn't fully reflect any current implementation.

How are we proposing to solve the problem?

User Activation v2 (UAv2) introduces a new user activation model that is simple enough for cross-browser implementation, and hence calls for a new spec from scratch as a long term fix for the Web. We prototyped the model in Chromium behind the flag --enable-features=UserActivationV2 in M67.

Details of the new model

Two-bit state per frame

The new model maintains a two-bit user activation state at every window object in the frame hierarchy:

  • HasSeenUserActivation: This is a sticky bit for the APIs that only needs a signal on historical user activation. The bit gets set on first user action, and is never reset during the lifetime of the window object. Example APIs: <video> autoplay and Navigator.vibrate().

  • HasConsumableUserActivation: This is a transient bit for the APIs that need limited invocation per user interaction. The bit gets set on every user interaction, and is reset either after an expiry time defined by the browser or through a call to an activation-consuming API (e.g. window.open()).

State propagation across frames

  • Any user interaction in a window object sets the activation bits in the window objects of all ancestor frames (including the window being interacted with). (See Related Links below for an API to modify this default behavior.)

  • Any consumption of the transient bit resets the transient bits in the window objects of the whole frame tree.

Major functional changes

In Chromium, the main change introduced by this model is replacing stack-allocated per-process gesture tokens with per-frame states as described above. This effectively:

  1. removes the need for token storing/passing/syncing for every user API,

  2. changes activation visibility from stack-scoped to frame-scoped, and

  3. fuses multiple user interactions within the expiry time interval into a single activation.

Design docs

For further details on the model and Chromium implementation, see:

Classifying user activation gated APIs

Modern browsers already show different levels activation-dependence for activation-aware APIs, and the Web needs a spec for this behavior. The UAv2 model induces a classification of user APIs into three distinct levels, making it easy for any user API to spec its activation-dependence in a concise yet precise manner. The levels are as follows, sorted by their "strength of dependence" on user activation (from strongest to weakest):

  • Transient activation consuming APIs: These APIs require the transient bit, and they consume the bit in each call to prevent multiple calls per user activation. E.g. window.open() is most (all?) browsers today.

  • Transient activation gated APIs: These APIs require the transient bit but don't consume it, so multiple calls are allowed per user activation until the transient bit expires. E.g. Element.requestFullscreen() in Chromium and other many browsers.

  • Sticky activation gated APIs: These APIs require the sticky activation bit, so they are blocked until the very first user activation. E.g. <video> autoplay and Navigator.vibrate() in Chromium.

Our prototype implementation preserved all the APIs' past behavior in Chromium after a few (mostly minor) changes.

Demo

Compare these demos in Chrome 72+ vs. in all other browsers to see why UAv2 makes sense.

Related links