Lighthouse and Core Web Vitals have brought concepts such as Largest Contentful Paint and Time to Interactive into many front-end developer’s lingo.
However, these three-letter-acronyms are easy to get confused with each other. What’s CLS again? What’s the difference between FID and TTI?
I’d like to propose an alternative language, from the user’s experience:
When Can Users Do X?
User starts loading
User can see something
User can hear something
User can read
User can scroll
User can zoom
User can interact
User thinks they can interact
User knows they can interact (from previous experience)
User actually successful interacted
User can share
User can close
This potentially goes broader than a particular browser life-cycle event. It starts to ask questions like “what’s the user’s goal in visiting this page?”, is it to read themselves (to read a recipe), or to share the link with someone else (to share the page to a booking with a partner)?
Perhaps the user tried zooming, but because the page was still loading then content moved around like a new born calf on a trampoline. So you’d say that the user couldn’t zoom until layout stopped shifting. I’m thinking less in terms of “Cumulative Layout Shift” and more in terms of “some users like to zoom, so when can they start doing that?”
And it begins to break the assumption that all users read the page by seeing, but some rather read by hearing. As great as tools like Lighthouse have been, they sort of assume the user is interacting with the page visually.
In the late 90s Steve Jobs was faced with a dilemma. Apple’s charismatic CEO couldn’t convince Adobe and Microsoft, makers of Photoshop and Excel, to make software for Apple’s upcoming operating system. Missing a ‘killer app’ like Photoshop on the Mac meant many people would buy a Windows PC instead. Apple conceded by investing substantial engineering effort in ‘Carbon’, a backwards-compatible software layer added to both their old and new operating systems. This stopgap gave Apple’s most important third-party developers a bridge from one system to the next, and Apple a bridge from one century to the next.
It worked. Apple dominated the first two decades of the 21st century, reinventing computers and upending markets. Now one of the most valuable companies ever with reliable quarterly profits in the tens of billions, it appears Apple has mastered the recipe for success. However, its key ingredients of user-friendly design, tight software/hardware integration, and ground-breaking technology are today all tightly controlled or obstructed by Apple. The same recipe is out of reach to many would-be competitors and collaborators, preventing the next wave of ‘killer apps’ from thriving.
Apple began their journey into people’s pockets twenty years ago with the iPod. The music player succeeded not only due to its striking industrial design and memorable ad campaigns with colourful dancing silhouettes, but thanks to deals with record companies Jobs had helped broker himself. He told writer Steven Levy “we walked in and we said, ‘We want to sell songs a la carte’”. The iTunes Music Store sold over one million songs in its first week.
This success was repeated on a grander scale with the iPhone and App Store, with ten million downloads of apps in first weekend. Both were more capable versions of their predecessors: the iPhone had a larger screen and faster chips than the iPod, and apps could communicate with the internet and were far more interactive than a song. A wave of software accessible to everyday people followed — Instagram, TikTok, Uber, Airbnb — that took advantage of pocketable, always-online devices.
However, a recent court case between Apple and Epic Games found that 70% of Apple’s App Store revenue is from games. That leaves only 30% for productivity software used by businesses such as Excel and Photoshop. Which makes reports from Apple that they have “paid out over $200bn to developers since 2008” much less compelling. According to Sensor Tower, Adobe makes an estimated $10m per month from the App Store, which is roughly 1% of its $1bn total monthly revenue. And Microsoft earns even less from the App Store, even though it makes ten-fold the revenue of Adobe.
New productivity tools like Notion, Jira, and GitHub are all web-first: while they may offer a companion app, the primary experience is designed to be used from a web browser, like Gmail or Google Docs. Another is Figma, a design tool founded ten years ago, which made $75m in 2020 and forecasted double that for 2021. GitHub, used to collaborate on the code behind software projects, was acquired by Microsoft in 2018 for $7.5bn, and was reported at the time to be making $200m–$300m annually. Microsoft’s own Office productivity suite including Word and Excel now work in web browsers (this article was written using it), and Adobe is working on the same for its flagship apps Photoshop and Illustrator.
The web has several key advantages over apps sold via the App Store. First, the web is an open standard that is purposefully designed to not be controlled by any single company. While makers of web browsers, such as Google’s Chrome, Microsoft’s Edge, Mozilla’s Firefox, and Apple’s Safari, are all made by large corporations with their own interests, they all must work to a single standard that they collaborate over. iPhone apps are made using Apple-provided tools that are updated every year, sometimes breaking what previously worked, which adds maintenance overhead. The first web page made in 1991 still works in the browsers of today.
This also means developers who build using web standards aren’t locked into a particular platform. Instead of writing a particular app for iPhone and another app for Android and yet more for tablet and desktop platforms, they can be made once and adapt to whichever device a customer uses. Platform providers are incentivised to continue working with as many existing websites as possible, which puts much of the burden of maintenance and compatibility on them. And if a new device is introduced, there’s likely less work by a developer to get an existing website adapted than it takes to provide another app.
Second, anyone can create and share a website provided they know how. Apple requires a $99 annual membership and submission to a board of reviewers before apps appear in their store. Membership can be revoked, as Epic Games found when their account was terminated in 2020, preventing them from being included in the App Store again. And reviewers are human and subject to the whims of Apple’s broader strategy. Apple Arcade is a game subscription service integrated into the iPhone, which helped contribute to the $54bn services revenue Apple made from its users in 2020. That same year Apple rejected the game streaming service xCloud. Its maker Microsoft responded “Apple stands alone as the only general purpose platform to deny consumers from cloud gaming”. xCloud is now available to iPhone users through the only alternative available: the web.
Finally, web software can integrate with whatever payment system they prefer, whether it be PayPal, Stripe, or even cryptocurrency. Businesses can find the service that offers the lowest cost, or which makes refunding and recurring subscriptions easiest. On the App Store, Apple requires online payments be done through them, taking 15–30%. And refunds can’t be issued by a developer looking to please a dissatisfied customer, instead users must submit a request themselves via an Apple support form.
Innovation requires trial and error, and on the iPhone app developers are prevented from properly trialling their ideas. Apple prescribes their business model: they can’t charge users an upgrade from one app version to the next, but instead must either offer new features for free or use a subscription model unpopular with many customers. Novel app ideas often never find their way on the store as their creators find themselves reading an opaque Apple rejection letter. This prevents original ideas from even being started, since Apple only accepts submissions for fully-functional apps and not early concepts, making starting new ventures as risky as betting what the weather will be in 200 days time.
The web offers an escape hatch, but here Apple has leverage too. On the iPhone only a single implementation of the web standard is available, and that is Apple’s. Google, who primarily makes its money advertising on the web, is incentivised to make websites more like apps, and has been pushing the web standards in that direction for a number of years. Apple is incentivised to keep the web more limited than the platform-specific apps that help sell its devices. (They counterbalance this with a different approach: Google is estimated to have paid Apple $15bn in 2021 to remain the default search engine on the iPhone & iPad.)
Steve Jobs once said “design is how it works”. If innovation is being prevented from working by the company he founded, then a new wave of design is surely being missed.
When you visit a new city, one thing you expect to see are landmarks. Statues, botanical gardens, theaters, skyscrapers, markets. These landmarks help us navigate around unfamiliar areas by being reference points we can see on the horizon or on a map.
As makers of the web, we can also provide landmarks to people. These aren’t arbitrary — there are eight landmarks that are part of the HTML standard:
Some of these seem obvious, but some are odd — what on earth does “contentinfo” mean? Let’s walk through what they are, why they are important to provide, and finally how we can really easily use them.
Nearly all websites have a primary navigation. It’s often presented as a row of links at the top of the page, or under a hamburger menu.
Stripe’s navigation provides links to the primary pages people want to visit. It’s clear, and follows common practices of showing only a handful of links, and placing the link to sign in up on the far right.
Most visual users would identify this as the primary navigation of the site, and so you should denote it as such in your HTML markup. Here’s what you might write for Stripe’s navigation:
Here we use HTML 5’s <nav> element, which automatically has the navigation landmark.
If there’s only one navigation landmark on a page, then people using screen readers can jump straight to it and the links inside. They can visit Stripe’s Support page in a few seconds. It’s like a city subway that connects key landmarks, allowing fast travel between them.
What if you have multiple navigations? Let’s look at GitHub for an example.
Here we have a black bar offering links to the main parts of the GitHub experience: my pull requests, my issues, the marketplace, notifications, etc.
But I am on the page for a particular repository, and it also has its own navigation: Code, Issues, Pull requests, Actions, etc.
So how do we offer both? And how do users using screen readers know the difference? By attaching labels to each navigation: the top navigation has the label Global, and the repository specific navigation has the label Repository. It’s like a city having multiple sports stadiums: here in Melbourne we have the MCG (used for football and cricket) and the Rod Laver Arena (used for tennis and music). They clearly have different names to identify them by that means people can find them easily and won’t mix them up.
Now people using screen readers or similar browser tools can see that there are two navigation to pick from, one named Global and one Repository.
Note also we have an aria-current="page" attribute on the link that represents the page the user is on. This is equivalent to a ? You Are Here mark on a public map.
When watching a show on Netflix, you’ll often be presented with a Skip intro button. This fasts forwards past the intro content that is often the same every time to the part you want to watch: the new episode.
Imagine if that Skip intro button didn’t exist: what would you do? You could watch the minute-long intro every time. Or you could attempt to fast-forward to the spot where the show actually starts. One is tedious and the other is error-prone. It would be a poor user experience.
On the web, our users might find themselves in the same situation. If they use a screen reader, they’ll probably hear all the items in our navigation and header. And then eventually they’ll reach the headline or the part that’s new — the part they are interested in — just like a TV episode. They could fast-forward, but that also would be error-prone. It would be great if we could allow them to skip past the repetitive stuff to the part they are actually interested in.
Enter <main>. Use this to wrap the part of the page where your ‘episode’ actually starts. People using screen readers can then skip past the tedious navigation and other preambles.
By using <main> we have allowed users to skip the intro.
We’ve already talked about the top strip on most websites, and these also have a role. Banners hold the primary navigation and also: logo, search field, notifications, profile, or other site-wide shortcuts. The banner often acts as the consistent branding across all pages.
Here’s GitHub’s banner when I’m signed in. The part I’ve highlighted with the yellow outline is the navigation (using <nav>). The entire element uses <header>, which automatically gains the role of banner if it meets the following (via MDN):
Assistive technologies can identify the main header element of a page as the banner if is a descendant of the body element, and not nested within an article, aside, main, nav or section subsection.
So the following <header> has the role of banner:
<header> <!-- This gains the banner role -->
While this one doesn’t:
<header> <!-- This does not gain the banner role -->
And you can use multiple <header> elements for things other than banners, if you nest them inside “article, aside, main, nav or section” as MDN mentions.
Because of this, I might recommend that you add the banner role explicitly, as it will make it easier to identify and also target with CSS (e.g. header[role=banner] selector).
<header role="banner"> <!-- Add the role explicitly -->
<header> <!-- Because this is nested inside <main>, it won’t gain the banner role -->
Banner’s don’t necessarily have to be a horizontal strip. Twitter has a vertical banner:
The banner here is the entire left hand column containing Home, Explore, etc. It’s also implemented with a <header role="banner">. The HTML 5 elements are named more for their concept that their visual intention.
Search is one of the things that makes the web great. You have an idea of what you are looking for, you type it in, and in seconds you’ll likely be shown it.
Again we see a <form> with role="search". If you decide to add a search form to your site, make sure it has the search role.
If you have another form not used for search, say for signing in or creating a new document, then the form role helps out here. The built-in <form> element actually already has the form role implicitly. So what’s left to do?
First, ensure it is labelled so people know what the form is for. That way if there’s multiple forms on a page, they can tell them apart. Also, people can jump straight to the form and start filling it out.
You can add a label by adding an aria-label attribute (note: avoid title):
<form aria-label="Create a new repository">
<h2>Create a new repository</h2>
Or by identifying which heading acts as the form’s label:
<h2 id="new-repo-heading">Create a new repository</h2>
Note in both cases we still have a heading — your forms should probably have a label that is readable by all users, not just those using assistive-tech.
Ok, so the names have been pretty logical so far. And then we come to contentinfo. What on earth does that mean?
Let’s show some examples of where contentinfo has been used in the wild:
It’s a footer! With lots of links. And a copyright.
Akin to the banner role and its automatic provider <header>, we can use <footer>:
<footer> <!-- Because this is nested inside <main>, it won’t gain the contentinfo role -->
<footer role="contentinfo"> <!-- Add the role explicitly -->
And also like <header>, it only gains the role if it’s a direct child of <body>. However, it’s recommended that you add role="contentinfo" explicitly to the desired element due to long running issues with Safari and Voice Over.
Hierarchy is a core principle of visual design. Some parts of a design will be more important than others, and so it is important that the reader is aware of what they should draw their attention to, and what is less important.
Visual users are aided by size, layout, contrast — and so we need a semantic approach too for non-visual users. This might be a user using a screen-reader. Or it might be a search engine’s web crawler, or someone using the reader view available in Safari and Firefox.
A simple hierarchical relationship is primary content supported by complementary content. Some examples of these are:
Footnotes to an article
Links to related content
Comments on a post
Here’s an example article with footnotes, pull quotes, and related links:
<h1>Why penguins can’t fly</h1>
<p>Penguins are … </p>
<p>Their feathers … </p>
Penguins swim fast due to air bubbles trapped in their feathers<sup><a href="#footnote-1">1</a></sup>
<p>Speeds of … </p>
<p>They eat … </p>
<a href="https://www.nationalgeographic.com/magazine/2012/11/emperor-penguins/">National Geographic: Escape Velocity</a>
We have covered seven landmarks — what’s left? The generic landmark of region. Use it as a last resort — first reach for one of the above landmarks.
Again, HTML 5 helps us out here: we can use <section>. It’s important that you add an aria-label attribute (or aria-labelledby) to name the landmark, so a user knows why it is important and can tell it apart from other landmarks.
<section aria-label="quick summary">
In this Smashing TV webinar recording, join Léonie Watson (a blind screen reader user) as she explores the web…
This allowed Léonie (who suggested the change) to identify the summary, and skip it if she liked.
Remember, use navigation, banner, contentinfo roles (<nav>, <header>, <footer>) before using region. The HTML spec suggests for using sections:
Examples of sections would be chapters, the various tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web site’s home page could be split into sections for an introduction, news items, and contact information.
We’ve been using <article> in some of the examples previously — is this also a landmark? The answer is technically no, but more or less yes. Bruce Lawson goes into detail on why you should use <article> over <section>:
So a homepage with a list of blog posts would be a <main> element wrapping a series of <article> elements, one for each blog post. You would use the same structure for a list of videos (think YouTube) with each video being wrapped in an <article>, a list of products (think Amazon) and so on. Any of those <article>s is conceptually syndicatable — each could stand alone on its own dedicated page, in an advert on another page, as an entry in an RSS feed, and so on.
An article element also helps browsers such as Apple Watch or reader views know what content to jump to with their stripped-back browsers. And many screen readers will surface them as a place-of-interest.
I encourage you to view landmarks on news sites, social media such as Twitter, web apps such as GitHub, and everything in between. You’ll find that there’s a fair amount of consistency, and some will be better than others. You’ll also have a bar to meet when building your own.
These landmarks apply to all websites: landing pages, documentation, single-page-apps, and everything in between. They ensure all users can orient themselves to quickly become familiar with and navigate around your creation.
They also provide a consistent language that we can design and build around. Share this and other articles (which I’ll link to below) with developers, designers, and managers on your team. Landmarks provide familiarity, which leads to happier users.
It’s rare to find job titles like software gardener, or information librarian (even though they would be just as valid as other terms we’ve made up like software engineer or information architect). Outside of the context of open source projects, we don’t talk much about maintenance. We’re much more likely to talk about making.
Anyone who’s spent any time working on design systems can tell you there’s no shortage of enthusiasm for architecture and making—“let’s build a library of components!”
There’s less enthusiasm for gardening, care, communication and maintenance. But that’s where the really important work happens.