Table of Contents

Headless browser - Puppeteer

About

Puppeteer is a Node library that provides a high-level API over Chrome or Chromium (ie headless chrome)

Puppeteer communicate with the browser via the DevTools Protocol

API

The Puppeteer API is hierarchical and mirrors the browser structure.

Puppeteer Architecture

Component

puppeteer-core

puppeteer-core is a library to help drive anything that supports DevTools protocol. puppeteer-core doesn't download Chromium when installed. Being a library, puppeteer-core is fully driven through its programmatic interface and disregards all the PUPPETEER_* env variables.

puppeteer-core doesn't download Chromium when installed.

Usage:

Code Usage:

const puppeteer = require('puppeteer-core');

puppeteer

When installed, it downloads a version of Chromium, which it then drives using puppeteer-core. https://github.com/puppeteer/puppeteer/blob/master/docs/api.md#environment-variables

Example

Integration

Javascript - Jest-puppeteer with typescript configuration

API / Doc

Launch

const browser = await puppeteer.launch({
  headless: false,
  slowMo: 200, // slowdown by 200 ms for every operations
  devtools: true,
  args: [
    '--disable-infobars', // Removes the butter bar.
    '--start-maximized',
    // '--start-fullscreen',
    // '--window-size=1920,1080',
    // '--kiosk',
  ],
});

Snippet

Serialize and Deserialize a date

Puppeteer - How to pass back and forth a date (or a complex type) to the headless browser via the evaluate function

Execute Javascript inside the page

Example with local storage and passing parameters

await page.evaluate(
  (storageKey) => { localStorage.removeItem(storageKey); }, 
  'theKey'
);

Add a breakpoint

There are two execution context:

Timeout

If you are going to play with breakpoint, you need to change the timeout accordingly.

In a test file, as jest is available as a global object.

jest.setTimeout(100000);

It will be use in every invocation with the setTimeOut function.

Node breakpoint

const browser = await puppeteer.launch({
    headless: false,
    slowMo: 250, // slowdown by 250 ms
    });

Browser breakpoint

const browser = await puppeteer.launch({devtools: true});
await page.evaluate(() => {debugger;});

Select

<div class="tweet">
    <div class="retweet">10</div>
</div>
/**
* @type {import("puppeteer").ElementHandle<HTMLDivElement>}
*/
const tweetHandle = await page.$('.tweet .retweet');
expect(await tweetHandle.evaluate(node => node.innerText)).toBe('10');

Debug

https://developers.google.com/web/tools/puppeteer/debugging

Documentation / Reference