My quest with Puppeteer and a few drafts on Devblog

Subscribe to my newsletter and never miss my upcoming articles

Hello Devblog 👋

Now that everyone of us is stuck at home, I decided it is high time I should start writing again. I started thinking about what would be my first story on Devblog. I have heard so much about this amazing platform.

So here it goes.

I was playing around with the settings of my new blog. There were tons of customizations at offer. And then I happened to click on the "Unpublished Drafts".

unpublished-drafts.png

See, I joined the platform right around when the Hashnode team launched it last year. I had been dormant all this while. But what I discovered was a bit of a surprise. I had a few drafts in there.

drafts.png

This was interesting. I saw that each one of them was from May 11, 2019 created roughly at the same time at 5:00 AM. This had to be a mistake. Why would I have hundreds of drafts!

It took me a minute but I realised there could have been a bug in the early days of the platform. How else, would I create so many topics?

I have an OCD. I am quite pretentious about "digital organisation". I am that guy who likes the inbox to be zero at all times. I also like taking the trash out in Gmail. But you don't need to read this. Moving on.

So I started deleting these empty drafts.

One by one.

Five deletions later, I got bored to death. So, I looked for the "select and delete" option. There wasn't one.

Who would think of adding such a feature? For some random user trying to cleanup their stash of empty drafts? No one is ever going to see this list!

But, remember I have the OCD, right?

Let see, I should count the drafts. Boy, I had a lot of them to do by hand.

count.png

Nah! I am not doing this by hand, I have better things to do than delete drafts at 10:00 PM on a Wednesday night. I could be watching Netflix with my lovely wife. But my nerdy mind wont let me go away. My OCD was adding fuel to the urge.

Heck, I am developer. I should automate this, would that not be a fun challenge?

So, my Puppeteer quest began.

So, what is Puppeeter.?

It is a Node.js library. It helps you interact with the Chrome and Chromium browsers. And it is an amazing tool to have with you at all times. It is great for doing browser automation tasks.

You can learn more about Puppeteer here: pptr.dev

You can write all the code from scratch, but if you are lazy like me, you can cheat and use a browser extension to record a script for you. That's right, it is exactly like a macro recorder.

All you do is, record the steps, like click a button, enter some text into an input, etc. and when done, the fun part begins. The extension will spit JavaScript, that you can take and add to your Puppeteer script. More often than not, you are good to go.

I recorded deleting one draft and got this neat script from the recorder:

const puppeteer = require("puppeteer");
(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  const navigationPromise = page.waitForNavigation();
  await page.goto("https://hashnode.com/drafts");
  await page.setViewport({ width: 1280, height: 934 });
  await page.waitForSelector(
    ".w-full > .post-card:nth-child(1) > .flex > .flex:nth-child(2) > span"
  );
  await page.click(
    ".w-full > .post-card:nth-child(1) > .flex > .flex:nth-child(2) > span"
  );
  await navigationPromise;
  await browser.close();
})();

Great right? Let's get to work then.

Let's start by making a list of the user stories:

  1. We need to sign in.
  2. We need to loop this over and over, for howsoever many empty articles we have.
  3. We also need to handle the page navigation after each deletion.

Let's tweak the script, that the extension has written for us.

Puppeteer will launch a new instance of chrome. We can write some logic that will go through the sign in flow and navigate to the drafts page. We can then add logic for deleting the empty posts.

But the sign in flow is going to be a pain. We are going to be doing a lot of trial and error while coming up with the final script. Also its not a part of the actual automation loop.

Note that our actions are:

  1. Navigate the /drafts page.
  2. Find and Click Delete.
  3. Click OK on the Confirm dialog.
  4. Wait for the page to (re)load and repeat all these steps again.

Thanks to Puppeteer, we have a convenient way of "hooking" onto a pre-existing session. A browser instance with a session that we have already signed into. This way, we can skip the authentication. This only works because we are attempting the automation on our local machine as a real user. It gets the job done.

Let's launch a browser with debugging enabled. The below command will expose a port and web-socket that we will use later:

We can run this command on a terminal.

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --no-first-run --no-default-browser-check --user-data-dir=$(mktemp -d -t 'chrome-remote_data_dir')

I borrowed the command above from this article for macOS, you should check it our for tips on your OS.

Now, we can sign in to https://hashnode.com on the Chrome instance that we launched. We will re-use the session by connecting to this instance from puppeteer like so:

const browser = await puppeteer.connect({
  browserWSEndpoint:
    "ws://127.0.0.1:9222/devtools/browser/c52e0020-15e2-4c09-85ab-68a750f96338",
});

const page = await browser.newPage();

let pageUrl = "https://hashnode.com/drafts";

await page.goto(pageUrl, {
  waitUntil: "networkidle0",
});

await page.setViewport({ width: 960, height: 934 });

Great, so far good. I can check that I do not get any prompts to sign in and I am on the page with a list of all the drafts that I need to work on. Onward then?

Let's see if we can count the drafts. I sure update the bit of script that the recorder gave me, and garb the selectors that I need. So cleaning it up we have:

let deleteButtonParentSelector = ".post-card";
let deleteButtonSelector =
  ".w-full > .post-card:nth-child(1) > .flex > .flex:nth-child(2) > span";

let count = await page
  .waitForSelector(deleteButtonParentSelector)
  .then(async () => {
    return (await page.$$(deleteButtonParentSelector)).length;
  });

console.log("found " + count + " drafts");

You may notice that I am using a weak logic for counting the posts. I know all the posts on this page are drafts that I want to delete. I also know that all the posts are right here on this page, there is no pagination. You may want to adjust your selector strategy as per your need.

Great moving forward, let's try and delete one of the posts. I used the recorder's script again and came up with:

page
  .waitForSelector(deleteButtonSelector)
  .then(page.click(deleteButtonSelector));

Except. It did not work. Bummer. Ah - there is no handling of the confirmation box? Darn you, recorder you did not capture that confirmation/alert dialog.

That should be an easy fix. We can hook into the dialog handler provided by Puppeteer.

page.on("dialog", async (dialog) => {
  await dialog.accept();
});

Okay this is getting us somewhere, let's throw it all together shall we?

Here is the full script.

//  #!/usr/bin/env node

const puppeteer = require('puppeteer');

// a promise-ified delay helper
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));

const main = async () => {
  // 1. Connect to an existing browser instance
  const browser = await puppeteer.connect({
    browserWSEndpoint:
'ws://127.0.0.1:9222/devtools/browser/c52e0020-15e2-4c09-85ab-68a750f96338'
  });

  // 2. Open a new tab on, navigate and set a viewport
  // width and height
  const page = await browser.newPage();

  let pageUrl = 'https://hashnode.com/drafts';
  await page.goto(pageUrl, {
    //waiting for network requests is a good idea.
    waitUntil: 'networkidle0'
  });
  await page.setViewport({ width: 960, height: 934 });

  // 3. Add a handler for alert dialogs (we will use this later).
  page.on('dialog', async dialog => {
    await delay(100); // add a microscopic delay for the dialog to be ready.
    await dialog.accept();
    await delay(100);
  });

  // 4. Prepare selectors for elements that we will interact with
  let deleteButtonParentSelector = '.post-card';
  let deleteButtonSelector =
    '.w-full > .post-card:nth-child(1) > .flex > .flex:nth-child(2) > span';

  // 5. Count the no. of posts, we can use this number to loop and repeat things
  let count = await page
    .waitForSelector(deleteButtonParentSelector)
    .then(async () => {
      return (await page.$$(deleteButtonParentSelector)).length;
    });

  console.log('found ' + count + ' drafts');

  // 7. Loop over the no. of drafts we found.
  for (let index = 0; index < count; index++) {
    try {

      // 10. Promise.all to avoid race conditions between page loads and clicks
      await Promise.all([

        // 9. Waiting for the page load to complete before the next run
        page.waitForNavigation({
          waitUntil: 'networkidle0'
        }),

        // 8. Find and click the delete button.
        page
          .waitForSelector(deleteButtonSelector)
          .then(page.click(deleteButtonSelector))
          // The handler (No. 3 earlier) will be called here

      ]);

      // catch any errors for debugging later
    } catch (error) {
      console.log('error', error);
    }
  }

  console.log('done');
  await browser.close();
};

main();

And voila! 15 minutes later, We have deleted all those drafts. Now this is a small quest, but it helped me do a mundane task and helped me learn a trick.

Hope you liked this story. Until the next one.

Stay safe and stay home. We got this.

Comments (3)

Edidiong Asikpo's photo

Wow! I must say that I am really impressed at the way you were able to figure out a way to solve the problem you encountered. Like you said, who wants to delete 100 drafts, when you can simply select all of them and delete it at once.

I can't wait to read more articles from you. Keep them coming Mrugesh Mohapatra

Mrugesh Mohapatra's photo

Technology & Community, freeCodeCamp.org | developer 👨‍💻 • music addict 🎸 • open-source enthusiast 🌟 • photography noob 📷 • travel 🥑

Thanks for your kind words Edidiong Asikpo. I will try and keep these coming now and then :)

Edidiong Asikpo's photo

Software Engineer at Interswitch Group

Mrugesh Mohapatra My pleasure