Web Scraping with NodeJS and Puppeteer

Here we will explore how we can do Web Scraping using NodeJS with Puppeteer. We will scrape a web page using the automation with Javascript.

Image for post
Image for post

What is Web Scraping?

What is Puppeteer?

Setting up the Project

$ mkdir web_scrapping_app
$ yarn init
$ code .
yarn add puppeteer

Work with Puppeteer

const puppeteer = require('puppeteer');
puppeteer.launch().then(async browser => {
//...
})
puppeteer.launch({ headless:false })
const SCRAPE_URL = `https://coincodex.com/`;(async () => {
const browser = await puppeteer.launch({ headless:false });
const page = await browser.newPage();
await page.goto(SCRAPE_URL);
await page.screenshot({ path: "screenShot.png" });
await browser.close();
})();
Image for post
Image for post
Screenshot generated with Puppeteer

Extracting data

const coinName = await page.evaluate(() =>
document.querySelector(".full-name").textContent.trim()
);
const coinCurrency = await page.evaluate(() =>
document.querySelector(".currency").textContent.trim()
);
console.log(coinName);
console.log(coinCurrency);

Extracting all data

const coinName = await page.evaluate(() =>
Array.from(document.querySelectorAll(".full-name"))
.map((coins) =>
coins.innerText.trim()
)
);
const coinCurrency = await page.evaluate(() =>
Array.from(document.querySelectorAll(".currency"))
.map((currency) =>
currency.innerHTML.trim()
)
);

Refactoring code

const coining = await page.evaluate(() =>
Array.from(document.querySelectorAll("table tr.coin.ng-star-inserted")).map(
(table) => ({
coinName: table.querySelector(".full-name").innerText,
coinPrice: table.querySelector(".currency").innerText,
})
)
);
console.log(coining);
Gist Link

Conclusion

Written by

Product Designer and Frontend Developer | https://twitter.com/ishan02016

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store