NodeJS + PhantomJS 前端自动化资源监控

前言:最近做前端资源监控,看了很多例子,没有达到想要的效果。首先的槽点是PhantomJS的官方文档,真鸡肋,其次是网上的例子,多数是介绍PhantomJS的用法,而并没有介绍怎么完整的去实现,跟官方文档好像也没啥区别......

一、实现了什么?

本文只做一个简单的例子,获取网址加载时间,旨在实现一种思路,而不是很复杂的功能。

主要流程:

前端在页面加载时调用node接口 -> 将URL返回给接口 -> node获取url,然后通过子进程(child_process)启动系统命令 -> 启动PhantomJS

-> PhantomJS打开url指定的网页,计算加载时间 -> 返回给node接口 -> 接口处理数据 -> 入库

二、主要实现

1、前端获取url

let param = {'url': window.location.protocol + '//' + window.location.host};

将url作为参数传递给接口。

2、NodeJS获取前端传递过来的url并启动phantomjs打开url,数据获取并处理

本文用的是sails框架。

首先是写models定义字段:ResourceMonitor.js

    module.exports = {
connection: '数据库名',
tableName: '表名',
autoCreatedAt: true,
autoUpdatedAt: true,
attributes: {
id: {
type: 'integer',
primaryKey: true,
autoIncrement: true
},
webAddress: {
type: 'string'
},
loadingTime: {
type: 'string'
}
}
}

然后写服务: resourceMonitorService.js

    module.exports = {
async create(resourceMonitor){
try {
try {
if (resourceMonitor.id) {
let oldResourceMonitor = await ResourceMonitor.findOne({id: resourceMonitor.id});
Object.assign(oldResourceMonitor, resourceMonitor);
oldResourceMonitor.save(function (err) {
if (err) throw err;
return oldResourceMonitor;
})
} else {
return ResourceMonitor.create(resourceMonitor);
}
} catch (err) {
throw err;
}
} catch (err) {
throw err;
}
}
}

接着是controller: resourceMonitorController.js

    const Cmd = require('cmd')
module.exports = {
async create(req, res){
try {
let url = req.body['url']
let resource = await Cmd.resourceMonitorPhantom(url);//启动phantomJS
if (!resource) return;
resource = resource.replace(/[\s\n\r]+/g, "")//正则去除空格换行符回车符
let loadingTime = resource.match(/total:(\S*)ms/)[1] + ' ms';//正则匹配'total:'和'ms'之间的所有非空白内容,并在返回结果后面加上'ms'
let webAddress = resource.match(/webAddress:(\S*)total:/)[1];//正则匹配'webAddress:'和'total:'之间的所有非空白内容
let resourceMonitor = {webAddress, loadingTime}
if (!resourceMonitor) {
sails.log.error("参数异常 " + JSON.stringify(resourceMonitor))
return res.send({code: -1, msg: "参数异常"})
}
resourceMonitorService.create(resourceMonitor).then(function (data) {
return res.send({code: 1, msg: "success", data: data})
}).catch(function (err) {
sails.log.error("create resourceMonitor err" + err)
return res.send({code: -1, msg: "get resourceMonitor list err"})
})
} catch (err) {
sails.log.error("create resourceMonitor failure" + err)
return res.send({code: -1, msg: "create resourceMonitor failure"})
}
}
}

NodeJS命令行启动PhantonJS的方法:cmd.js

//用的是node子进程中的execSync同步方法,可以等phantomjs处理完数据返回以后才执行接下来的内容

   let execSync = require('child_process').execSync;
module.exports = {
async resourceMonitorPhantom(url){
try {
let resourceMonitorCmd = `phantomjs ./phantom/resourceMonitorPhantom.js ${url}`;
let resourceMonitorOut = await execSync(resourceMonitorCmd).toString();//由于phantomjs返回等数据是二进制流buffer,所以处理称字符串
return resourceMonitorOut;
} catch (err) {
throw err;
}
}
}

PhantomJS获取资源加载时间:resourceMonitorPhantom.js

    var page = require('webpage').create();
var system = require('system');
var address;
if (system.args.length === 1) {
console.log(system.args)
phantom.exit();
} else {
var t = Date.now();
address = system.args[1];
page.open(address, function (status) {
if (status !== 'success') {
console.log('fail to load the address');
} else {
t = Date.now() - t;
console.log('webAddress:' + address + 'total:' + t + 'ms');
}
phantom.exit();
});
}

三、涉及的框架及运行环境

1、NodeJS

Node.js® is a JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Node.js' package ecosystem, npm, is the largest ecosystem of open source libraries in the world.

Node.js 是一个基于 Chrome V8 引擎的 JavaScript 运行环境。 Node.js 使用了一个事件驱动、非阻塞式 I/O 的模型,使其轻量又高效。 Node.js 的包管理器 npm,是全球最大的开源库生态系统。

官网:https://nodejs.org/

2、Sails

Sails.js (or Sails) is a Model-View-Controller (MVC) web application framework developed atop the Node.js environment, released as free and open-source software under the MIT License. It is designed to make it easy to build custom, enterprise-grade Node.js web applications and APIs. Emulating the MVC architecture of other frameworks, like Ruby on Rails it offers similar pattern and familiarity, reducing the cognitive burden when switching between other frameworks/languages.

Sails.js (or Sails)是一个开发的模型-视图-控制器(MVC)web应用程序框架的节点。js的环境,在MIT许可下免费和开源软件的发布。它的目的是使它容易构建定制的,企业级节点。js web应用程序和api。效仿其他框架的MVC架构,像Ruby on Rails提供了类似的模式和熟悉,减少认知负担当切换其他框架/语言。

官网:http://sailsjs.com/

3、 PhantomJS

PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

PhantomJS,脚本化的WebKit内核的无头浏览器,支持各种web标准:DOM处理CSS选择器,JSON,Canvas,SVG。

简单的说,就是一个没有界面的webkit内核浏览器,只能在命令行下使用。可以做很多事,比如:生成网页截图、抓去网页数据、获取网络资源加载时间等。

官网:http://phantomjs.org/