PhantomJs Cloud
17 min
phantomjs cloud provides web page rendering and scraping services via an api, enabling automated website information retrieval and screenshot capture phantomjs cloud is a powerful web automation and data extraction platform that allows users to capture website information and screenshots programmatically the phantomjs cloud connector for swimlane turbine enables users to automate the retrieval of detailed website information in json format, including rendering and structure, as well as capturing high quality screenshots of web pages this integration empowers security teams to streamline the collection of web intelligence and visual evidence, enhancing their cyber investigation and monitoring capabilities without the need for complex coding prerequisites to effectively utilize the phantomjs cloud connector within turbine, ensure you have the following prerequisites api key authentication url the endpoint for the phantomjs cloud api service api key a unique identifier to authenticate requests to phantomjs cloud notes lower your cost use requestsettings {"donewhen" \[{ "event" "domready"},]} as shown in this example that page loads in about 6 seconds, whereas the original without dom ready loads in about 20 seconds the difference is that we render when the browser signals domready normally we wait until all resources (ads, css,images, etc) are loaded track cost view the http response headers we send back with every api call this includes details such as the cost of your page and how many credits remain see all the pjsc billing headers what resources failed to load some pages have resources (images, css, fonts, etc) that fail to load usually due to broken links you can track this by inspecting the http response headers see all the pjsc content resource headers all metadata about the page / processing use the outputasjson \ true property this will return your content and all details about load times, cookies, iframes, sub resources, etc actions setup example requests minimal request { "url" "http //www google com", "rendertype" "jpg", "outputasjson"\ false, "requestsettings { "donewhen" \[ { "event" "domready"} ] } } request with all inputs { "url" "http //google com", "content" null, "urlsettings" { "operation" "get", "encoding" "utf8", "headers" {}, "data" null }, "rendertype" "jpg", "outputasjson" true, "requestsettings" { "donewhen" \[ { "event" "domready"} ], "ignoreimages" false, "disablejavascript" false, "useragent" "mozilla/5 0 (windows nt 6 2; wow64) applewebkit/534 34 (khtml, like gecko) safari/534 34 phantomjs/2 0 0 (phantomjscloud com/2 0 1)", "authentication" { "username" "guest", "password" "guest" }, "xssauditingenabled" false, "websecurityenabled" false, "resourcewait" 15000, "resourcetimeout" 35000, "maxwait" 35000, "waitinterval" 1000, "stoponerror" false, "resourcemodifier" \[], "customheaders" {}, "clearcache" false, "clearcookies" false, "cookies" \[], "deletecookies" \[] }, "suppressjson" \[ "events value resourcerequest headers", "events value resourceresponse headers", "framedata content", "framedata childframes" ], "rendersettings" { "quality" 70, "pdfoptions" { "border" null, "footer" { "firstpage" null, "height" "1cm", "lastpage" null, "onepage" null, "repeating" "%pagenum%/%numpages%" }, "format" "letter", "header" null, "height" null, "orientation" "portrait", "width" null }, "cliprectangle" null, "renderiframe" null, "viewport" { "height" 1280, "width" 1280 }, "zoomfactor" 1, "passthroughheaders" false }, "scripts" { "domready" \[], "loadfinished" \[] } } configurations api key authentication authenticates using an api key configuration parameters parameter description type required url a url to the target host string required apikey api key string required verify ssl verify ssl certificate boolean optional http proxy a proxy to route requests through string optional actions get website information retrieve detailed information of a website in json format, including rendering and structure, using the specified url and rendertype endpoint method post input argument name type required description url string optional url endpoint for the request content object optional response content urlsettings object optional url endpoint for the request urlsettings operation string optional url endpoint for the request urlsettings encoding string optional url endpoint for the request urlsettings headers object optional http headers for the request urlsettings data object optional response data rendertype string optional type of the resource outputasjson boolean optional parameter for get website information requestsettings object optional parameter for get website information requestsettings donewhen array optional parameter for get website information requestsettings donewhen event string optional parameter for get website information requestsettings ignoreimages boolean optional parameter for get website information requestsettings disablejavascript boolean optional parameter for get website information requestsettings useragent string optional parameter for get website information requestsettings authentication object optional parameter for get website information requestsettings authentication username string optional name of the resource requestsettings authentication password string optional parameter for get website information requestsettings xssauditingenabled boolean optional parameter for get website information requestsettings websecurityenabled boolean optional parameter for get website information requestsettings resourcewait number optional parameter for get website information requestsettings resourcetimeout number optional parameter for get website information requestsettings maxwait number optional parameter for get website information requestsettings waitinterval number optional parameter for get website information requestsettings stoponerror boolean optional error message if any input example {"json body" {"url" "http //google com","content"\ null,"urlsettings" {"operation" "get","encoding" "utf8","headers" {},"data"\ null},"rendertype" "jpg","outputasjson"\ true,"requestsettings" {"donewhen" \[{"event" "domready"}],"ignoreimages"\ false,"disablejavascript"\ false,"useragent" "mozilla/5 0 (windows nt 6 2; wow64) applewebkit/534 34 (khtml, like gecko) safari/534 34 phantomjs/2 0 0 (phantomjscloud com/2 0 1)","authentication" {"username" "guest","password" "guest"},"xssauditingenabled"\ false,"websecurityenabled"\ false,"resourcewait" 15000,"resourcetimeout" 35000,"maxwait" 35000,"waitinterval" 1000,"stoponerror"\ false,"resourcemodifier" \[],"customheaders" {},"clearcache"\ false,"clearcookies"\ false,"cookies" \[],"deletecookies" \[]},"suppressjson" \["events value resourcerequest headers","events value resourceresponse headers","framedata content","framedata childframes"],"rendersettings" {"quality" 70,"pdfoptions" {"border"\ null,"footer" {"firstpage"\ null,"height" "1cm","lastpage"\ null,"onepage"\ null,"repeating" "%pagenum%/%numpages%"},"format" "letter","header"\ null,"height"\ null,"orientation" "portrait","width"\ null},"cliprectangle"\ null,"renderiframe"\ null,"viewport" {"height" 1280,"width" 1280},"zoomfactor" 1,"passthroughheaders"\ false},"scripts" {"domready" \[],"loadfinished" \[]}}} output parameter type description status code number http status code of the response reason string response reason phrase statuscode number status value statusmessage object status value originalrequest object output field originalrequest originalrequest websecurityenabled boolean output field originalrequest websecurityenabled originalrequest pages array output field originalrequest pages originalrequest pages url string url endpoint for the request originalrequest pages content object response content originalrequest pages urlsettings object url endpoint for the request originalrequest pages urlsettings operation string url endpoint for the request originalrequest pages urlsettings encoding string url endpoint for the request originalrequest pages urlsettings headers object http headers for the request originalrequest pages urlsettings data object response data originalrequest pages rendertype string type of the resource originalrequest pages outputasjson boolean output field originalrequest pages outputasjson originalrequest pages requestsettings object output field originalrequest pages requestsettings originalrequest pages requestsettings donewhen array output field originalrequest pages requestsettings donewhen originalrequest pages requestsettings donewhen event string output field originalrequest pages requestsettings donewhen event originalrequest pages requestsettings ignoreimages boolean output field originalrequest pages requestsettings ignoreimages originalrequest pages requestsettings disablejavascript boolean output field originalrequest pages requestsettings disablejavascript originalrequest pages requestsettings useragent string output field originalrequest pages requestsettings useragent originalrequest pages requestsettings authentication object output field originalrequest pages requestsettings authentication originalrequest pages requestsettings authentication username string name of the resource originalrequest pages requestsettings authentication password string output field originalrequest pages requestsettings authentication password output example {"status code" 200,"response headers" {"pjsc billing credit cost" "0 000151384","pjsc billing elapsedms" "2192","pjsc billing bytes" "257,914","pjsc billing proxy ingress bytes" "0","pjsc billing proxy ingress cost" "0","pjsc billing total credits remaining" "0 048475749","pjsc billing daily subscription credits remaining" "0 047475749","pjsc billing prepaid credits remaining" "0 001","local address" "190 195 70 130","pjsc backend id" "1 05np","pjsc content status code" "200","pjsc content name" get website screenshot capture a screenshot of a specified website using phantomjs cloud, requiring the url and render type endpoint method post input argument name type required description url string optional url endpoint for the request content object optional response content urlsettings object optional url endpoint for the request urlsettings operation string optional url endpoint for the request urlsettings encoding string optional url endpoint for the request urlsettings headers object optional http headers for the request urlsettings data object optional response data rendertype string optional type of the resource outputasjson boolean optional parameter for get website screenshot requestsettings object optional parameter for get website screenshot requestsettings donewhen array optional parameter for get website screenshot requestsettings donewhen event string optional parameter for get website screenshot requestsettings ignoreimages boolean optional parameter for get website screenshot requestsettings disablejavascript boolean optional parameter for get website screenshot requestsettings useragent string optional parameter for get website screenshot requestsettings authentication object optional parameter for get website screenshot requestsettings authentication username string optional name of the resource requestsettings authentication password string optional parameter for get website screenshot requestsettings xssauditingenabled boolean optional parameter for get website screenshot requestsettings websecurityenabled boolean optional parameter for get website screenshot requestsettings resourcewait number optional parameter for get website screenshot requestsettings resourcetimeout number optional parameter for get website screenshot requestsettings maxwait number optional parameter for get website screenshot requestsettings waitinterval number optional parameter for get website screenshot requestsettings stoponerror boolean optional error message if any input example {"json body" {"url" "http //google com","content"\ null,"urlsettings" {"operation" "get","encoding" "utf8","headers" {},"data"\ null},"rendertype" "jpg","requestsettings" {"donewhen" \[{"event" "domready"}],"ignoreimages"\ false,"disablejavascript"\ false,"useragent" "mozilla/5 0 (windows nt 6 2; wow64) applewebkit/534 34 (khtml, like gecko) safari/534 34 phantomjs/2 0 0 (phantomjscloud com/2 0 1)","authentication" {"username" "guest","password" "guest"},"xssauditingenabled"\ false,"websecurityenabled"\ false,"resourcewait" 15000,"resourcetimeout" 35000,"maxwait" 35000,"waitinterval" 1000,"stoponerror"\ false,"resourcemodifier" \[],"customheaders" {},"clearcache"\ false,"clearcookies"\ false,"cookies" \[],"deletecookies" \[]},"suppressjson" \["events value resourcerequest headers","events value resourceresponse headers","framedata content","framedata childframes"],"rendersettings" {"quality" 70,"pdfoptions" {"border"\ null,"footer" {"firstpage"\ null,"height" "1cm","lastpage"\ null,"onepage"\ null,"repeating" "%pagenum%/%numpages%"},"format" "letter","header"\ null,"height"\ null,"orientation" "portrait","width"\ null},"cliprectangle"\ null,"renderiframe"\ null,"viewport" {"height" 1280,"width" 1280},"zoomfactor" 1,"passthroughheaders"\ false},"scripts" {"domready" \[],"loadfinished" \[]}}} output parameter type description status code number http status code of the response reason string response reason phrase file object file file file name string name of the resource file file string output field file file output example {"status code" 200,"response headers" {"pjsc billing credit cost" "0 000131835","pjsc billing elapsedms" "1754","pjsc billing bytes" "252,335","pjsc billing proxy ingress bytes" "0","pjsc billing proxy ingress cost" "0","pjsc billing total credits remaining" "0 048495298","pjsc billing daily subscription credits remaining" "0 047495298","pjsc billing prepaid credits remaining" "0 001","local address" "190 195 70 130","pjsc backend id" "1 727r","pjsc content status code" "200","pjsc content name" response headers header description example access control expose headers http response header access control expose headers www authenticate,server authorization alt svc http response header alt svc h3=" 443 "; ma=2592000,h3 29=" 443 "; ma=2592000 cache control directives for caching mechanisms no cache content disposition http response header content disposition filename="www google com jpeg" content encoding http response header content encoding gzip content type the media type of the resource image/jpeg date the date and time at which the message was originated wed, 28 dec 2022 18 46 48 gmt local address http response header local address 190 195 70 130 pjsc backend id http response header pjsc backend id 1 727r pjsc billing bytes http response header pjsc billing bytes 257,914 pjsc billing credit cost http response header pjsc billing credit cost 0 000151384 pjsc billing daily subscription credits remaining http response header pjsc billing daily subscription credits remaining 0 047475749 pjsc billing elapsedms http response header pjsc billing elapsedms 2192 pjsc billing prepaid credits remaining http response header pjsc billing prepaid credits remaining 0 001 pjsc billing proxy ingress bytes http response header pjsc billing proxy ingress bytes 0 pjsc billing proxy ingress cost http response header pjsc billing proxy ingress cost 0 pjsc billing total credits remaining http response header pjsc billing total credits remaining 0 048475749 pjsc content done detail http response header pjsc content done detail {"reason" "match donewhen {"event" "domready"}","statuscode" 200 } pjsc content event phase http response header pjsc content event phase "load" pjsc content name http response header pjsc content name www google com jpeg pjsc content page exec last waited on http response header pjsc content page exec last waited on waitinterval(1000) not yet met still need to wait 40 pjsc content resource aborted http response header pjsc content resource aborted 0 pjsc content resource active http response header pjsc content resource active 1 pjsc content resource complete http response header pjsc content resource complete 12 pjsc content resource failed http response header pjsc content resource failed 0 pjsc content resource late http response header pjsc content resource late 0 pjsc content status code http response header pjsc content status code 200 pjsc content url http response header pjsc content url http //www google com/ transfer encoding http response header transfer encoding chunked vary http response header vary origin,accept encoding via http response header via 1 1 google