This guide describes provides general bot information and describes the bots that Layer0 detects.

Layer0 examines the user-agent header in an incoming request to determine if it includes a string that indicates if it is a bot, and if so, injects 1 in the x-0-device-is-bot request header, which will be visible to your server code. If the user-agent header does not include any of the strings indicating a bot, a 0 value is injected.

The following table list the user agents that Layer0 examines and describes the corresponding bots.

User AgentBot Description
embedlyEmbed.ly web crawler bot that performs HTTP requests most often in automatic mode.
facebookexternalhitFacebook bot that crawls the HTML of social plugins, apps, and websites shared on Facebook. The bot gathers and caches data (title, description, thumbnail image) about the shared content and presents the data as a preview.
flipboardFlipboard Proxy Service bot that runs in response to a user request for the service to scan a social media feed such as Twitter, and construct a processed feed of items to deliver in real time.
googlepagesspeedGoogle bot that assists in ranking search results based on page load speed.
Google web/snippetGoogle+ Enterprise bot that extracts high-level data from a URL posted on Google+ Enterprise and presents the data as a snippet of the URL.
headlessBots, usually scripts, that run on a scheduled basis or are triggered from an external system. Headless bots usually perform activities like sending alerts or daily digest messages. The scripts usually run for a short time, then terminate.
ia_archiverAmazon Alexa bot that crawls web sites for issues related to Amazon's Site Audit service.
outbrainOutbrain Recommendation Platform chat bot.
pinterestAutomated Pinterest bot that creates boards and schedules pins to post to customer accounts.
prerenderPrerender.io hosted service bot that produces an easily crawled version of dynamically rendered pages, allowing indexing by search engines.
previewYahoo bot that extracts data (title, description, thumbnail images) from a URL embedded in an email and presents the data as a preview of the URL
qwantifyWeb crawler bot that indexes content for the Qwant search engine.
scannerBots that analyze how well your website and its security measures respond to various bot threats.
slurpYahoo Search bot for crawling and indexing web page information.
spiderGeneral purpose automated bots that crawl the web to index web page information.
tumblrTumblr bot that performs automated HTTP requests as a web crawler.
vkshareVK social network bot that performs automated HTTP requests usually as a web crawler.
w3c_validatorW3C bot that checks Web documents in formats like HTML and XHTML for conformance to W3C Recommendations and other standards.
whatsappWhatsapp platform chat bot.
xing-contenttabreceiverXing social network crawler bot that indexes content for the Xing social network.
yahooAnother Yahoo Search robot for crawling and indexing web page information.

If the set of bots detected by Layer0 is not sufficient for your needs, you can easily add your own bot detection through EdgeJS and its match and setRequestHeader APIs:

router.match(
  {
    headers: {
      'user-agent': /^regex-for-your-bot-detection$/i
    },
  },
  ({ setRequestHeader }) => {
    setRequestHeader('my-bot-detection-is-bot', '1')
  }
)
// ... all your other routes go here and they can match on `my-bot-detection-is-bot: 1`

The above code will match all the routes that even have a user-agent header and then inject the my-bot-detection-is-bot when the value of the user agent header matches the given regex. Once the header has been injected, the later routes can test for it and implement bot handling. Or, you could just let the header be sent upstream for your backend to handle it.

Edit this guide on GitHub