New Feature: Amazon S3 Integration

We're pleased to announce public availability of two new features to streamline your integrations via the Amazon AWS platform.

1. Output Buckets

Many of our users have a data pipeline that involves picking up our callback POST and then immediately fetching the SERP and storing the resulting JSON directly to an S3 bucket.

We can now negate this entire process by automatically pushing the data straight into your S3 bucket from our crawl nodes as soon as it's available. You can then choose to either continue to receive the callbacks, or to simply trigger an AWS Lambda function to do further processing automatically.

We also support custom filename templates for your S3 objects that can include any of the following variables:

  • [id] - the request ID
  • [check_id] - the check_id of the SERP
  • [engine_code] - e.g. google_en-us
  • [yyyymmdd] - the date of the SERP
  • [timestamp] - the unix timestamp of the SERP
  • [queue] - the queue name: daily, priority or delayed
  • [query] - an MD5 sum of the final query

If you'd like to utilise this feature then contact support (info@) and we'll get you set up and tested.

2. Bulk CSV Upload

[Note: this feature is limited to the delayed queue only]

For our larger clients, queuing 10s or 100s of thousands of requests every day introduces logistical and devops overhead that they would rather avoid if possible. For this reason, we've introduced CSV upload via AWS S3 as a white-listed feature.

Full integration details can be found on our documentation page.