Skip to main content

Data Delivery Setup for Octaprice

Octaprice enables seamless delivery of your product and price tracking data directly to your AWS S3 bucket. We offer multiple data formats—JSON, CSV, and JSONL (JSON Lines)—providing maximum flexibility for your data integration and analysis needs.

Configuration Steps

To configure data delivery to your S3 bucket, please provide the following credentials and settings:

  • AWS Access Key ID: Your AWS Access Key ID.
  • AWS Secret Access Key: The Secret Access Key associated with your AWS account.
  • Bucket Name: The name of your S3 bucket.
  • Bucket Path: An optional folder path within your bucket.
  • Bucket Region: The AWS region where your bucket is located.
  • Delivery Format: Select one or more formats—JSON, CSV, or JSONL.

After entering the required information, click Save to finalize the configuration. Octaprice will securely transmit your data to your specified S3 bucket according to your delivery settings.

Required AWS Permissions

Ensure that the AWS credentials provided have the following permissions:

  • s3:PutObject
  • s3:GetBucketLocation
  • s3:ListBucket

These permissions are essential for Octaprice to write data to your S3 bucket and verify its location.

Security Protocols

At Octaprice, we prioritize the security of your sensitive information. All AWS credentials are encrypted and stored securely in compliance with industry-leading security standards.

Data Transmission Process

Initiating a Crawl Job

Each time Octaprice crawls a source, it initiates a crawl job encompassing all URLs to be checked. The job may produce data in multiple batches, each identified by a unique Batch ID. By default, each batch contains 5 items; however, this can be customized by contacting our support team.

Data Delivery Mechanism

Upon completion of data collection, the data is formatted according to your selected delivery formats (JSON, CSV, or JSONL) and transmitted to your S3 bucket. The process is as follows:

Data Formats

On the Data Delivery Configuration page, select your preferred format(s):

  • JSON: Standard JSON format.
  • CSV: Comma-Separated Values format.
  • JSONL: Each line contains a single JSON object (JSON Lines).

Bucket Path Structure

Data files are organized in your S3 bucket using a structured path that includes the date and IDs pertinent to your organization and crawl job. The structure is as follows:

[bucket-path]/[organizationId]/[sourceId]/[year]/[month]/[day]/[crawlJobId].[extension]
  • bucket-path: The bucket path you specified.
  • organizationId: Your organization's unique ID.
  • sourceId: The source identifier (e.g., amazon.com).
  • year/month/day: The date when the crawl job was completed.
  • crawlJobId: The unique ID for the crawl job.
  • extension: The file extension based on the selected format (json, csv, jsonl).

Example Path for a JSON File:

If you are tracking products from amazon.com and the crawl job completed on September 23, 2024, the first file received might be located at:

/123/567/2024/09/23/890.json

Process Overview

The system validates the selected format and uploads the data to your S3 bucket accordingly:

  • Format Validation: Ensures that the format is one of the supported types—JSON, CSV, or JSONL. If no format is specified, JSON is used by default.
  • Data Handling:
    • JSON Files: Data is converted to JSON format and stored with the content type application/json.
    • CSV Files: Data is converted to CSV format and stored with the content type text/csv.
    • JSONL Files: Each data item is written as a separate JSON object on a new line.

Enhanced Security Measures

Octaprice employs rigorous security protocols to safeguard your data:

  • Data Encryption: All credentials and sensitive data are encrypted during storage and transmission.
  • Secure Storage: Credentials are stored in secure environments with strict access controls.
  • Regulatory Compliance: Octaprice adheres to industry-standard security practices and compliance requirements.

For further assistance, please contact our support team.