Skip to main content

Connectors

Connectors define how the platform connects to external data sources. Each connector type brings its own authentication method, data access patterns, and capabilities.

Available Connectors

PostgreSQL

The most mature connector with full streaming, incremental ingestion, and real-time preview.

FeatureStatus
AuthenticationUsername/Password
Schema Browser
Custom SQL Queries
Preview (AI-enriched)
Incremental Ingestion✅ (change-tracking)
Scheduled Execution

How it works: The platform connects to your PostgreSQL database, runs the configured query (or full table extract), streams the rows as CSV, and imports them into your data warehouse.


Google Sheets / Drive (In progress)

Google Sheets/Drive is visible in the product as In progress and is temporarily unavailable for new connections, previews, tests, and scheduled jobs.

FeatureStatus
AuthenticationOAuth 2.0 (Google)
PreviewIn progress
Scheduled ExecutionIn progress
Incremental Ingestion❌ (full extract)

Configuration fields:

  • Spreadsheet ID — Found in the spreadsheet URL: docs.google.com/spreadsheets/d/SPREADSHEET_ID/edit
  • Range — e.g. Sheet1!A:Z or just A:Z for the default sheet

Planned behavior: The platform will use the Google Sheets API v4 to read spreadsheet data through OAuth. The first row is treated as headers, and subsequent rows are converted to CSV before loading into the warehouse.


Microsoft (Excel / SharePoint)

Connect your Microsoft 365 account via OAuth to access Excel Online files, SharePoint folders, and SharePoint lists.

FeatureStatus
AuthenticationOAuth 2.0 (Microsoft)
Preview
Scheduled Execution
Incremental Ingestion❌ (full extract)

Three source types are supported:

Excel File

Read a specific Excel workbook from OneDrive or SharePoint.

SharePoint Folder

Ingest all CSV and Excel files from a SharePoint document library folder.

SharePoint List

Read data from a SharePoint list and convert to tabular format.


MySQL

Fully supported with schema browsing, SQL queries, previews, and scheduled executions.

FeatureStatus
AuthenticationUsername/Password
Schema Browser
Preview
Incremental Ingestion✅ (via column watermark tracking)
Scheduled Execution

SQL Server

Fully supported for enterprise workloads, including SQL authentication.

FeatureStatus
AuthenticationUsername/Password
Schema Browser
Preview
Incremental Ingestion✅ (via column watermark tracking)
Scheduled Execution

Inbound API (Push)

External systems push data into the platform via a dedicated webhook endpoint. Each connection gets a unique, revocable ingest token.

FeatureStatus
AuthenticationIngest Token (auto-generated, 256-bit)
DirectionPush (external → platform)
Preview✅ (last received data)
Multi-Table Support✅ (hybrid single-request routing)
Auto-Healing Schema✅ (zero-cost pre-filtering & AI Schema Matcher)
Scheduled ExecutionN/A — event-driven / queue worker

How it works: Create an Inbound API connection, the platform generates a unique endpoint URL with an ingest token. External systems POST JSON data (single-table arrays or multi-table structures) to the endpoint.


AI API Client (Pull)

Connect to any REST API without writing code. By providing technical documentation (ReDoc, Swagger schema, JSON/TXT, or raw endpoints text) and describing in natural language what tables you want (e.g., "Pull won opportunities from ERP"), our AI Integrator orchestrates the entire bridge in an automated, secure background sandbox.

FeatureStatus
AuthenticationBearer Token / API Key / Basic Auth / Custom Headers
DirectionPull (Iara Data → External API)
Frictionless Mapping✅ (Full analysis of manual documents & schema routes)
Active Domain Duplicate Detection✅ (Auto-reuses valid credentials for existing host domains)
Adaptive Auth Guidance✅ (Visual copy & paste steps based on API security checks)
Silent Background Compilation✅ (Generated logic and trial runs happen without tech-noise)
Execution & Scheduling✅ (Custom recurrence, schema review, type mapping & keys)

Unified No-Code Experience

To achieve an effortless cognitive flow, we consolidated connections and pipelines into a single continuous wizard:

  1. Analyze Documentations: Insert your target API root and paste technical references (APIs manuals, ReDocs, or plain text descriptions). Our agent scans the payload security structures instantly.
  2. Helpful Credentials Guides: Instead of guessing headers, the wizard provides tailored copy/paste steps explaining where to retrieve the API key or token.
  3. Domain Duplicate Prevention: If you try to create a connection for an API host domain that was already configured, the platform flags the duplication block and offers a 1-Click Reuse action to borrow existing valid credentials safely.
  4. Click Recommended Pipelines: The model maps the endpoints and suggests a deck of typical data-context cards (e.g. Customers/Clientes, Invoices, Logs). Clicking a card triggers the bridge creation.
  5. Silent Test Run: The system compiles the connector, runs a trial query, and returns a verified data preview within seconds—completely shielding the business user from complex scripts or code approvals.
  6. Review and Schedule: Check the final preview tables, configure standard transformations, set PII privacy masking, select keys, and set up your cron scheduler inside the review screen.

Shopify

Ingest data from your Shopify store via the Admin API.

FeatureStatus
AuthenticationAdmin API Access Token
Supported ObjectsOrders, Products, Customers, Inventory Items, Collections
Preview
Pagination✅ (cursor-based via Link header)
Incremental Ingestion✅ (via updated_at change tracking)
Scheduled Execution
PlanStarter+

Configuration fields:

  • Shop Domain — e.g. my-store.myshopify.com
  • Access Token — From a Shopify custom app (Admin API)
  • API Version — Defaults to 2024-01

How it works: The platform calls the Shopify Admin REST API, paginates through all records using cursor-based pagination (Link header), flattens nested objects, and uploads the data as CSV to the data warehouse.

To get an access token:

  1. Go to Shopify Admin → Settings → Apps and sales channels → Develop apps
  2. Create a custom app and configure Admin API scopes (read_orders, read_products, etc.)
  3. Install the app and copy the Admin API access token

Stripe

Ingest payment and billing data from Stripe.

FeatureStatus
AuthenticationRestricted API Key (read-only)
Supported ObjectsCharges, Subscriptions, Customers, Invoices, Payouts, Disputes, Products, Prices
Preview
Pagination✅ (cursor-based via starting_after)
Incremental Ingestion✅ (via created timestamp filter)
Scheduled Execution
PlanStarter+

Configuration fields:

  • API Key — Restricted key with read-only permissions

How it works: The platform calls the Stripe REST API with cursor-based pagination, flattens nested objects (metadata, address, etc.), and uploads to the data warehouse.

To get an API key:

  1. Go to Stripe Dashboard → Developers → API keys
  2. Create a restricted key with read-only permissions for the data you need

HubSpot

Ingest CRM data from HubSpot.

FeatureStatus
AuthenticationPrivate App Access Token
Supported ObjectsContacts, Companies, Deals, Tickets, Products, Line Items
Preview
Pagination✅ (cursor-based search API)
Incremental Ingestion✅ (via updatedAt)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Access Token — From a HubSpot private app

How it works: The platform uses the HubSpot CRM v3 Search API to paginate through records. Properties are automatically flattened from the nested properties object.

To get an access token:

  1. Go to HubSpot → Settings → Integrations → Private Apps
  2. Create a private app with CRM object read scopes
  3. Copy the access token

TOTVS Protheus

Ingest data from the TOTVS Protheus ERP system via its REST API.

FeatureStatus
AuthenticationBasic Auth / Bearer Token / API Key
Supported EntitiesCustomers (SA1), Products (SB1), Sales Orders (SC5), Invoices (SF2), Financials (SE1/SE2), custom
Preview
Pagination✅ (offset-based)
Field Mapping✅ (map Protheus fields to standard names)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Base URL — e.g. https://protheus.company.com:8888
  • Auth Type — Basic, Bearer, or API Key
  • Environment / Company / Branch — Protheus-specific context headers

How it works: The platform calls the TOTVS Protheus REST API using offset pagination. Pre-configured entity endpoints (SA1, SB1, SC5, SF2, SE1, SE2) map to standard business objects. Custom endpoints can be specified for non-standard entities.


S3 / GCS (Cloud Bucket)

Ingest files from Amazon S3 or Google Cloud Storage buckets.

FeatureStatus
AuthenticationAccess Key (S3) / Service Account (GCS)
Supported FormatsCSV, JSON, JSONL, XLSX (Excel), Parquet
Preview
Multi-File Ingestion
Binary Ingestion✅ (raw buffers download preventing file corruption)
Incremental Ingestion✅ (by file path history tracking & filename watermarks)
S3-Compatible✅ (MinIO, DigitalOcean Spaces, etc.)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Provider — S3 or GCS
  • Bucket Name — Target bucket
  • Region — S3 region (e.g. us-east-1)
  • Custom Endpoint — For S3-compatible services like MinIO

How it works:

  1. Binary-safe Download: Files are downloaded as raw buffers, fully supporting binary formats like Excel (.xlsx) and Apache Parquet (.parquet) without text-encoding corruption.
  2. Incremental Ingestion (File Log): The platform maintains a history of ingested files (boitata_ingested_files). When scheduled, it performs an outer join check to skip files that were already successfully processed, preventing duplicate imports.
  3. Filename Date Watermarks: If configured, the platform extracts timestamps from filename patterns to dynamically advance the job's watermark, skipping older files.
  4. Ingestion: A _source_file column is added to track the origin of each row. Combined data is structured and loaded into Nessie/Iceberg tables.

Salesforce

Ingest CRM and business data from Salesforce using SOQL queries.

FeatureStatus
AuthenticationOAuth Access Token
Custom SOQL Queries
Supported ObjectsAll standard and custom Salesforce objects
Preview
Pagination✅ (automatic via nextRecordsUrl)
Incremental Ingestion✅ (via LastModifiedDate)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Instance URL — e.g. https://mycompany.salesforce.com
  • Access Token — OAuth access token
  • API Version — Defaults to v59.0

How it works: The platform executes SOQL queries against the Salesforce REST API. Results are automatically paginated using nextRecordsUrl. You can specify individual objects with field selection or write custom SOQL queries.


MongoDB

Ingest documents from MongoDB via the Atlas Data API.

FeatureStatus
AuthenticationAtlas Data API Key
Custom Queries✅ (filter, projection, sort)
Preview
Scheduled Execution
PlanStarter+

Configuration fields:

  • Data API URL — Atlas Data API endpoint
  • API Key — Data API key
  • Data Source — Cluster name (e.g. Cluster0)
  • Database — Target database

How it works: The platform uses the MongoDB Atlas Data API /action/find endpoint. Extended JSON types ($oid, $date, $numberDecimal) are automatically converted to primitive values. Arrays and nested objects are JSON-serialized into string columns.


FTP / SFTP

Connect to FTP or SFTP servers to ingest files.

FeatureStatus
AuthenticationUsername/Password or Private Key (SFTP)
Connection Test✅ (TCP reachability)
Full Execution⚠️ Requires ssh2-sftp-client dependency
PlanStarter+

Note: Full FTP/SFTP execution requires the ssh2-sftp-client and basic-ftp npm packages to be installed. The connection test checks TCP reachability only.


BigQuery

Run SQL queries against Google BigQuery and ingest the results.

FeatureStatus
AuthenticationService Account JSON (JWT)
Custom SQL Queries
Preview
Schema Types✅ (preserves BigQuery type metadata)
Scheduled Execution
PlanBusiness+

Configuration fields:

  • Project ID — GCP project ID
  • Service Account JSON — Service account key with BigQuery Reader role
  • Location — BigQuery dataset location (e.g. US, EU)

How it works: The platform signs a JWT using the service account key, exchanges it for an access token, then executes the SQL query via the BigQuery REST API. Schema metadata is preserved (field types from BigQuery schema).


Snowflake

Run SQL queries against Snowflake and ingest the results.

FeatureStatus
AuthenticationUsername/Password or Key-Pair
Custom SQL Queries
Preview
Scheduled Execution
PlanBusiness+

Configuration fields:

  • Account — Snowflake account identifier (e.g. xy12345.us-east-1)
  • Username / Password — Snowflake credentials
  • Warehouse — Compute warehouse
  • Database / Schema / Role — Default context

How it works: The platform submits SQL queries via the Snowflake SQL REST API (/api/v2/statements). Results are returned synchronously for small queries. Column names and data types are extracted from resultSetMetaData.


SAP (OData)

Ingest data from SAP systems via OData services.

FeatureStatus
AuthenticationBasic Auth / OAuth / API Key
OData Queries✅ ($select, $filter, $expand)
Preview
Pagination✅ ($skip/$top with inline count)
Scheduled Execution
PlanBusiness+

Configuration fields:

  • OData Service URL — SAP Gateway service URL
  • Auth Type — Basic, OAuth, or API Key
  • SAP Client — (optional) Client number (e.g. 100)

How it works: The platform calls the SAP OData service with $skip/$top pagination, using $inlinecount=allpages to determine total record count. Metadata objects (__metadata, __deferred) are automatically stripped from results.


Kafka

Consume messages from Kafka topics via the Confluent REST Proxy.

FeatureStatus
AuthenticationSASL (PLAIN / SCRAM)
ProtocolConfluent REST Proxy (v2)
Preview
Offset Control✅ (earliest / latest)
Scheduled Execution
PlanBusiness+

Configuration fields:

  • REST Proxy URL — Confluent REST Proxy endpoint
  • Schema Registry URL — (optional) For Avro/Protobuf deserialization
  • SASL Credentials — (optional) Username/password

How it works: The platform creates a consumer instance via the REST Proxy, subscribes to the specified topic, consumes messages, and then cleans up the consumer. Message values are flattened from JSON; metadata (topic, partition, offset, key, timestamp) is preserved.

Note: For native Kafka connections (without REST Proxy), install kafkajs (coming soon).


Notion

Ingest data from Notion databases and pages.

FeatureStatus
AuthenticationIntegration Token
Supported ObjectsDatabases, Pages
Preview
Pagination✅ (cursor-based)
Incremental Ingestion✅ (via last_edited_time)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Integration Token — From a Notion integration (secret_...)

How it works: The platform queries Notion databases using the API's /databases/{id}/query endpoint. All 18+ Notion property types (title, rich_text, number, select, multi_select, date, checkbox, URL, email, phone, formula, relation, rollup, people, files, created_time, last_edited_time, status) are automatically flattened to scalar values.

To get an integration token:

  1. Go to notion.so/my-integrations
  2. Create a new integration
  3. Share the target database with the integration

Slack

Export messages, users, and channels from Slack workspaces.

FeatureStatus
AuthenticationBot User OAuth Token (xoxb-...)
Supported Data TypesMessages, Users, Channels
Preview
Pagination✅ (cursor-based)
Incremental Ingestion✅ (via message ts timestamp)
Scheduled Execution
PlanGrowth+

Configuration fields:

  • Bot Token — Slack Bot User OAuth Token (xoxb-...)

Data types:

  • Messages — Channel message history with reactions, threads, attachments
  • Users — Workspace members with profile info, email, status
  • Channels — Public and private channels with topic, purpose, member count

To get a bot token:

  1. Go to api.slack.com/apps and create a new app
  2. Add Bot Token Scopes: channels:history, channels:read, users:read, users:read.email
  3. Install the app to your workspace and copy the Bot User OAuth Token

Connector Architecture

All connectors follow the same pipeline pattern:

External Source → Connector Engine → Data Processing → Iara Data Warehouse
  1. Connection stores encrypted credentials (OAuth tokens or passwords)
  2. Job defines what to extract (query, spreadsheet, folder) and the schedule
  3. Execution engine routes to the appropriate connector
  4. Data is always normalized before being imported into your data warehouse
  5. The platform handles schema inference, table creation, and data loading

Pricing Tiers

TierConnectors Included
StarterPostgreSQL, MySQL, SQL Server, Microsoft Excel, File Upload, Inbound API, AI API Client, Shopify, Stripe, MongoDB, FTP/SFTP
Growth+ HubSpot, TOTVS Protheus, S3/GCS, Salesforce, Notion, Slack
Business+ BigQuery, Snowflake, SAP, Kafka