Skip to main content

What is Data Capture and how do I create an extract job?

Updated today

What is Data Capture?

Data Capture is a feature designed to fetch and log data interactions with le Chat and our various APIs stored in our servers.

📌 This logged data can serve various purposes, such as analyzing usage patterns, debugging issues, or creating datasets for fine-tuning machine learning models.

Creating an extract Job

Creating an extract job is quite straightforward:

1. Navigate to Data Capture

Access the Data Capture section from the main navigation menu on the left side of the platform interface.

image

Clicking Data Capture in the left-hand navigation menu

2. Initiate a New Job

On the main Data Capture view, locate and click the New Extract Job button.

image

Clicking the New Extract Job button on the Data Capture page

3. Configure the Extract Job

A configuration modal window will appear. You need to specify the parameters for your data extraction:

  1. Choose a Data Source: Select either API or le Chat depending on the interactions you want to extract (1).

  2. Select a Date Range: Specify the Start Date and End Date for the data extraction (2).

  3. (Optional) Choose a Model: You can optionally select a specific model from the dropdown list (e.g., a base Mistral model or a fine-tuned model you own) to apply during the extraction (3). If omitted, all models are selected by default.

  4. Click Create Job to submit your configuration.

🔑 The maximum duration for a single extract job is 31 days.

image

Configuring the extract job: (1) Data Source, (2) Date Range, (3) Optional Model Selection before clicking on the Create job button

4. Monitor Job Completion and Access Details

After creating the job, it will appear in the list on the Data Capture page with a status (e.g., Pending, Running, Completed, Failed).

Wait for the job's lifecycle status to become Completed. Once completed, click on the job's ID in the list.

image

Clicking on a completed job's ID in the extract job list.

5. Review Job Details and Download Output

Clicking the job ID opens the specific Data Capture Extract Job details page. This page displays:

  • A summary of the job configuration (status, data source, date range, model used).

  • Lifecycle timestamps (creation time, completion time).

  • One or more links to download the output file(s) in .jsonl format. These files contain the extracted log data based on your configuration.

image

Capture overview and Output files to download

You can now download the generated files for your analysis or fine-tuning workflows.

🔎 For details on the structure of the downloaded JSON Lines files, please refer to the article: What is the format of the log files you provide?

Did this answer your question?