The TrustTest client provides a powerful interface for persisting and retrieving evaluation artifacts. It allows you to save and load any scenario, test set, and evaluation results through a consistent API.

Client Implementations

TrustTest offers multiple client implementations:

NeuralTrustClient

The NeuralTrustClient connects to the NeuralTrust API service and provides the following methods:

from trusttest.clients import NeuralTrustClient

# Initialize client with your API token (configured in app settings)
client = NeuralTrustClient(token="your_api_token")

Configuration

The token is defined in your app settings. You can either:

  • Pass it directly when initializing the client
  • Set it as an environment variable NEURALTRUST_TOKEN

Evaluation Scenarios

# Save an evaluation scenario
client.save_evaluation_scenario(evaluation_scenario)

# Load an evaluation scenario by ID
scenario = client.get_evaluation_scenario("scenario_id")

Test Sets

# Save a test set for a specific scenario
client.save_evaluation_scenario_test_set("scenario_id", test_set)

# Update an existing test set or create if it doesn't exist
client.upsert_evaluation_scenario_test_set("scenario_id", test_set)

# Load a test set for a specific scenario
test_set = client.get_evaluation_scenario_test_set("scenario_id")

Evaluation Results

# Save evaluation run results
client.save_evaluation_scenario_run(evaluation_run)

# Load evaluation run results for a scenario
results = client.get_evaluation_scenario_run("scenario_id")

Evaluators

# Save a custom evaluator with optional name and description
client.save_evaluator(evaluator, name="my_evaluator", description="Custom evaluator")

# Load an evaluator by name
evaluator = client.get_evaluator("my_evaluator")

FileSystemClient

The FileSystemClient provides local storage capabilities, saving evaluation artifacts as JSON files. It uses the same interface as NeuralTrustClient but stores data locally.

Key Capabilities

All TrustTest clients support these core operations:

  • Evaluation Scenarios: Save and retrieve evaluation scenario definitions
  • Test Sets: Manage test sets associated with evaluation scenarios
  • Evaluation Results: Persist and load evaluation run results
  • Evaluators: Store custom evaluator configurations

Example Workflow

Here’s how to use the client in a typical evaluation workflow:

from trusttest.clients import FileSystemClient
from trusttest.evaluation_scenarios import EvaluationScenario
from trusttest.probes import TestSet

# Initialize a client
client = FileSystemClient()

# Create and save an evaluation scenario
scenario = EvaluationScenario(name="My Test", description="Testing functionality")
client.save_evaluation_scenario(scenario)

# Save a test set for the scenario
test_set = TestSet(test_cases=[...])
client.save_evaluation_scenario_test_set(scenario.id, test_set)

# Later, retrieve the scenario and its test set
loaded_scenario = client.get_evaluation_scenario(scenario.id)
loaded_test_set = client.get_evaluation_scenario_test_set(scenario.id)

# After running an evaluation, save the results
client.save_evaluation_scenario_run(evaluation_run)

The client abstraction ensures your evaluation artifacts are consistently stored and retrieved regardless of the underlying storage mechanism you choose.