TrustTest provides a flexible framework for evaluating any LLM target. The core of this flexibility lies in the Target base class, which you can inherit from to create your own custom model implementations.

Creating a Custom Target

To create your own model evaluator, you simply need to inherit from the Target class and implement the required abstract methods. The base class provides the foundation for both synchronous and asynchronous operations.

Basic Implementation

Here’s a simple example of how to create a custom model:
from trusttest.targets.base import Target

class DummyTarget(Target):
    """A simple dummy model that always returns the same response."""
    
    async def async_respond(self, message: str) -> Optional[str]:
        """Get a response for a single message.
        
        Args:
            message (str): The input message to get a response for.
            
        Returns:
            Optional[str]: The model's response.
        """
        return "This is a dummy response to: " + message

Using Your Custom Target

Once you’ve created your custom model, you can use it in any TrustTest scenario:
from trusttest import Scenario

# Create an instance of your custom model
target = DummyTarget()

# Create and run a scenario
scenario = Scenario(
    target=target,
    # Add your scenario configuration here
)
results = scenario.run()

Conversation Targets

For models that need to handle multi-turn conversations, TrustTest provides the ConversationTarget class. This class extends the base Target class and adds support for conversation history.

Creating a Conversation Target

Here’s an example of how to create a custom Conversation Target:
from trusttest.targets.base import ConversationTarget

class DummyConversationTarget(ConversationTarget):
    """A simple dummy model that handles conversation history."""
    
    async def async_respond_conversation(
        self, conversation: List[str], **kwargs
    ) -> Optional[str]:
        """Get a response for a conversation history.
        
        Args:
            conversation (List[str]): List of messages representing the conversation history.
            **kwargs: Additional keyword arguments to pass to the target.
            
        Returns:
            Optional[str]: The model's response to the conversation.
        """
        # Example: Return a response that includes the conversation history
        return f"Responding to conversation with {len(conversation)} messages. Last message: {conversation[-1]}"

Using Conversation Targets

Conversation Targets can be used in the same way as regular models, but they provide additional methods for handling conversation history:
from trusttest import Scenario

# Create an instance of your Conversation Target
target = DummyConversationTarget()

# Create and run a scenario with conversation history
scenario = Scenario(
    target=target,
    # Add your scenario configuration here
)
results = scenario.run()
The ConversationTarget class provides both synchronous and asynchronous methods for handling conversations:
  • respond_conversation(): Synchronous method for getting responses
  • async_respond_conversation(): Asynchronous method that must be implemented by subclasses
This makes it easy to evaluate models that need to maintain context across multiple turns of conversation, such as chatbots or dialogue systems. The Target base class handles all the necessary infrastructure, allowing you to focus on implementing the core model logic in the async_respond method. This makes it easy to evaluate any LLM model, whether it’s a local model, an API-based service, or any other implementation. Remember that your custom model must implement the async_respond method, which is the core method responsible for generating responses to input messages. The base class will handle the conversion between synchronous and asynchronous calls automatically.