LLM Classifier

The LLM Classifier in Stalwart enhances the spam filtering capabilities by utilizing AI models to detect threats, unsolicited emails, and commercial messages. By integrating with any of the configured AI models, the LLM Classifier improves the accuracy of spam detection, analyzing the content of incoming messages through advanced natural language processing techniques. This approach allows the spam filter to better understand the nature of the emails and identify potential risks.

The LLM Classifier operates by sending a prompt, which can be customized, to the AI model along with the subject and body of the email. The default prompt instructs the AI model to categorize the email into one of four categories: Unsolicited, Commercial, Harmful, or Legitimate. This classification helps the system identify whether the message is unwanted, promotional, malicious, or safe for delivery. In addition to categorization, the model also assigns a confidence level—High, Medium, or Low—to each classification. This confidence level enables the system to assess the certainty of the classification, providing valuable input for how the message should be handled by the spam filter.

Once the AI model returns its classification and confidence levels, the spam filter converts this information into tags. Each tag is associated with a customizable score, which influences the overall spam score of the email. For example, if the AI model identifies an email as Unsolicited with a high confidence level, the spam filter assigns the tag LLM_UNSOLICITED_HIGH, which has a default score of 3.0. This indicates a strong likelihood that the email is unsolicited and increases its spam score. On the other hand, if the message is categorized as Legitimate with high confidence, the filter assigns the tag LLM_LEGITIMATE_HIGH, which has a default score of -3.0. This lowers the spam score and signals that the message is likely legitimate and should be delivered safely. These tags and their corresponding scores can be fully customized to align with the specific needs and policies of the organization.

Enterprise feature

This feature is available exclusively in the Enterprise Edition of Stalwart and not included in the Community Edition.

Caveats

While integrating LLMs into the Stalwart’s spam filtering system offers powerful capabilities for detecting unsolicited and harmful emails, it is important to consider the performance and cost implications, especially when processing a high volume of incoming messages.

Performance Considerations

When self-hosting AI models, the performance of the system heavily depends on the hardware used. Without access to powerful hardware, especially GPUs (Graphics Processing Units) designed for machine learning tasks, running LLMs can introduce significant delays in processing each email. Passing the subject and body of every incoming message through the AI model requires substantial computational resources, and using only CPU-based processing can slow down the email intake times considerably. This latency may become problematic in environments with high traffic, where maintaining prompt email delivery is critical. Therefore, organizations opting for self-hosted models should ensure they have sufficient hardware, preferably with GPU acceleration, to avoid bottlenecks in their email processing workflow.

Cost Implications

For users integrating cloud-based AI models, such as those from OpenAI or Anthropic, the primary concern shifts from performance to cost. These providers typically charge based on the number of tokens processed per request. Since each email is passed to the AI model for analysis, each request consumes tokens depending on the length of the email's subject and body, along with the response generated by the model. As the volume of incoming emails increases, the number of tokens processed can scale rapidly, leading to significant costs over time. Organizations should carefully consider these factors and evaluate their budget when relying on cloud-based LLM providers for spam filtering, especially in high-traffic email environments.

In both cases—whether using self-hosted or cloud-based AI models—there are trade-offs between performance, cost, and the quality of spam detection. While LLMs provide enhanced detection capabilities, ensuring that the solution scales efficiently for both performance and budget is essential for long-term sustainability.

Configuration

By default, the LLM Classifier is disabled in Stalwart. To enable it, set the spam-filter.llm.enable option to true in the configuration file. This activates the LLM Classifier, allowing it to process incoming emails and assign tags based on the AI model's classification.

Model

The LLM Classifier configuration requires specifying the AI model to be used for processing email content. This is done by setting the spam-filter.llm.model option in the configuration file. The value of this option should correspond to the ID of the AI model defined in the enterprise.ai.<id> section of the configuration file.

Example:

[spam-filter.llm]
model = "chat"

Prompt

The LLM Classifier's prompt is fully configurable, allowing administrators to adjust the instructions given to the AI model based on the unique requirements of their email environment. Whether the focus is on general spam detection or more specialized content filtering, administrators can fine-tune the model’s behavior to ensure optimal results. The prompt can be customized from the spam-filter.llm.prompt setting in the configuration file, for example:

[spam-filter.llm]
prompt = "You are an AI assistant specialized in analyzing email content to detect unsolicited, commercial, or harmful messages. Your task is to examine the provided email, including its subject line, and determine if it falls into any of these categories. Please follow these steps:

- Carefully read the entire email content, including the subject line.
- Look for indicators of unsolicited messages, such as:
   * Lack of prior relationship or consent
   * Mass-mailing characteristics
   * Vague or misleading sender information
- Identify commercial content by checking for:
   * Promotional language
   * Product or service offerings
   * Call-to-action for purchases
- Detect potentially harmful content by searching for:
   * Phishing attempts (requests for personal information, suspicious links)
   * Malware indicators (suspicious attachments, urgent calls to action)
   * Scams or fraudulent schemes
- Analyze the overall tone, intent, and legitimacy of the email.
- Determine the most appropriate single category for the email: Unsolicited, Commercial, Harmful, or Legitimate.
- Assess your confidence level in this determination: High, Medium, or Low.
- Provide a brief explanation for your determination.
- Format your response as follows, separated by commas: Category,Confidence,Explanation
  * Example: Unsolicited,High,The email contains mass-mailing characteristics without any prior relationship context.

Here's the email to analyze, please provide your analysis based on the above instructions, ensuring your response is in the specified comma-separated format:"

Temperature

The LLM Classifier allows you to adjust the temperature parameter used during the sampling process when generating responses from the AI model. The temperature parameter controls the randomness of the responses, with lower values producing more deterministic outputs and higher values introducing more randomness. By default, the temperature is set to 0.5, which provides a balance between generating diverse responses and maintaining coherence. To customize the temperature, set the spam-filter.llm.temperature option in the configuration file.

Example:

[spam-filter.llm]
temperature = "0.5"

Responses

Response parsing is used to extract the category, confidence level, and explanation from the AI model's output. The LLM Classifier requires this information to assign tags to the email based on the classification and confidence level. The returned category and confidence level are combined to form the tag assigned to the email. The tag is constructed by concatenating the category and confidence level with an underscore _ in between. For example, if the AI model classifies an email as Unsolicited with a high confidence level, the tag assigned to the email will be LLM_UNSOLICITED_HIGH.

The following options are available for response parsing:

spam-filter.llm.separator: The character used to separate the category, confidence level, and explanation in the AI model's response. The default value is ,.
spam-filter.llm.index.category: The index of the category in the response string. The default value is 0.
spam-filter.llm.index.confidence: The index of the confidence level in the response string. The default value is 1.
spam-filter.llm.index.explanation: The index of the explanation in the response string. The default value is 2.
spam-filter.llm.categories: A list of categories that the AI model can assign to the email. A response with a category not in this list will be ignored. The default value is ["Unsolicited", "Commercial", "Harmful", "Legitimate"].
spam-filter.llm.confidence: A list of confidence levels that the AI model can assign to the email. A response with a confidence level not in this list will be ignored. The default value is ["High", "Medium", "Low"].

Example:

[spam-filter.llm]
separator = ","
categories = ["Unsolicited", "Commercial", "Harmful", "Legitimate"]
confidence = ["High", "Medium", "Low"]

[spam-filter.llm.index]
category = 0
confidence = 1
explanation = 2

Headers

The LLM Classifier allows you to add the X-Spam-LLM header to the email, which contains the classification and confidence level determined by the AI model. This header can be used for further processing or analysis of the email content. To enable this feature, set the spam-filter.header.llm.enable option to true in the configuration file. The name of the header can be customized using the spam-filter.header.llm.name option.

Example:

[spam-filter.header.llm]
enable = true
name = "X-Spam-LLM"

LLM Classifier

Caveats​

Performance Considerations​

Cost Implications​

Configuration​

Model​

Prompt​

Temperature​

Responses​

Headers​

Caveats

Performance Considerations

Cost Implications

Configuration

Model

Prompt

Temperature

Responses

Headers