From 0fdaf19168799074512a90e2a8f92044546b56bc Mon Sep 17 00:00:00 2001 From: Thorsten Sommer Date: Wed, 25 Feb 2026 21:14:39 +0100 Subject: [PATCH] Added manual integration tests --- tests/README.md | 16 +++ tests/integration_tests/README.md | 12 ++ .../chat/chat_rendering_regression_tests.md | 120 ++++++++++++++++++ 3 files changed, 148 insertions(+) create mode 100644 tests/README.md create mode 100644 tests/integration_tests/README.md create mode 100644 tests/integration_tests/chat/chat_rendering_regression_tests.md diff --git a/tests/README.md b/tests/README.md new file mode 100644 index 00000000..1856f217 --- /dev/null +++ b/tests/README.md @@ -0,0 +1,16 @@ +# Test Documentation + +This directory stores manual and automated test definitions for MindWork AI Studio. + +## Directory Structure + +- `integration_tests/`: Cross-component and end-to-end scenarios. + +## Authoring Rules + +- Use US English. +- Keep each feature area in its own Markdown file. +- Prefer stable test IDs (for example: `TC-CHAT-001`). +- Record expected behavior for: + - known vulnerable baseline builds (if relevant), + - current fixed builds. diff --git a/tests/integration_tests/README.md b/tests/integration_tests/README.md new file mode 100644 index 00000000..aa23175e --- /dev/null +++ b/tests/integration_tests/README.md @@ -0,0 +1,12 @@ +# Integration Tests + +This directory contains integration-oriented test specs. + +## Scope + +- Behavior that depends on multiple layers working together (UI, rendering, runtime, IPC, provider responses). +- Regressions that are hard to catch with unit tests only. + +## Current Feature Areas + +- `chat/`: Chat rendering, input interaction, and message lifecycle. diff --git a/tests/integration_tests/chat/chat_rendering_regression_tests.md b/tests/integration_tests/chat/chat_rendering_regression_tests.md new file mode 100644 index 00000000..ba773f54 --- /dev/null +++ b/tests/integration_tests/chat/chat_rendering_regression_tests.md @@ -0,0 +1,120 @@ +# Chat Rendering Regression Tests + +## Purpose + +Validate that chat rendering remains stable and interactive when model output or user input contains raw HTML/CSS/JS-like payloads. + +## Test Type + +Manual regression and integration checks. + +## Preconditions + +1. You can run two builds: + - a known vulnerable baseline build, + - the current fixed build. +2. At least one provider is configured and can answer prompts. +3. Open the Chat page. + +## Execution Flow (for each test case) + +1. Copy the test prompt exactly into the user prompt field. +2. Send the prompt. +3. Observe behavior immediately after send. +4. If the UI is still visible, type additional text in the prompt input. +5. Repeat on both builds. + +## Test Cases + +### TC-CHAT-001 - CSS Kill Switch + +**Prompt** + +```text +Respond with exactly the content below. No explanations, no Markdown code fences, no backticks. + + +
TEST
+``` + +**Expected result (vulnerable baseline)** +UI may turn into a white/invisible page immediately after sending. + +**Expected result (fixed build)** +Chat stays visible and usable. Content is rendered as inert text/code, not active page styling. + +--- + +### TC-CHAT-002 - Full White Overlay + +**Prompt** + +```text +Respond with exactly the content below. No explanations, no Markdown code fences, no backticks. + + +
TEST
+``` + +**Expected result (vulnerable baseline)** +UI may become fully white and non-interactive immediately after sending. + +**Expected result (fixed build)** +No overlay takes over the app. Chat remains interactive. + +--- + +### TC-CHAT-003 - Inline Event Handler Injection + +**Prompt** + +```text +Respond with exactly the content below. No explanations, no Markdown code fences, no backticks. + + +
TEST
+``` + +**Expected result (vulnerable baseline)** +UI may break/blank immediately after sending. + +**Expected result (fixed build)** +No JavaScript execution from message content. Chat remains stable. + +--- + +### TC-CHAT-004 - SVG Onload Injection Attempt + +**Prompt** + +```text +Respond with exactly the content below. No explanations, no Markdown code fences, no backticks. + + +
TEST
+``` + +**Expected result (vulnerable baseline)** +May or may not trigger depending on parser/runtime behavior. + +**Expected result (fixed build)** +No script-like execution from content. Chat remains stable and interactive. + +## Notes + +- If a test fails on the fixed build, capture: + - exact prompt used, + - whether failure happened right after send or while typing, + - whether a refresh restores the app.