← Back to Docs
Recipe

Plagiarism similarity check

Compare two text bodies and receive a granular similarity breakdown with matched spans, overlap percentage, and source attribution.

How it works

Meridian tokenizes both inputs, builds n-gram shingles, and computes a weighted Jaccard similarity score. Matched spans are aligned via longest-common-subsequence backtracking so you can see exactly which passages overlap.

Request

POST /v1/recipe/plagiarism-check
Content-Type: application/json

{
  "source": "string (original text)",
  "suspect": "string (text to compare)",
  "threshold": 0.6
}

Response

{
  "similarity": 0.74,
  "matched_spans": [
    {
      "source_range": [12, 47],
      "suspect_range": [8, 43],
      "text": "matched substring..."
    }
  ],
  "verdict": "likely_plagiarized"
}

Parameters

  • thresholdMinimum similarity to flag (0.0–1.0, default 0.6).
  • ngram_sizeShingle width in tokens (default 5).
  • min_match_lenMinimum characters for a span to be reported (default 20).