A. Sentence Matching Scores:
Sentence matching scores are the percentage probability
that two sentences have the same meaning. This number
can also be interpreted as the reciprocal to the probability
that the two sentences are similar by chance. For
example, a score of 90% means that there is 90% probability
that these two sentences have the same meaning, and
about 10% probability that they are similar by chance
(not because of plagiarism).
B. Overall Matching Score:
Overall matching score is basically an average of
all sentence scores, weighted by a) the length of
the sentence; b) the "commonness" of the
sentence (calculated based on the average typical
frequency of usage of the words from the sentence).
This score does not have a simple statistical definition,
but it is very highly correlated with a) the probability
that there is some text matching other documents in
the paper; b) the amount of matching text in the document.
In general, this score should be treated as a warning
indicator. We strongly recommend reviewing all reports
with high Overall Matching Scores. For analysis of
matching scores, the following interpretation scale
should be used:
1. Scores below 15% - usually papers
with such scores contain some quotes and few "typical"
phrases that match other documents. In most cases,
they do not require any further analysis, and there
is no evidence of plagiarism in reports.
2. Scores between 15% and 40% -
papers with such scores can either contain plagiarism
or have a significant amount of quoted material. We
usually recommend reviewing the reports with such
scores before making any judgments about the papers.
3. Scores over 40% - papers with
such scores usually contain some text copied from
elsewhere, and, even if this text is properly cited,
such amount of cited material is considered excessive
in most cases. Therefore, such scores give a clear
warning to instructors. However, there are few cases
when such scores can be given to authentic papers,
for example, when the paper was legitimately published
online before it was sent for processing (instructors
have just to "Delete" the source pointing
to the legitimate copy), or when the same student
has already submitted this paper or a similar paper
to another class.
NOTE: SafeAssignment does NOT make
any verdicts about plagiarism – it only identifies
matching between blocks of text. Always keep in mind
that not all marked sentences are plagiarized, and
that sometimes there can be legitimate reasons for
high matching scores. Also note that SafeAssignment
ignores quotation marks and highlights all material
in quotation marks as well – this is an intentional
behavior aimed to help instructors verify validity
of citations. For example, if a student paper includes
three or four quotations in a row, and this block
of quotations is matched to a Web page or a research
paper containing the same quotations used in the same
order, most probably the student used that other source
as a research surrogate, and therefore the material
is not used legitimately despite its citation.