Comparing Open Source Search Engine Functionality, Efficiency and Effectiveness with Respect to Digital Forensic Search

  • Joachim Hansen
  • Kyle Porter
  • Andrii Shalaginov
  • Katrin Franke

Abstract

Keyword search is one of the key components of the Cyber Crime Investigations. It
has a direct influence on precision and relevance of the data found on seized data carriers.
However, many of the digital forensics tools developers do not reveal the actual under-
lying algorithms or source code of their search engines. Therefore, there is a challenge
to verify their accuracy and eciency. On the other hand, open-source search engines
are an alternative to using proprietary keyword search tools, where they have extensive
functionality and perform well on large-scale datasets. The goal of this paper is to ex-
plore the applicability of such search engines in forensics search. The contribution of the
paper is two-folded. First, a thorough literature review and comparison of the supported
functionality documented by open-source search engines and open-source digital forensic
tools was performed. In addition, a survey of existing publicly-available digital forensics
datasets was conducted. Second, out of reviewed search engines, Solr and Elasticsearch
were selected and compared by their functionality, eciency in searching and indexing,
and effectiveness of search results with respect to digital forensic search using relevant
datasets. Our findings should assist those in the digital forensic community when choosing
the appropriate open source search engines for keyword search in large-scale datasets.

Published
2018-10-09