Digital Investigation 18 (2016) A1eA3
Contents lists available at ScienceDirect
Digital Investigation journal homepage: www.elsevier.com/locate/diin
Editorial
Focused digital evidence analysis and forensic distinguishers
Pressure is mounting to perform forensic analysis of digital evidence in a more focused manner, for multiple reasons. The need for timely results is a constant driving force. As digital evidence becomes a critical component in more cases, decision makers are demanding prompt responses to specific questions. Steady growth in data volume is also challenging forensic analysts to conceive of more efficient ways to perform their duty. In addition, the U.S. National Commission of Forensic Science (NCFS) is promoting an approach to reducing cognitive bias by restricting forensic analysis to task relevant information (see “Views Document on Ensuring that Forensic Analysis is Based Upon Task-Relevant Information,” https://www.justice.gov/ncfs/work-products-adoptedcommission). The second annual report of UK Government Chief Scientist Adviser also promotes the approach of forensic examiners evaluating “only contextually relevant information” (see “Cognitive And Human Factors” by Itiel Dror, Chapter 4, in http://www.forensic-access.co.uk/wpcontent/uploads/2016/01/gs-15-37b-forensic-sciencebeyond-evidence.pdf). However, an open question is how to determine what is relevant versus irrelevant prior to performing forensic analysis of digital evidence. Furthermore, some U.S. courts are debating how the particularity requirements in the Fourth Amendment to the U.S. Constitution should be applied to searches of digital evidence. In essence, the question is “Should digital forensic analysis be limited to specific types of data and areas of a device specified in a search warrant or affidavit?” There is also an increasing demand to link similar or related activities using distinctive digital traces. Broadly sharing patterns that are indicative of suspicious activities such as malware, data hiding, and evidence destruction enables forensic analysts to reuse knowledge from past cases in order to find similar activities in new cases. For forensic intelligence purposes, sharing between jurisdictions those highly selective traces that distinguish specific criminals or groups (a.k.a. “forensic distinguishers”) is becoming an international priority in order to find significant linkages http://dx.doi.org/10.1016/j.diin.2016.08.004 1742-2876/© 2016 Published by Elsevier Ltd.
between activities committed by the same organized criminal groups, terrorist networks, and cyber attackers. Timeliness Expedience is not the same as timeliness. It can be dangerous to produce results quickly at the expense of overlooking crucial inculpatory or exculpatory digital evidence. Incomplete results can allow offenders to continue offending, can leave victims vulnerable to continued victimization, and can cause an innocent person to be falsely accused. Most proposed tactics to speed up processing of digital evidence are based on the assumption that relevant information will be found in locations where evidence has been found in past cases. Such tactics will overlook evidence in previously unknown locations and new locations, which is a very real problem with new technology and criminal behaviour. More fundamentally, such tactics do not address how to select files of potential value/relevance based on objective measures. The clustering approach presented in this issue, “A Suspect-Oriented Intelligent and Automated Computer Forensic Analysis” by Fudong Li, is designed to focus on relevant information within a data source. Further work is needed to determine whether such an approach can be extended to additional data types, and to mitigate the risk of important information being overlooked. The open question in this area is which characteristics are most effective for selecting relevant data. The focus should be on meaningful selection, rather than convenient and expedient selection of data. Until this open question can be resolved with a high level of assurance, these methods might be useful for triage with the caveat that important information might be missed. Cognitive bias Although objective decisions depend on disregarding extraneous factors, incorrect decisions can also result from having too little information.
A2
Editorial / Digital Investigation 18 (2016) A1eA3
Research indicates that a forensic analyst's objectivity can be adversely impacted by exposure to investigator's assumptions, such as suspicions about a specific individual. This phenomenon is called cognitive bias, and poses a very serious risk in forensic science. Even the way a forensic question is asked can introduce cognitive bias, e.g., “does the evidence implicate the victim's husband?” increasing the likelihood that a forensic analyst will reach an affirmative conclusion. Ideally, to avoid cognitive bias, forensic analysts would receive the minimum evidence necessary to address a specific question. For instance, phrasing questions in more general terms can reduce cognitive bias, e.g., is the selected evidence compatible with any particular individual? The primary problem with this ideal is that someone must first decide what subset of evidence to select and which questions to consider. These initial decisions could be biased, thus skewing all subsequent forensic analysis. Rather than attempting to implement general restrictions on forensic analysis based on limited understanding of what might or might not be relevant in a particular case, a balanced approach is recommended to mitigate the risk of cognitive bias. One such approach is called linear sequential unmasking, initially restricting what information forensic analysts receive, and allowing them time to produce initial findings without risk of contextual bias. After forming initial hypotheses, forensic analysts could review additional evidence and contextual details to ensure that important evidence is not overlooked and to decide whether or not their hypotheses are compatible with all available facts. Privacy Performing focused digital forensic analysis is not the same as putting on digital blinders. Addressing forensic questions should be the focus, rather than excluding potentially relevant data sources. Some courts in the U.S. are considering whether to restrict how digital forensic analysts can search for data of probative value. In response to these concerns, the Scientific Working Group of Digital Evidence (SWGDE) recently released a draft document “Comments on Forced Minimization Requirements for the Seizure of Digital Evidence” for public comment (https://www.swgde.org/ documents/draftsForPublicComment). The idea of restricting forensic analysis to specific classes of data or to specific areas of a device returns to the conundrum echoed throughout this editorial e how to know what is relevant in advance? The short answer is, you cannot e any data structure or device area has the potential to contain relevant digital evidence. In actuality, digital forensic analysis is already focused because we cannot look at everything. Forensic analysts employ a variety of data reduction techniques, including keyword searching, timeline reconstruction, thumbnail review, and inspecting larger files. The purpose of these techniques are to not examine everything, but rather to find a potentially relevant digital trace (needle in the haystack) and then examine its context to determine its meaning and significance in order to address forensic questions.
Case linkage The primary aim of case linkage is to find meaningful connections, and exclude irrelevant ones. When attempting to find related activities across multiple cases, the focus should be on meaningful linkages rather than general commonalities. Finding deleted files in the Recycle Bin created by Microsoft Windows on storage media from different cases is not meaningful itself, but finding these deleted files within the Recycle Bin and categorized under the same security identifier (SID) across multiple cases might be a linkage worth pursuing. In the context of cyberattacks, connections to Google infrastructure might help establish meaningful linkage between related incidents. In most other contexts, however, the fact that Google infrastructure was utilized is not relevant to the activities under investigation. Novel methods for finding child abuse media using distinguishing characteristics is presenting in “iCOP: Live forensics to reveal previously unknown criminal media on P2P networks” by Claudia Peersman et al. This system can also provide investigators with links to related illegal activities on P2P networks, such as files shared by the same application-level globally unique identifier (GUID). Cyber threat intelligence is making use of infrastructure and identity-related information to link attacks committed by the same individuals or groups. However, cyber threat intelligence has not yet explored ways to leverage other forensic analysis methods to extract additional forensic distinguishers from digital evidence, such as biometric information. The sale of narcotics and banned substances via the Internet poses challenges and opportunities from a forensic perspective. Anonymity in online marketplaces make it difficult to ascertain the source of illegal activities, and innovations in packaging make it more difficult to detect drugs being mailed regionally, or even internationally. Combining results from multiple forensic disciplines, including digital forensic analysis and chemical analysis, can help overcome such linkage blindness if forensic distinguishers are shared. Drug dealers, whether online or on the street, typically use mobile devices that capture details about their activities and identities that can be analysed to find linkages. Savvy criminals will dispose of mobile devices frequently to avoid apprehension. Additional research is needed to determine whether any forensic distinguishers persist when a criminal changes from one mobile device to another. One practical consideration for case linkage is how to facilitate sharing of forensic distinguishers between international law enforcement agencies. Current methods are inadequate as detailed in this issue “A survey of mutual legal assistance involving digital evidence” by James & Gladyshev. A more ambitious international initiative is needed to combat criminal activities that span national boundaries. Another important consideration for comparing cases and assessing similarity is representing and sharing information in a standardized, structured format. The community-developed format called Cyber-investigation
Editorial / Digital Investigation 18 (2016) A1eA3
Analysis Standard Expression (https://github.com/ casework) is being developed in unison with the Unified Cyber Ontology (UCO), and is aligned as much as feasible with Digital Forensic Analysis eXpression (DFAX) and Cyber Observable eXpression (CybOX). Conclusions There is no doubt that digital forensic analysts need more effective methods to pinpoint relevant information within a digital crime scene, and to find meaningful linkages between cases. However, we must resist the “CSI effect” of unrealistic expectations that forensic analysis can magically resolve a case within an hour-long television episode. The reality is that digital forensic analysis is already focused by necessity and design. There are many incentives
A3
to hone in on relevant evidence more efficiently but, ultimately, there are limits to what can be accomplished in this area. There certainly will never be a “Find Evidence Button” that speedily produces all the relevant evidence, and only relevant evidence. Stated more formally, no algorithm can detect all traces left by an activity of interest. The search for traces must necessarily be guided by some expertise, which is intrinsically limited and biased by human reasoning. This role is for forensic analysts. Eoghan Casey Ecole des Sciences Criminelles, University of Lausanne, Switzerland E-mail address:
[email protected]