Avaliação do impacto da remoção de características comuns em estratégias de busca de arquivos similares
Resumo
Investigações de forense digital enfrentam um importante problema: a grande quantidade de arquivos armazenados em dispositivos apreendidos. Para analisar esses dispositivos de forma mais eficiente, utilizam-se estratégias de busca de similaridade, capazes de encontrar arquivos idênticos, ou até mesmo similares, à um dado conjunto de arquivos, usando técnicas de pareamento aproximado. No entanto, esta busca pode ser prejudicada devido a blocos comuns, como cabeçalhos, presentes em diferentes arquivos. Este trabalho objetiva avaliar o impacto da remoção de blocos comuns na performance das estratégias. Os resultados mostram uma redução significativa na taxa de falsos positivos com um aumento aceitável no tempo de execução.
Palavras-chave:
Forense Digital, Pareamento Aproximado, Estratégias de Busca de Similaridade, Blocos Comuns
Referências
Breitinger, F. and Baier, H. (2013). Similarity preserving hashing: Eligible properties and a new algorithm mrsh-v2. In Digital Forensics and Cyber Crime: 4th International Conference, ICDF2C 2012, Lafayette, IN, USA, pages 167–182. Springer Berlin Heidelberg.
Breitinger, F., Baier, H., and White, D. (2014a). On the database lookup problem of approximate matching. Digital Investigation, 11:S1–S9.
Breitinger, F., Guttman, B., McCarrin, M., Roussev, V., and White, D. (2014b). Approximate matching: denition and terminology. NIST Special Publication, 800:168.
Kornblum, J. (2006). Identifying almost identical les using context triggered piecewise hashing. Digital investigation, 3:91–97.
Lillis, D., Breitinger, F., and Scanlon, M. (2017). Expediting mrsh-v2 approximate matching with hierarchical bloom lter trees. In International Conference on Digital Forensics and Cyber Crime, pages 144–157. Springer.
Moia, V. H. G., Breitinger, F., and Henriques, M. (2020a). Understanding the effects of removing common blocks on approximate matching scores under different scenarios for digital forensic investigations. XIX Brazilian Symposium on information and computational systems security, Brazilian Computer Society (SB).
Moia, V. H. G., Breitinger, F., and Henriques, M. A. A. (2020b). The impact of excluding common blocks for approximate matching. Computers & Security, 89:101676.
Moia, V. H. G. and Henriques, M. A. A. (2017). Similarity digest search: A survey and comparative analysis of strategies to perform known le ltering using approximate matching. Security and Communication Networks, pages 1–17.
Oliver, J., Cheng, C., and Chen, Y. (2013). TLSH–a locality sensitive hash. In Cybercrime and Trustworthy Computing Workshop (CTC), 2013 Fourth, pages 7–13. IEEE.
Raff, E. and Nicholas, C. (2018). Lempel-ziv jaccard distance, an effective alternative to ssdeep and sdhash. Digital Investigation, 24:34–49.
Roussev, V. (2010). Data ngerprinting with similarity digests. In IFIP International Conf. on Digital Forensics, pages 207–226. Springer.
Roussev, V. (2011). An evaluation of forensic similarity hashes. Digital investigation, 8:34–41.
Velho, J. P. B., Moia, V. H. G., and Henriques, M. A. A. (2020). Entendendo e melhorando a capacidade de detecção de estratégias de busca de similaridade em investigações forenses. XX Brazilian Symposium on information and computational systems security, Brazilian Computer Society (SB).
Winter, C., Schneider, M., and Yannikos, Y. (2013). F2s2: Fast forensic similarity search through indexing piecewise hash signatures. Digital Investigation, 10(4):361–371.
Breitinger, F., Baier, H., and White, D. (2014a). On the database lookup problem of approximate matching. Digital Investigation, 11:S1–S9.
Breitinger, F., Guttman, B., McCarrin, M., Roussev, V., and White, D. (2014b). Approximate matching: denition and terminology. NIST Special Publication, 800:168.
Kornblum, J. (2006). Identifying almost identical les using context triggered piecewise hashing. Digital investigation, 3:91–97.
Lillis, D., Breitinger, F., and Scanlon, M. (2017). Expediting mrsh-v2 approximate matching with hierarchical bloom lter trees. In International Conference on Digital Forensics and Cyber Crime, pages 144–157. Springer.
Moia, V. H. G., Breitinger, F., and Henriques, M. (2020a). Understanding the effects of removing common blocks on approximate matching scores under different scenarios for digital forensic investigations. XIX Brazilian Symposium on information and computational systems security, Brazilian Computer Society (SB).
Moia, V. H. G., Breitinger, F., and Henriques, M. A. A. (2020b). The impact of excluding common blocks for approximate matching. Computers & Security, 89:101676.
Moia, V. H. G. and Henriques, M. A. A. (2017). Similarity digest search: A survey and comparative analysis of strategies to perform known le ltering using approximate matching. Security and Communication Networks, pages 1–17.
Oliver, J., Cheng, C., and Chen, Y. (2013). TLSH–a locality sensitive hash. In Cybercrime and Trustworthy Computing Workshop (CTC), 2013 Fourth, pages 7–13. IEEE.
Raff, E. and Nicholas, C. (2018). Lempel-ziv jaccard distance, an effective alternative to ssdeep and sdhash. Digital Investigation, 24:34–49.
Roussev, V. (2010). Data ngerprinting with similarity digests. In IFIP International Conf. on Digital Forensics, pages 207–226. Springer.
Roussev, V. (2011). An evaluation of forensic similarity hashes. Digital investigation, 8:34–41.
Velho, J. P. B., Moia, V. H. G., and Henriques, M. A. A. (2020). Entendendo e melhorando a capacidade de detecção de estratégias de busca de similaridade em investigações forenses. XX Brazilian Symposium on information and computational systems security, Brazilian Computer Society (SB).
Winter, C., Schneider, M., and Yannikos, Y. (2013). F2s2: Fast forensic similarity search through indexing piecewise hash signatures. Digital Investigation, 10(4):361–371.
Publicado
04/10/2021
Como Citar
VELHO, João P. B.; MOIA, Vitor H. G.; HENRIQUES, Marco A. A..
Avaliação do impacto da remoção de características comuns em estratégias de busca de arquivos similares. In: SIMPÓSIO BRASILEIRO DE SEGURANÇA DA INFORMAÇÃO E DE SISTEMAS COMPUTACIONAIS (SBSEG), 21. , 2021, Belém.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
p. 239-252.
DOI: https://doi.org/10.5753/sbseg.2021.17319.