Investigating Malicious Document File

Faishol Hakim
MII Cyber Security Consulting Services
5 min readApr 3, 2023

--

Malicious Excel files have been a popular vector for delivering malware, phishing, and other types of cyberattacks. Excel files are commonly used for sharing data and macros, which makes them a popular target for cybercriminals. In this article, we will discuss how to investigate a malicious Excel file and identify any potential security threats. In this moment I’m using ms excel sample file.

Identify

identify the Excel file The first step in investigating a potentially malicious Excel file is to identify the file in question. It is important to obtain the original Excel file and not a copy, as this can alter important metadata that may be useful in the investigation. The file can be identified by its file name or through other means such as an email attachment, download from a website, or file share. More identifying can conducted with knowing where the file found, is it from email attachment, or external disk or others.

It’s important to still aware with the file and not make any action to the file except it already stored on sandbox environment.

Basic Analysis

Once the file is identified, it is necessary to perform a basic analysis to determine if it is malicious. This analysis should include the following:

Basically we can start identify using file command or exiftool first to ensure it.

$ file safe_excel.xls 
safe_excel.xls: Composite Document File V2 Document, Little Endian,
Os: Windows, Version 10.0, Code page: 1252, Author: BRT, Create Time/Date: Tue Aug 10 07:24:04 2021,
Last Saved Time/Date: Tue Aug 10 07:24:09 2021, Security: 0
$ exiftool safe_excel.xls 
ExifTool Version Number : 12.42
File Name : safe_excel.xls
Directory : .
File Size : 145 kB
File Modification Date/Time : 2021:08:11 09:46:14+00:00
File Access Date/Time : 2023:03:30 15:55:43+00:00
File Inode Change Date/Time : 2023:03:30 15:55:41+00:00
File Permissions : -rw-rw-r--
File Type : XLS
File Type Extension : xls
MIME Type : application/vnd.ms-excel
Author : BRT
Create Date : 2021:08:10 07:24:04
Modify Date : 2021:08:10 07:24:09
Security : None
Thumbnail Clip : (Binary data 86480 bytes, use -b option to extract)
Code Page : Windows Latin 1 (Western European)
Company :
App Version : 16.0000
Scale Crop : No
Links Up To Date : No
Shared Doc : No
Hyperlinks Changed : No
Title Of Parts : Foglio1
Heading Pairs : Fogli di lavoro, 1
Comp Obj User Type Len : 42
Comp Obj User Type : (Foglio di lavoro di Microsoft Excel 2003

Sometime unique things can be highlighted in this step, like when using exiftool, comment could be interesting or binary data can be attached there. Check for password-protected files. Malicious Excel files may be password protected to prevent analysis or reverse engineering.

Content Analysis

After a basic analysis, the next step is to analyze the content of the Excel file. In common, Microsoft has been using XML-based file formats since Office 2007, such as *.docx, *.xlsx, and .pptx, which are essentially compressed files that can be decompressed to view all plaintext components. Additionally, for backward compatibility, Microsoft still supports OLE2 (.bin) files that can be found within the decompressed files. but if it can’t conducted we can suspect it as malicious file.

Typically, attackers use these files to utilize macros and OLE2/.bin files to execute certain malicious code. Using oledump, we can see the file content to determine, im using 2 file to compare the safe and unsafe file.

$ oledump.py other_sample.xls 
1: 73 '\x01CompObj'
2: 20 '\x01Ole'
3: 116 '\x05DocumentSummaryInformation'
4: 312 '\x05SummaryInformation'
5: 9528 'Workbook'
$ oledump.py safe_excel.xls 
1: 118 '\x01CompObj'
2: 248 '\x05DocumentSummaryInformation'
3: 86644 '\x05SummaryInformation'
4: 34132 'Workbook'
5: 461 '_VBA_PROJECT_CUR/PROJECT'
6: 104 '_VBA_PROJECT_CUR/PROJECTwm'
7: m 1181 '_VBA_PROJECT_CUR/VBA/Foglio1'
8: M 5457 '_VBA_PROJECT_CUR/VBA/Questa_cartella_di_lavoro'
9: 2988 '_VBA_PROJECT_CUR/VBA/_VBA_PROJECT'
10: 2139 '_VBA_PROJECT_CUR/VBA/__SRP_0'
11: 331 '_VBA_PROJECT_CUR/VBA/__SRP_1'
12: 2495 '_VBA_PROJECT_CUR/VBA/__SRP_2'
13: 876 '_VBA_PROJECT_CUR/VBA/__SRP_3'
14: 464 '_VBA_PROJECT_CUR/VBA/__SRP_4'
15: 106 '_VBA_PROJECT_CUR/VBA/__SRP_5'
16: 565 '_VBA_PROJECT_CUR/VBA/dir'

From the sample. we can conduct which one contain multiple vba file to unknown action.

Check for Malware

The next step is to check for malware within the Excel file. From the last phase, we know if the file contain some macro or VBA codes. For further analysis and checking, we can still using oledump to see more from the file contains on its stream based on last step.

$ oledump.py safe_excel.xls -v -s 7
Attribute VB_Name = "Foglio1"
Attribute VB_Base = "0{00020820-0000-0000-C000-000000000046}"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Attribute VB_TemplateDerived = False
Attribute VB_Customizable = True

It gives more information about this malicious file. we can comb through every stream we deem suspicious to map it out

Function tellmo() As Variant
For Each KK In SPP("" & Cells(70, 3), 3)
If Not (IsNumeric(KK)) Then x = LTrim(Left(KK, Len(KK) - 1)) Else x = LTrim(KK)
Tk = Tk & Chr(x)
Next
tellmo = Split(Tk, "j")
End Function

The search can go as far as finding some suspicious functions in a particular stream. For example in this sample we find several function which address on operating system. And for other option from oledump, we can utilize exceplpeek from slaughterjames to analyze specific excel file like this sample.

For further summary from the files, we can utilize olevba if already identifying the file, and other tools from remnux like olebrowse, oleid olemeta, oletimes, oledir, olefile, olemap, oleobj. For example, identifying using olevba can identifying the macro used and some malicious section

$ olevba safe_excel.xls 
XLMMacroDeobfuscator: pywin32 is not installed (only is required if you want to use MS Excel)
olevba 0.60.1 on Python 3.8.10 - http://decalage.info/python/oletools
...
...
...
VBA MACRO Foglio1
in file: safe_excel.xls - OLE stream: 'Foglio1'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(empty macro)
+----------+--------------------+---------------------------------------------+
|Type |Keyword |Description |
+----------+--------------------+---------------------------------------------+
|Suspicious|Run |May run an executable file or a system |
| | |command |
|Suspicious|Chr |May attempt to obfuscate specific strings |
| | |(use option --deobf to deobfuscate) |
|Suspicious|Hex Strings |Hex-encoded strings were detected, may be |
| | |used to obfuscate strings (option --decode to|
| | |see all) |
+----------+--------------------+---------------------------------------------+

it depent on how we want to analyze the tool, and how deeper our analysis skill.

Identify the Attack Vector

Once the malware is identified, it is important to identify the attack vector that was used to deliver the Excel file. This can help prevent similar attacks from occurring in the future. Some common attack vectors for Excel files include phishing emails, compromised websites, or social engineering tactics.

We can identify the file source with utilize like zone-identifier or something similar. Other option can implemented with getting the file information like hash or something specific and search it on some common threat intelligence source like virustotal.

$ md5sum safe_excel.xls 
491dfc470849972dfb648a4778b04e24 safe_excel.xls

Sometime it already stored there, and some analysis report from other platform can be very helpful to identify it.

Investigating a malicious file, like excel file requires a thorough understanding of Excel file structure and malware analysis techniques. This one is only simple sample but there is more complicated and confused in the wild. It is important to remember that prevention is key when it comes to cyberattacks, and having a robust security program can help mitigate the risks associated with malicious Excel files.

--

--