Structured document files, such as those utilized by standard productivity applications or for portable documents can have malicious computer executable instructions embedded within them. Modifications to such files can prevent the execution of such malware. Modifications can operate at a file sector level, such as either fragmenting or defragmenting the file, or they can operate at a file record level, such as removing records, adding records, or rearranging the order of records. Other modifications include writing random data into records deemed likely to have malware, removing unaccounted for space, or removing records that are not known to be good and are inordinately large. A scan of the structured document file can identify relevant information and inform the selection of the modifications to be applied.
Malware detection systems and methods for determining whether a collection of data not expected to include executable code is suspected of containing malicious executable code. In some embodiments, a malware detection system may disassemble a collection of data to obtain a sequence of possible instructions and determine whether the collection of data is suspected of containing malicious executable code based, at least partially, on an analysis of the sequence of possible instructions. In one embodiment, the analysis of the sequence of possible instructions may comprise determining whether the sequence of possible instructions comprises an execution loop. In a further embodiment, a control flow of the sequence of possible instructions may be analyzed. In a further embodiment, the analysis of the sequence of possible instructions may comprise assigning a weight that is indicative of a level of suspiciousness of the sequence of possible instructions. In a further embodiment, the sequence of possible instructions may begin with a possible instruction that comprises at least one candidate operation code (opcode) that has been determined to occur frequently in executable code.