viernes, 31 de julio de 2009

How to sort a malware collection

Hi, malware collectors of the world!

Today I´ll discuss the different options we can decide about how to sort our malware collection.

Collection packed or collection unpacked?

I always have considered that having the collection packed is the best decission for multiple reasons. Almost every consideration is a pro for having the collection packed and there are no contras almost; meanwhile having the collection unpacked has lots of contras in my opinion.

Pros of having the collection packed:

* Making backups will be easier.

You create new archives containing new stuff so backups are incremental, no need to backup everything everytime.

* C0llection will take less space on hard disk.

* KAV scans a packed collection as fast as an unpacked one. Some tests even say that it´s faster.

* Verifying the integrity of the collection is easier.

You just need to run the test function of WinZIP to know if everything is ok. Checking if something is wrong with an unpacked collections takes more time as you must run a check of the whole drive storing the collection.

The only contra is the amount of time required to compress new files but as we will compress just a few files every day that´s not relevant.

There are other reasons but I´ll discuss some of them in future posts.

How to name files?

Some traders used to like to name files by the identification given by KAV. I always considered this as a mistake because identifications may be modified so the file name would be wrong.

I consider that it has more advantages having the files named by a hash, like MD5, SHA-1 or SHA-256.

You can use RenFiles to rename files to MD5 or SHA-256.

How to name file extensions?

Using KAV the file extension is not relevant as identification will not change depending if the file has the right extension or not.

Some collectors prefer extensions like .VXE or .VLL instead .EXE and .DLL to avoid infections.

A good collector should be able to manage a collection having the right extensions on files because he manages the files in a safe environment. A safe environment is that one where you can not run a virus or malware accidentally.

If you want to name files by their right extension use RenFiles.

What folder structure should I use to store the collection?

If you decide to follow my tip and keep the collection packed you don´t need a folder structure. Just decide a file size limit for the ZIP (I recommed ZIP to pack) and add new files until you reach the limit. When you reach it continue compressing on next archive. You can use consecutive numbers to name archives. Like:


If you decide you want an unpacked collection then continue reading.

Years ago many collectors liked having the folder structure based in the KAV identification name. Something like:




Several tools were created to process files and copy/move them to such structures using KAV logs.

If you like that folder structure method to sort the collection you can download VS2000 GUI and use it. You can get VS2000 GUI from here.

You have that feature under "Virus organizer" tab.

There are 5 different folder structure types available. You can see examples of how collection will look like clicking in the "?" buttons.

If I´m forced to use a folder structure then the folder structure method I prefer is the one called "Bulk". It´s based in the hash of the file. There is a root folder and inside 16 folders, from 0-9 and A-F. Inside those folders there are other 16 subfolders with the first 2 chars of the hash. 16*16 folders in total. Something like:


This is one of the five available structures in VS2000 GUI.

And that´s all you must decide about how to sort your malware collection. A fast resume:

Decide if you want collection packed or unpacked

Decide how to name files

Decide how to name extensions

If you decide an unpacked collection then decide the folder structure.

My "setup" is:

Collection packed (using ZIP format).

File names by their SHA-256

Files having the right extension

File size for archives: around 200 and 300 MB. More can be problematic for KAV.

File names for archives: VIRUS001.ZIP, VIRUS002.ZIP, etc

And that´s all for now. See you soon!

2 comentarios:

  1. What to do if you have all your collection on Linux ? This is my case. I created a directory struture like that /malware/0/00/000 /malware/a/a0/a00 .... and all malware were renamed to his onw sha1 hash. A mysql database has the original name, other hashs, and all other relevant information about the sample. I use f-prot free as my AV.

  2. What to do if you have your collection on Linux?

    Sorry but I don´t understand your question. It´s too vague and I don´t know what you want I reply.