Converting dd image to vmdk for analysis


Astute readers will notice that the names for the images used in this part are not the same as in Part 1. Good for you, astute reader. I pulled a 16GB Quantum Fireball out of an old desktop that had not spun up in at least two years. When I last booted the system it was a fully functional Windows XP SP3 desktop.

I imaged this drive using the method in Part 1. Verified the image and copied it from the Kali Linux laptop that I dropped the initial image onto to an external USB drive. Why? Because in forensics you NEVER want to work with the initial image. The entire process was about 45 minutes for all three steps. The external drive is USB3 and that definitely made the copy phase faster.

Converting this dd image to a vmdk file and then booting it is obviously going to change the hash. Just booting a Windows system adds multiple entries to the event log which is more than enough for verification to fail. Not to mention that the OS is going to install drivers for all of the new devices that are used by VMWare. In summary, NEVER WORK OFF OF THE INITIAL IMAGE.

Before we get to the actual conversion there is no reason this conversion couldn’t be from dd to VHD or VDI. I have VMWare Workstation installed on my laptop and not VirtualBox or Virtual PC. I have used all of these and can’t say I have strong feelings for any one over the others.

The Good Stuff

Get the qemu utils apt-get install qemu-utils

Next we use the qemu-utils to convert to vmdk qemu-img convert -O cmdk /path/image.dd /path/output.vmdk

qemu-utils conversion

Qemu-utils conversion

Get yourself a drink and stretch your legs. Forensics is a time consuming process. The conversion of this ~16GB dd file to vmdk took about 90 minutes.

The next step was to attach the disk to an existing virtual machine to ensure it would spin up. I happened to have a Windows XP virtual machine that I keep around mostly to run old software. Depending on how you plan on testing this hard drive you can take a snapshot of the drive to allow any changes to be rolled back; I didn’t do this simply because I could always convert the copy of the dd file again if I made some catastrophic change. If you wanted to boot directly into the XP operating system it would probably be necessary to run a repair install off of either a disk or ISO image containing the installation files. The chances that the underlying physical hardware is the same as the virtual hardware are just about zero. That is why VMWare has a physical to virtual converter.

Attachedh to VM

Attached to VM

Here is where the pen testing part comes in. I spun the VM up and opened the drive in Windows Explorer to ensure that it worked. Oh look right at the root. The Tax Backup folder. If this wasn’t my own drive I’d probably start there.

Pen Test Gold

Pen Test Gold

What next? Well we have covered acquiring the image and converting it to a virtual disk format. The next article in this series will cover juicy places to look in both the file system and Windows registry. It will probably have a cheat sheet for different operating systems to find data that will help you look good during the report writing phase of penetration testing. You know that phase? The one that everyone hates doing? Might as well look good doing it.


Geographic Information Theory

There are two main types of geographic information found in files. Geotagging is the information placed in a file with the GPS coordinates of the location. EXIF (Exchangable Image File Format) contains the geotagging information as well as device type and speed. EXIF contains more information and is normally limited by the capabilities of the device creating the file.

What are the common weaknesses? Data leakage from the geographic information can pin point the exact location of where a file created. This information can be used to find detailed maps using software such as Google Earth or create detailed patterns of movement.

What are you trying to do? We are going to connect to Twitter and do geolocation on the @FIFAWorldCup  account.  Why the FIFAworldcup account? We know where the world cup is happening so it is easy to see if the information is correct.

 Getting Started

Get creepy from here: http://ilektrojohn.github.io/creepy/

Ready to Go

For this tutorial it is installed in a Windows 7 virtual machine. The Kali apt-get repositories was not the latest version when this was written. Besides, the OS is just a tool we don’t need to get caught up in an ideological battle about how somebody has to use a certain tool to be a ‘real’ hacker. Being effective is more important than being a zealot.

Edit the configuration: Edit -> Plugins Configuration then select Twitter Plugin -> Run Configuration Wizard -> Next. Enter your Twitter ID and password to authorize creepy by clicking Authorize APP.

Creepy Twitter Authorization Screen

Authorize Creepy

Wouldn’t this also be a great time to follow us @SecureNM? I’m not trying to make you feel guilty but you are here reading our stuff. Copy the PIN that Twitter generates into the text box at the bottom of the window and click the finish button.

Creepy Twitter Plugin  Configuration Complete

Creepy Twitter Plugin Configuration

Creepy should now be authorized but just to be sure select Twitter Plugin and then click the Test Plugin Configuration button. Yay, we are ready to get started. Click OK a few times to get back to the main screen.

Twitter Plugin Success

Twitter Plugin Success

From the file menu select Creepy -> New Project -> Person Based Project. This will start the project wizard. Fill in the information as you see fit.

Creepy Project Configuration

Project Configuration

Add the information and select the proper plugin then select Search. In this case we used @FIFAWorldCup.

Creepy Search Results

Search Results

Click the ID or IDs that you want to creep on, see what I did there? Then select Add to Targets. I added all of the IDs that were found to ensure data for this tutorial.

Select Next -> Next -> Finish.

Analyze the project by selecting the project and clicking the Analyze button

There are many analyze buttons like it but this one is mine

Sao Paulo, Rio de Janeiro, and the Maldives are all among the locations of texts sent by the twitter IDs that creepy analyzed. Select one of these locations on the map and through the power of google and GPS you can see the location and possibly a street view.  In the immortal words of Keanu Reeves, Whoa!

Full Map of Tweets

Full Map of Tweets

Location of Maldives Tweet

Location of Maldives Tweet

I know what you’re thinking, wow that was cool but so what. So what you say? This is how you would use it on a real life security engagement. You get a black box test with nothing but a URL. You find the companies twitter account on the website. Feeding this information into creepy gives you locations that are potential targets for social engineering, physical infiltration, and WiFi attacks. See how just a little information can turn the tide in an assessment?

Moving between OS types can be a pain. How many times have you typed ls into a windows machine when trying to list the files in a directory? This is a handy dandy guide comparing commands between the two.

It will be updated over time as we get requests or as we feel it is justified.



Both ls and dir list the contents of a directory
Pwdecho %cd%
Print the current directory
Print the current user
cat <filename>type <filename>
Print the contents of a file to the screen
> and >>> and >>
Send the contents of cat or type to another file. > overwrites the target file >> appends to the end of file
Prints human readable content from binary filesSysInternals has a downloadable Windows equivalent
Schedule tasks to run at a specific time
ps auxtasklist /v
Print the list of running tasks
grep <search string>find
Search for a pattern in a file
| – (SHIFT + \ usually)|
Send the output of one command as the input to the next command
Print network settings. iwconfig prints wireless network settings in linux.
Print the computer name


Pipe usage:

Command1 | Command2

Search a file for a string in Linux:

cat samplefile.txt | grep secret

Search a file for a string in Windows:

type samplefile.txt | find /I secret

    /I ignores the case of the search string

dcfldd for disk imaging



If you don’t have a USB write blocker remount the drive to prevent unintentional changes to the drive. This is good forensics. dcfldd is a part of Kali Linux but if you want to add it from source it is available here.

Find the installed drives on the system:

ls –l /dev | grep sd

grep image

/Dev output from grep

If you need more information about the drives to determine which is the source and which is the destination for the image:

fdisk –l

fdisk output

Output from fdisk

Change the drive permissions to read-only:

chmod 440 /dev/sdb

The drive is now set to read-only for both root and owner; it is time to start the image. This is a time consuming process dd takes a significant amount of time to copy the disk to image.

An astute commenter mentioned that changing the block size to 4096 bytes dramatically increases the speed of the image which is true. The default value is 512bytes. Leaving the default at 512 bytes is a belt and suspenders approach, if a block is bad dcfldd writes 0’s into the image. Using the smaller and slower default option minimizes the potential data lose. On a known good drive increasing the block size may be appropriate depending on the circumstances. Where forensics are concerned, especially if the image may be used in a disciplinary or legal setting, it is best to err on the side of caution.

The hashwindow option allows the image to be split into multiple smaller image files that individually hashed. This has the benefit of allowing a failed hash to only invalidate only a portion of the total image. The option takes an input value in bytes; to use this option set the hashwindow=1G. 

dcfldd if=/dev/sdb hash=sha1,md5 md5log=/image/md5.txt sha1log=/image/sha1.txt of=/image/toshiba.dd

Now you’re imaging.

One of the advantages of dcfldd is the built in hashing function. When the image is completed we can verify that the image is sound by checking the hash log and a separately generated hash.

Check the hash values created during the imaging process:

cat /image/md5.txt

cat /image/sha1.txt

To generate a separate hash for verification run the following and compare the outputs:

md5sum /dev/sdb

sha1sum /dev/sdb

Verify that the image file is good and that data hasn’t changed. You can use md5sum or sha1sum but they take forever to run. Dcfldd has a built in verification function.

dcfldd if=/dev/sdb vf=/image/toshiba.dd verifylog=/image/verifytoshiba.txt

This is what the whole process looks like.

Complete Image Process

Complete Image Process

If the verification fails you will get a message similar to this:

Verification Fail

Verification Fail



This article was edited on 7/20 to include great feed back that was received from reddit users as well as direct comments. The comment was published because of the email address used. If you want recognition please feel free to contact us and I’ll update it again. Thanks to F. Erensics and HackThe______ for providing insight.

Get Foca

Download Link: http://www.informatica64/foca.aspx enter an email in the textbox at the bottom of the page labeled “Cuenta de correo”. A download link will be sent out; this link will expire.

Install from downloaded package.

Now What?

What are you trying to do?

  1. Create Project
    1. Click Project a New project
    2. Enter the Project name, Domain name and Alternative domains. The domain and alternative domains will be used during later analysis. Alternative domains such as ftp.website.com or secondarywebsite.com are used by Foca during later analysis.
    3. Be aware that on a website with a large number of documents disk space can be consumed quickly. Take this into account when selecting the Folder where save documents location.
    4. Click the Create button

Figure 1: New Project Creation

  1. Save the .FOCA file. This file contains the project definitions and information that will be found later with the exception of documents which will be stored in the location provided earlier.
  2. Network and Domains Information
    1. Select the Network Icon. The buttons will have black text when active and the text will be grey when deselected. By default all searches are selected.
    2. Click Start à This can take a significant amount of time because of the dictionary search option. All of the names or IP’s added when the project was created as domain website or alternative domains will be searched.
    3. Select the Domains icon. Information gathered about the domain will be in the center pane.


Figure 2: Domain Information

  1. When the Domains icon is selected three main options can be selected to gather more information about the selected targets and to be used later to gather metadata.
  2. Clicking the Technology Recognition button will identify the web server type in use (Apache, IIS, ColdFusion, etc.). See Figure 2 for example.
  3. The Crawling button uses Google and Bing search engines to list the known files and folders.


Figure 3: Search Engine Crawling

  1. The Files button crawls the target using Google, Bing and Exalead to find documents of the selected type. These files form the basis of the metadata analysis in Section 4.
  2. Vulnerability Enumeration
    1. Currently there are no examples available of vulnerabilities found using Foca to populate this section.
    2.  Document Enumeration and Metadata Analysis
      1. Select the Metadata icon. The files found in section 2g will be listed to ensure a complete list select the Select all button.
      2. Download files for analysis. Select a subset of files and select Download from the right click menu. Alternatively right click any file and select Download All.
      3. Once all files are downloaded right click any file and select Extract All Metadata from the menu. When this completes select Analyze Metadata from the right click menu. This will populate information into the other sections of Foca.
      4. Examples of Metadata Usage
        1. The names found in the Users section create a list of accounts to be brute forced accounts.
        2. The Folders and Printers section provide names of internal systems that can be targeted if a foothold is found externally.
        3. The Software, Emails, and Operating Systems sections are the most useful when combined in a spear phishing attack. Knowing the specific operating system and software in use allows a very targeted exploit to be created. Combine this exploit with the specific information in a document from the user tied to the email address and a very effective phishing campaign should be possible.

DNS Enumeration Theory

Already a genius? Skip this part. The Domain Name System (DNS) is a distributed database containing the address for every domain on a network. When a browser requests a website by name, the local DNS cache is checked to see if a record already exists. If a record doesn’t exist, a DNS lookup is sent to the servers configured in the network setting. If the servers configured for this connection don’t have a cached record, they will forward the request up the hierarchy until either a server knows the address or one of the thirteen root DNS servers that are authoritative for the entire internet respond. Complicated, right? Computers love numbers, and people love words. DNS does the heavy lifting of converting between what people like and what computers like.

Continue Reading