NTFS: Sometimes accurate file times are not in $FILE_NAME but in $STANDARD_INFORMATION

Project:JNode FS
Component:Code
Category:bug report
Priority:normal
Assigned:Daniel Noll
Status:closed
Description

NTFS has two attributes, $STANDARD_INFORMATION and $FILE_NAME, which contain all four FS timestamps (created, modified, changed, accessed.) OSes are expected to update both, but evidently some versions of Windows (not Vista, I have checked) update only the times in $STANDARD_INFORMATION.

JNode's driver is only using the values from $FILE_NAME, which leads to misleading numbers, even if it was reporting the times present in the disk image.

Sadly, I have found further information suggesting that some systems update $FILE_NAME and not $STANDARD_INFORMATION, so I figure it's possible for either one to be correct depending on who last touched the file record. Because of this, I'm proposing that the driver reads both and takes the most recent to be the most accurate.

Patch in progress.

#1

Status:active» patch (code needs review)

Attaching proposed patch.

Going through simultaneous code review here to find any typos and silly errors, but it's a relatively simple change, the only downside is the additional time to scan for one more attribute, plus needing to read the filetime from two places instead of one. Well, the buffer it's reading from is actually a byte[] in memory, so it's probably not a problem.

Probably.

AttachmentSize
ntfs-file-times.patch6.76 KB

#2

Updated, Javadoc comment was inaccurate, found by our internal review.

AttachmentSize
ntfs-file-times.patch6.76 KB

#3

Thank you, Daniel. I committed the patch.

#4

There is one worry remaining in my mind. My patch assumes the most recent timestamp is always correct, but is this really OK for the Created timestamp?

Some heavier testing (finally found a disk image with the problem... coincidentally it's one on which I'm investigating another supposed problem) is starting to show that Windows is incorrectly overwriting the value in $FILE_NAME to be the same as the MFT Change time, while the one in $STANDARD_INFORMATION is left untouched and shows what is presumably the correct Created time.

So I'm starting to think that perhaps just for Created, it should be Math.min instead of Math.max.

The original reason I worried about using Math.min is because if Windows is forgetting to update $FILE_NAME, maybe it forgets to set it in the first place which would leave it as a junk value. But maybe that's nothing to worry about?

#5

It looks like there are situations where if you copy a file, it keeps its modified time but the created time is updated. So it's possible for created to be above modified, and this might make it impossible to just switch to Math.min for this. Sad

Nonetheless Math.min is giving reasonable results so far...

#6

Is it not possible that the way these attributes are used is dependent on some kind of version number or meta attribute characterizing the whole file system? What you write looks quite scarry and I wonder how all those windows users and microsoft itself handles this problem in countles mission critical, real world usecases. Is there any informtion about this on the net?

#7

Windows itself uses only $STANDARD_INFORMATION for anything accessed via their API. Whether the API is reading or writing it uses that set of stamps. I've investigated a couple of forensic tools and they values they show are also from $STANDARD_INFORMATION. (This is despite the fact that users can trivially mess with the data, which shows how seriously most forensics software developers treat timestamps.)

$FILE_NAME is supposed to be a store of the *actual* timestamps, written by the driver, which the user can't directly mess with (though if they're admin they can obviously write into the MFT... let's ignore that.) So if you extract a file from a zip file, the stamps in $FILE_NAME will show the current date and time, whereas $STANDARD_INFORMATION's modification date will be overwritten.

The original author of JNode's driver probably used the fact that the user can tamper with $STANDARD_INFORMATION as a reason not to use it. And this would be fine, *but* Windows XP and whatever other versions have that problem where it isn't being updated, or when it gets updated it's updating all fields instead of just the one it is supposed to. Sad

The resource I've been using (a book) says that when $FILE_NAME lies, it tends to be the case that all four timestamps are the same. Investigation today has shown that this is *not* the case, and has additionally shown that when it is out of whack, it's impossible to determine Created and Modified reliably.

I'm starting to lean towards tossing away the Math.min/max idea entirely, and just reading them only from $STANDARD_INFORMATION. Even if it won't result in the most accurate timestamp, everybody else uses that one anyway so it will reflect what users actually see, even if it's wrong.

#8

Along these lines, a patch which takes the approach of using only $STANDARD_INFORMATION, against current trunk.

It looks like we're going to give up on figuring out which is the more accurate timestamp, and just sit with the ones Windows is using for the API side.

AttachmentSize
ntfs-file-times-2.patch1.83 KB

#9

Committed.

#10

Status:patch (code needs review)» fixed

Marking fixed as levente commited the code.

#11

Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#13

Manually closed