Article describes Multics’ ability to have a file's directory entry on disk but ...

newhouseb · on Sept 9, 2020

Dropbox Smart Sync [1] is HSM using reparse points on Windows and kauth on MacOS. We prototyped using fanotify on Linux, but there were a number of edge cases around moving files and permissions that weren't comfortable with (if I recall correctly).

After we shipped HSM, Microsoft rebuilt the functionality into their Cloud Sync APIs which are used by OneDrive and others [2]. On MacOS, the File Provider APIs [3] provided similar functionality for Cocoa apps using Cocoa APIs but not for POSIX (which made it a no-go for us).

[1] https://www.dropbox.com/smart-sync

[2] https://docs.microsoft.com/en-us/windows/win32/cfapi/build-a...

[3] https://developer.apple.com/documentation/fileprovider

Source: I started these efforts at Dropbox (but left a couple years ago)

a1369209993 · on Sept 9, 2020

You could also implement this pretty easily in FUSE: if the dirent is present on the underlying (ext or whatever) filesystem, just forward the operations, otherwise leave the syscall blocked and hunt down the relevant backup. I don't know that anyone's actually written that, though.

acdha · on Sept 9, 2020

Nothing about this is “easily” once you work through the edge cases for performance and reliability. The only places which need HSM have enough data volume and range of applications to stress any simple solution (e.g. with the approach you outlined: what happens when someone runs find on that volume?). One of the more interesting challenges is how to deal with not having quite enough fast storage to batch the slow storage. Your system can appear to work well with one test workload and then fail miserably when two people start running different tasks at the same time.

After a couple decades of this, I generally think this class of software is a mistake. Any time you misrepresent one class of storage as another it inevitably leads to very complex software which is still pretty fragile and confuses its users on a regular basis, and the cost savings never deliver to the hoped-for degree.

a1369209993 · on Sept 9, 2020

Well obviously the performance is going to be terrible, but it's probably better than the zero performance you get if you block everything while waiting for the system to fully restore from backup.

> what happens when someone runs find on that volume?

It stalls until all the directories are restored? And hopefully pushes those directories to the front of the to-be-restored-from-backup queue, but even without that it's still better than not being able to run any operations on that volume.

acdha · on Sept 9, 2020

It can be worse than zero: your tape drive get hit with lots of small file requests, running much slower than it would be to stream a restore of a large batch containing all of the files you need, and causing increased failure rates on the hardware and media because tape drives are designed to stream, not seek. I’ve had to explain this to multiple HSM admin teams who were trying to save a few bucks on staging HDD capacity and surprised to see it taking over a month to restore a terabyte of data (not joking - and that was with multiple drives!) and hardware failing at like 5x the manufacturer’s estimates.

What you’re trying to do is akin to saying you can write an interface layer to make a railroad look like Uber: at some point the fundamental differences between the architectures are too much to paper over. The situation has improved now that the major operating systems have offline file support so you can make it more obvious that some files are not instantly available but you still need all of your client software to handle that gracefully.

gumby · on Sept 9, 2020

Except on Multics it was supposed to be not at the file level but at the segment (think block) level. In theory it was one set of segments not all accessible at the same speed, with “filesystem” just a thin layer grouping some sets of segments together.

I’m not sure that design actually survived to the real world; files seemed more coherent to me, but that could have been me projecting: I was pretty young back then.

skissane · on Sept 9, 2020

My understanding is that on Multics, segment = file.

Multics had a segmented memory model, much like segmented memory models on the 286 and 386 – indeed, Multics was one of the influences on the designers of the 286 and 386 – although newer x86 operating systems moved to flat memory model instead, so segmented memory only ever saw significant use on 16-bit versions of Windows and OS/2.

What made Multics unique was that all files were mmaped – opening a file gave you the ID of a memory segment, which you'd then use much as you'd use a segment selector on x86.

unused0 · on Sept 9, 2020

https://multicians.org/mgs.html#segment

segment User-visible subdivision of a process's address space, mapped onto a storage system file. Each segment is 1MB long, and has a zero address. The term "segment" is used interchangeably with "file" -- except not really: the things that are files in other systems are implemented as segments; also, the term "file" includes multi-segment files, and when talking in terms of COBOL, PL/I, or FORTRAN language runtime objects, one speaks of files. Programs are spoken of as stored in (procedure) segments. Correct use of the terms "file" and "segment" is a sure sign of a Multician.

edub · on Sept 9, 2020

It sounds similar to "selective sync" offered by Resilio Sync, which works on Win/Mac/Linux. I don't know how it is implemented though.