What am I looking at here ? =========================== This section is a guide to the output of an AutoArchaologist excavation, and also an introduction to the underlying datamodel. We will be using the Datamuseum.dk excavation of two hard-disk images from the Commodore-900 computer: https://datamuseum.dk/aa/cbm900/ That page consists of the following parts:: Decorative header. the top link the index links Table of top level artifacts The decorative header is just that, a "letter head". The top link is present on all pages in the excavation and will always bring you back to this page, no matter how deeply you have wandered into the artifacts or how much you have gotten lost, this link will always bring you back to the top of the excavation. The index links take you to a sorted index of everything covered by the current page. We will explain what that means shortly. The table of top level artifacts contains the input files to this AA excavation, in this case two files from the Datamuseum.dk BitArchive, and it looks something like this:: Artifact Unique Description ⟦eafc30061⟧ 41/500 Bits:30001199 Commodore 900 hard disk image Ar_file, UNIX file ⟦f27320a65⟧ 1508/1967 Bits:30001972 Commodore 900 hard disk image with partial source code Ar_file, UNIX file The left "Artifact" column shows the unique name of the artifacts in the AA, recognizable by the "MATHEMATICAL LEFT/RIGHT WHITE SQUARE BRACKET" as the Unicode consortium so poetically named them. You will see these links all over the place, and clicking on them will take you to the HTML page for that artifact. The center "Unique" column shows us how many artifacts are derived from this top level artifact, and how many of them are only present on this top level artifact. In this case the first artifact has 41 artifacts not found on any other top level artifact, whereas the second has 1508 uniq artifacts, the links takes you to pages with more details about this. The "Unique" column when ingesting data media in a collection, for instance a pile of floppy-disks, because it will tell you which disks are just copies of each other or subsets of other disks. The Description column shows any information provided about the artifact to AA on the first line, in this case information from the BitArchive, but typically just the filename specified. The second line shows some of the things AA found in that artifact. Now, click on the ⟦eafc30061⟧ link, and you get a page for that specific artifact, which has the following parts:: Decorative header. the top link - the metrics link Header about the artifact One of more interpretations The header lists the properties of the artifact, some, such as the length, are common to all artifacts, but others, such as the "ST506 Disk" is an annotation added during the excavation. In this case it just a very mechanical partitioning of the harddisk into four equally sized partitions, not very interesting, so we rapidly move further down by clicking the ⟦5ec4c54f2⟧ link and get an artifact page with more parts. The index-links on an artifact page cover only the artifacts found inside that artifact, to also see the artifacts from the other partitions or the other disk-image, click the "top" link and use the top level index. The other new part is the "Derivation", we will get back to that in a moment with a better example. The only interpretation on this page is as a Unix filesystem, as a "namespace table" which looks a lot like the output of "ls -li" would do. At the bottom of the page is a "Full View" link, because there is no point in loading very detailed "for further study" interpretations, unless peoplw want to see them. In this case, clicking "Full View" also shows an interpretation where the the UNIX filesystem is taken apart, data structure by data structure. Never mind that for now. Instead, search for "fortunes" to find the "usr/games/lib/fortunes" file and click on its ⟦5caa31010⟧ link. First thing to notice is that there is now also a "download" link, which allows you to download this artifact. The Derivation is much more interesting, because it illustrates the fundamental data model in the AA:: Derivation └─⟦eafc30061⟧ Bits:30001199 Commodore 900 hard disk image └─⟦8281d0872⟧ UNIX Filesystem └─⟦294235107⟧ »vol3.fd« UNIX Filesystem └─⟦this⟧ »usr/games/lib/fortunes« └─⟦f27320a65⟧ Bits:30001972 Commodore 900 hard disk image with partial source code └─⟦2d53db1df⟧ UNIX Filesystem └─⟦this⟧ »games/lib/fortunes« └─⟦eafc30061⟧ Bits:30001199 Commodore 900 hard disk image └─⟦5ec4c54f2⟧ UNIX Filesystem └─⟦this⟧ »usr/games/lib/fortunes« This specific artifact, these precise 1338 bytes, have been found three different places in this exacavation, and the top most one being the most illustrative, tells us that: * In the toplevel artifact ⟦eafc30061⟧ * … there is a UNIX Filesystem ⟦8281d0872⟧ (in a partition) * … which contains a file ⟦294235107⟧ called "vol3.fd" (a floppy disk image) * … which contains another UNIX filesystem * … which contains a file called "usr/games/lib/fortunes" * … which contains this artifact. First note that AA had no trouble with a UNIX file containing another UNIX filesystem. Second note that in this case the three filenames were identical, but they need not be, because the identity of an artifact to the AA is the bytes contained in the artifact. Or if you want to get technical about it: The SHA256 hash of the bytes in the artifact because the name ⟦5caa31010⟧ is simply the first part of the SHA256 hash of this artifact. Under the Derivation, this artifact is interpreted as a "TextFile", containing the somewhat limited wisdom of the fortune(7) command on the "Coherent" UNIX clone from Mark Williams. And that completes our tour of the AA output. If you want to you can visit some of the other public AA exacavations from Datamuseum.dk here: https://datamuseum.dk/aa Yes, I know it is in Danish, but I'm sure you get the gist :-)