Synopsis
MediaKeg is a lightweight solution for importing digital photos, video, and audio. It is powerful and flexible enough to support the needs of professional content creators wanting to improve workflow, and easy enough to use by everyday users wishing to organize or archive their family photos.
This document provides an in-depth view of MediaKeg's features and capabilities, which can help in evaluating if MediaKeg is right for you. For advanced users, this document also discusses concepts and provides details on how to customize MediaKeg. For new users wanting to get up and running with MediaKeg for the first time, refer to the Quickstart Guide instead.
Overview
MediaKeg is a lightweight solution for importing multimedia content, such as digital photos, video, and audio. The term import is often used by applications that process and manage multimedia content. As such, the term's strict meaning is application-specific. However, in virtually all cases, import involves some form of copying files from a source device (such as an SD card, camera, or file share) to an application managed folder or enclave. The import operation also often involves updating proprietary databases and catalog files, and can even include non-ingestion work such as generating preview images needed by the application.
With MediaKeg, the scope of an import operation starts and ends with ingesting files into a MediaKeg library, which is an ordinary file system folder containing only a small configuration file at its root to distinguish it from a non-library folder. This narrow, single-function scope makes MediaKeg highly efficient at importing large volumes of content and also helps improve the day-to-day workflow of photographers and videographers who regularly ingest new content onto their workstations.
By default, MediaKeg organizes and names imported media assets according to the date and time they were captured or created. The date and time information is read from the metadata embedded in each file. Consider the following file listing:
/Volumes/Sample/Files
├── Capture0001-Edit.jpg
├── Capture0001.jpg
├── Capture0001.nef
└── Capture0001.xmp
The file list is flat (lacks an organizational hierarchy), and the filenames are weak, meaning there is no way to tell if a different set of files by the same names reflect the same scenes or not. For that matter, there's no way to be sure (based on the filenames) if the files belong to the same scene (although in this case, there's a good chance they do because that they appear side-by-side in the same folder and have similar filenames).
Note: For now, think of a scene as what the camera saw at the time the digital image was captured. The topic of scenes is covered in detail later in this document.
Now, consider the following file listing, which shows how the same files might be organized and named upon being imported into a library having the default configuration:
/Volumes/Photos/Libraries/Personal
├── 2028
└── 02
├── 20180221T211619-S570000-5ZMGA-00.JPG
├── 20180221T211619-S570000-5ZMGA-00.NEF
├── 20180221T211619-S570000-5ZMGA-00.xmp
├── 20180221T211619-S570000-5ZMGA-01.JPG
As can be seen, the files are organized by year and month, and their names contain embedded information that seems to follow a convention. The year and month information corresponds to the date the images were originally captured. The embedded information does, in fact, follow the MediaKeg filename convention, which this document covers in detail.
MediaKeg also provides the ability to customize how imported assets are named and organized using library templates. Library templates contain tokens that are replaced with like-named metadata tag values during an import operation.
MediaKeg supports any multimedia asset having an audio, image, or video MIME type, provided the asset contains the required metadata information. MediaKeg also has several advanced features and capabilities that provide a high degree of control, flexibility, and correctness over how assets are imported. The following list highlights these features and capabilities:
- Stateless operation
- Cross-platform support (Linux, macOS, Windows)
- Command-line Interface (CLI)
- Flexible templates
- Open and compatible
- Deterministic
- Scene aware
- Duplicate prevention
- Error detection
- Write metadata option
- Timeshift option
- Rollback option
- Detailed logging and reports
- Parallel processing
- Utilities
This remainder of this section covers each feature and capability in detail.
Stateless Operation
MediaKeg functions solely at the file system level and requires no databases or catalogs for maintaining state over successive import operations. The stateless design was a key goal in the development of MediaKeg because it bring along several inherent benefits, as follows:
- Library portability
- Minimal maintenance tasks
- Interoperability with other applications
- Simplifies installation and removal
- Facilitates standalone and cross-platform operation
The tradeoffs are:
- Less flexibility over how imported library assets are named
- Fewer opportunities to optimize performance (as compared to a stateful design)
To enable high performance with a stateless design, MediaKeg enforces rules over how imported media assets are named. These rules are covered in the Library Layout and Filenames section, and they result in a pattern that MediaKeg leverages for optimizing import performance.
Note: The file naming rules do not preclude the user from customizing how imported assets are named.
Cross-platform Support
MediaKeg is supported on the folowing OS platforms as a desktop application:
- Linux
- macOS
- Windows
The MediaKeg runtime library is currently available with a Command-line Interface (CLI), with plans to make a graphical user interface (GUI) available in the future.
Command-line Interface
The Command-line Interface (CLI) provides full access to MediaKeg runtime library and is an efficient means of executing commands. The following example illustrates the efficiency with which an import operation can be executed using the CLI:
$ mkeg i /Volumes/DCIM
Note: This example assumes the user has already setup a default library and that the assets to import are located in the folder shown (and not a subfolder).
The MediaKeg CLI makes is quick and easy to ingest assets to a workstation or server without needing to launch a heayweight application that combines ingestion with other import tasks.
See the MediaKeg CLI Reference Guide for more information.
Flexible Templates
MediaKeg provides control over the layout (organization) and filenames of imported media assets using layout and filename templates, respectively. These templates contain tokens which MediaKeg exchanges for Exif metadata (or user supplied values) at import generate asset library paths and filenames.
For added flexibility over how imported assets are organized and named, template tokens can be static or dynamic. Static tokens always receive a replacement value even if their respective backing properties are unavailable, in which case default values are used in place of actual metadata. Conversely, a dynamic token is ignored if its backing property is unavailable.
MediaKeg templates also support user tokens, which recieve their backing values in the form of user input to an import operation. This capability provides an additional degree of freedom for how assets are organized and named, and can be used as an alternative to multiple libraries when wanting to separate assets according to user-defined criteria.
Open and Compatible
There is an abundance of applications available for processing and managing multimedia content, each having its own import solution. It's also common for artists, especially photographers, to work with multiple similar applications where each has a unique trait or capability that makes it optimal for a particular task. MediaKeg is not a replacement for these applications. Instead, MediaKeg complements such applications by providing a consistent cross-platform and cross-application solution for how assets are ingested and structured onto a workstation (or server).
Because MediaKeg libraries are ordinary file system folders, there are no compatibility or interoperability issues when accessed from other applications. Moreover, it's expected that such applications will create or add foreign files and assets to a MediaKeg library. The term foreign is used when referring to files copied to or created in a MediaKeg library by some external means. MediaKeg does not overwrite library content and incorporates sub-indexing into its file naming scheme to avoid filename collisions. Therefore, foreign files are not at risk of being overwritten.
MediaKeg does not eliminate the need for additional steps needed by a third-party application to perform as part of its import process. Hence, the user may need to reimport the files he or she wants to work with its application workspace. A benefit of inserting MediaKeg into the workflow is that the user can ingest the files onto their workstation without needing to commit to a particular application. Once ingested, the application can pick up where MediaKeg left off instead of needing to also copy the files.
Deterministic
MediaKeg prioritizes correctness over other factors when importing assets, which can lead to some assets being quarantined or an import operation being aborted. The quarantine folder is a special library folder that separates indeterminate assets from the rest of the library, so they don't contaminate the main population with misinformation. This approach also prevents such assets from being left behind at the source. Once corrected or mitigated, attempts to reimport the quarantined assets can be made.
Some of the approaches that MediaKeg uses to ensure correctness are as follows:
- An asset must contain metadata that indicates its date and time of capture or creation. If not, the asset is considered indeterminate and is quarantined. Using the file last modified date as a fallback could make for a more friction-free import experience; however, this is not a reliable source of such information and is therefore not used.
- An asset must contain metadata indicating its MIME type and must be of type application, audio, image, or video. If not, the asset is considered indeterminate and is quarantined, even if the asset file extension matches a known media type.
- All assets are checksum verified using an MD5 hash algorithm (default) to ensure no bit errors occurred in the process of copying a file from source to library destination. The digest value is also used for duplicate detection. This is an example of how MediaKeg prioritizes correctness over added performance.
- When using the option to write or update metadata values to import assets, MediaKeg stages the write operations and verifies they have been written correctly before moving them to their final library destination. This comes at a significant performance penalty, but and is another example of how MediaKeg prioritizes correctness over added performance.
MediaKeg provides options to relax some import requirements at the user's descretion, but is strict by default.
See the MediaKeg CLI Reference Guide for more information.
Scene Aware
Imagine that you have two photos of the same scene — an original and an edited version of the original — and the edit has metadata removed for privacy and sharing purposes. Since MediaKeg relies on metadata for determining how imported assets are organized and named, the edit could become separated from the original if the missing metadata is referenced by the library templates. MediaKeg addresses this problem by making the import process scene aware.
By making import scene aware, MediaKeg can help ensure that assets originating from the same capture or creation (a scene) are organized and named similarly after being imported. This feature is particularly useful when deep scanning multiple drives for assets to organize into a single library, where it is not unusual for various edits (variants) of an original to exist. For photographers shooting RAW+JPEG, variants exist straight out of the camera, before the editing and sharing process even begins. In advanced scenarios, where the library template includes a custom metadata tag specific to a RAW file, the scene aware feature ensures the JPEG is organized and named in a manner consistent with the RAW file.
See the Adanced Concepts for more information about scenes.
Duplicate Prevention
Duplicate prevention helps prevent two or more identical files from being imported into the same library. The algorithm used to detect duplicates is highly efficient and remains performant even as a library grows very large under normal use-case scenarios.
One of the scenarios where duplicate pevention plays an importat role is in finding and reorganizing all of one's photos into a single library. Image that you've accumulated years of photos spread out over mutiple drives and it's time to get organized, or that you want to be sure not to lose any photos before sending the drives off to be recycled, or both. Wthout the proper tools, this is a painstaking process to complete without losing photos or amassing lots of duplicates by copying folders instead of files, especially if you're the type of person who likes to create multiple backups.
The scenairo just described it one for which MediaKeg was specifically developerd to handle in a through yet performant manner. Simply point MediaKeg at the root of each drive to import photos and it takes care of scanning for photos (and other mutimedia assets if desired) and importing them into a single library, leaving behind any duplicates. Any copies of an orginal file that have been subsequently edited do not count as duplicates. Rather, only files having identical file digests (a form of file signature or fingerprint) are considered duplicates. By default, MediaKeg uses the MD5 algorithm for calculating file digests.
Note: MediaKeg will not overwrite a previously imported asset with another asset resolving to the same library path and filename. Naming conflicts could occur if two or more edited versions of an asset having the same file extension are imported because they are not duplicates but have the same metadata. MediaKeg incorprates a file subindex into its file naming scheme to prevent such collisions.
Duplicate prevention also plays a role in the workflow of ingesting multimedia content from removable storage (SD card, CFast card, etc.) to a workstaton or server. Unless the removable storage device is reformatted prior to reuse, the possibiity of accumulating duplicates or overwriting a previously imported and edited file exists withtout some form of duplicate detection and prevention strategy.
Error Detection
Data corruption is rare in modern compting devices but such errors can occur if the storage device is starting to fail or when uncorrected memory errros occur. The probability of hitting such an error increases when copying large volumes of data, such importing to and backing up multimedia libraries. Most consumer PCs and laptops lack the error correcting (ECC) memory to protect against the latter, which is typically reserved for high-end professional workstations due to the added cost of ECC memory.
To help guard against data corruption of imported media assets, MediaKeg recalulates the MD5 digest for each imported asset after the file copy operation is complete and compares it to the source digest. If the digests to not match, the MediaKeg automatically re-attempts to copy the file (up to three times). Multiple failed attempts (where the copy operation was allowed but the digests to not match) is most likely a hardware issue with the source or destination device. MediaKeg warns and maintains a log of such events for diagnostic purposes.
Note: MediaKeg ensures that the imported file matches the source can does not prevent a corrupted source file from being imported, unless the source file is so badly corrupted that a metadata read error occurs. If the latter, then the file asset is quarantined in a special folder so it is not left behind while also not being added to the general population.
Write Metadata Option
MediaKeg provides the option write new or updated metadata to media assets as they are being imported. The write values are applied to all assets processed for the import session, so this should be kept in mind when considering this option. For example, you probably wouldn't want to update the camera make and model information for a lifetime's worth of digital photos in a single import session. On the other hand, adding artist, copyright, or keyword information could make sense depending on how the source files are organized.
MediaKeg inserts an intermediary staging setp into the import pipeline when the write option is invoked to accomplish the following:
- To perform duplicate detection on the modified binary instead of the orginal source.
- To ensure all assets accept the update values prior to importing them into the target library.
Note: The staging folder is located at the root of the target library by default and is automatically removed at the end of an import operation. The import command provides the option to specify an alternate path.
Because updating metadata requires assets to be modified, duplicate detetction operates on the modified binary instead of the orginal source. This guards against the same asset slipping past the duplicate detection logic if reimported using the same write values. If the orginal asset is later re-imported with diffferent write values (or as-is), then it is not a duplicate of its previously updated self and will be imported.
By default, MediaKeg aborts an import operation if any asset rejects a write operation (or if the write value did not take for some reason). The user can override the defaut behavior to be best effort, where the import operation is allowed to continue even if a write operation did not take on one or more assets. This might be necessary if the import collection spans multiple file formats because supported metadata tags and if writable varies by file format.
When expanding templates to determine asset library paths an filenames, the write value is use in place of the original value when referenced by a template token. Overriding the default import behavior to best effort is therefore ill-advised in such cases. Doing so could lead to inconsistencies in how some assets are organized and named if reimported into another library, unless the user can somehow manage to retrace the import operations of the exsting library when reimporting.
Note: When using the write option to update asset metadata, a copy of the origial is kept in a folder at the root of the target library. The import command provides the option to specify an alternate path. This feature is not intended to be a substitue for a proper backup solution and applies only to assets imported using the write option.
Note: When using the write option to update asset metadata, error detection verifies assets are copied correctly from source to stage and then from stage to final library destination. However, error detection does not check for any data corruption errors that may have occured during the write operation itself. Therefore, it's important to always keep a backup of the orginal asset.
Note: MediaKeg uses ExifTool for all metadata read and write operations. ExifTool is a widely-used and trusted application within the photographic community, and its open source library is trusted and used by several other third-party applications.
See also: MediaKeg CLI Reference
- --keywords, --timeshift , --timezone options
- /w option
Timeshift Option
Imagine that you've just purchased a new camera or traveled to a different region of world without setting or resetting the camera date and time. When the time comes to review or import your photos, it becomes appearant that the photos you took yesterday are recorded as having been taken several days, weeks, or even years ago, or the family photo you took on a sunny beach elswhere in the world is recorded as being captured at midnight. This is probably not the desired result if you care about such things and is a problem the timeshift option addresses.
The timeshift option is a special write metadata option with extra safeguards to protect again unintended consquences. Specifically, it updates date and time metadata indicating when an asset was captured or created by a time duration offset. MediaKeg does not allow directly setting date and time values because import typically involves processing several assets, and it's atypical that all assets in an import collection would be captured or created at the same instant in time.
As an additional safeguard, the timeshift option also requires that all assets from an import collection originate from the same device. The reasoning is that it's unlikely images captured from two or more different devices would need to be offset by the same duration value.
See also: MediaKeg CLI Reference
- --timeshift option
Rollback Option
MediaKeg provides the ability to rollback (undo) import operations. Rollback removes all assets previously imported into a lbirary for a specified import operation. The typical use-case is following the last import where the user realizes he or she selected the wrong source or targeted the wrong library. However, rollback can be applied to any import operation previously used to import assets into a library.
Note: The rollback option depends on log files that MediaKeg creates for each import operation, which are saved under the target library. This is the one exception where MediaKeg requires saved state to carry out an operation. Should the logs directory be deleted, import will continue to function normally but the user will lose the ability to rollback all previous import operations.
See also: MediaKeg CLI Reference
- rollback command
Detailed Logging and Reports
MediaKeg writes detailed logs for all import operations. With the exception of rolling back previous import operations, the logs are not required for MediaKeg to function. In addition to enabling rollback, logs can assist with the following scenarios:
- Tracing a library assets back to their orginal sources and filenames.
- Reviewing profile data for optimzing import performance.
- Reviewing warning and error details associated with an import operation.
- Running scripts for auditing changes to library assets since imported.
In general, log data consumes a very small percentage of storage relative to imported media assets. If desired, logs can be safely deleted at anytime to recover storage; however, this should only done once certain there is no need for log data in the future or the ability to rollback a prior import operation.
Parallel Processing
MediaKeg parallelizes import workloads across multiple processes to help reduce the total amount of time needed to complete a job. By default, the total number of processes scales with the total number of processor cores available on the host computer. For most systems, maximum throughput is gated by storage performance so there are limits to how much multiprocessing can help.
MediaKeg logs simple profiler information that can assist in finding bottlenecks and fine-tuning performance. The total number of processes assigned to an import operation can be set globally or tuned for file read (metadata), digest, and copy operations.
See also: Performance Tuning
Utilities
MediaKeg provides serveral helpful utility commands in addition to the import command, such as:
- Library management
- Exif metadata viewer
- Fast file find with regular expression matching
- Timeshift calculator
See the MediaKeg CLI Reference Guide for the complete list of CLI commands and usage details.
Library Structure
A library consists of files and directories categorized as follows:
- Library configuration
- Resident assets
- Auxiliary files
- Foreign assets and files
Library configuration
A library has a hidden configuration file named .keg
at its root, as shown here:
library
├── .keg
This file contains all of the information necessary to complete an import operation. This approach makes libraries portable, meaning it is possible to rename or move them without incurring additional maintenance tasks to resume import operations.
Resident assets
Resident assets are assets that were successfully imported into one or more library collections. A collection is a subdirectory of the library whose relative path is expanded using a layout template. The default layout template organizes assets according to the year and month captured or created. If July 2018, for example, then they appear under the following subdirectory:
library
├── 2018
│ └── 07
│ └── (assets) # "2018/07" collection
The collection name assumes the same name as its path, which is "2018/07" in the preceding example.
Note: An asset is not considered imported until it finds its way into a library collection using the process described above. Assets may be copied elsewhere in the library structure for reasons explained below, but such assets are not considered successfully imported.
Auxiliary files
Auxiliary files are the artifacts of an import operation and do not play an active role in subsequent import operations. Auxilary files fall into one of the following groups:
- Log files
- Stage files
- Originals (when write metadata and / or sidecar option is invoked)
- Quarantined files
A folder for each group can be found at the library root, as illustrated here:
library
├── .logs
├── .stage
├── _originals
├── _quarantine
Important: Before deciding to delete auxiliary files, be sure to consider the information provided below to understand the tradeoffs.
Note: The
.logs
folder is the only folder that is always present after an import operation. The presence of the other folders shown is conditional, as discussed below.
Note: The import command provides the option to specify an alternate path for staged assets and originals.
Log files
The .logs
directory is a hidden folder that contains detailed logs for each import operation. The logs are organized according to the date and time (UTC) an import operation starts. For example, the log files for an import operation that started on 2020-03-18T003435-07:00
(local time) are found under the following folder:
library
├── .logs
│ └── 2020
│ └── 03
│ └── 18
│ └── 073435 # UTC time
The log files contain the following types of information:
- Import summary
- Sources of imported media content and change log (auditing trail)
- Import errors and warnings
- Troubleshooting and performance tuning information
- Rollback journals
The amount of storage the log files take is a very small relative to imported multimedia assets. Therefore, it's recommended to keep the log files if possible in case needed at a later date. If the log files are deleted then the ability to rollback for an import operation is lost, which is an exception to stateless operation claim at the start of this document.
Note: If the log files are deleted then the ability to rollback for an import operation is lost.
Stage files
The .stage
directory is a temporary folder for staging assets when invoking the option to write metadata properties during import. The stage directory serves two functions:
- Ensure assets are updated correctly prior to importing them into the library.
- Enable duplicate to be peformed on the modified binary instead of the orginal source.
Once the stage directory has served its purpose, the assets contained within are copied to their final destination and it is then deleted.
Important: Never store files in the
.stage
directory should it fail to be deleted at the end of an import operation. If present, the.stage
directory is automatically deleted as the start of the next import operation is run.
Note: An alternate path for the staging directory can be set as an import option.
Originals
The _originals
directory is populated when invoking the option to write metadata properties during import. For each asset or file that is modified, the orginal binary is copied into the _originals
folder under a directory path reflecting the asset's library destination and scene name. The same process applies to sidecar files when invoking the option to include sidecar files, because sidecars must be modified to reflect the new filename of the asset they're associated with after being imported.
Consider a file named IMG001.CR2
that is modified and imported to the following library path:
2016/05/20160530T120936-D850S210000-7T18Q-00.CR2
In this example, the scene name is 20160530T120936-D850S210000-7T18Q
. As such, a copy of the orginal file can be found as illustrated below:
library
├── _originals
│ └── 2016
│ └── 05
│ └── 20160530T120936-D850S210000-7T18Q
│ └── IMG001.CR2
Note that the orginal file also retains its orginal filename.
Note: The
_originals
folder is not a replacement for a proper backup solution. Only assets that are modified using the import write option are copied into the_originals
folder, and a backup of the entire library should be set up and maintained on a separate drive using backup software.
Note: An alternate path for originals can be set as an import option.
Quarantined files
The _quarantine
directory contains assets that cannot be imported due to insufficent metadata or metadata read errors, and are called indeterminate assets. Copying interminate assets to the quarantine folder makes it so they are not left behind at the source, where they could be forgotten about or overlooked. At the same time, it prevents them from contaminating the resident population with missinformation (by attempting to use unreliable information in place of metadata).
Because indeterminate assets lack the information to import, MediaKeg uses a different strategy for organizing quarantined assets (as compared to the _originals
folder). Specifically, assets are copied into directories named with a hash of the following information:
- Hostname
- Source directory of quarantined asset
The layout structure is illustrated by the following example:
library
├── _quarantine
│ └── 5a3e126d05e75d202c3fa026a8195899
│ └── _source.html
│ └── seattle.png
│ └── avatar.jpg
Note: An alternate path for quarantined assets can be set as an import option.
The _source.html
file provides the directory path information for the quarantined files. The host name accounts for shared media such as a file share or removable storage.
<pre>
{
"host": "MacPro.local",
"path": "/Volumes/Backups/Photos/2003"
}
</pre>
The _quarantine
folder structure prevents deeply nested directory structures from forming under _quarantine
, which could make working with quarantined files tedious.
After each import operation or at some regular interval, it's a good idea to check for quarantined files and fix or delete them, so they do not accumulate over time. This is especially true after deep scanning entire drives for assets to import. Deep scans often encounter images downloaded from the Internet, application generated preview files, and thumbnail images, where metadata is often stripped away for privacy purposes.
Note: Use the
--minsize
import option to help prevent low-resolution Internet files and cache data from being imported. Such files often end up in quarantine because they are indeterminate.
The absence of metadata informing the date and time of capture or creation is the most common reason for quarantining an asset. The quarantine folder makes it possible to efficiently discover such assets so that the user can fix the problem and attempt reimport. In such cases, the fix is to edit the metadata using a utility such as ExifTool. If the capture date and time cannot be set, then the fallback solution is to rename the file. By default, MediaKeg extracts this information from the filename provided it conforms to the MediaKeg filename convention.
Note: Setting the import
--strictness
option to level 4 (Brutal) disables extracting timestamp information from filenames.
Foreign assets and files
MediaKeg libraries are ordinary file system folders, and there is an expectation that users will work with library assets using other applications. In doing so, files will inevitably be added or created into the library structure through some means other than an import operation. Such files are called foreign assets or foreign files, depending on if multimedia assets or not.
Filename Convention
MediaKeg defines a filename convention for how imported media assets are named. The convention enables MediaKeg to operate without stored state (databases, catalog files, etc.), have high performance, and preserve asset affinity to scenes at a file system level (i.e., so all assets that are part of the same have the same filename except for subindex and file extension). The tradeoff is MediaKeg filenames are long, and there is reduced flexibility to customize how assets are named.
This section breaks down the parts of a fully qualified file path for an imported media asset, including filename convention. The following illustration of an example file path serves a visual guide to the remainder of this section:
Library Root
The library root is a directory containing a library configuration file (.keg
) and is the target of an import operation. All library contents are expressed relative to the library root.
Collection
Imported media assets are organized into collections, which are subdirectories of the library root. The collection path is expressed relative to the library root, and the collection name bears the same name as the collection path. For example, the collection path and name from the above illustration is 2018/06
.
The library layout template determines the collection path for each asset by expanding it with asset metadata (and user tag values where applicable). The default layout template organizes assets by year and month.
See Layout Templates for details on how to create custom layout templates.
Filename
Imported assets have filenames that conform to the naming convention covered by this section. The following terms are defined and applied to the convention:
Basename
The basename is the part of the fully qualified path following the last path separator, as highlighted using boldface in the following example:
/Volumes/Pictures/Library/2018/06/20180620T153205S120000-6CC58-01.JPG
Note: For the remainder of this section, the directory path leading up to the basename is excluded from the examples highlighting the various basename parts.
The term basename is used throughout the documentation only when needing to make the distinction between basename and filename. Since MediaKeg does not alter the file extension during the import process (except for letter case depending on library configuration), the documentation focuses primarily on the filename part.
File extension
The file extension is the suffix at the end of the basename, as delimited by a period. The file extension is an indication of a media type, and is also referred to as file type througout the documentation.
20180620T153205S120000-6CC58-01.JPG
Filename
Regular expression pattern:
^((?:[^\/\\.#%|<>?*":]*)((?:(?:18|19|20)(?:\d{2}))(?:0[1-9]|1[012])(?:0[1-9]|[12][0-9]|3[01]))(?:[^\/\\.#%|<>?*":]*)(?:T)((?:0\d|1\d|00|20|21|22|23)[0-5]\d[0-5]\d)(?:[^\/\\.#%|<>?*":]*)([S|M|F|C])(\d{6})-([A-Z0-9]{5}))-(\d{2,})(?:[^\/\\.#%|<>?*":]*)?$
The filename is the basename minus the file extension.
20180620T153205S120000-6CC58-01.JPG
The filename is also the concatenation of the timestamp, index, deviceId, and subindex parts.
Declarative Part
Regular expression pattern:
^(?:[^\/\\.#%|<>?*":]*)((?:(?:18|19|20)(?:\d{2}))(?:0[1-9]|1[012])(?:0[1-9]|[12][0-9]|3[01]))(?:[^\/\\.#%|<>?*":]*)(?:T)((?:0\d|1\d|00|20|21|22|23)[0-5]\d[0-5]\d)(?:[^\/\\.#%|<>?*":]*)$
The declarative part of the filename is expaned from the filename template, which includes the date and time an asset was captured or created in abbreviated IS08601 form. The default layout template includes just the date and time information as shown here:
20180620T153205S120000-6CC58-01.JPG
The filename template can be customized to include additional information. For example, the template can be modified to have all assets start with the letter P
and include device make and model information, as shown here:
P20180620T153205-NIKON-D850S120000-6CC58-01.JPG
See Filename Templates for more information.
Index
Regular expression pattern:
^([S|M|F|C])(\d{6})$
The file index (or index) prevents filename collisions for assets captured from the same device at subsecond intervals. If the asset contains metadata providing subsecond time information then this value is used for the index by default. If this information is not available, then MediaKeg attempts find a suitable property fulfilling a similar role, such as shutter count or file number.
The file index is a seven character sequence consisting of a prefix and six digits. The prefix informs the index source and the digits represent the time or count value. Values less than six digits in length are zero padded. Time values are left and right padded and count values are are left padded. Values greater than six digits are truncated from the right.
The following example shows a filename having an index value of S120000
:
20180620T153205S120000-6CC58-01.JPG
The S
prefix indicates that the index is a subsecond time value and the time value is 120 milliseconds (see below for explanation).
The following table lists index prefixes in rank preference order and their meanings:
Prefix | Rank | Source |
---|---|---|
S | M | 0 | Custom list of ranked sources specified by settings.index.tags . |
S | 1 | The asset capture subsecond time value. |
M | 2 | Default list of ranked metadata source properties. |
F | 3 | Parsed from the asset source filename. |
C | 4 | Import session counter. |
MediaKeg seeks interval values using the sources listed, in rank order from lowest to highest. The source value must be available and numeric (0-9) else it is skipped, and processing continues to the next item in the list. The metadata sources (M
prefix) are ranked sub-lists of metadata tags serving as providers of interval data.
S-value
The index source is metadata indicating the subsecond time value for when an asset was captured or created.
Note: Most cameras that include subsecond time values in metadata do so at centisecond (2 decimal places) or millisecond (3 decimal places) resolution. The value shown is to the right of the decimal, where
100000
represents 100 milliseconds, and001000
represents 1 millisecond. Subsecond time values are padded from both ends to ensure the sort order reflects the correct sequence, and then to fill out the 6-digit sequence.
M-value
The index source is a metadata property value other than subsecond time. The property selected is the first to have a numeric value from a ranked list of metadata tags. The default list can be customized using the settings.index.tags
property in the library configuration.
See settings.index.tags for information on setting and listing the index tags.
F-value
The index is parsed from the asset source filename if it contains a numeric sequence matching one of two patterns.
If the filename follows the filename convention described herein, then the index already encoded in the filename is selected. This situation can occur if the import asset belongs to the same scene as a resident asset, or if the source is from another library. If the scene includes a resident asset, then its index value is selected.
If the filename does not follow convention, then it must contain a sequence of 3 to 5 consecutive digits, inclusive (expressed as [3, 5]). It is common for digital cameras to index images this using a [3, 5] sequence. If there are multiple matches, then the most frequently occurring match is selected.
Note: A minimum sequence of 3 digits helps prevent file copy subindex values from being selected (e.g.,
DSC001 Copy 1.JPG
). A maximum of 5 is required to avoid date and time encoded values from being selected.
The table below provides exampes of indicies parsed from asset filenames belonging to the same scene. An empty cell indicates a match could not be found and the scene will be assigned an auto-incremented I-value, as described below.
Scene Asset Filenames | Index | Comments |
---|---|---|
IMG_012345 | F012345 | |
IMG_0012 | F000012 | |
IMG_00012-Copy 04 | F000012 | |
IMG_12 | No numeric sequence [3, 5] | |
IMG_123456 | No numeric sequence [3, 5] | |
IMG_089_123456 | F000089 | |
IMG_0012, IMG_0012-Copy 1, Wedding1234 | F000012 | More occurrences of 0012 |
C-value
The index is set to the scene idenifier, which an an auto-incremented counter for each new scene. The counter is reset at the start of each import operation.
Device identifier
Regular expression pattern:
^([A-Z0-9]{5})$
The device identifier (deviceId) helps to uniquely identify a media asset in time. The deviceId is a 5-character, base-36 encoded value derived from available device make, model, and serial number information.
Note: The virtualized tag properties
*Make
,*Model
, and *SerialNumber
are used for make, model, and serial number information, respectively.
A dash separator always precedes the device identifier.
The following example shows a filename having a device identifier of S120000
:
20180620T153205S120000-6CC58-01.JPG
MediaKeg generates a deviceId using all available information. If one of the above tag properties is not available, then a deviceId is still generated, but the chance of another device having the same identfier is much higher than if all three properties are available. If no properties are available then the deviceId is set to 00000
.
Note: The probability of two different devices having the same deviceId is highly improbable if
*Make
,*Model
, and*SerialNumber
are available for either device.
Note: The device identifier is derived from a hash of device indentifying information (make, model, and serial number), which is not reversible and therfore should not be a privacy concern. MediaKeg libaries also set
settings.salt
to hash input string with a user-defined value, which results in a different device identifier being generated for the same device information.
Subindex
Regular expression pattern:
^(\d{2,})(?:[^\/\\.#%|<>?*":]*)?$
The subindex is exists to resolve filename conflicts when two or more assets belonging to a scene have the same file extension. The subindex has with two leading digits followed by zero (0) ore more characters, which may also include spaces. A dash separator always precedes the subindex.
The following example shows a filename having subindex of 01
:
20180620T153205S120000-6CC58-01.JPG
The flilename always includes a subindex, even in the absence of filename conflicts. In the event of a filename conflict, the importer auto-increments the subindex value until a free slot is found. If a sidecar is paired to the asset, then the sidecar target name is also considered when seeking a free slot.
The subindex pattern provides freedom for the user and external applications to copy library assets without violating the filename convention. The following list illustrates subindices for multiple assets having the same scene name and file extension:
20180620T153205S120000-6CC58-00.JPG 20180620T153205S120000-6CC58-01.JPG 20180620T153205S120000-6CC58-01-01.JPG 20180620T153205S120000-6CC58-01 Copy 1.JPG
The last two entries from this list were created by the user or an external application because MediaKeg always sets a numeric value.
Scene name
Regular expression pattern:
^(?:[^\/\\.#%|<>?*":]*)((?:(?:18|19|20)(?:\d{2}))(?:0[1-9]|1[012])(?:0[1-9]|[12][0-9]|3[01]))(?:[^\/\\.#%|<>?*":]*)(?:T)((?:0\d|1\d|00|20|21|22|23)[0-5]\d[0-5]\d)(?:[^\/\\.#%|<>?*":]*)([S|M|F|C])(\d{6})-([A-Z0-9]{5})$
The scene name is the basename minus the subindex and file extension. All assets that originate from the same scene also share the same scene name.
Note: The scene is also the concatenation of the declarative, index, and deviceId parts.
The following example shows a filename having scene name of 20180620T153205S120000-6CC58
:
20180620T153205S120000-6CC58-01.JPG
Library Templates
Library templates (or templates) are user-customizable strings that define how assets are organized and named in a library.
- The layout template determines how assets are organized.
- The filename template determines how assets are named.
The library configuration (.keg
) file contains the template declarations.
Tokens
Library templates contain token parameters (or tokens), which are replaced by arguments in a process called expanding the template. The templates are expended for each scene using asset metadata and optional user tag values.
A token is delimited by a tag pair consisting of a start tag and an end tag. The general format is as follows:
<start-tag token end-tag>
The token inclusion rules are as follows:
- Token names are case-insensitive.
- Token names must be alphanumeric and contiguous (i.e., a single word with no special characters).
- Spaces between token and delimiters are allowed for readability, but not required.
- A token may appear only once per template.
The tag delimiters are specific to each of the three (3) token types:
Each token type is discussed below.
Tokens have one of the following behaviors:
- Static tokens
- Dynamic tokens
A static token always receives a substitution value. If the mapped property is unavailable or if the property value is not set then a default value is used in its place. Default values are specified as part of the library configuration file, and are required for each static token referenced by the library templates.
See Static Token Defaults for details on setting defaults.
A dynamic token is dropped if the substitution value is unavailable.
Token Delimiters
Token delimiters are string parsable entities consisting of start and end tags specific to each token type and behavior, as indicated by the following table:
Token | Start Tag | End Tag | Behavior | Comments |
---|---|---|---|---|
Timestamp | <@= | @> | Static | Dynamic tags not supported |
Metadata | <%= | %> | Static | |
Metadata | <%? | %> | Dynamic | |
User | <&= | &> | Static | |
User | <&? | &> | Dynamic |
Static Fill
Templates can also include static fill, which is any text external to tag delimited tokens. Except as noted below, static fill is transferred as-is to the expanded template output (meaning the static fill is not replaced by metadata or user tag value).
- Must be alphanumeric, except as noted below.
- Hyphens (
-
) are also allowed. - Must not contain spaces.
Hyphens receive special handling as follows:
- Two or more adjacent hyphens are reduced to a single hyphen.
- Hyphens are trimmed from the start and end of the expanded output.
This special handling helps prevent extraneous hyphens when used alongside dynamic tokens, which may not receive a substitution value.
The following example shows a filename template state starts with the letter P
:
P<@=*date@><@=*time@>
In this example, the letter P
is undelimited (i.e., all by itself), which causes MediaKeg to treat it as static fill. The result is that all imported media assets will have filenames starting with the letter P
, as shown in the following example:
P20180620T153205S120000-6CC58-01.JPG
Although this example inserts static fill at the start of the template string, fill can be added anywhere in the template.
Timestamp Tokens
Timestamp tokens are are used or organize and name assets according to the date and time they were originally captured or created.
Timestamp tokens must be listed in rank order (where used), as indicated in the table below.
Timestamp tokens need not be adjecent, meaning they can be interleaved with other token types.
Filename templates must include
*DATE
and*TIME
tokens.
It's recommended that layout templates contain one or more timestamp tokens to prevent collections from becoming excessively large.
Timestamp tokens represent the date and time parts corresponding to the *Created
virtual tag, as outlined in the following table:
Timestamp Part | Rank(s) | Information Contained | Expanded Value (Example) |
---|---|---|---|
*DATE | 1, 2, 3 | Year, Month, Day | 20180212 |
*YEAR | 1 | Year | 2018 |
*MONTH | 2 | Month | 02 |
*DAY | 3 | Day | 12 |
*TIME | 4, 5, 6 | Hour, Minute, Second | T143205 |
*HOUR | 4 | Hour | 14 |
*MINUTE | 5 | Minute | 32 |
*SECOND | 6 | Second | 05 |
Metadata Tokens
Metadata tokens map to metadata tags by the same name as the following table illustrates:
Metadata Token | Metadata Tag | Expanded Value Example |
---|---|---|
Artist | Artist | Ansel Adams |
Make | Make | Canon |
Model | Model | EOS 5R |
Virtual Metadata Tokens
Virtual metadata tokens (or virtual tokens) are metadata tokens that map to virtual metadata tags. Except for this distinction, vitual tokens behave like real metadata tokens.
The following table lists the virtual tokens that library templates support:
Virtual Token Name | Virtul Tag | Comment |
---|---|---|
*Make | *Make | Device manufacturer name. |
*MediaType | *MediaType | See mediaTypeTransforms setting. |
*Model | *Model | Device model name. |
*SerialNumber | *SerialNumber |
Note: Use timestamp tokens for tag values correponding to the
*Created
virtual tag.
Note: Using real metadata tokens (i.e.,
Make
,Model
,SerialNumber
) in place of the corresponding virtual tokens is supported but not recommended, because virtual tokens have a higher chance of being backed by metadata.
Application Notes
For a list of known tags, enter the following command using the MediaKeg CLI:
$ mkeg list-tags
For known metadata tags, metadata tokens are case-insensitive. For all other metadata tags, metadata tokens should be entered in proper case.
Note: The
list-tags
command contains several options for filtering the output, including by tag name search pattern and tag category. Refer to the documentation for usage details.
When customizing library templates using metadata tokens, consider that the tags set on an asset vary according to capture device make and model. In general, try to limit usage to the most common tags to help ensure that replacement values are always or usually available. For static tokens, a default value is assigned if the tag value is unavailable.
To see if an asset contains a tag, enter the follwing CLI command:
$ mkeg tags <path-to-file>
User Tokens
User tokens map to user tags, which are provided as inputs to the import process alongside their respective values. User tokens can be static or dynamic, but declaring them as dynamic is generally the best option so they are dormant unless activated via user input.
Consider the following layout template and an import operation for asset captured in June 2019:
"layout": {
"template": "<&?Category&>#<@=year@>#<&?Event&>#<@=month@>"
},
The following examples illustrate how a dynamic user tokens are activated via the MediaKeg CLI:
Example 1: No activation
$ mkeg import /Volumes/DCIM
The layout template expands to:
2019/06
The dynamic tokens are dormant because the import command contains no user tag option values.
Note: The raw template substitution for this example yields
#2019##06
. The adjacent hashtags (##
) appear because theEvent
tag is dormant. MediaKeg collapses adjacent hashtags into a single hashtag prior to expanding the file system path separators.
Example 2: Partial activation
$ mkeg import /u:category=Racing /Volumes/DCIM
The layout template expands to:
RACING/2019/06
Note: This example assumes the default template lettercase settings, which is for token values to expand to uppercase.
Example 3: Full activation
The following example illustrates how a user token is activated via the MediaKeg CLI:
$ mkeg import /u:category=Racing /u:event="24 Hours of Le Mans" /Volumes/DCIM
The layout template expands to:
RACING/2019/24HOURSOFLEMANS/06
Note: This example assumes the default template format settings, which is for expanded token values to contain only alphanumeric characters and no whitespace.
Layout Templates
Layout templates determine how assets are organized or, more precisely, how collection paths are expanded from tokens and static fill. The hashtag (#
) is a special character specific to layout templates, which is replaced by the path the platform-specific path segement separtor:
/
on Linux and macOS\
on Windows
The following table provides layout template examples and expanded collection paths:
Layout Template | Collection Path | Comments |
---|---|---|
<@=*year@>#<@=*month@> | 2018/02 | Default Template |
<%?artist%>#<@=*year@>#<@=*month@> | Ansel Adams/2018/02 | Dynamic behavior |
<@=*year@>#<%=make%>#<@=*month@> | 2018/Canikon/02 | Static behavior |
<&?category>#<@=*year@>#<@=*month@> | Racing/2018/02 | Dynamic behavior |
Note: Adjacent or dangling path separators are reduced or trimmed from the expanded template value. This case can occur if one or more substitution values are unavailable for dynamic tokens. For example, if the category information is unavailable in the example provided above, then the template expands to
2018/02
instead of/2018/02
.
Filename Templates
Filename templates determine how assets are named by expanding the declarative filename part. The following table provides examples of filename templates expanded output:
Filename Template | Filename | Comments |
---|---|---|
<@=*date@><@=*time@> | 20190315T212520S120000-0SPTS-00.NEF | Default Template |
P<@=*date@><@=*time@>-<%=*make%>-<%=*model%> | P20190315T212520S120000-NIKON-D850-0SPTS-00.NEF |
Note: The filename template only determines the declarative part of the filename. The index, device identifier, and subindex parts follow the declartive part, all delimited by hyphens (
-
), to form the complete filename.
Static Defaults
Static tokens always expand to a value. If the token substitution information is unavailable, then a default value is substituted when expanding a template. The token default can defined inline, as part of the template declaration, or using a library configuration setting, as shown below.
Note: Setting defaults is recommended, but not required. If a token default is not set, then
Unknown
is used.
The token default can be defined inline with the token using the following syntax:
<token>:[default]
The following table illustrates how to set a token default inline with the template declaration:
Example | Default Value | Comments |
---|---|---|
<&=category&> | Uknown | Default value not specified. |
<&=category:Racing&> | Racing | |
<%=artist:Ansel Adams%> | Ansel Adams | |
<&?category:Racing&> | N/A | Default values do not apply to dynamic tokens. |
<@=date:20171203@> | N/A | Default values do not apply to timestamp tokens. |
Template defaults can alternatively be set as part of the template configuration, as the following example illustrates for a layout template:
{
"templates": {
"layout": {
"template": "<&=category&><@=year@>#<@=month@>",
"defaults": {
"category": "Racing"
}
}
}
}
Which method to use in setting a default is a matter of personal peference. Using a combination of both methods is also allowed. The inline method is chosen a default is assigned to the same token using both methods.
See Template Options for more information.
Template Options
The following options are available to layout and filename templates:
- Defaults
- Format
- Lettercase
- Maximum expanded token length (maxlen)
- Template
The options are settable for each template as part of the library configuration. The following shows how to apply options under the templates section for a library configuration:
"templates": {
"layout": {
"template": "<%=make%>#<%?model%>#<@=*year@>#<@=*month@>",
"defaults": {
"make": "Unspecified"
}
},
"filename": {
"template": "<%=serialnumber:00000%>-<@=*date@><@=*time@>",
"format": "packed",
"lettercase": "lower",
"maxlen": 12
}
}
See Library Configuration for a description of each option and their default values. If the default value is acceptable, then there is no need to set it in the configuration file explicitly.
Note: The above settings snippet illustrates the two different methods of setting static token defaults. The layout template uses the defaults option, where the defaults are entered as key-value pairs. The filename template uses the inline method, where a colon (
:
) delimits the token from its default value.
Library Configuration
A library is any directory containing a .keg
configuration file at its root. The configuration file contains a small amount of JSON data conforming to the MediaKeg Library Schema. Except for optional library metadata, the configuration is static and should not be modified after the first import operation. The optional metadata fields can be changed at any time because they do not have a role in import operations.
The schema defines defaults for the required properties. The following JSON shows a minimal configuration, which is the default configuration:
{
"doctype": "https://mkeg.io/schemas/document/library-1-0-0.json",
"identity": "00000000-0000-0000-0000-000000000000"
}
Note: An actual configuration file must contain a non-empty UUID identity value.
The following JSON shows a more declarative configuration that includes metadata and library settings:
{
"doctype": "https://mkeg.io/schemas/document/library-1-0-0.json",
"identity": "00000000-0000-0000-0000-000000000000",
"metadata": {
"maker": {
"app": "MediaKeg CLI (mkeg)",
"appver": "1.0.0",
"created": "2019-08-03T23:40:22-07:00",
"username": "username",
"hostname": "hostname"
},
"name": "Library name",
"desciption": "Library description",
"owner": "Library owner name",
"artist": "Artist name",
"copyright": "Copyright info"
},
"settings": {
"extension": {
"lettercase": "uppercase"
},
"salt": ""
},
"templates": {
"layout": {
"template": "<@=*year@>#<@=*month@>",
"format": "alphanumeric",
"lettercase": "uppercase",
"maxlen": 16
},
"filename": {
"template": "<@=*date@><@=*time@>",
"format": "alphanumeric",
"lettercase": "uppercase",
"maxlen": 16
}
}
}
The remainder of this section details the the library metadata and settings properties, which are listed in JSON dot notation:
- A required property with default assumes the default value if not set in the configuration file.
- A required properties without a default must be set in the configuration file.
doctype
Required Property
Identifies the document as a library configuration file and the schema version. This property must be set to the following value:
https://mkeg.io/schemas/document/library-1-0-0.json
This property has no default value and must be present, else the document will fail to load.
identity
Required Property
Sets the library identity, which is a Universally Unique Identifier (UUID). The UUID must be formatted as follows:
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
The 4 bits of digit M
indicate the UUID version, and the 1–3 most significant bits of digit N
indicate the UUID variant. To help ensure anonymity, MediaKeg uses UUID v4 when making a new configuration file. The UUID version cannot be attested to if the configuration file is made using a different method.
This property has no default value and must be present, else the document will fail to load. The value must also be a non-empty UUID, meaning it cannot contain all zeros.
metadata
Optional Object
This object contains several settable properties listed below. They provide descriptive information about the library and have no functional role in the import process. All metadata properties are therefore optional, including the metadata object itself.
MediaKeg uses library metadata to provide richer information about a target library in reporting and library maintenance commands.
metadata.artist
Optional Property
The name of the artist who created the library content.
metadata.copyright
Optional Property
The copyright notice for the library content.
metadata.desciption
Optional Property
The library desciption.
metadata.maker
Optional Object
This object contains several settable properties listed below. The properties inform details relevant to how the configuration file was made.
metadata.maker.app
Optional Property
The name of the application or entity that created the library configuration.
This property is automatically set when using MediaKeg to make a new library.
metadata.maker.appver
Optional Property
The version of the application or entity that created the library configuration.
This property is automatically set when using MediaKeg to make a new library.
metadata.maker.created
Optional Property
The library creation date and time in ISO 8601 format, as illustrated below:
2019-05-18T14:12:13-07:00
This property is automatically set when using MediaKeg to make a new library.
metadata.maker.username
Optional Property
The login username of the person or entity that created the library configuration.
This property is automatically set when using MediaKeg to make a new library.
metadata.maker.hostname
Optional Property
The system hostname or identity that created the library configuration.
This property is automatically set when using MediaKeg to make a new library.
metadata.name
Optional Property
The library friendly name or title.
metadata.owner
Optional Property
The name of the person who owns the library content.
settings
settings.digest.defaultMethod
Optional Property
Allowed Values: md5-fast | md5 | sha1 | sha256 | sha512
Default: md5
Sets the default message-digest algorithm for duplicate and import bit error detection. The following options are available:
Option | Bits | Comments |
---|---|---|
md5fast | 128 | Improves performance by hashing only a portion of each file. |
md5 | 128 | Recommended for most users. |
sha1 | 160 | |
sha256 | 256 | |
sha512 | 512 |
Note:
md5
is the only option supported for the current version of MediaKeg..
Mediakeg calculates a message-digest for each file participating in an import operation. Think of a message-digest as a fingerprint that uniquely identifies a file, and where even a single bit difference produces a radically different value.
Message-digests are calculated using a hash function producing an output n-bits in length, where the bit length depends on the hash function. While two different files can generate the same hash value (a collision), the chances of occurrence are highly improbable. (1 in 1.47e29 for MD5.) Higher bit algorithms further decrease the chances of a collision but incur an additional performance penalty. MD5 is generally considered sufficient to verify data integrity against unintentional corruption.
The md5fast
option uses the MD5 hash algorithm to calculate message-digests by reading only a portion of each file. This option has little or no practical impact on the reliability of duplicate detection, but renders error detection unreliable. Therefore, only use this option if you're willing to forgo validation against bit-errors for imported assets. Nor will your library contain logs with message-digests that can be used to check for bit rot at a later date.
settings.index.tags
Optional Property
Allowed Values: [ *"item1", *"item2", ... ]
An ordered list of metadata tags providing index assignments. Use this property to add, remove, or reorder the default list.
- Setting
settings.index.exclusive
totrue
clears the default list. - Entries are prepended to the default tags list.
- Use the
*SubSec
for Exif subseconds time value.
Note:
*SubSec
is a virtual tag as indicated by the asterisk (*) symbol, and is included in the default list if index tags. It's a required tag and is added to the end of the list not present in the optionally configred list.
Note: Any index tags set using this option are also used for scene identification, even if not explicitly set using the
settings.scene.tags
option.
Index tags are evaluated in the order listed and their values are expected to be numeric (base-10). The importer selects the first numeric value encounterd for the scene index value. If none found, then the importer attempts to extract an index value from the filename.
This property expects a string array, where each entry conforms to the following rules:
- Must be alphanumeric.
- The dash (-) and asterisk (*) characters are also allowed.
- Must contain no whitespace.
See Intervals for more information.
Application Notes
For a list of known tags, enter the following command using the MediaKeg CLI:
$ mkeg list-tags
The following list shows the default index tags:
[
"*SubSec",
"ImageCount",
"ShutterCount"
]
Use the following command to list the default index tags from the MediaKeg CLI:
$ mkeg list-tags --scope=index
Use the following command to list the interval tags as configured for a target library:
$ mkeg list-tags --scope=index --target=<library alias or path>
Example 1
The following setting modifies the list as shown below:
"settings": {
"index" : {
"tags" : ["FileNumber", "ImageNumber"],
}
}
The default list as modified per above settings example:
[
"FileNumber",
"ImageNumber",
"*SubSec",
"ImageCount",
"ShutterCount"
]
Note: The actual list may differ from the example depending on MediaKeg version.
Example 2
The following settings clear the default list for more control over tag ordering:
"settings": {
"index" : {
"tags" : ["*SubSec", "ShutterCount", "ImageNumber],
"exclusive" : true
}
}
A new list created by setting exclusive
to true
.
[
"*SubSec",
"ShutterCount",
"ImageNumber"
]
Note: When setting custom index tags, inspect multiple images from the same device to verify they have integer values and increasing sequentially with time.
settings.index.exclusive
Optional Property
Allowed Values: true | false
Setting to true
prevents the default index tags from being included with settings.index.tags
, if set.
settings.salt
Optional Property
The salt value to use in generating a device identifier. If specified, the value must not contain whitespace. Special characters are allowed.
settings.scene.tags
Optional Property
Allowed Values: Object
An object for customizing the default scene detection tags. Use this property to add, remove, or replace the defaults values.
The scene tags list always include the following tags, irrespective how this property is set:
- Index tags
- Device identifier tags
- Any metadata tags referenced by layout and filename templates.
Note: Use the
settings.index.*
setting to customize the default index tags.
The device indentifier tags are as follows:
[
"*Make",
"*Model",
"*SerialNumber"
]
Use the following settings pattern to add an entry to settings.scene.tags
:
settings.scene.tags.{ <tag1>:<score>, <tag2>:<value>, ... }
Example:
"settings": {
"scene" : {
"tags" : {
"Artist" : 232
},
"exclusive" : false
}
},
The metadata tag (object key) rules are as follows:
- Must be alphanumeric.
- The dash (-) and asterisk (*) characters are also allowed.
- Must contain no whitespace.
The score (object value) is an integer value that sets the tag's contribution to cummulative scene score.
Application Notes For a list of known tags, enter the following command using the MediaKeg CLI:
$ mkeg list-tags
MediaKeg performs a case-insensitive look-up of tag names from the known tags list and substitutes the name in its correct lettercase. This behavior is provided to help reduce the possibility of error, but it's best not to rely on this behavior because it doesn't work for unknown Exif tags.
Use the following command to list the default scene tags from the MediaKeg CLI:
$ mkeg list-tags --scope=scene
Tip: Using the MediaKeg CLI to list the default scene tags ensures that the list is correct for your version of MediaKeg.
Use the following command to list the interval tags as configured for a target library:
$ mkeg list-tags --scope=scene --target=<library alias or path>
settings.scene.exclusive
Optional Property
Allowed Values: true | false
Setting to true
prevents the default scene tags from being included with settings.scene.tags
, if set.
settings.style.extension.lettercase
Required Property
The lettercase for the imported file extension.
Options: lower | original | upper
Default: upper
settings.tags.virtual.mediaTypeTransforms
Optional Property
Allowed Values: Object
An object for customizing the default *MediaType
virtual tag transforms. Use this property to replace the default values and to create new mappings. The *MediaType
virtual tag is intendend primarily for customizing layout templates, so that a library can be organize according to its MIME type (and subtype, if desired). this poperty provides the user with the ability to customize the token substitution value for each MIME type.
A MIME consists of a type and a subtype, delimited by a forward-slash (/
):
type/subtype
For example:
application/vnd.adobe.photoshop
audio/mpeg
image/jpeg
image/x-nikon-nef
video/mp4
The *MediaType
virtual tag is set on each asset according to its MIME value, where the MIME is used to cross-reference that *MediaType
value that get set for the asset. MediaKeg selects the most specific cross-reference match found, mathcing first on type/subtype
and then type
.
The following table shows the default values for each MIME type:
MIME Type | *MediaType |
---|---|
applicaton | Photos |
audio | Music |
image | Photos |
timelapse | Timelapses |
video | Videos |
Use the following settings pattern to override *MediaType
default mappings:
settings.tags.virtual.mediaTypeTransforms.{ <type>:<value>, <type>:<value>, ... }
The following example illustrtes changing the default value assignment for image
from Pictures to Photos.
"settings" : {
"tags" : {
"virtual" : {
"mediaTypeTransforms" : {
"image" : "Photos"
}
}
}
}
Include the subtype to create a more specific mapping, as shown here:
settings.tags.synthesized.mediaTypeTransforms.{ <type/subtype>:<value>, ... }
The following example illustrates changing the default value assignment for application/vnd.adobe.photoshop
from Pictures to Photoshop.
"settings" : {
"tags" : {
"synthesized" : {
"mediaTypeTransforms" : {
"application/vnd.adobe.photoshop" : "Photoshop"
}
}
}
}
settings.types.exclusive
Optional Property
Allowed Values: true | false
Default: false
Setting to true
prevents the default list of known file types from being included with the settings.types.known
list, if set. The settings.types.known
list must contain at least one (1) entry for this property to take effect.
settings.types.known
Optional Property
Allowed Values: Object
An unordered list of file extensions to apply as search criteria when searching for assets to import. Use this property to add or replace the default, built-in list of types that MediaKeg otherwise applies.
When setting this property, each item in the list represents a file extension to include in the search criteria for import. Item entries are case-insensitive and exclude the period (.
) filename delimiter, as shown here:
{
"doctype": "https://mkeg.io/schemas/document/library-1-0-0.json",
"identity": "d7cf1e2d-9887-4b70-af27-bd7633244597",
"settings": {
"types" : {
"exclusive" : true,
"known" : {
"FFF" : { "mime" : "image/x-raw", "description" : "Hasselblad File Format" }
}
}
}
}
Setting settings.types.exclusive
to true
results in the items listed replacing the default, built-in list. If Setting settings.types.exclusive
is false
or not set, then the items listed are appended to the default, built-in list.
To display the list of default, built-in known types, enter the following command using the MediaKeg CLI:
$ mkeg known-types
To display the known-types list as configured for a library, enter the following command, where [target]
is a library path or alias:
$ mkeg known-types [target]
See the MediaKeg CLI Reference Guide for more information.
settings.types.prune
Optional Property
Allowed Values: [ *"item1", *"item2", ... ]
An unordered list of file extensions to remove from the default, built-in list that MediaKeg uses when searching for assets to import.
Note: This setting is ignored when
settings.types.exclusive
is set totrue
.
settings.types.scope
Optional Property
Sets the import scope to the MIME type specified.
Allowed Values: application | audio | image | video
Default: [unset]
Note: Leaving scope unset (default) configures MediaKeg to import application, audio, image, and video file formats.
templates.filename.format
Required Property
Sets the expanded token style.
Options: alphanumeric | freeform | packed
Default: alphanumeric
See Template Options for more information.
templates.filename.lettercase
Required Property
Sets the expanded token lettercase.
Allowed Values: lower | original | upper
Default: upper
See Template Options for more information.
templates.filename.maxlen
Required Property
Sets the maximum length for the expanded token. If the value exceeds this length, then it is trimmed down to maxlen from the right.
The value must be betweem 8 and 64, inclusive.
Default: 16
See Template Options for more information.
templates.filename.template
Required Property
Sets the layout template value.
Default: <@=date@><@=time@>
See Filename Template for more information.
templates.layout
Optional Object
This object contains the filename template properties listed below.
templates.layout.format
Required Property
Sets the expanded token style.
Allowed Values: alphanumeric | freeform | packed
Default: alphanumeric
See Template Options for more information.
templates.layout.lettercase
Required Property
Allowed Values: lower | original | upper
Default: upper
Sets the expanded token lettercase.
See Template Options for more information.
templates.layout.maxlen
Required Property
Sets the maximum length for the expanded token. If the value exceeds this length, then it is trimmed down to maxlen from the right.
The value must be betweem 8 and 64, inclusive.
Default: 16
See Template Options for more information.
templates.layout.template
Required Property
Sets the layout template value.
Default: <@=year@>#<@=month@>
See Filename Templates for more information.
Advanced Concepts
Scenes
The concept of scenes applies primarily to digital photographs and images, where it's common to have multiple versions (variants) of an original capture or creation. For example, a black and white version of a color photograph, or a RAW image saved to JPG format. In the case of a digital photograph, think of the scene as what the sensor saw at the time the image was captured. With this in mind, how an image is edited or reformatted does not alter the scene. Hence, a scene is immutable and unique in time since no two devices can capture the same scene (even if they look identical to the naked eye).
Much of this document is devoted to how MediaKeg imports assets into a library. However, when MediaKeg performs an import operaiton, it's actually importing scenes. This detail has been left out until now for the sake of simplicity. During an import operation, assets are grouped into scene payloads, and much of the processing occurs at the payload level. Layout and filename templates, for example, are actually expanded using scene metadata, which is a rollup of metadata (scene tags) for the assets contained in the payload.
Assets belonging to the same scene satisfy the following conditions:
- Have matching timestamps
- Have congruent scene tag values
Such assets are said to be congruent with each other.
The purpose of a scene is to ensure that all of the assets it contains exist side-by-side in the target library after being imported. (See scene aware section for problem statement.) Expanding the layout and filename templates for each scene payload — as opposed to the individual assets — accomplishes this goal.
Consider the following example list of imported asset library paths:
/2019/03/20190316T221232S2270000-0SPTS-00.JPG
/2019/03/20190316T221232S2270000-0SPTS-00.NEF
/2019/03/20190316T221232S2270000-0SPTS-00.TIF
/2019/03/20190316T221232S2270000-0SPTS-01.JPG
/2019/03/20190316T221232S2270000-0SPTS-02.JPG
/2019/03/20190316T221232S2270000-0SPTS-03.JPG
/2019/03/20190316T221232S2270000-0SPTS-04.JPG
/2019/03/20190316T221238S4380000-0SPTS-00.NEF
In this example, all but the last asset originates from the same scene, which is evident from the filenames. As covered under filename convention, assets originating from the same scene also have the same scene name. The scene names for the files listed are:
- 20190316T221232S2270000-0SPTS
- 20190316T221238S4380000-0SPTS
Hence, the listing contains two (2) scenes. The first scene has six (6) variants of an original RAW image (.NEF
) for a total of seven (7) images. The second scene consists of a single raw image. With this explanation out of the way, it starts to become apparent that the scene concept is also the cornerstone for how imported media assets are named. Except for the subindex and file extension parts, the remainder of the basename is the scene name.
Scene Congruency
MediaKeg uses select metadata tags to determine which assets a part of the same scene. These tags are called scene tags and are discussed below. Assets having congruent scene tags are associated with the same scene. The term congruent, as used here, means overlapping scene tag values must match, where set. The following tables and their explanations illustrate the concept.
Example1 : One scene from three assets
The following assets are congruent and therefore part of the same scene.
Scene | Asset | Timestamp | Make | Model | SerialNumber |
---|---|---|---|---|---|
1 | A | 20110304T125612 | Canon | EOS-1D X MARK II | 002011000061 |
1 | B | 20110304T125612 | Canon | 002011000061 | |
1 | C | 20110304T125612 |
For one or more assets to be congruent, only the metadata tags whose properties have a non-empty values are evauated for equivalence. Hence, in the above example, the lack of a Model
information for Asset B does not prevent it from being included in the same scene as Asset A. The same reasoning applies to Asset C, which is congruent with Assets A and B.
Note: The example uses only a partial list of the default scene properties for illustrative purposes.
Example 2: Two scenes from three assets
The following assets are incongruent and therefore split into two (2) scenes.
Scene | Image | Timestamp | Make | Model | SerialNumber |
---|---|---|---|---|---|
1 | A | 20110304T125612 | Canon | EOS-1D X MARK II | 002011000061 |
1 | B | 20110304T125612 | Canon | 002011000061 | |
2 | C | 20110304T125612 | Canon | EOS-R |
In this example, the Model
information for Asset C differs from Assets A and B, which results in it being assigned to a separate scene. Asset C is said to be incongruent with Scene 1.
Note: In this example, and alternate solution assets B and C into the same scene since it is A and B that are mutually exclusive.
Scene Tags
MediaKeg uses scene tags to evaluate assets for congruency, which is the hueristic used to evaluate if one or more assets originate from the same scene.
The default scene tags can be listed by entering the following CLI command:
mkeg list-tags --scope=scene
The scene tags can also be customized for a library using one or both of the following methods:
- Adding one or more metadata tokens to a library template
- Using the library configuration
settings.scene.tags
setting
To list the scene tags as configured for a specific library, enter the following CLI commend:
mkeg list-tags --scope=scene --target=<library alias or path>
Template tokens
Any metadata token (real or virtual) declared in a layout or filename template automatically adds the assocaited metadata tag to the list of scene tags.
Scene tags setting
The library configuration settings.scene.tags
setting provides explicit control over the scene tags list, with the following exceptions:
- The
*Created
virtual tag is always used to evalute congruency, irrespective of library configuration. - Any metadata tags mapped to templates tokens are automatically included, as noted above.
Performance Tuning
MediaKeg uses parallel processing to maximize import performance for the following types of operations:
- Metadata extraction
- File digest calculation
- File copy
All three types of operations are storage I/O intensive, and storage performance is, therefore, generally, the limiting factor gating import performance. This generalization applies to both the source and destination storage devices. For example, performing an import operation from an older Class 2 SD card (2MB/s) will likely result in performance being the limiting factor for a modern workstation, even if the import operation is configured to use several CPU cores and the destination device is an internal high-performance SSD.
With the above caveat out of the way, MediaKeg provides customization over the number of processes to fork for I/O intensive operations. For tuning purposes, the logs directory contains a profiler.dat
file for each import operation, which records the amount of time spent in regions, as illustrated below:
ENTER_REGION: IMPORT
ENTER_REGION: FIND.SOURCE.FILES
LEAVE_REGION: FIND.SOURCE.FILES (0ms)
ENTER_REGION: SOURCE.FACTORY
ENTER_REGION: READ.SOURCE.EXIF
ENTER_REGION: READ.SOURCE.STAT
LEAVE_REGION: READ.SOURCE.STAT (175ms)
LEAVE_REGION: READ.SOURCE.EXIF (505ms)
ENTER_REGION: MAKE.ASSETS
LEAVE_REGION: MAKE.ASSETS (0ms)
LEAVE_REGION: SOURCE.FACTORY (505ms)
ENTER_REGION: COLLATE.SOURCES
LEAVE_REGION: COLLATE.SOURCES (0ms)
ENTER_REGION: FIND.TARGET.FILES
LEAVE_REGION: FIND.TARGET.FILES (1ms)
ENTER_REGION: TARGET.FACTORY
ENTER_REGION: READ.TARGET.STAT
LEAVE_REGION: READ.TARGET.STAT (132ms)
ENTER_REGION: READ.TARGET.EXIF
LEAVE_REGION: READ.TARGET.EXIF (356ms)
ENTER_REGION: MAKE.ASSETS
LEAVE_REGION: MAKE.ASSETS (0ms)
LEAVE_REGION: TARGET.FACTORY (489ms)
ENTER_REGION: PRE.STAGE
ENTER_REGION: COLLATE.TARGETS
LEAVE_REGION: COLLATE.TARGETS (0ms)
ENTER_REGION: CREATE.SCENES
LEAVE_REGION: CREATE.SCENES (0ms)
ENTER_REGION: EXPAND.TEMPLATES
LEAVE_REGION: EXPAND.TEMPLATES (0ms)
LEAVE_REGION: PRE.STAGE (1ms)
ENTER_REGION: POST.STAGE
ENTER_REGION: RESOLVE.DUPLICATES
LEAVE_REGION: RESOLVE.DUPLICATES (0ms)
ENTER_REGION: RESOLVE.PATHS
LEAVE_REGION: RESOLVE.PATHS (0ms)
LEAVE_REGION: POST.STAGE (1ms)
ENTER_REGION: IMPORT.ASSETS
LEAVE_REGION: IMPORT.ASSETS (162ms)
LEAVE_REGION: IMPORT (1.2s)
Each region represents a discrete step or phase internal to an import operation. As the import operation progresses, it enters and exits regions, as indicated by ENTER_REGION
and LEAVE_REGION
declarations preceding the region names. Each region maintains its own timer, and the total time elapsed for the region is displayed on the LEAVE_REGION
line. Regions can be nested, as is shown above, using indentations to highlight the nested structure.
Note: The actual regions shown can vary depending on the MediaKeg version and the import options.
Note: The log file example shown uses indentations (tabs) to highlight region nesting. This is for illustrative purposes only, and the actual log files do not indent lines.
The following table lists the dominate I/O operations for regions that implement parallel processing:
Region | Dominant I/O Operation |
---|---|
READ.SOURCE.EXIF | Metadata extraction |
READ.SOURCE.STAT | File digest calculation |
READ.SOURCE.EXIF | Metadata extraction |
READ.TARGET.STAT | File digest calculation |
STAGE.COPY | File copy |
STAGE.STAT | File digest calculation |
STAGE.VALIDATE | Metadata extraction |
IMPORT.ASSETS | File copy |
The import command includes import options for customizing the number of processes to fork for each I/O operation type, as listed in the following table:
Import option | I/O operation | Default Scaling Factor |
---|---|---|
--ranks | All | 70% |
--ranks-copy | File copy | 70% |
--ranks-exif | Metadata extraction | 70% |
--ranks-stat | File digest calculatoin | 70% |
The --ranks
option sets the number of processes for each I/O operation type. By default, MediaKeg uses a default scaling factor of 70%, which is the percentage of system cores to utilize. Hence, for a 10-core system, seven (7) processes are forked. Fine-tuning can be achieved by setting the import option for the more specific I/O operation type (e.g., --ranks-copy
), which takes precedence over the --ranks
option.
Glossary
This section defines the terms used throughout this document. The terms and their definitions are specific to MediaKeg and may take on a different or broader meaning as applied elsewhere.
Asset
A multimedia asset (or asset) is any file with an audio, image, or video MIME type, and the term is used to differentiate multimedia assets from other file types. Digital photographs and scanned images have an image MIME-type. Some files having an application MIME-type can also be multimedia assets, such as Adobe Photoshop and Adobe Illustrator files.
Assets are classified as one of the following:
- Import candidate
- Resident
- Indeterminate
- Foreign
Import candidate
An import candidate is an asset satisfying the search criteria for an import operation and awaiting further processing to determine if it is importable or not. If importable, it will continue on to become a resident asset (assuming now downstream processing errors occur such as a file copy error), else it becomes indeterminate.
Resident asset
A resident asset is an import candidate that was successfully added to a library in a previous import operation. MediaKeg will not rename, move, or modify a resident asset. Any filename conflicts for incoming assets are automatically resolved by incrementing subindex value as needed until a free slot is found.
Foreign asset or file
A foreign asset or file is an asset or file that was added to a library through some external means, such as another application or by being directly copied into the library. Foreign assets can freely coexist with resident assets. Just as with resident assets, MediaKeg does not rename, move, or modify foreign assets and will automatically resolve filename collisions where they occur.
Foreign assets that do not conform to the MediaKeg filename convention are excluded from duplicate detection. Also, foreign assets do not appear in MediaKeg import logs. They are excluded from rollback operations, even if the user replaces a resident with a foreign asset by the same name (unless an exact copy).
When importing the contents of a library into another library, any foreign assets in the source library become resident assets of the target library and are automatically organized and renamed in accordance with the target library configuration settings.
Indeterminate Asset
An indeterminate asset is an asset or file that matches the import search criteria but cannot be imported because it is corrupt or lacks the required metadata information. For an asset to be importable (i.e., not indeterminate) it must meet the following conditions:
- Must have a MIME type of audio, image, or video. A MIME type of application is allowed for application files known to be multimedia content, such as Adobe Photoshop.
- Must contain metadata indicating the date and time the asset was captured or created.
An importable asset must contain metadata indicating its original creation date and time. Assets lacking this information are indeterminate and copied to a quarantine folder.
Basename
A file basename (or basename) is the combine filename and file extention component of a file path. For example, consider the following file path:
/Volumes/DCIM/IMG003.JPG
In this example, the basename component is IMG003.JPG
.
Collection
A library collection (or collection) is one or more resident assets appearing alongside each other in a library. The name of a collection corresponds to its directory path relative to the library root. By default, MediaKeg organizes media assets according to the year and month they were created. Hence, an asset captured in June 2019 belongs to the 2019/06 collection.
Library
A library is the target of a MediaKeg import operation and is an ordinary file system folder containing a hidden configuration file named .keg at its root. The configuration file includes a library identifier and optional metadata about the library. It is also used to customize how assets are imported into the library.
Note: The term folder is often used in place of directory when working with files from a graphical interface. The term directory is typically used when working with files from a command-line interface. The terms are otherwise equivalent and can be used interchangeably.
There is no limit to the number of libraries MediaKeg can target for import operations. Because each library has its own configuration, it's possible to optimize each for usage. For example, how a professional organizes client photos differs from personal photos.
Metadata
Metadata is information about an asset that is typically embedded in the file header. Metadata can also accompany an asset in a sidecar file, which is an external file that exists alongside the asset it describes. Metadata is read and written using key-value pairs, where the key is called a metadata tag. The following is an example of a metadata tag (Make
) and its value (Canon
):
Make : Canon
There are thousands of metadata tags in use today covering several industry standards and vendor-specific tags. The actual number of tags set on an asset varies according to the device make and model that captured or created the asset, and the file format. It's common for an asset to contain anywhere from a few to a few hundred metadata tags. (I.e., a small fraction of the total tags defined for use today.) This is important to keep in mind when customizing library templates.
The following are examples of metadata common to photos:
- Capture date and time
- Camera make and model
- Lens make and model information
- Exposure information
- GPS coordinates at the time of capture (if the camera has a GPS function)
- Artist name (if set by user in camera settings)
Exif is perhaps the most well-known standard, so much so that it is often used (incorrectly) when referring to metadata covered by other standards. This is because it's common for assets to contain metadata from multiple standards, sometimes by the same tag name. For example, digital photos typically include both Exif and IPTC tags, and applications showing metadata properties often don't distinguish the backing standard. This is true for MediaKeg as well, which does not require the user to distinguish between metadata standards when referencing (with template tokens) and writing metadata.
Note: MediaKeg uses ExifTool for all metadata read and write operations. ExifTool is a widely-used and trusted application within the photographic community, and its open source library is trusted and used by several other third-party applications.
Scene
A scene is one or more assets originating from the same capture or creation. For digital photos, think of a scene as the field of view captured by the sensor to produce the original image that gets saved to an SD card. The image can be subsequently copied, edited, or saved to a different format, but the resultant files still reflect the same scene.
A scene helps to ensure the assets belonging to it resolve to the same collection and filename (ignoring file extension and subindex) after being imported. The metadata used to determine an asset library path and filename are evaluated at the scene level, which has a rollup of all relevant metadata for the assets it contains. With scenes, assets that could othewise become separated from their variants stay together, even if lacking metadata needed to resolve collection and filename.
See Scenes for more information.
Sidecar
A sidecar file (or sidecar) contains asset metadata stored in an external file. A sidecar file exists alongside an asset by the same name, hence the term sidecar file. Sidecar files are not assets; however, the import command provides the option to include any associated with an import asset provided the following conditions are met:
- Must exist alongside and have the same filename as the asset.
- Must have a .xmp file extension (case-insensitive).
- Must be a valid Extensible Metadata Platform (XMP) document.
The processing of each sidecar file depends on if the asset is importable or not. If importable, MediaKeg modifies and saves the sidecar alongside the asset. The sidecar also assumes the same filename as the imported asset. MediaKeg saves a backup of the original sidecar to the originals folder.
If the asset is indeterminate, MediaKeg quarantines the sidecar alongside the asset.
Source
A source is any file or directory containing meida assets to import. The file or directory path must be accessible to the host operating system and the user. If the source is a directory, then MediaKeg scans for files matching the search criteria. By default, the search criteria includes files having well-known multimedia file extensions.
MediaKeg provides several options to refine the search criteria, such as:
- Recursive search
- Regular expression filtering
- File size constraints
- File last modified constraints
- Custom file types
See the MediaKeg CLI Reference Guide for the complete list of options and details.
If the source is a file, MediaKeg skips the directory scan and otherwise processes the file as usual. Selecting the sidecar option will also include the sidecar if one is present.
Tag
A tag is the key component a key-value pair when working with the following types of information:
- Metadata
- User supplied token substitutions
- Virtual metadata
Metadata Tag
A metadata tag is the name of the metadata property. The following are examples of common Exif metadata tags:
- DateTimeOriginal
- Make
- Model
- Artist
The list of tags available for a particular asset varies by the device manufacturer and model that created the asset, and the file format.
User Tag
A user tag is a named property for a user-supplied value that participates in token substitution when expanding library templates. For library templates containing user tokens, the user tag and value provides the user with the ability to influence how assets are named and organized (instead of relying solely on metadata).
User tags are provided as an option to the import command so they can be activated and set for each import operation. User tags can also be set using a library alias so that multiple aliases targeting the same library can have different user tags associated with a descriptive alias name.
Virtual Tag
A virtual tag is the name of a virtual metadata property. Virtual metadata is a MediaKeg construct that defines properties whose values come from an ordered list of metadata properties. The first property in the list that has a property value determines the value of the virtual property.
Virtual tags are used in place of real tags (i.e., metadata tags) when the types of information they represent have an influential role in determining how assets are named and organized. Virtual tag names always start with an asterisk (*).
For example, the *Created
virtual tag is used by MediaKeg to represent the date and time an asset was created or captured. The *Created
tag has its value from the first metadata property whose value is set from the following list of metadata tags:
[
"SubSecDateTimeOriginal",
"SubSecCreateDate",
"DateTimeOriginal",
"DateTimeDigitized",
"GPSDateTime",
"CreateDate",
"MediaCreateDate"
]
If none of the properties from the list is set, then virtual tag is not set.
Note: The above list is for illustrative purposes only. The actual list may be different and can change at anytime.
Target
A library target (or target) is the target of an import operation. For a given library, the target can be expressed as the library root directory path, or as an alias should one exit. If an alias, MediaKeg attempts to revolve the alias to a library path.
An alias is a convenience feature that associates an easy-to-remember name with a library. By creating an alias, the user can save some typing when initiating an import operation from the CLI by not needing to enter the library path. The user can also assign a default alias, which is automatically resolved without needing to specify a target.
Note: Library aliases are stored in a file named aliases.json located in the user settings folder. This is not a violation of stateless operation as aliases are a convenience feature that simply resolves names into import command arguments.
Template
A library template (or template) defines the rules for how MediaKeg names or organizes assets as part of an import operation. The template is a string value containing tokens, delimiters, and optional static text. The import operation substitutes tokens for metadata and user-supplied values to produce an expanded value. This process is called expanding the templates or template expansion.
There are two types of templates:
- The layout template expands to the path relative to library root, thereby determining how assets are organized.
- The filename template expands to the expansion part of a filename, thereby influencing how assets are named.
Both templates are expanded for each scene in the import collection.
Token
A token is a template substitution value. Consider the following filename template example:
"filename": {
"template": "<@=*date@><@=*time@>-<%=*make%>-<%=*model%>-<%=artist%>",
"lettercase" : "lower"
}
In this example, the following substrings are tokens:
[
"*date",
"*time",
"*make",
"*model",
"artist",
]
These tokens represent tags (virtual and metadata) by the same name and are exchanged for their respective tag values during template expansion.
Note: Token names are case-insensitive, meaning token letter case is not a factor when substituting tokens for tag values by the same name.