MetaResource(s) is a binary file format for "resource files". Its purpose is to ease the localization of programs, development of applications in multiple languages and deployment of logical data.
The most eye-catching and unique features are its "Resource Polymorphism" characteristics, like: Resource Aliasing, Resource Inheritance, Resource Overriding and Resource Overloading. These features are reviewed later on this page.
The project has been created from scratch and has been an integral part of Heinz Engine since its creation. It is now under its next development iteration (v1.1 mature beta) and I am now thinking of making a stand alone public version to be used as a universal resource file format.
The official implementation defines the following file extensions:
mrf, mr: MetaResource file. This is the final output resource file.
mrs, ms: MetaResource script. This is the script file used to compile a resource file.
What are resource files and why should you use them?
The answers are quoted from ResMagik's documentation page here.
What's a resource file?
Most applications use text, images, sounds, videos and many other external files or data. These objects are also known as resources. These resources are used for multiple purposes depending on the type they are. As you might already noticed, many programs you have downloaded use images but, where are these images stored if the programs are only single executable files and sometimes they come with a few satellite assemblies not even related to an image file? Well, the only answer is that these programs are using resource files. Resource files then are files just like any other file but they act like "packages" to store other files, programs are able to read these packages and retrieve data. There're many custom engines for data retrieval but the main concept and popularity of resource files born with the old Win16 resource. Probably, you're now asking yourself: Why should i use a resource file if there're many cool packaging file formats out there? Because compression formats such as RAR, ZIP, ACE, TAR, etc are used only to package or compress files while resource files store many other data types (including files) such us: Strings, dialogs, menus, accelerators and many other serializable objects. Resource files can be used in web sites (dynamic pages). Resource files differ from other file types in the way they can be used: Resource files may be used directly by an application or (one of the most important feature) they can be embedded inside executables or libraries (the executable reads itself to retrieve data).
"...A resource is any nonexecutable data that is logically deployed with an application..." .NET Framework Developer's Guide.
Why to use a resource file?
Because output text (legible by the user) inside your source files, is "mixed" with the source code. Finding a specific string inside a huge source file can be very tedious and time consuming. Suppose you want to do a spell check, you'd have to find every string in the source and then correct them. If you use resource files, all your strings are stored in a single place (table), making corrections and translation very much easier and faster.
Reduce file size by storing repeated strings and objects only once.
Because files get packed and the distribution of your software is more efficient. Otherwise when you distribute your software, you have to make sure that every file is available and in the right path. Also if you don't use resource files you'll probably have to distribute hundred of files along with your main executable and libraries, filling your hardrive or file system with files and entries.
Because resource files help developers to create applications with multilanguage support. Resources are localized with a specific culture building translated versions of the program.
Because resource files secure your data in a very low manner by embedding it with satellite libraries and executables. And protect web content from direct download.
Because file access is much faster:
NOTE: This was written in 2007. Today we have SSDs that mitigate seek and read latency, but there is still the overhead of polling the filesystem for every single file plus triggering the OS file access chained events.
What is a MetaResource?
Basic concepts
The MetaResource ecosystem understands the concept of a ResourceManager, Language, Folder, Resource, MetaResource and MetaData. A ResourceManager is the core object, it represents a 'physical' resource file and it is used to manage operations with all the other objects. A ResourceManager contains one or more Languages, which are exactly what they mean. A Language is the root localized object that can contain Resources directly and Folders that in turn can contain Resources and Sub-Folders. Folders are flexible sub-containers objects that allow to create tree-like structures, making design time representation possible by using many tools like: XML, Windows registry, a file system and every software technology capable of working with trees. A Resource is an object that represents logical data in its complete state (resource data + meta data).
What is a MetaResource?
A MetaResource is an arbitrary data handler object that in combination with other data produces the previously mentioned Resource object. A Resource, internally has a "MetaData" associated with it that gets embedded into the resource file, then this MetaData can be read back by using the corresponding MetaResource or even be "metamorphosed" by overriding the MetaResource information. A MetaResource also describes how resource data is written and read to/from the resource file and perform other operations on the internal MetaData.
Why META?
In today's IT world, the word meta seems to be somewhat overused due to its wide, abstract and generic nature that allows to use it as a prefix for almost any other word. Of the MANY definitions, the word meta marries the project in more than one instance:
Meta- "reference of itself or its type", commonly interpreted as "an X about X": Resource files store internal data about other data that when read becomes new data. This is known as metadata (data about data), used analogously by databases.
Meta- "change or transformation": This responds to the metamorphous data interface, meaning that metadata can be transformed from its original form and be read as another type, provided that the MetaResource handler is compatible with the metadata structure.
So, the project is called MetaResource(s) because "a resource file packages data about data that can be transformed and be read as new data about resources" (that sounds dizzy). Such is the way of a resource file, to naturally fall into the definition of the word meta.
Other name candidates were "Object Oriented Resource" files because it implicitly described the inheritance and overriding components, "Polymorphic Resource" files because of the static+dynamic polymorphism features and finally "Binary Inherited Embedded Resource" files (B.I.E.R.) because oh boy how I am forcing it to fit that achronym (it means beer in German and who wouldn't like that).
General features
It comes with a complete suite like the resource I/O API, a command line resource compiler and a GUI resource editor.
Latest version has been developed exclusively for C++11. I'm looking for a way to make a C client for interop with other languages and platforms.
Runs theoretically in every computer system, platform and device capable of setting up a C++11 runtime: Windows, FreeBSD, Linux, MacOS, Android, PlayStation, Xbox, Nintendo Switch, etc...
Dynamic binary file structure optimized for the best ballance between smallest file size and read performance. Comes from a game engine background where performance and optimizations are priorities. It is now successfully being used for general applications.
Dual official implementations: There is the dynamic structure one that helps to reduce file size (great for embedded systems) and there is the fixed size structure one that takes more space but helps a little bit with read performance (usually not noticeable at all).
Flexible and extensible specification. Custom resource types can be embedded by implementing the MetaResource interfaces. What type of data can be embedded is limitless.
In version 1.0, a single resource manager class was used to write and read resource files, making it very easy to create frontends for the API. But in version 1.1 the library has been split into two namespaces: One for writing and the other one for reading resource files, this results in the best approach for a more lightweight (uses less RAM) and even further optimized (for read speeds) implementation, which was the priority in this new version.
The write namespace provides all the static polymorphism functionality like static analysis, type checking and alias resolution (for optimization). While the read interface provides all the runtime and dynamic polymorphism functionality like inheritance, overrides, aliasing, MetaResource and MetaData handling.
2022 Update: Entirely rewritten resource data compression algorithm. It has been optimized for seek performance over data size. Per resource compression, allowing selected resources to be compressed while leaving read-critical resources clear. Per resource target compression level can be configured.
BIG 2022 Update: Brand new, freshly rewritten encryption strategy with new algorithms. It has been optimized for security over any other aspect, otherwise there would be no reason for a proper encryption mechanism to exist. I took the heavy decision to drop my own stream cipher implementations in favor of a 3rd party library, allowing me to focus on the resource system itself and not in cryptography which is a whole field of its own (I could revert this decision in the future). Under x86, the selectable stream cipher algorithms are hardware-based AES256-GCM and the newly introduced XChaCha20-Poly1305. Both algorithms provide message authentication and integrity check.
Targeted confidentiality: Different segments of the MetaResource file structure can be encrypted independently. This is well documented with suggestions based on secrecy levels.
Per folder aliasing. This allows for selective inheritance and overriding of folders and resources. Think of it like a "sub-inheritance": Like a C++ class can derive from one or more other classes inheriting their members, now the members can in addition inherit and override the members of other sub-members.
It comes with built-in MetaResource types registered for the most commonly used data types. This can be extended arbitrarily by the programmer, making the file format an all-around file format: It can be used to create for example video, audio, maps, model and everything file format! And with localized data bonus!.
The built-in MetaResource "STRING" and "STD_STRING" store strings in UTF-8 encoded streams, performing sequence validation at compile time. But any other string encoding can be embedded.
Implicit optimization of duplicated data. Can be turned on or off actually.
Built-in integrity and authenticity mechanisms.
Minor (optional) in-file information or utility attributes such as compilation timestamp and magic number. Useful for creating identifiable files with unique hash.
Automatic endianness handling. The output resource file can be explicitly created as BigEndian or LittleEndian. The API is compile time and runtime endian independent, meaning that it will work regardless of the file's and machine's endianness. It is all handled internally with minimal or no overhead.
Built-in cache system that caches only referenced elements: It loads what you use, not the entire file into system memory. This helps a lot with the seek times for elements referenced more than once.
NEW 2020 feature: Brand new "index" mechanism. On top of a fresh set of read and seek optimizations (and bug detections), a new indexing algorithm has been implemented that DRAMATICALLY increases read performance and reduces I/O latency. It has an incremental nature: It feeds and grows only from referenced elements, it does not build and bloat the entire index at once, only what you use and need. Additionally, index records are retroactive: Existing records help to find new records to increase seek performance.
The index requires minimal extra memory and can be optionally disabled for small/simple files or for benchmark comparisons.
Inherited elements can be optionally cached and even be pre-cached. Useful for improving seek performance in complex aliased structures.
Resource files can be loaded not only from physical files but from any arbitrary source like: Memory streams, network streams, Win32 resource files (yep, that's inception right there, a MetaResource file embedded into a Windows resource file, and quite useful might I add) and any custom readable byte stream.
Resource Polymorphism key features and concepts
The MetaResource file format provides a smart and fast way to design multi-language programs with the introduction of the following Resource Polymorphism (static and dynamic polymorphism) concepts:
Resource Aliasing
Resources have a property called "Alias" that can be set to point to an equivalent resource located at the same path, in another language. When you set a resource's alias you are "linking" that resource to its alias, making it to reference its alias's data instead of its own.
Consider the following example: You have created an application in German language. Time after that you decided to make an english translation of the program with the ability to switch between the two languages. So, with MetaResources what you do is add every single entry in the MetaResource file from German to its English equivalent. You get a structure like this one:
When the program is set to run in English then it will fetch all the corresponding assets from the English language table. You expect the application to look something like this:
That is standard resource file behaviour. But you could put Resource Aliasing into play: You could understand that the German and English language tables each one contain their own resources in their respective culture, very different one from another. But there are elements in common between the two: For example the program name is unique and is the same for both languages. There is also the image, that one doesn't change no matter the language. So, what you do in MetaResources is to set the alias of the resources "Program_Name" and "Screen1_Image_1" in the English language to point to their German language equivalent. This will explicitly optimize storage space and memory usage by referencing only a single data instance. This is a simple diagram of how resources reference each other:
Running the program would yield the same looking application but with the previously mentioned benefits/optimizations. All of the resources would be loaded from the English language table except for the program name and image, those would be loaded from the German language table (where the actual data is located).
Even more complex cases of Resource Aliasing can be created (if you need to, just because). Suppose you create a MetaResource file with 4 languages for an application. The program name would be the same for all languages, so the most obvious thing to do is to set the alias of 3 languages to point to the resource of a single base one. That would be correct and efficient, but it is also possible to do a "chained aliasing" by setting the resource alias of the last language to point to the resource of the previous language, then set the resource alias of that previous language to point to the resource of its previous language...and like that causing a chain reaction until the alias resolves to the base resource where the data is actually stored. This diagram shows how this chaining would look like:
In the example above, if the application was run in Finnish language then it would try to get the program's name. That resource is aliased to the Italian language and that Italian resource is aliased to the English language and that English resource is aliased to the German language where it is actually stored, finally resolving and fetching the data. This chain reaction seems somewhat inefficient but it is not, at compile time the API is able to optimize away this chaining and resolve linking statically.
So, Resource Aliasing means "Resource Linking" (actually the term is still a candidate for the final release, together with the "Link" property instead of "Alias"). Aliasing or linking is the core step to enable certain dynamic polymorphism features when applied to elements other than resources. We will review them next.
Resource Inheritance
Resources are not the only elements that support the concept of aliasing. Languages and folders also support aliasing. When the "Alias" property of a language or folder is set, they are instructed to "behave like their equivalent alias language/folder", this mean that a language or folder can have its own child sub-folders and resources but also must inherit the ones from its alias equivalent.
When is inheritance used? Consider the following example: You are developing an application in multiple languages, each one with its own localized resources. Among these languages there is "en-US" (English, United States) and you want to add for personal reasons the language "en-GB" (English, Great Britain). But it just happens that at the current program' state, all resources for both of these English languages are equal, hence it would not be practical to duplicate all the resources from en-US into en-GB. The solution is to set the "Alias" property of en-GB to point to en-US, this way it will inherit all of the children from en-US, callable from en-GB, while optimizing storage space and memory usage. Such a structure would simply look like this:
More flexible or complex structures can be created by aliasing folders, this enables selective inheritance by importing only members from aliased sub-folders. Such a hybrid structure could look like this:
Chained aliasing is also supported on languages and folders, recursively inheriting the corresponding members and further extending the complexity of structures that can be created.
Resource Overriding
Aliased languages and folders inherit members from their respective aliases and they are also allowed to have their own. As a rule of hierarchy and inheritance, if a direct member has the same name of an inherited member then the first overrides the second. This is quite useful for languages that have many in common but want their differences to prevail. Lets bring this to an example:
You are developing an application in English language. But you want to cover a wide spectrum of English cultures so you create a MetaResource file with support for en-US (English, United States) and en-GB (English, Great Britain). Language en-GB has been aliased to en-US and inherits all of en-US members because both languages have all their resources in common, except for this one string that in British English is written differently than in American English: "Choose a colour" (en-GB) vs "Choose a color" (en-US). In such case, en-GB will provide its own string and override the inherited one. Such structure is easy to implement and would look something like this:
In this example, if the program is run with the en-GB language then it will load all the resources from en-US except for the overriden string that en-GB provided. Such application, if run in both languages, would look something like this:
Here is another, more complex example: Inherited members can also be overriden with other inherited members:
In the example above, the language de-DE is aliased and points to the language de-AT, inheriting all of its members. Then de-DE inherits the folder "Folder1" with two resources in it. One would think that those resources are obtained from de-AT but that is not correct because de-DE overrides its inherited "Folder1" from de-AT by providing its own "Folder1" that is aliased and points to "Folder1" located at de-CH. So, the inherited resources from de-AT get overriden by the inherited resources from de-CH because they come from a higher level alias. If de-DE would have provided only its own "Folder1" but without it being aliased then it would have inherited the resources from de-AT because there would have been no other resources (inherited or not) that would override the inherited ones.
When there is chained aliasing present on a language or folder, members are recursively inherited and the same overriding rules are perfectly applied for each and every single element, ascending or descending, in all levels and dimensions.
Resource Overloading
This is an experimental feature that will for sure be excluded from final release. It allows languages and folders to have multiple resources of different types but with the same name. This is pretty much equal to function overloading in C++ where multiple versions of a function with the same name but different parameters can coexist. This brings little benefit, makes seek functions slower and carries extra overhead because it requires the resource type to be specified in every function for disambiguation.
Using them all together
All of these features are compatible between each other and can be used all at the same time, multiple times, at different levels and dimensions. The degree of complexity that can be achieved by a MetaResource structure is beyond imagination.
The most common workflow for creating a multi-language structure would be:
Create a "NEUTRAL" language. This would be a master language that is the maximum common denominator and contains all the resources that all languages have in common.
Create base culture languages and alias them to point to the neutral language to inherit the base resources. Plus add the localized resources.
Create derived languages and alias them to point to their corresponding base culture language to inherit all of the neutral resources plus the localized ones. Additionally apply resource overrides specific to the derived culture.
History
The past
The first MetaResource file format experiments started in 2007 together with my first game engine studies, but it was in 2010 when the MetaResource API got its implementation matured and widely used for private custom programs. It was indeed created as a deployment solution for the engine, allowing it to pack all the resources and assets (text, sound files, texture files, model files, level data, custom binary data, etc...) into a single resource file. It was mainly used for GUI together with the engine.
The present
How much the overall project has evolved is amazing. It is more flexible, simpler to use, more powerful and faster than ever! And still using it for whatever program I do (even for creating non game nor multimedia apps).
A project like this looks simple to the naked eye, but it's certainly not! With 6500+ lines of code (and counting), the project is still growing in complexity and functionality, which is way overkill for a resource file format. The programming of multi-level multi-dimentional inheritance can twist someone's brain. It is complex, difficult and extensive but I'd say it can be done by a single person. One need only understanding and time.
The future
Today I'm thinking of a way to make it a public, general purpose, universal resource file format but the amount of time and resources required is outstanding. The integration with the game engine is something that can be overcomed but still requires re-engineering. It would require a native C implementation (or at least a client) that is a lot of work. Plus the overall time and money to setup the project, website+domain, the documentation, the support...all while still trying to eat and survive...is really expensive. It's gonna take me a while to materialize a public release. Even more so, if the process becomes unsustainable then I may have to drop it. In the mean time I leave this page as a "Technology Preview Page".
Thank you for your time and interest to read this article. This is the kind of things I invent in my free time. I just follow my intuition: I can hear/feel the god(s) in my mind/heart telling me to do this stuff and how to do it.