Evolvix BEST Names for semantic reproducibility across code2brain interfaces

Names in programming are vital for understanding the meaning of code and big data. We define code2brain (C2B) interfaces as maps in compilers and brains between meaning and naming syntax, which help to understand executable code. While working toward an Evolvix syntax for general‐purpose programming that makes accurate modeling easy for biologists, we observed how names affect C2B quality. To protect learning and coding investments, C2B interfaces require long‐term backward compatibility and semantic reproducibility (accurate reproduction of computational meaning from coder‐brains to reader‐brains by code alone). Semantic reproducibility is often assumed until confusing synonyms degrade modeling in biology to deciphering exercises. We highlight empirical naming priorities from diverse individuals and roles of names in different modes of computing to show how naming easily becomes impossibly difficult. We present the Evolvix BEST (Brief, Explicit, Summarizing, Technical) Names concept for reducing naming priority conflicts, test it on a real challenge by naming subfolders for the Project Organization Stabilizing Tool system, and provide naming questionnaires designed to facilitate C2B debugging by improving names used as keywords in a stabilizing programming language. Our experiences inspired us to develop Evolvix using a flipped programming language design approach with some unexpected features and BEST Names at its core.


Social contracts about cars and computers
Why are we allowed to drive such dangerous machines as cars if we individually do not understand them well enough to build or fix them? The answer comes in the form of implicit and indirect, yet powerful social contracts between the users of cars and the manufacturers of cars who developed the collective expertise on how to make these complicated machines easy to use for most humans (most of the time). While we still need to learn about brakes and steering wheels, we can ignore most technical details. Yet remarkably, knowing how to build a car is not enough for driving today. The collaborative use of many cars on shared roads has given rise to social conventions that are completely unnecessary from a purely automotive technological perspective. But these conventions are vital for the semantic reproducibility of coordination signals required for collaborative driving in concurrent traffic networks. Imagine it was difficult to recognize the right side of the road, whatever that may be.
These non-technical conventions are critical for efficiency as they facilitate fast decisions in lifeor-death situations. Not all research is a life-or-death experience, but most research could be made substantially more efficient, by social contracts that settle "on which side of the road" to drive and thereby greatly reducing inessential complexity. While care is needed to ensure that such contracts do not inhibit the innovative forging of new paths that necessarily start as single tracks, it was demonstrated by the initial human genome project how much such coordination could contribute to efficiency in research. 1,2 The level of semantic reproducibility achieved by red traffic lights is impressive. Such reproducibility of interpretation would greatly increase the efficiency of computational modeling or any other programming, yet such clarity seems like a distant dream. The gap is exemplified by the "No Warranty" disclaimers in packages frequently used for biological analyses such as Excel, Matlab or R. Naïve readers might interpret these disclaimers to indicate that not even elementary operations like adding two numbers are guaranteed to work. Few non-technical users realize the extent to which this is true, as numerical limitations exhaust the precision of 64-bit floating point numbers before counting the bacteria in a few hundred humans (most CPUs cannot add efficiently a single cell to a count of 2 52 ≈ 10 16 cells, since rounding errors annihilate the addition of numbers that differ too much in absolute value. It is as if computers are silently running out of names for such numbers. The real problem is not the existence of such limits, but rather the absence of warnings that can make it prohibitively difficult to check if said problems actually occurred or not. Addressing the problem of how to calculate correctly in light of numerical limitations is not trivial, 3-12 the subject of international standards 13-16 and essential for reproducibility of computational results in biology. 17 The absence of warnings about ambiguous code that is potentially misleading also creates many unexpected problems in another area of importance for calculations: operator precedence. As seen in the example in Figure Main1A (see main text; when calculating "-3^2") • The naming communicator connecting the name and the item • The sending communicator using the name for any purpose • The receiving communicator interpreting the name For communication to be free of semantic rot (see below), all three communicators have to agree on a shared mapping between the name and its content, which serves as the definition of the name. For this to happen, the sending and receiving communicators have to learn the definition of the name from the naming communicator with perfect accuracy. It is important to note that this triangle of communicators exists irrespective of the nature of the communicators, which could include humans, computers, complex virtual machines, very simple look-up code, hard-wired tools, animals, plants, cells, signaling molecules, phosphorylation chains, and many other biological communication processes, as well as physical interactions between elementary particles or molecules (where communication is defined as interaction and the naming communicator would be the shared laws of physics).
There are, of course, many more communicators than those three that may share a name. Given the infinities of potential priorities for naming and the infinities of potential users for names (which reflect corresponding queries for content) it becomes clear that a great diversity of names can easily exist for any item. This is true even in a perfect world where all communicators would have perfect access to and understanding of the named item, perfect communication while naming, a perfect memory of agreed names (for all time), and perfect agreement about the priorities of naming. The latter implies a perfect understanding of the use Such random names do not carry any meaning at all, except for their ability to serve as a label for the item they point to; thus they do not provide any hints that might help to remember or reconstruct them, nor do they provide help for finding where they point to. Such names are like islands that can only be found by those 3 who already know where they are. … can collide. Their uniqueness depends on a very low probability for generating the same name twice by chance. This probability is usually not as low as one might naively suspect (see the birthday problem in hashing below); hence, pure names may need a substantial length if they are to provide a reasonable guarantee of uniqueness among the set of all elements that have ever received and will ever receive such pure names. Details depend on the use case and are ideally known before implementation, as its limitations are hard-coded by the length of supported hash values.
As soon as the probability of name collisions increases, such "pure" names can turn into a management nightmare if no precautions have been taken to handle the case, where two identical names are no longer unique, but instead (are expected to) point to two different boxes, each with its own content (the data structures known as hash-tables have efficient mechanisms for that). Below we give a Figure with collision probabilities due to the birthday problem to help choose a reasonable trade-off.
Classifying properties of names using four simple states. Table Main1 in the main text highlights many more aspects that turn naming into a challenge and provides questions that illustrate their impact on naming. These aspects can be organized in ways that highlight a certain potential property of a name, reflecting a some naming priority, aspect, or approach. It is possible to turn many if not all of these aspects of names into "pure essential properties" that define • a corresponding well-specified "perfect" state of completion for the property, • a catch-all for one or more complete opposites that totally lack the "perfect" property, • a catch-all for the many intermediary states, where a bit of the property is present, and • all other names to which this property is not applicable in any meaningful way.
Importance of automated versioning. When developing software, it is unfortunately very easy to drop the seemingly mundane detail of precise version information of the code that defines the corresponding type of names if versioning is done manually. Tracking such details requires organizational and computational overheads that mostly translate into explicit or implicit naming complexity which implements a system that allows for the reliable deduction of type name version information among all communicating parties. It is costly to develop such automated versioning solutions for a new application; thus many developers ignore this problem until it is too late. However, such overheads can in principle be greatly reduced by delegating the tedious tracking work to an appropriately implemented compilers; while the cost of such an implementation will likely exceed the cost of any single ad hoc solution for a given application, it is easy to see, that moving such automated versioning tools into a compiler provides an excellent return on investment for the whole software ecosystem, as the corresponding capabilities are now ready to use for every developer of every app -"batteries included". Implementing such ideas greatly benefits from the Flipped Programming Language Design approach chosen for Evolvix; such generalized versioning can quickly become unworkable or unnecessarily complex if a given compiler-language pair cannot adjust some fundamental design decisions that easily get in the way (such as not storing any versioning information).
Versioning by hash and dependency tree. This problem can be substantially simplified if there is a way of delineating unique complex sets of type definitions that change over time. For example, NIX is a functional programming language with delayed evaluation that removes all problems from incompatibilities between diverse software packages that demand conflicting dependencies 24 . It does so by renaming each version of each software package it installs by combining its usual package name with a long-unique hash value that essentially uniquifies the name. This allows many mutually incompatible versions to peacefully coexist in one system; newly installed packages are then redirected to the corresponding assigned unique names for the right version of the packages they need. Thus, different incompatible versions that create great headaches on other systems can peacefully coexist if installed by NIX. This approach is powerful enough to run NIXOS 24 , which is a Linux version that is entirely managed by NIX. Thus, seemingly simple and slightly cumbersome renaming exercise can prevent uncounted hours of debugging nightmares in what is otherwise known as "dependency hell".

Naming and cache invalidation are two sides of the same coin
Shared name definitions and caches. The above need for definitions shared among independent communicators that each have their own localized storage creates a deep connection between the problem of naming and the problem of cache invalidation: if the meaning of names is allowed to change, such as when a type definition is unilaterally changed, then how is the other party supposed to know that this meaning changed, if the last copy that defined that meaning is available but not clearly recognizable as no longer valid? If there was a naming scheme that would uniquely identify every version of a type that ever existed, by a name that is guaranteed to be unique, and every communication including content of that type was to also communicate the precise unique name of that type (and its version), then the recipient communicator can compare the received type name with the locally stored type name and determine whether an update of the local type definition is necessary for interpreting the content Loewe or not. Updating a type definition by downloading (for computers) or asking (among humans) is not difficult in principle, albeit creating a small slowdown. However, not know whether a given type definition is actually applicable is a much more complicated decision problem.
Merge conflicts in Git. Thus, content that comes with a correctly controlled version number that references a required type for interpreting itself can easily be used for invalidating a local version that has a different version number. This can be used to highlight "merge conflicts", where the computer cannot easily predict correctly and unambiguously, which version is the latest revision that includes the latest changes; the distributed version control system "Git" 25 has is very good at bringing such merge conflicts to the attention of users and demands a "simple" naming decision that identifies one of two conflicting files as "more recent" and the other as "outdated" (this is indeed simple for a handful of files that were all recently created, but that is not a case where Git is needed or used; much of the frustration developers experience with git comes from the challenging nature of performing this "simple" naming tasks for large sets of changes).
Auto-merged loss of data in the cloud. The same problem of tracking names and content exists for all distributed computing systems, including storage in the cloud. Yet not all systems have naming conventions or other mechanisms for safely handling all cases. An example is given in the main text (Fig. Main1B) where Alice keeps editing a local document that is automatically synced via cloud to Bob's computer. As soon as Alice has finished editing, she tells Bob in real time that she's done. Bob, assuming "the cloud works", starts editing his local copy of Alice's document right away, but does not realize that he actually works with an older version and does not contain the latest important edits from Alice. If Bob continues to work on a file with the same name, either Alice's or Bob's changes are likely to be lost (and if not require merging in a painstaking manual process). An equivalent scenario was experienced by some of the authors when using a major "cloud" that was accidentally observed to create such problems (and also make them again disappear without a trace, deleting Alice's edits along the way; observing this required making local copies of the files that were affected). To circumvent these problems, the following very simple shared naming convention was developed. For example: If all agree on the shared naming convention that … All changing files end with suffix anyfile_v1r0p1_XY-Busy indicate • only user XY is allowed to change the file ("busy editing this file") • everybody else is allowed to copy and read the file, but must never add changes.

All static files end with suffix
anyfile_ v1r0p2_XY-Sent indicate • user XY has completed all changes of the file and submitted it ("sent to others"); • changing from "Busy" to "Sent" also creates a new unchangeabe version variant • nobody (including owner XY) is allowed to change this file, making it immutable; • everybody can make a copy of the file and change the suffix to "-Busy" after increasing the version variant number and including the initials of whoever intends to change the file (thus taking responsibility). Loewe Usability. As much as this simple solution can solve the problem in principle, assuming perfect adherence to this policy might be challenging in groups that are less than well-trained and highly disciplined. Thus, creating the necessary shared name-space for human users requires much effort, which could be greatly reduced by a labeled transition system that implements a corresponding finite state machine for the states in which any text-file could find itself. This system would store all possible transitions to other valid states and support users by automatically following the specified policies for copying/naming/renaming, whenever a file is read or modified.
Semantic rot. This example illustrates also, how semantic rot works. Semantic rot first affects the file name (which no longer links to what Alice and Bob think it links to). This triggers more semantic rot either by dropping edits or by reducing the time that Bob or Alice can spend on fighting semantic rot elsewhere. Scenarios like these are becoming increasingly important, as applications migrate to and through the clouds of distributing computing.
The name of the coin. This section would be incomplete without a tell-tale episode illustrating how difficult it can be to draw conclusions about naming that are rather obvious with hindsight (or from a different perspective). We had been using the "two sides of a coin" metaphor for a very long time before one of us eventually realized the irony in our situation: If naming and cache-invalidation are really two sides of the same coin, representing the struggle to appropriately share the same reality among multiple interacting communicators, then "naming" and "cache-invalidation" are indeed two names for something so big, important, and challenging that it integrates the two biggest problems in computer science. The fact that we struggle to name this, even if we use the length of a Summarizing Name does reduce the real impact on our struggles to write reliable distributed computing code or to name the parts of biological systems consistently. While it is clearly beyond the scope of our study to attempt finding an appropriate single name, it is clear that different dialect and synonyms could greatly illuminate model building if such deep connections can be captured appropriately.

Life cycles of names
In order to avoid problems like those of Bob and Alice in distributing computer scenarios, a welldefined life cycle for names becomes critically important. Here is not the place to define this, so a brief sketch as to suffice. The life cycle of a name starts with the realization that a type is needed for referencing an item that is to be named. Both type and item require an implicit name in the form of a storage location before there is any name that can possibly become meaningful. Thus, questions of naming are related to questions of memory management, since the relative location in memory is equivalent to an implicit name that can always be given (even if most programming languages never allow accessing it). However, even if the content of a name did not require storage, the name itself still could. The life cycle of names is intricately intertwined with the life cycles of any other bodies of information that are described as text and employ some form of code in which names matter. Such life cycles of sequences of characters include the combined changes of any program ever written, the full set of updates to any database ever compiled, the software development life cycle in general, the provenance of each entry in an ontology, as well as the life cycle of the ontology itself, the cyclic improvement of scientific models that introduce and drop names as needed, the revisions of legal laws, books, dictionaries, etc. and finally the evolution of language itself. It is often impossible to tell where one of these lifecycles stops and another one starts, as they form an intricate net of definitions that might also be captured in some form of ontology computing (as defined in Table Main2 of the main text).

Code2Brain interfaces
One way code. In Fig Main1A, of the main text an overview of C2B interfaces is given. Every time a human brain inspires the writing of code, the B2C interface translates the ideas in the brain into characters in a file that can be compiled. Computer science has excelled at developing methods of ensuring that the code written by humans is translated correctly into the binary code that a CPU can execute. This combination of B2C and C2C interfaces has powered much progress in computer sciences. However, this model works best if an isolated programmer writes the code and uses it only as long as the brain remembers the meaning of the various names and structures that are important for understanding code. On some occasions this can be a matter of weeks or months, on others it can be years or decades. The details depend on the complexity of the problem, the documentation of the code, the knowledge of the programmer, but to a very large degree also on the quality of the names that are used in the program source code for denoting variables, functions and other entities.
Language definitions. One might think that programming languages are orthogonal to this problem, because they are carefully designed and documented, much more so than most applications that are much more specialized. Thus one might expect little ambiguity about what a certain syntax means. This may be true for the designer of that language and everybody who knows the specification. However, this does not necessarily hold for all code written in that language. Why?
Writing code in a misleading language. Before a coder can write code, the coder needs to read other code that somewhat resembles fragments of the code to be written. Thus the coder needs to use her C2B interface for reading code and constructing a mental representation of the meaning of this code. The fragments of meaning implied in a programming language definition are organized by a programmer nto a sequence of expressions that she believes can solve the problems the code is written for. If the coder interprets (i.e. reads) code wrongly, then semantic rot at the C2B interface will have impacted that coder's ability to write reliable code. If this becomes a learned error, it will translate later into cascade of errors, as this misleading mental representation is repeatedly used to create code with corresponding errors of meaning with multiplicative effects on all users of the code. Reading code in a misleading language. New developers who might join the effort or try to debug a program will also use their C2B interfaces to construct a mental picture of what the program is doing and why this might cause the intended results or not. This process can be greatly accelerated by compilers that automatically check for common errors by alerting programmers to the precise line and symbol that created the problem. However, errors not caught by the compiler will still need to be resolved by users. It can be useful to distinguish the severity of coding errors as discussed next.

Coding Error Classes
It is not reasonable to expect any non-trivial program to be free of errors from the start. As compiled by Dr. Ray Panko on the Human Error Website (see for more details and refs), http://panko.shidler.hawaii.edu/HumanErr/Index.htm the basic human error rate when it comes to typing or other mechanical activities is about 0.5% with some variability that depends on the precise activity. Experts can be substantially more accurate, but rarely reduce their error rates more than 10 fold compared to non-experts. Rates of logic errors are higher than those of mechanical errors. They are also more difficult to detect than mechanical errors. Yet by far the most difficult errors to detect are errors of omission.
Thus getting computer code to work correctly is always associated with substantial efforts that are summarized under the label "debugging". Over the years a substantial arsenal of weapons has been accumulated by computer scientists in efforts to automatically detect as many "bugs" as possible automatically, when compilers analyze code in order to translate them into other code that can be executed more directly on a machine that operates at a lower level.
Programming errors can be organized in a hierarchy according to how difficult they are to find (and thus how much damage they can cause; difficulty increases with each point): • Literal (insect) bugs: long gone are the days, where small crawling insects could cause computer malfunction by walking across sensitive electronic components. • Known errors for which compilers create clear and accurate error messages. While these might frustrate beginners by standing in the way of a working program, experts often give the recommendation to have compilers report as many errors as possible; every error caught earlier does not have to be caught later at much greater cost. • Errors that always crash a program at run time are next easiest.
• Errors that often crash at run time and produce obvious nonsense help by making their existence clear beyond the shadow of a doubt. • Errors that rarely crash and usually produce full results, but not always can be very difficult to fix, because most of the time it is not clear that they exist at all and if they do, it is often not clear how to reproduce them. • Errors that are rare can be very difficult to find on distributing computer systems, where no two runs of the program will result in exactly the same execution path, because of a great variety of diversifying circumstances.  It can be useful to consider what randomly changed bits can do; some programming  errors can cause computers to behave as if the crawling insects of old can again  randomly make connections, or as if a cat walked over the keyboard, or as if obviously malicious input is being provided. Good programs catch these cases. • The second to hardest errors to catch are errors in the logic of the program, because these require a deep understanding of the purpose of the program and its problem domain. These errors are often caused by omissions (what should be there, but isn't). • The hardest errors to catch are the errors in the logic in the programming language because programmers assume that the language designers will have done a good job by selecting a logic that adequately represents the problem domain of the programming language.
Programmers can always program around the deficiencies of any Turing complete language; however, this requires solving the same problem, albeit without the support of a compiler infrastructure and the freedom to define an appropriate syntax. Figure Main1A illustrates a few examples where semantic rot is caused by poor naming of constructs in programming languages, that lead to ambiguities which are interpreted in one way by many and in another way by many others.
Even though Excel and R interpret -3 2 differently, neither produces an error message warning users that their assumptions (C2B) could be wrong. It may come as a surprise to many programmers that even a frequently used operator such as the logical "AND" can easily be misinterpreted as "UNION" in the context of the Venn diagrams so frequently used in biology, even if programmers meant to obtain the "INTERSECTION" of two sets.

Defining Semantic Rot and Semantic Reproducibility
Let us assume the total amount of meaning M t in a message M can be quantified by a measure of semantic units (such as the number of correctly re-identified elements of a set or the percentage of bits of a message; the details do not matter here that grossly underestimate the true communication problems, which also include those that D is unaware of (M tD,Unknown2Miss , M tD,Unknown2MadeUp ). Thus, a careful comparison with E would reveal that D actually only received: Where M tD,Unknown2MadeUp is subtracted, as it replaces the resting and non-confusing silence sent from E (such as before or after sending) with confusing and draining misinformation. It is interesting to note that in most real-life circumstances, and that equality can only be approached under the ideal conditions, where D can somehow guarantee to catch and reverse all instances, where it would otherwise make something up and not know it, and catch and remedy all instances where it would otherwise miss something and not know it (assuming such a remedy is possible with E and eventually leads to success). Now we can define the semantic reproducibility, R e , of content as it is in the process of transmitting from E to D: Similarly, we can define semantic rot, R t , as any subtraction or addition of the original message: For this to work, sender and receiver of M would of course have to agree on how to measure semantic content. Generally, any system X that can measure the semantics of another system Y has to be more expressive (i.e. be able to differentiate and capture more cases of interest) than the system that is being measured. Thus, Kurt Gödel's Incompleteness Theorem would apply here 26, 27 . This theorem states that it is impossible for a sufficiently expressive formal system S (that is, one capable of performing basic arithmetic operations) to assign a truth-value to each of the statements in S, nor is it possible for S to prove that its derivations will always be consistent. To accomplish these tasks, one needs a more powerful formal system S*, whose own completeness and consistency also cannot be proven without referring to a yet more Loewe Therefore, it is much easier to transmit a message without semantic rot than to precisely quantify the amount of transmitted rot without severely limiting the semantics of the message, such that it is too simple for Gödel's theorem to apply to it.
Such greatly simplified messages are completely defined by their type systems, which live between two extremes: they are either guaranteed to be correct, which means they are so simple that they are approximately useless outside of extremely well-defined narrow use-cases; or they are powerful enough to be used for anything in principle, which makes them so complex that it is impossible to guarantee their correctness. Thus, quantifying the semantic reproducibility of scientific research would have to rely on a type system with a logic that very closely mirrors the physical world. How to construct such a system, which has to deal with ambiguity and vagueness in a rigorous way remains an open question in logic research 28 . In our own research, we have repeatedly encountered situations where the Boolean logic that allows only for true and false statements falls short of describing situations where the answer is precise, but represents a probability between true and false, or it is impossible to compute an answer on time, a case in which it would be inappropriate to apply Boolean logic.
These limitations of Boolean logic are likely to repeatedly trigger the need to work around their limits when constructing biological models where many statements are less than clear-cut; yet, the need for imprecision and non-applicability in biology does not reduce the utility and efficiency of Boolean logic for many cases where it applies. In Evolvix, these tensions are being resolved by developing the data type "BioBinary", which requires two bits for storing one of the four states of the "OKScale", which helps to better capture the many exceptions observed in biology, without artificially forcing them into inappropriate True or False statements that generate semantic rot by forcing all code to pretend a level of clarity that simply does not exist. Using the Brief, Explicit, and Summarizing Names for the Type OKScale, which will become a keyword in Evolvix (hence no leading dots):

OKS OKScale OKScale_UsedFor_Enumerating_BasicAlternatives_of_BioBinary_Types
Similarly, each of the four alternative states (including some synonyms) are defined as: Since closed-world systems like computer programs are well-defined, and naming presents the enormous challenges discussed above, it is possible to summarize a formally rigorous naming approach as follows: Time spent in getting the naming balance is time spent writing documentation, which ideally should help outsiders without the current naming context to understand the meaning of the various short names that are not understandable outside of this particular type of context. Thus again, the definition of a type system (here for context) carries the information necessary for rigorously interpreting a set of names.
Experts of a local context who use names frequently tend to abbreviate them; outsiders new to the context usually prefer longer names with more information that helps them connect to semantics defined by the given context. This is the essence of communicating by pointing to the "thing", stating it does some "thing" which is clarified in a very specific context by pointing to another "thing". An equivalent email reply: "please find my answers in the link https: //drive.google.com/ open?id=alcnekwejtb", was received as part of this study, and made perfect sense to the recipient at the time.
A more serious example closer to programming languages is given in Figure Main1A. The statement, Set A is Set B AND Set C is surprisingly ambiguous, given how many people consider it to be unambiguous (hard core logicians will recognize the logical "and" in a Venn diagram and will conclude that "A" is the intersection; many other persons, when upon reading B AND C imagine areas and convince themselves that both areas are included, hence claiming that "A" contains the union. This has contributed to the resolve to rename logic operators in Evolvix in order to increase the clarity of the language for non-programmers (for fragments of this work in progress, see Section 6.1 on pages 26-27).

The cost of names
Explicit and implicit names have fundamentally different cost structures. For implicit names, the cost of naming is non-existent because items already have implicit names that are ready to use and defined by a specific type context (array indexes, geolocation coordinates and more). However, the cost of interpreting such implicit names can easily become prohibitive if the cost of understanding the semantics of the context exceeds the time and processing capabilities of the communicator aiming to use those names without semantic rot.
For explicit names, the cost of naming in a way that avoids semantic rot can be prohibitive because the given time and information needed for naming is not sufficient for finding an appropriate name that is correctly interpreted by a very large number of communicators. This cost is only worth paying when very large numbers of communicators actually need to use a name correctly and the consequence of incorrect use are severe. However, this cost can be Loewe

Common naming problems in programming and modeling
The concept of BEST Names has been developed by modelers to handle synonym overload in biology. It became clear that naming in the following use-cases also involved related trade-offs between longer, more readable names and shorter, more cryptic names.

Beginners vs Experts
Beginners and experts need convenient access to the same semantic units but have different preferences. Students like names with more information about the semantics of an identifier to reduce lookup time. Experts remember and abbreviate to reduce processing time, even if outsiders can no longer follow. Speed and readability are important, but can rarely be satisfied simultaneously by identifiers with a single name. A combination of parser and 'pretty-printer' could automatically translate between different dialects (assuming all names have been defined appropriately). This would allow readers to choose their own readability trade-off.

Code vs. Math
Names ideally follow the priorities and culture of the environment in which they are used. Thus, identifiers in source code should reflect at least some aspects of their semantics, even if abbreviated. Names in mathematical equations are usually reduced to single letters with subscripts. Abstracting such details away maximizes compactness and facilitates a quick grasp of the structure of mathematical expressions -at the cost of hiding most biological semantics in systems biology models. The resulting semantic irreproducibility can have severe consequences, when e.g. confusing indices in (n1 -n2) swaps hosts and parasites. Code in programs is often more explicit to facilitate debugging. Model analyzed in code and equations thus can add two manually managed 'dialects' per publication, which is inherently error prone and hinders reproducibility. The manual work involved could be much reduced by providing automated support (e.g. for detecting inconsistencies).

Compare different implementations of the same model
It can be desirable to have independent implementations of simulation models of the same biological system, yet building these often adds synonyms. In the absence of automated support, a precise comparison of the different names in these codes can cost substantial amounts of time for complex models.

Getting different models to interact with each other
It can be advantageous to combine independent models addressing different aspects of a larger problem. For example, different biochemical pathways in the same cell are usually studied by different researchers. To study interactions between these pathways, the models will need to be thoughtfully combined to ensure that different molecules get distinct names in the combined model, even if they had the same name in their original, more limited model (see the Amylase . Conversely, identical molecules should get identical names, even if they were different originally (e.g. 'H2O' and 'Water', each from a different model should map to a common identity). Adjusting identifiers manually in small models is easy, but lack of automation in large models becomes prohibitive quickly. These problems have stimulated the development of tools for extracting specific reactions from the Biomodels database to reuse them as bricks in larger, combined models (e.g. 31 ). Wrestling with this combination problem has also revealed a fundamental difficulty with applying the concept of 'modularity' as currently understood in software engineering to 'modularity' in living cells 32 . Software modules usually seek to hide details behind small, elegant interfaces, which are then said to become implementation independent ('black-box modules'). Cells operate in the opposite way. They are made of 'whitebox modules' that all share one name space (with varying degrees of probability as mediated by spatial structure in cells). Since every molecule in a cell can in principle interact with every other molecule in the same cell, Neal et al. 32 developed the "semantics-based adaptable interface modularity" (SAIM) approach to building white-box models. It recommends that entities carry their semantics in their name. This can be a daunting proposition if the semantics becomes more intricate, or it can limit the precision of models if no appropriate measures exist for dealing with 'imprecise names'. Many formal systems are Turing complete, and a system for describing 'the' semantics of entities in names is likely to be either very limited or Turing complete (i.e. can theoretically express any computation ever performed if given infinite resources). While it is unclear where the limits of white-boxing are, Neal et al. 32 did computational biology a service by highlighting that white-box code-level coupling can bring all elements into the same namespace without fear of confusion, since all elements have been properly named, so that only those elements interact, which are supposed to 32 . Such semantics based interfaces require careful management and automatic support to become reliable in light of human typo error rates 32, 33 . Installing and maintaining these tools might be too involved for ad-hoc use.

Reading and documenting a foreign code base
Much time in modeling is invested in trying to understanding models by reading the code of strangers (which may include the author of said code after a year) hence, no quick help from friendly colleagues. Venturing into a foreign code base is like a research expedition unto itself. It ought to be possible to record notes of key findings (e.g. on the meaning of x). Ideally such findings go where they were found and are likely to be used again. To translate names of variables, functions and classes from the naming idiosyncrasies of one programmer into those of another is a never ending thankless task. As a reviewer of this paper pointed out, everyone has their favorite naming conventions. We can confirm from experience, and paraphrase: everyone has a favorite coding dialect. And it is usually not one of the four identified in the Evolvix BEST Names concept; thus, a proper implementation of BEST Names must facilitate the creation and management of any new dialects. Languages, where compilers have learned to check for naming clashes and that can 'pretty-print-translate' code between any dialects, will stimulate new ad-hoc dialects that help navigate a new code base from a different perspective (guaranteeing integrity of the original code). Scientific software is often implemented in contexts that provide little time for documentation. Hence, tools that can facilitate repeated rounds of editing could make coding more pleasant, by making it easier to find good names.

Dynamic names are challenging
Situations where names change faster than processors can update their (cache) memory of them provide another way of showing why cache invalidation and naming are very closely related problems (see discussion above and in main text).
While it may seem that cache invalidation is only important for the memory of fast CPUs, it is equally important in research where (i) many researchers can work for a long time before a new phenomenon becomes clear enough to give it a non-confusing name. This implies many confusing updates to local 'brain-caches' along the way.
The same discoveries can be made by different researchers independently, resulting in different names, which happens often in biology. This use of different names creates the equivalent of undetected invalid cache memory that generates much work for nomenclature and ontology committees who seek to reduce confusion from such invalid entries (e.g. names look different, but really point to the same thing underneath; or one name is a collection of things that need to be identified individually.) The same dynamic unfolds in most non-trivial software development projects: code structures may need names long before their true purpose is clear enough for giving them an appropriate name; and different programmers may independently develop parallel structures. Little can be done to avoid these fundamental difficulties of naming in principle (yet efforts to take a few moments before naming a data structure to find a good name are surprisingly useful).
However, computers could make the human task much easier by offering their stellar memorizing, sorting, and comparing capabilities to help humans find the needles in the haystacks that cause problems. This is essentially what the integrated development environments of modern compilers do: automate and simplify code navigation. Still, there is much room for improvement beyond providing simple graphical user interfaces or other well-known tools. The goal is to get computers to produce as much modeling code as possible automatically, without loss of accuracy, while providing a more efficient Code2Brain interface that preserves semantic reproducibility for all who attempt to work on a model. Loewe

Blacklisting confusing keywords in simulations of biology
It can take considerable effort to identify the confusion caused by some commonly used names in trans-disciplinary research environments and to identify alternatives. Hence, storing the results of such lessons learned might be worth the effort.
In Evolvix, the goal is to avoid computer science jargon words that have a clear meaning in biology that might reasonably be used in a model to describe actual biology. Avoiding such jargon can be difficult, since it is quite entrenched. However, this will be worth the effort, since it greatly simplifies discussions of Evolvix code across the domains of Code and Biology.

Uniquified names by versioning or by hashing
It is often desirable to quickly and systematically obtain names that are either guaranteed to be unique or that are unique with such a high probability that naming collisions can reasonably be assumed to not occur over the life-time of a given system; we call such names "uniquified".
Algorithms that produce such names are of particular interest for BEST Names implementations, as they provide the uniquified names required for defining Stable Meanings.
Versioning. To guarantee uniqueness, a versioning system can be employed that increments a counter whenever a new version needs to be uniquified. Such names can be small and the system can be fast; however, it also requires the central management of the version counter and hence a communication cost that can be prohibitive in distributed computing contexts.
Hashing. An alternative strategy based on hash functions does not depend on such central management, but in return does not come with a guarantee of uniqueness, and generates names of considerable size. Such instantly (most likely forever) unambiguous names can be generated from hashes computed for a unique content to be named using a hash function, provided the hash has sufficient length and randomness.

Figure S5-1: How likely are collisions between uniquified names?
The answer is predicted by the birthday problem, which gives the probability that any two or more persons from a group of given size share their birthday. The length of a hash value and the number of hash values used will determines the probability with which randomly selected hashes will collide after the given large number of uses has occurred. The birthday problem is well understood under the assumption that the content and the hash function are both indeed perfectly random. While this is a reasonable initial assumption for most purposes if the hash-values are long enough, it is important to note that hash functions are still an active area of research, and deviations from perfectly random distributions usually increase the probability of collisions substantially. The event to avoid for uniquified names is a hash collision between any two or more randomly generated hashes that end up to be the same (and thus by definition are no longer unique). This is known as the "birthday problem", from the observation that in any room it is much more likely that two persons have their birthday on the same day, than most would suspect.
The names computed by hash functions are either pure names (when the input to the hash function is completely independent from the content to be named), or their polar opposites, where all content of the box to be named is fed into the hash-function for generating a hash value that finger-prints the full content that is to be named.
In such systems, names can be used to lock in guaranteed content. They enable elegant data structures, such as block-chains and Merkle-trees. 34 These rather abstract sounding advances are critically important for implementations of many important applications, such as Git, peer2peer file-sharing networks, the ZFS file system, and many other systems that provide guarantees for the correctness of the content they store. Figure S5-1 illustrates some of the trade-offs involved in selecting hash functions.
Hash functions, collisions, and versioning are all active areas of research to be explored elsewhere. Loewe

Perspectives on naming from the humanities
The following two sections report on a rather unique project involving two humanities editors who employed their English expertise to improve the overall clarity of a programming language design for non-programming outsiders. Improving clarity is a key goal for the development of the Evolvix modeling language, which aims to greatly simplify the construction of biological models (and hence also programming) for biologists.
Please refer to the main text for a brief overview of "Flipped Programming Language Design," a process by which important names are rigorously reviewed repeatedly by developers and potential users before implementation of the programming infrastructure that makes a compiler work and simultaneously locks into place many of the big pillars of a language such as logic operators, elementary math, basic data structures and many other things that cannot or should not ever be changed for a language without changing its name. The following two texts have been written independently and have been edited only slightly to improve clarity for technical readers.
Summary: It is well worth listening to very diverse voices; cross-disciplinary work highlights different assumptions made in separate fields. The value of a clear grasp on English syntax and semantics is paramount for the semantic reproducibility of program code across Code2Brain Interfaces. Experiences like these suggest that a humanities education could have more value than might otherwise be assumed by a software industry that aims to produce stable, maintainable and understandable code. After all, the clarity of code is mediated by the clarity of names given in the code. What is true of software application code and software libraries is even more true for programming languages.
While many programming concepts require training in formal logic, semantics, syntax, and beyond, much of their formal elegance is lost when using them turns into an exercise in memorizing many syntactic rules and exceptions that are unnecessarily complicated due to arbitrary language implementation decisions or poor naming strategies. This accidental complexity may not matter for computer science students who work with computers all the time and are steeped in such exceptions. For those without such training, the concepts are difficult enough, but the inessential complexity 35 resulting from poorly chosen names and the resulting substantial increase in time to learn the basics means that many will not be able to learn the concepts, even if they otherwise could and actually would benefit from the use of a programming language in their research.
We do not need very many programming languages, just one that works as expected would be enough. Engaging editors from the humanities is not a silver bullet to create such a language if there was not already an idea for how to build one. However, they sure help avoid many of the semantic cliffs that are easy to miss when we trade clarity for speed. Maybe pair-programming is so effective because all code will have had to cross at least one more brain before it is committed.

Experiences with Flipped Programming Language Design in Evolvix Perspective of Editor 1
My background in humanities became very useful in unexpected ways after I joined the lab to support the development of Evolvix, a programming language being designed to maximize ease of use for intelligent non-programmers. I have since been highly engaged in the naming process for some parts of Evolvix, most notably renaming the well-known truth function operators used in Boolean logic and the development of what we called the Evolvix Stability Schema Scaffolding for supporting the organization of work and data in Evolvix (to better reflect its general usefulness, we renamed it as the POST Network while finalizing this paper, and it is described in the Supporting Online Material of this paper).
It should come as no surprise that my lack of training in computer science and advanced mathematics meant that some programming concepts required for discussing the syntax of a general model description programming language were difficult to understand. Far more surprising, however, was the ease with which I was able to join the discussion and make important contributions. My ability to productively interact with the group resulted less from my formal training in historical research and more from my experiences trying to communicate history research results in accessible prose during a decade of study in the field. Three years of experience editing undergraduate papers as a teacher also helped hone my abilities to examine language in fine detail and guide others as they tried to clarify their ideas.
The following example presents a single, early contribution that, I believe, enhanced the clarity of truth functions in Evolvix. While I have made many more contributions in a variety of ways, this one provides an easy-to-understand example that suggests the benefits of including trained writers in the language design process. Afterward, this essay concludes with sets of strengths and weaknesses that I see in the Flipped Programming Language Design Process.
As early as my first week in the lab, I began assessing the semantics of truth function operators as I interpreted them from the names presented to me by other members in the lab. While I understood such words as "AND" and "OR" in the common English usage, I had no prior experience using them as operators in a formal logic context. Before I joined the lab, our members had determined several semantic ambiguities in how logic operators were interpreted from the perspective of non-logic trained biology undergraduates. They had made significant progress in renaming them so that someone without training in programming or formal logic might understand them intuitively. These earlier renaming efforts facilitated my understanding of the concepts, as did the Venn diagram representations attached to each name. However, several of the then-current names still confused me, including the following: OnlyUsual NotOnlyUsual OnlyOther NotOnlyOther Here the terms "Usual" and "Other" represent input set 1 and input set 2, respectively. Lab members had previously determined that a good way to define operators that produced outputs that contained values strictly in the Usual or Other set would be to use the word "Only" (see left column above). To get an output of all values except those strictly in the Usual or Other set, we should only have to negate the "Only" statement. However, this naming strategy seemed ambiguous to me: "Only" can function as an adjective and an adverb, which means it can modify either "Usual" or "Not." Thus, while lab members interpreted the term as "Not(OnlyUsual)" -a negation of OnlyUsual -an inexperienced user could easily interpret the same name as "(NotOnly)Usual," which could mean everything in the combined input sets: or even: referring to everything literally, whether known or unknown. After some discussion, I suggested that the solution would be to find a short, commonly used word that could only be an adjective (and therefore only modify a noun) and that conveyed a sense of exclusivity. I suggested the idea of "Pure," and a brief examination showed that it was a good solution because its denotation is "unmixed," and it cannot be misconstrued as an adverb. This revision was done remarkably quickly despite the fact that the previous best solution had been stable for quite some time; the solution I brought as an outsider to the problem stands until today and looks as follows: PureUsual NoPureUsual PureOther NoPureOther Note that, with the revised names, it is still possible to put brackets around the three-word name, but now the parentheses only make sense in the following context: "No(PureUsual)." If an unexperienced user were to mentally place parentheses around the first two words, "(NoPure)Usual," the term becomes nonsensical because "NoPure" would indicate the absence of any pure output, whether Usual or Other, but the use of "Usual" still allows for an output of "PureOther," creating a contradiction that is easy to catch intuitively. Thus, "(NoPure)Usual" is meaningless (as well as a non-intuitive reading of the word string), eliminating the ambiguity barrier that hindered comprehension of the earlier name strategy.
After eight months of participating in the Flipped Programming Language Design Process as I have experienced it in the development of Evolvix in our lab, I can now see the following strengths and weaknesses of the process. Loewe

Collaborators from other branches of the Academy can provide fresh eyes.
Although my training was in history and cultural anthropology (neither focusing explicitly on topics like writing mechanics), the humanities' strong emphasis on effective writing for a broad audience provides significant training in linguistic form and style. Scientists and programmers may also be good writers, but many of them do not consider the production of English prose narratives as the ultimate product of research. Moreover, humanities students are trained to explore and analyze diverse worldviews in their research work, making it easier to forecast the needs and expectations of varied audiences. If programming languages are to become more intuitive and accessible for a broad audience in addition to becoming more efficient, then expanding the types of people who help create such languages will be essential.

Programming concepts become more intuitive.
Many programming concepts require training in formal logic, semantics, syntax, and beyond. For those without such training in programming and computer sciences, the concepts can be difficult. Poor naming choices only add to the problem, leading to unnecessary complexities and lost time trying to map names to meanings; many will likely not bother to learn these programming concepts as a result, even if they could and might actually benefit from the use of a programming language in their research. I have found that many of the names developed in our lab have helped me understand difficult concepts better and more quickly, and several have been so useful that I have chosen to employ them as abbreviations and annotations in my day-to-day work.

It is easier for non-programmers to contribute than many would expect.
While I needed explanations of various programming concepts in order to work effectively, my lack of experience in programming, mathematics, and other clearly relevant disciplines has not yet prevented me from making valuable contributions to the Evolvix language design. Since I bring an important skillset that complements (instead of overlaps with) the skills of other lab members, I have consistently been able to offer unique contributions to the naming and structuring activities under consideration in our lab. Thus I believe that effective human language communication skills are as important to programming language design as math and logic if a language is to be widely used and understood. Loewe My lack of programming experience has helped me see things in ways a programmer would not, but it has also created a steep learning curve. When contributing to the design of a large, complex system, it is helpful to understand that system to the greatest extent possible. But I do not know how, for example, the algorithms at the core of Evolvix work, and I never will. The same could be said for many other parts of this language. As I have attempted to connect the dots during my work on various parts, I have consistently wanted to know how these parts fit into the whole; it is not always easy to "fly blind," and my lack of overview may have prevented me from suggesting even better solutions to various problems. This means that people with more complete knowledge must take time to explain to me what I am working on, a process that could also be seen as part of language review as I keep asking "why" in places where computer scientists might not. In a nutshell, the result tends to be slower progress but better results.

No basis of comparison.
Because I have no idea what other programming languages look like, I have no concept of how our language compares to others -I am working in a vacuum. This can provide benefits (I don't internalize poor practices developed generations ago and adopted by virtually all programmers), but it also means that I have no benchmarks to measure progress against the strengths and weaknesses of other programming languages. It is therefore likely that some of my suggestions reinvent the wheel or are simply not practical, since they would conflict with too many important and well-chosen terms in computer science. Without programming knowledge, contributors wear blinders that may feel frustrating and inefficient, but are probably the inevitable consequence of bringing an outsider's perspective to anything.

A few contributors from the humanities do not make a general public.
Because one of the key strengths of the humanities is exploring varied peoples, cultures, world-views, etc., humanities-trained contributors may feel as if they are asked to speak for the world. In the initial stages of language development, such contributors can provide much needed skill in conveying complex concepts and terms more simply, and they can offer insight into what diverse audiences need or expect if they are to use the language. But these efforts must be complemented with the opinions of a much larger sampling of the population; people trained to study and understand other people still have misconceptions, biases, and other limitations. The Flipped Programming Language Design Process will certainly not resolve all ambiguities in how outsiders read the code of programming languages just by running semantics past a small number of unorthodox contributors. However, I have become convinced that regularly listening to the perspectives of some pioneering outsiders will go a long way towards simplifying programming languages and ultimately also towards expanding the pool of contributors.

Naming experiences with Evolvix Development Perspective of Editor 2
When I was asked to help with providing names for Evolvix, I was immediately struck by the prosaic nature of the task. Naming? That must be the most basic and elemental of tasks associated with computer programming! Why such a fuss over it? But it occurred to me that it is also one of the most important and fundamental steps for anything that is created in this wide world of ours. Baby-naming is sacred in many cultures; indeed, anyone who has children knows how fraught with emotional meaning (both good and bad) this ritual can be. Families have shunned new parents for not favoring certain relatives with a namesake, and have also warmly welcomed back prior outcasts just for christening their firstborn in honor of great-grandfather Arnold. All I can say is, poor child.
Which brings me back to the task of naming for Evolvix. The idea of a "poor child" whose moniker is such a clunker that he will never be accepted on the playground is apt in this regard as well. It quickly became apparent at our naming meetings that it was a priority to avoid inadvertently choosing a scary sounding or confusing name. It is as hard as it sounds. First, the impulse in naming, at least in my case, is to always try to consider the audience for which you are naming. This audience was daunting in many ways: it included computer programmers and engineers, biologists and other scientists, as well as high-school students -all with very diverse experiences and expertise.
What names might qualify as a common denominator for virtually anyone without seeming to be too coarse-grained or simple to the point of losing any significant meaning? Second-guessing, normally a frustrating diversion in most processes, turned out to be a remarkably successful tactic in the naming process. Does it sound overly complicated? Is it misleadingly simple? Can anyone read this the wrong way? After finding a name we deemed to be a good candidate, we tested it by comparing it against a series of criteria that we had established as important in the corresponding naming process so far. Detecting ambiguity and ensuring words were used correctly was a top priority. Hilarious misuse of a word? Laugh and come up with another name.
There are some high-level concepts and algorithms in computer programming that do amazing things. One might think these actions deserve apt names that convey what they do in a breathtakingly elegant way. However, even programmers have found it very difficult to find such names, so many whimsical names are given instead. With fits of rarified humor woven throughout computational systems, it almost begs you to join in on the fun. Why not let our imaginations rip?
As an example, watch a young child play a sophisticated computer game some time. You'll notice that she is certainly not aware of networks, arrays and code sequences, yet she is solving problems within the rules of the game. She is world-building. She creates life forms out of her imagination, depicting actions and features and giving them names.
Facing the complexities of rational naming, many programmers prefer to choose irrational, accidental or even randomized names to avoid the slowdown necessary for choosing good names. I can't help but wonder if this task could be made easier by combining the intuitive, natural instincts of a child with the structure and sophisticated analysis of an adult, a process that I believe Evolvix helps facilitate.

Mini survey on improving names
There are many different ways to approach naming, which is the common task of finding a label that refers to a specified content in a given context. A person's approach to naming is influenced by a wide variety of factors, including personality, discipline, experience, life history, and other traits. These differences in background often cause people to have different priorities and outlooks on naming.
To further explore the diversity of naming approaches with a view on how to improve naming as a process, we solicited informal feedback among ourselves and from colleagues. We received a total of 32 responses reported below, • 16 from persons who identify as "non-programmers" (10 "bio"; 6 "non-bio") and • 16 from persons with substantial programming experience (6 "bio", 10 "non-bio"), where among programmers and non-programmers, some persons had a background in biology (total 16 "bio") and some had not (total 16 "non-bio"). Their overall feedback is informally summarized in a paragraph below each question, and their more detailed responses are presented in aggregated bullet points beneath the summaries (each person could provide as many naming priorities as they desired to).

Conclusion:
The variation in responses supports the notion that a system such as the BEST Names Dialects can facilitate naming by disentangling conflicting naming priorities, which surface, when different users of a common language meet in the same namespace and thus need to accommodate a variety of naming priorities.

Questions: Bold text below marks questions or text in the survey sent by email.
Summaries: Non-bold, non-italics below mark our overviews of responses received.

Responses: Italics below mark answers received and aggregated into similar groups.
In the questions below, the word "Aspect" was chosen to be a deliberately broadly defined property; interpret it any way you like, possibly including: precision, meaning, brevity, completeness, type-indication, classification, fun, formality, correctness, easeof-use, memorability, speed of finding, speed of reading, speed of learning, adherence to a system, and/or many more ...

8.1.
When you give something (or someone) a name, what aspects of the name do you consider to be important?

… In a general context?
The most frequent aspect desired by respondents is that a chosen name is not overly complex, remains jargon-free, short, and easy to pronounce and spell (13 respondents). Many respondents also indicated they desire names which are descriptive, reflecting at least some meaning (12 respondents). Others supported similar aspects, such as precise, unambiguous names (10 respondents) and names which are easy to learn and remember (7 respondents smaller but significant number of respondents preferred name choices that are intuitive / rational (5 respondents); unique or "catchy" (5 respondents). Others valued an acceptable "sound" (3 respondents) or its history (3 respondents).

… In the context of computer programming (variables, functions, etc.)
For important aspects of names in the context of programming, preferences leaned heavily in favor of descriptive, easy-to-infer meanings (20 respondents; see also approachability, usability, ease of typing and remembering) and unambiguous, unique names (9 respondents). Many preferred short names (9 respondents). To some it seemed important that preferred names were, respectively, internally consistent (following any established project guidelines, 5 respondents), clearly defined right away (4 respondents), precise (4 respondents), and consistent with conventions (4 respondents). Several respondents hinted at what might be described as a need for building names by combination when requesting names that can be customized in order to become capable of being used in multiple contexts for different-butrelated purposes (3 respondents; see also 'able to use with modifiers', and 'creates patterns of names'). In addition to pointing to the general aspects above, these points were given: Loewe

8.2.
Let's assume you learn about a name that someone else has given to something (or someone) and you can get all the information you want about this. What would you generally want to know and are there aspects that make learning about this name easier or harder for you?

… In a general context?
There were fewer points of agreement on this question compared to previous ones and 5 respondents found this question difficult to answer in a general context. Most wanted to know the basic meaning of the name or how it relates to its content (12 respondents). Fewer were interested in how the name linked to what they already knew (7 respondents), whether any synonyms and/or abbreviations existed (6 respondents), and what the name's history, original context, and root words were (6 respondents), or why this name was chosen (6 respondents). A number of diverse concerns were brought up by one or two respondents, such as pronunciation (2+1), additional examples (2), connotations, and various aspects that could be used to assess the quality of the name.
• Want to know basic meaning, how name relates to object (12) • Links to what is already known (7) (4) • Hard to answer for general context (1)

…In the context of computer programming (variables, functions, etc)
The largest point of agreement (10 respondents) was that names should clearly imply or describe known meanings that are easy to find. Some desired an example of how the names are used in context (6 respondents). Fewer responses preferred knowledge of why a name was chosen and which alternatives were considered (4 respondents), the types and domains of use (3 respondents), any similar names with potential for confusion 3 respondents), as well as various other aspects. Several pointed to the previous question, indicating a substantial overlap of naming in a general and a computational context (4 respondents).
• Name clearly implies the described meaning; else desire to know that meaning, or wish to quickly and easily find the meaning (10) • Examples of how used in context (6) • Why this name was chosen, other alternatives considered (4) • See point 2a above (4) • Type and domain of use (3) • Related variables, other names which could potentially be confused with this name (3) • Specific, unique (2), harder with ambiguous or contradictory info on a name (1) • Naming documentation (2) • If fits with pattern, conventions (2) • Links to what is already known (1), relevant previously published materials (1) Loewe

… In the context of computer programming (variables, functions, etc.)
The largest point of agreement was that consistently used contextual clues are key to understanding new words (11 respondents). Smaller numbers suggested descriptive words (4 respondents) and recognizable words associated with other known fields or recognizable languages, if the associations make sense (4 respondents).

8.4.
Are there situations, where you would prefer names, respectively, that are …

… longer?
A plurality of respondents favored longer names if the concept being named is very large, complex, or confusing (7 respondents), requiring longer names to remove ambiguity and be more descriptive (7 respondents), or uncommon situations required more explanations (combined 7 respondents). Justifying increases in length of names included improved understanding, readability, and the need to remember less; or if names included additional information like metadata, or modifiers, or as generated from the combination of meaningful parts (combined 7 responses). While a few felt that esthetic considerations could justify increasing name length, at least one respondent felt that longer names were under no situation preferable.

… shorter?
Most respondents preferred shorter names all else being equal, but named different conditions for when shorter names would be acceptable. Some required maintaining precision (6 respondents), others that short names were only used for something simple, basic, or small tasks (5 respondents), or there be a particular need to read or write efficiently when used often, such as for variables in mathematical equations (combined 7 responses). Some thought that unique names did not need additional clarification and could be short (3 respondents), while others preferred shorter names in general (combined 4 respondents), or only locally (1 respondent). Some saw the potential for shorter names to simplify remembering, reading or communicating (3 combined respondents), and parallels to question 1b were noted.

… other preferred properties?
Fewer respondents noted additional preferences. Some restated their preference for short names in general, and in particular for concepts related to math (3 respondents). Other single respondents emphasized the need for different names for different audiences or use cases; the need for additional explanations or a dictionary that maps short to long names, or might even adjust the balance between length and descriptiveness, or require longer names when needed to reduce ambiguity. The importance of a common naming scheme for related functions and variables was highlighted as well as paying attention to case to improve readability.

• Prefer shorter names (2), especially for math (1) • Common name for similar functions and variables (1) • Technical names when need to be very precise or show expertise (1) • General names when trying to collaborate with people from other fields or with general audience (1) • Consistent internally and with standard conventions (1) • Case is important for readability (1) • Would like dictionary to map short to long names (1) • Explanations for names (1)
• Longer names when need to reduce ambiguity (1) • Balance between length / descriptiveness (1)

Summary of additional comments from respondents
Further comments also indicated that the willingness of respondents to engage in discussions about naming varied greatly: from the recommendation to use ad hoc names or efficiently delegate this tedious work in order to avoid wasting too much time on it (at one extreme), to the willingness to invest time according to its importance, to actually enjoying the process as one that facilitates a deeper understanding of the item that is being named, and hence possibly stimulating new discoveries (someone described naming as "illuminating and fun"). Some noted that even though naming can be sometimes tedious, sticking with the process until a really good name can be found was important and worth their time. When exactly that endpoint was, however, was difficult to pinpoint according to some comments. Working together as a group on Loewe It is also likely that collaborative naming can be substantially improved. For example, a larger group could be broken down into small independent groups that can quickly generate and evaluate new names and thus efficiently generate naming proposals for efficiently combining them in a quest to find better names. Survey responses suggested that this and other systematic process improvements might be able to substantially increase the speed at which different credible naming proposals can be made and evaluated.
Challenging aspects of naming that emerged, included its general complexity, the lack of knowing when exactly a name was "good enough", the need to revisit names sometimes more than once until a good one could be found. Naming discussions are frustrating for people who want get quickly to using the thing being named and then find themselves stuck in a discussion on names ("there are more important problems"). One person remarked building a "tolerance" for naming discussions, while another thought that ad hoc names were usually good enough to convey the meaning necessary in a programming context. One person remarked on the ability of names to include or exclude, highlighting the great responsibility that comes with naming.
By and large, respondents with a math/computing background seemed mostly interested in limiting the time spend on naming and preferred by and large shorter names; Not everybody responded to the mini survey, and a noticeable majority of non-respondents had computational backgrounds. Hence the observable bimodality in the following mini histogram should be interpreted with much caution and may at best to be taken as an incidental observation in need of more systematic scrutiny.

If you could change the naming process anyway you wanted, what would you do?
(Please check with x any that apply or describe what you would suggest) It was allowed to choose more than one: 0x__ 1. "Don't ask me to work with others on naming: naming my things is my business, naming your things is your business" 2x__ 2 "I see that naming is necessary, but I hate doing it, so please do it for me and just tell me the conclusions" How to use these Naming Forms 1. The goal of these forms is quality improvement for Naming, which requires evaluating different Names for use by diverse audiences. Different equivalent (i.e. strictly synonymous) Names are defined as • different ways of expressing ("Syntax") • one precisely defined meaning ("Semantics").
These forms were made only to improve the quality of names and naming processes, not for investigating human capabilities. They do not record what is needed for useful conclusions on the latter. The forms aim to uncover hints of how a multitude in a larger population might misunderstand the names investigated to motivate searches for better alternatives; such goals are incompatible with the study of human capabilities.

Why you should stay anonymous if you are a reviewer of names
It is of utmost importance for the success of Naming, that your evaluations of names are based on • your actual understanding of these names without further help, as that is also how other people encounter such names in real life; • not on idealized knowledge that you feel you "ought to know" or "could easily look up", nor • not someone else's likely knowledge, and certainly not that of someone who is "trained".
The last reason is most distorting, as humans can learn any name with enough effort; that does not make it a good name, which needs to be interpreted correctly by many who have not seen that name before. Thus reviewing names requires an openness in communication style that can be very difficult to achieve for humans if any fear of judgment or other unintended consequences of speech lead to "self-censoring". Thus, please do stay anonymous and do speak your mind. If you say something unusual or highlight something you don't understand, you might speak for millions on a global scale: thus, please be precise.

Don't be intimidated by the many questions: Rather pick and choose as you see fit.
Skip what you can't speak to easily if you don't have time; even partial answers can be very useful.

4.
If you need more space on a page, use the space below, then attach pages; always annotate clearly: • Page number, then MeaningTag, YYYY_MM_DD Date , and ContributorTags.
• Question number on a page, to create a reference back for additional content on extra pages.

Please rate all traits of the given expressions on these scales
Check one of the vertical boxes at the extremes if the criterion is not applicable or you want to abstain from rating to strengthen the voice of other raters (by a "non-vote"; else choose some intermediate value):

Balance Scale
How well is the given trade-off managed?

Ideal Scale
How close does this get to being ideal?

Trouble Scale
How bad are the troubles caused by this? ] You would like to be acknowledged for particularly especially useful ideas using this acknowledgement info:

Overall Self-Assessment only for Naming efforts touched by this submission
With respect to anything e.g. history, names, meanings, ways of expressing them, problems, concepts, research, etc. relevant for particular Naming efforts advanced by this submission: do you think you are [_] total outsider, [_] beginner, [_] rare drop-in, [_] trainee, [_] core contributor, or [_] expert in this Specific Naming Effort [_] Are you a leading research expert on some of the issues raised here (stay anonymous or tell us how we can ask questions) It can help if you answer the following, but please don't answer if you like more anonymity ( [_] as PostGrad, [_] as Postdoc, [_] in Industry, [_] at Uni, [_] at Home, [_] as Prof, [_] ___ 12. Do you see yourself as literate in these areas?

Detailed Self-Assessment on Naming Related Skills
Total Outsiders and Beginners are often excellent reviewers of Usability for the General Public, hence they can contribute pivotally important insights, even if they may not feel it at the time.
Some problems are best solved by experts, and some best by newcomers. Indicating where you see yourself in this particular naming discussion will help maximize the impact of your contributions, especially any comments where you simply indicate "confusion".

Abstract
Organizing is the art of avoiding inessential complexity. Organizing becomes more difficult with project size, as increasing numbers of moving parts tend to throw everything into disarray more quickly. Good abstractions can enable amazing efficiencies by balancing the needs of standardizing and customizing, but they can be (too) costly to find. Much of this cost is related to naming, which in turn is closely related to organizing. Names are made to cut search times by labeling boxes of organized content. Poor naming strategies result in poor organization and increased search times. The quality of a naming/organizing strategy shows as numbers of items increase. Almost no strategy is needed for a few items (tempting many to regard naming as trivial), but finding appropriate strategies for millions of items or more is usually difficult (some say impossible). The lack of sensible strategies can frustrate much further research, a research potential that can be unlocked by the proposal of a workable nomenclature (i.e. naming strategy; see, for example, how modern biology was impacted by the organization and names proposed by Linnaeus' taxonomy).
Cancer cell biology, evolutionary systems biology, and many other areas of current biology are in great need of naming/organizing the millions of parts required for mechanistic computer simulations of relevant biological systems that incorporate all current knowledge. Many diverse ontologies, taxonomies, databases, genome-browsers, models, tools, simulations and other projects have made great progress in consolidating the dizzying jungle of synonyms in biology. However, their independent origins have also generated a new jungle of (near-)synonyms, as each project tends to use idiosyncratic ways of handling equivalent tasks, such as tracking bugs, tasks, reliability, versions, recurring types of modifications and more. These and many more inessential differences make it near impossible to develop programs that treat the scattered big data of biology in a uniform way. Most importantly, there is no universally agreed stability scale that enables researchers from distant fields to easily track the reproducibility, maturity, and reliability of a given result as assessed by relevant experts. A similar stability scale is also essential for developing programming languages aiming to respect the time of their users by not breaking the code of programs when releasing the next version (i.e. by providing long-term backwards compatibility).
The need for such a high-quality stability scale in the modeling language Evolvix inspired the development of the POST system presented here. It uses the BEST Names concept that disentangles conflicting naming priorities by distinguishing Brief, Explicit, Summarizing, and Technical (BEST) Name Dialects. Most Brief Names in POST are double-capital letters such as RR, SS, and TT that mirror Explicit Names like ReviewedRelease, StableSource, and TrustedTested respectively, where TT marks the difficult to achieve long-term backwards-compatible end of the POST sequence of stability levels that starts with MM for MockupModel. The rest of POST evolved around this StabilizingZone in order to provide the project organization necessary for the development of the Evolvix programming language and to support the simulation system it implements. … from so simple a beginning endless forms most beautiful … are being evolved. 1

Preface 1
It is a truth universally acknowledged, rarely pure and never simple: managing complex projects is … complex. Surprisingly, none of them start that way. Even the most complex projects are born from simple, elegant ideas in the mind(s) of their initiators before they begin to grow and evolve.
As humans, we generally abhor (ugly) complexity and admire (elegant) simplicity. Thus we will continue to start new projects as we try to 'stand out' while 'blending in', and 'be extraordinary' while 'remaining normal' (defined in as many ways as people exist on the planet).
These tensions are not new, and those before us have pioneered two response strategies: standardize and customize. Standardizing helps with simplicity, blending in, and compatibility to others, while customizing enables extraordinary feats of outstanding innovation -albeit at the cost of having to tolerate some inessential and annoying complexity.
Take cars, for example. Most people want a car (standardize), usually one that is different from others (customize). The pattern continues for driving behavior: roads require driving on the right side (whichever that may be; standardize), while most people are free to choose which road they take (customize their journey). Our social contracts to pre-decide on which side of the road to drive are not imposed on us by the physics of cars; yet they have huge practical and technical consequences that simplify and speed up many decisions, save countless lives -and even affect cars physically.
We suggest that parallels exist to navigating multiple complex information-based projects efficiently. When working with multiple projects, it cannot be efficient when each project stores its tasks in a different location or has a different system for indicating stability. The POST system distills the best ideas for standardizing the organization of complex information processing projects of any kind.
1 Who may want to consider using POST? We know from experience that most researchers, developers, and organizers are right in not looking for systems that can manage multiple complex projects; mostly, because what they already use works well for them and searching for better systems would waste much time (see below for reasons). The POST system was developed because of very real needs encountered in developing Evolvix; we realized only later that POST is much more widely applicable, more flexible, and easier to use than we had thought (and most people suspect). To highlight this generality, we added this broad introduction that explains how POST could be used efficiently for very many information based projects (see text up to example in Figure P1). However, we do not wish to imply that all such projects would benefit from switching (potentially at great cost). POST was designed for complex networks of interacting projects that have much to win from standardizing (see the computational biology use-cases described), and it may also get used by those who like the POSTcode names we found; but if there is no need for improving coordination, there is little need for considering the POST system presented here. Why would most people be wasting time when evaluating systems for managing multiple complex projects? Such searches tend to be too tedious for anybody who wants to get started with actual work. So much so, that ad hoc strategies 'made-up along the way' quickly become irresistible; there is rarely a point in paying the large up-front costs of evaluating and introducing a complex system while a project is small and uses only a fraction of the capabilities. In comparison, it is much more efficient to use most capabilities of a system that has no up-front costs (such as naming a few folders and keeping a few tasks on a list). The adhocery of the latter does not matter if the project remains small or both grow together; transitioning later to an appropriately capable management system is usually not prohibitive for one such project. Attempting to avoid that cost is tricky, as there is no guarantee that the choice of a given large complex management system will pay off: it may turn out to lack one critical feature that forces complete re-organization or limits further growth. Given how unpredictable such critical features are in real life, most small projects do best to pick any system available at (next to) no startup cost -lest the project itself will never get started. Exceptions to these rules of thumb are projects aiming to set up structures that are intended to become long-term backwards compatible and/or projects that need to coordinate across many potentially independent subprojects, and thus require replicated and dependable structures (such as Evolvix).

Thus, a firm requirement for developing this POST system has been to welcome new users by keeping initial startup simple, albeit without killing projects indirectly later by no longer being able to support their growth.
POST aims to standardize aspects of organizing that do not really change between projects, and 'pick one side of the road' by choosing standardized names for frequently recurring needs to store certain types of information. It then offers projects to customize any given POST Home Folder for its specific purpose by allowing certain types of folders to be present or not. Using POST is no guarantee for the success of a project like driving on the right side of the road will not guarantee that a journey serves its purpose. However, those who give the POST system a try may find some of its catchy Brief and Explicit Names to be memorable, and the organizational support offered to be surprisingly useful. Progress in projects becomes much simpler if there is no need to arrive at customized decisions about which side of the road to pick on each turn of each project's journey. This more general perspective may help to set the frame for the more specific introduction of POST that follows next.
The Project Organization Stabilizing Tool (POST) system has been designed for any • Project or complex set of tasks that depend on information processing in need of good • Organization for improving work efficiency and reducing inessential complexity by • Stabilizing flows of information using well-defined containers, building an automatable • Tool that is easily customized and can help to separate the wheat from the chaff in messy project data while simultaneously encouraging creativity, rapid innovation, testing, and the slow emergence of reliable standards with long-term stability.
First we will introduce the POST system with a brief overview aimed at a more general audience; then we will highlight the original use-case it was developed for in computational biology, before we will switch gears and summarize important features of its design. We will also address the question of how to define progress towards long-term backwards compatibility, including an assessment of its current status, and areas of further development, all of which will be important as the POST system evolves towards increasing stability. An overview of the next sections follows:

3.1.
Motivating problem: the need to organize 3.2.
Solution: organize project content using BEST Names (Table P1, Table P2)  3.3. 'Hello world' example: Using a POST Home Folder to write a paper together ( Figure P1 High-level structure of the POST system 3.5. Overviews of POST system definitions using BEST Names (pointing to Figures P2, P3)
Why TrustedTested (TT) Stability could be pivotal for personalized medicine and evolutionary systems biology. Translation in POST folders 6.
POST specification: current status, stability and future work 7.

References and POST overview figures: (i) InfoFlow in POST, (ii) Brief Dictionary of POST BEST Names.
This overview and the last two highly condensed pages provide perspectives on the POST system, that is defined by its BEST Names that specify a nucleus for developing POST standard semantics. Introduction This part is aimed at a more general audience of potential POST users.

Motivating problem: the need to organize
Ever wondered how to stay on top of a dynamic project that requires collecting and processing a lot of information? How to track the many improvements, small and big, that make a polished final version? Where to keep to-do lists, lessons learned, current problems? How can the final product be separated from all the information needed to produce it without making further improvements cumbersome? These challenges compound as projects become more complex, more contributors join, and the quest for quality intensifies in order to develop an outstanding product meant to last for a long time. The reason is: Every contributor needs the right information at hand to make a difference and be efficient.
It is natural for many people to sort information into hierarchical ad hoc file trees and other organizational structures created on the fly. However, such improvisation can make it very difficult to work together efficiently, especially if organizational problems are solved in very different ways by the people involved. For example, different contributors might regularly prepare files with tasks to be delegated to other contributors, but everybody stores them in a folder with a different name, such as 'Delegated', 'Tasks', 'Todo', 'Work', 'Actionable', and so on. As naming preferences multiply, collaborators will spend an increasing amount of time communicating where files are (or risk that tasks remain undone and work is being misplaced). If this confusion is to be avoided a common project management strategy is required, but there is rarely enough time to develop it when the project "just needs to get done". To adopt an existing strategy may trigger a possibly prohibitively complex evaluation process to ensure that the system of choice meets all needs without imposing undue costs from inessential complexity. Our efforts to support programming language developers and users have revealed the need for a simple-yet-scalable organizational scheme that makes it quick to start small projects in a way that does not prohibit them to gradually grow until they become complex long-term projects. Ideally, this scheme also quantifies the stability of different parts of large, complex projects and provides stable points of reference that can be used to measure progress towards aims of long-term stability (and backwards compatibility) without stifling the innovative processes that put a project on the map in the first place (the "StabilizingZone" introduced below aims to accomplish this).

Solution: organize project content using BEST Names
We developed the POST system described below to organize the development of the Evolvix programming language and its many nested smaller projects in a stable network of well-defined containers. We aimed to avoid recurrent, costly episodes of complete re-organization (such as the unavoidable one that triggered our work on POST). Due to the broad scope of Evolvix, we designed POST to offer a flexible infrastructure for supporting the development of very diverse project types. At its core is a set of carefully chosen standard "StabilityCodes", or "POSTcodes", aimed at separating highly refined information that moves very slowly from more transitory to more stable states as it is reviewed with increasing rigor. These StabilityCodes form a sequence that follows the alphabet and defines "DoubleCaps" as keywords, starting with MM, NN, OO, … and ending with … RR, SS, TT:  Other POSTcodes denote other aspects of project management that are independent from these StabilityCodes codes, but also reflect in some ways on the stability of the information they annotate (such as AnyArrival for arbitrary project relevant material that has just arrived, or VersionVariant for historic releases of variants). Drawing on the alphabet's familiar progression, the POSTcodes and StabilityCodes roughly organize the initial project phases with earlier letters and the increasingly mature phases with later letters.
These POSTcodes annotate containers, such as folders, with a short description of their intended purposes, and they exist as the first three standardized synonymous names defined by the BEST Names concept: • Brief Names: here mostly double capital letters for speed, brevity, and memorability; • Explicit Names: here mostly memorable keywords reflecting the capital letters; and • Summarizing Names: cheat-sheet-like summary sentences that act like mini-manuals.
• Technical Names will be defined later to serve as core for corresponding StableNames. Our use of BEST Names aims to enhance the intuitiveness and memorability of POSTcodes; indeed, without BEST Names it is unlikely that POST would have been developed. Many of its use-cases require very frequent use of POST codes at very short notice, that neither leave much room for typing nor much time for remembering cryptic names, as seen in the example given next. Table P2: Brief, Explicit, and Summarizing Names of basic POSTCodes used for choosing in Fig P1. For example, a research paper in the folder of MyResearch might select from this list the folders shown in Figure P1 below. Usually MyResearch starts with a very small folder, only storing any arriving ideas and information in an unstructured form in the AnyArrival (AA) drop box. As the collection of content continues, arriving files are sorted into CollectedContent. Planning prioritizes aims in AnyAimsAdmin (AAA) structures. The final latest and greatest version of the product produced will grow and reside in the GrandGallery (GG), and a LabLog (LL) may help record a chronology of the work done in preparation for the big finale. To declutter, old variants or unused files are moved from GG to the HistoryHeap (HH) and good ideas for later from AAA to BackBurner (BB). Writing and working often result in JammedJobs (JJ) that need solutions, and increasing project complexity might lead to a reorganization of content that results in new GG folders for different workspaces. Shared manuscript variants are stored as UploadsUsed (UU), and incoming comments collected in FeedbackFlow (FF). See Figures P2 and P3 for more potential workflows. The complete POST system includes many more features than are needed in most use cases and is being designed to provide developers with powerful abstractions for facilitating the automation of data flows. However, none of this needs to infringe on personal organizing preferences for users who do not need automation. POST's beauty is in its flexibility: it encourages using of StabilityCodes as needed and ignoring of all others -they will wait until content arrives that they can organize. Figure P1: Using a POST Home Folder for organizing a mini-project such as writing a paper. Exemplary activities are shown in columns and the folders they generate in rows as they persist. The names presented here were found to be easy to understand with minimal explanations in our internal tests. More complete definitions using Brief Names (e.g. "AA"), Explicit Names (e.g. "AnyArrival") and Summarizing Names (e.g. giving a short description of how to use "AA") are given in Table P2 and the overviews of the POST system in Figures P2 and P3 below. Brief and Explicit Names are given here for clarity, but this would not usually be done in practice (Brief is enough). The POST system can tolerate a great number of variations when used manually, but many contributors usually benefit from stricter following the naming rules to avoid confusion, while POST automation requires following all formal rules. POST can help organize data in diverse storage media, including manual use on paper.

High-level structure of the POST system
Generality, flexibility and origin. The POST system is meant to be as general as reasonably possible so that anyone can use it for efficiently managing projects, small or large. This is the case even though (or rather because) it was primarily developed for supporting the implementation infrastructure of the general-purpose programming features of the Evolvix modeling language. This language aims to support a very broad set of use cases to enable efficient work in computational biology (hence the generality, which is needed to accommodate the broad diversity of biological use cases). The POST system facilitates standardizing and customizing development work across complex and independent code, text, and project management structures. Users can drop features they do not need initially, only to introduce them as necessary when a project keeps growing in complexity.
The POST system is designed to encourage automated file management while also providing an effective structure for manual organization that makes it easy to set up and manually operate a single POSTbox that consists of a single POST Home Folder. Simultaneously it aims to support the growth of massively distributed collaborative projects that eventually combine many automatically managed PHFs into larger POSTnets (e.g. to meet citizen science project requirements described elsewhere 2 ). Such networks facilitate globally distributed collaboration by automating well-defined information flows between individual PHFs, while clearly communicating the stability of the code involved (albeit without burdening users who do not need this for their POSTboxes). We define • a POST Home Folder (PHF) as the one top-level folder of a given POSTbox, containing its own independent active area, ZZ* folders and POST_Cabinet folder (which in case of automation must be stored under the name "_POST" or "a_POST", if leading underscores are forbidden; manual use is more flexible); all folders and data in a POSTbox is stored relative to the PHF, which serves as a local anchor in its broader context; • a POSTbox as the implementation of a single PHF that allows for manual management -just like the old-fashioned mail service that requires hands-on file management and is simple to use; a POSTbox may have links to other POSTboxes (see ZZ, ZZA, ZZ* folders); • a POSTnet as a combination of many PHFs that can easily become very intricate and difficult to keep in a consistent state manually; to maintain consistency, POSTnets are usually best managed automatically, especially when multiple PHFs are nested, interact in complicated ways, and/or contain data that is costly to re-synchronize after incomplete manual operations; • the POST system (or simply 'POST') as the abstract type specification defined in this document or a future POST standard; it shows how to implement a POSTbox or POSTnet.
Nesting and automation. PHFs can be nested inside and linked to other PHFs, thus facilitating great flexibility as they can replicate and network at all levels of a project hierarchy. While such networks can be managed manually in principle, it is more likely that they require automated POSTnets to reduce operational errors.
Simple decisions before starting a POST Home Folder. A project using POST should consider: • Will it be operated manually, like most project, and thus use the "GrandGallery" as place for developing and updating the latest content, adding other folders only as necessary? For a simple example, see the Figure P1 above.
• Will it aim for some form of stability or backwards-compatibility to help guard the investments of outside users who rely on the stability of the system developed by this POST? Then using the StabilizingZone is recommended to encourage stability, even if TT may not be the goal. Thus, new content is developed in MM or NN, and moved up the rungs of the StabilizingZone as it matures up to RR or SS, possibly with automated support. • Will the PHF be automated eventually or manually managed throughout its life-time?
Automated PHFs follow stricter rules that are easier to enforce from the start, even if managed manually. Costly re-organizations may be avoided by following the rules right away.

Rigidity of structure.
If automation is not needed, then users of a PHF may interpret POST related structures more flexibly. Humans are much better at navigating ambiguity than computers, albeit consistent use has substantial advantages for humans as well, especially when many contributors are involved. For example, it is easy to append tags that serve as WorkspaceIDs to further break down navigation complexity, even if not yet supported by automation (e.g. see "GG-Figures" in Figure P1); it may not matter to humans, whether peer reviews are stored as UploadUsed (by reviewers) or next to potentially a lot of files inside of FeedbackFlow from the web, but automated PHFs need a standard answer allowing them to expect defined content in a fixed place and thereby reducing inessential code complexity. Most areas where the POST system requires additional work are such decisions of locating within the folders already defined in Figures P2 and P3. Such choices do not alter the core meaning of POST StabilityCodes and hence rarely matter in a manual POSTbox.

Overviews of POST system definitions using BEST Names
Nucleus for a POST system standard While Figure P1 above provides an incomplete introductory overview of POST parts frequently used in simple projects, the Figures at the very end of this text provide complete overview perspectives of the core of all top level POSTcodes, including the StabilityCodes that facilitate the project organizing and stabilizing aspects of POST: • Figure P2: Information flow overview illustrating some inner connections between StabilityCodes and which folder is located where (if a POST is implemented as folders in a file system, which is not the only possible option). • Figure P3: Brief Dictionary of all core POSTcodes giving interchangeable BEST Names The core POSTcodes presented in these figures can be complemented by more specific POSTcodes that belong to reserved and controlled vocabulary lists of keywords and that streamline the naming of more specific areas, such as organizing observed data, estimates of parameters and simulation results. Discipline, research-area or industry specific keywords lists are conceivable, whose applicability to a given POST Home Folder is defined by the type assigned to the PHF. Overall, the POST system design aims to minimize the number of these keywords; if possible to maintain, POST development will aim to make these sets of words non-redundant and free of "near-synonyms" to ensure overall simplicity of use by maximizing the re-use of core names.
The core names defined in these lists also form the core material for generating Technical Names that become part of the StableName and StableMeaning as defined by the BEST Names concept to ensure unambiguity in the presence of potentially many translations and (perfect) synonyms.

Computational biology use-cases
Next we will present details of the the use-cases that motivated the development of POST.

Why TrustedTested (TT) Stability could be pivotal for personalized medicine and evolutionary systems biology.
Not all software systems let alone non-IT systems require TrustedTested or TT stability. However, the equivalent of TT stability is critical for systems that depend on the results of very large numbers of contributors who need to build upon each other's work over very long periods of time to achieve overall success. This requires using a common language to improve their efficiency of communication; if the number of required contributors is so large that newcomers need to be recruited, then chances of success will increase dramatically if learning this shared language is easy.
Modeling biological systems has shown that computational techniques can in principle be applied to vastly more biological systems than currently modeled. While this discrepancy may have many reasons, on important reason is certainly the lack of a TT stable modeling language that efficiently helps biologists to formally describe their mechanistic understanding of molecular and other systems they investigate (including experimental results) to allow for corresponding computational systems biology analyses. The emergence of some standardization (e.g. see SBML.org 3 ) is encouraging, but there is a long way to go until the computational tools for biology can integrate current biological understanding well enough to • enable personalized medicine analyses that efficiently build on all known evidence (and not only the datasets that happen to be available), 4 or • enable evolutionary systems biology analyses predicting phenotypes and fitness from genotypes and environments, in order to simulate how populations are likely to evolve, 5 or enable similarly bold and challenging visions in predictive cancer research, conservation biology, or development of policies to slow antibiotics resistance evolution in bacteria. All these areas will eventually need to combine biological observations, models and results from many decades in the past and in the future. Only a reliable computational integration can enable future biologists to efficiently build on the foundation of past results without losing them to semantic rot or requiring the steep cost of continually re-implementing past modeling code. Experiences with the first simulation project that used Evolvix 6 highlighted the potential for automating such an integration.
To efficiently enable personalized medicine or evolutionary systems biology it is essential to find a way to keep active the models that are currently being buried in immutable online publications. Activation could be greatly simplified by implementing models in code that is easy to read and has the TTv1 stability allowing future biology students to easily extend old model code using new TTv1 downloads.
This vision drives Evolvix, and inspired the quest to define a StabilizingZone leading to TrustedTested and the POST system. Evolvix adopts POST to increase code stability. There is no reason why other projects cannot use the StabilizingZone and POST to improve their stability as well.

Area-specific controlled vocabulary lists in POST
To avoid a confusing chaos of folder names, POST defines some keywords for the purpose of streamlining the locations that programs would have to know in order to find corresponding files. For example, in POST, simulation results should always be in a folder that has "Sim" in the path, even if many other aspects remain flexible, while "Obs" always denotes some observation of a system that is not a simulation. The following list of Brief and Explicit Names is given to reduce confusion. In addition, we offer a short explanation that will be used to create a Summarizing name at a later point.
This list is by no means complete. As long realized in biomedical research, there are numerous benefits to using a system that can combine terms and fragments to derive new composed terms denoting new meaning (often based on Latin or Greek roots 7 ). POST will not need all conceivable terms (which gives away the advantage of simplicity); indeed, a case can be made for introducing discipline/application/industry specific vocabularies, which should be defined in combination with the specification of a controlled type for the PHF. Ultimately any such efforts require the construction of an ontology 8,9 or type system to keep POST well defined.

Top Level Key words in Brief Dictionary
Other Keywords with reserved top folders and independent subfolder namespaces: • QAS is to be expanded substantially to facilitate a wide range of human annotations using controlled vocabulary for quickly annotating repeating quality assessment scenarios, including diverse fuzzy situations, where it is difficult to assess quality because of many unknowns.

Flexible Modifiers of other keywords in the Brief Dictionary
These modifiers cannot stand alone, but otherwise work very well in combination with some of the keyword fragments defined above:

Defining TrustedTested in POST
The unusual nature and long-term importance of TrustedTested as a StabilityLevel merit a separate list of requirements and comments to explain how TT works.
Projects aiming to reach TT are encouraged to ask questions about how to enable long-term stability as early as possible and are cautioned against elevating code too quickly to SS without evidence that its design has what it takes to go all the way to TT; thus SS can be used as a testing ground that buffers TT from integrating solutions that have not received sufficient review. There is no need to risk elevating new features too quickly to TT. Many successful software products have demonstrated that remaining at the equivalent of RR does not hamper success, and many high-quality IT standards that provide stability at SS do so without the need for TT promises. TT stability is very difficult but not impossible to achieve, as IT systems change continually and TT systems need to anticipate and abstract these changes well enough to isolate them from affecting any code written for such TT systems. Thus, in the absence of evidence for extraordinary stability or if in any doubt, software projects providing a single high-quality implementation of their design are recommended to stay at RR, and official standards without guarantees for long-term backwards compatibility at SS. However, a precise definition of the POST requirements for transitioning from QQ to RR to SS and to TT is beyond the scope of this study and remains to be reported elsewhere.
a. Simplify use for outsiders. Any downloadable system developed using POST and marked "TTv1" indicates that it belongs to the version variant family "TrustedTested version 1" = "TTv1" expect long-term stability which follows a well-defined set of requirements providing the following capabilities (where some details may need to be defined by the corresponding POST project): b. Work with all older code written for stability. The latest downloadable release or patch of a TT version variant family can correctly interpret any code produced for any previous release or patch of this TT version variant family (e.g. TT version 1 can interpret any code for any release or patch back to v1r0p0, the original version 1 release 0 patch 0) c. Use Stabilizing Versioning. It is beyond the scope of this study to describe the stabilizing version variant naming system implied here; once fully defined, it is to become part of POST. e. Clearly mark code that is not (yet) stable either as TTv0 or by omitting TT or by adding other StabilityLevels: "TTv0" do not yet expect long-term stability "SSv1" do not expect long-term stability "RRv1" do not expect medium-term stability "QQv1" do not expect short-term stability … expect less and lesser short-term stability "MMv1" expect least short-term stability Given the many strict requirements and prolonged review processes required for developing code at the TT StabilityLevel, projects aiming for TT first need to denote many version variants that do not yet meet TT criteria. Code that does not contain any TT level code is to be annotated at the appropriate level (MM…SS). Code modifying any pure TTv1 variant with less reliable changes is to be denoted by adding the Brief Names of its lower StabilityLevels to the version variant label, thereby indicating the loss of TT stability. For example, in "TTv1r2 " has only features of TT version 1 release 2 "TTv1r2_OOv3r4" TT core (v1r2) with OO extensions v3r4 "TTv1r2_OOv3r4_MMv0r0p3" further extended by a hack at MM level, the overall stability is specified by the lowest StabilityLevel present, even if parts of the system are more stable since they were not modified. However, the presence of any modifications makes it difficult to exclude interactions that destabilize the whole system. Hence all modifications of any code above TTv1r0p0 require full review in the context of how they affect the whole system they have been added to.
f. Minimize contradictions. All output from TT systems is required to be accurate to the best of current knowledge, which may advance as research progresses. This implies the emergence of new algorithms and/or fixes for bugs in known algorithms. Bug-fixing TT patches can in principle change output if the older system produced output that is demonstrably wrong. However, appropriate review at other stability levels is expected to catch such bugs before an algorithm is advanced to TT. In areas that are known to be difficult to standardize, such as numbers and arithmetic systems, alternative systems can be distinguished by corresponding differences in names, thus enabling the addition of new interpretation systems without breaking the compatibility of code that rely on previously implemented systems. Similarly, the discovery of improved algorithms does not require abandoning the possibility of using previously used algorithms. However, new algorithms shall eventually result in TT releases that automate running new algorithms in addition to old ones, facilitating comparisons of results. Design and algorithm choices that are TrustedTested are expected to have survived multiple rounds of rigorous conceptual and usability review, repeated design simplifications, prolonged use in professional production environments, and many automated test cases. A full list of criteria for justifying the progression of code to TT is beyond this study, except to say that it should not happen too fast to minimize contradictions and to simplify design (see next).
g. Use rigorous review for reducing clutter in namespaces. Careless design decisions can quickly and irrevocably clutter the TTv1 namespace of a given project and thereby increase its inessential complexity 10 . This can quickly degrade the prospects of long-term survival for the project. The special nature of TTv1 features causes this, features that signal to users and programmers that they will all remain available on the long term. Thus the leadership of POST based projects is recommended to be slow and careful when allowing names to enter into the TTv1 namespace. The POST StabilizingZone does not require such rigor for TTv0 or any full versions of other stability levels (MM … SS), providing many opportunities for experimenting with mutually incompatible competing implementations of a new feature before selecting one of them for TTv1. While POST generally expects an increase of stability from MM to SS and with increased version numbers within a StabilityLevel, it expects even more that systems become public at RR and noticeably reduce the changes that remain possible as stability moves through the levels RR -> SS -> TT, while incorporating worthwhile improvements suggested in a FeedbackFlow from public users.
The path to TrustedTested: an overview of the StabilizingZone.
Here GG serves as the "Null-Element", that moves equivalent content outside of the StabilizingZone, if there is no aim to ever attain long-term backwards compatibility. There is no difference in the assurance of stability between GG and MM, only a difference in ultimate intention.
VV serves as the installation location and ultimate archive for any products produced by the StabilizingZone.
XX provides temporary build space needed for generating the final installation-ready files (its stability will be copied from independent upstream code that produces it, hence XenoXero).

Design
The following information is advanced and meant for software architects, designers, and developers; it is not intended for a general audience.

Functional requirements and features
The POST system was optimized in a trade-off between the following requirements: a. Easy entry point for newcomers with as few required structures as possible (none).
b. Simple growth by adding only structures that are needed locally (one by one).
c. Conceptual clarity, providing roles for each structure that are well defined and of the highest importance for corresponding large projects.
d. Brief memorable keywords, including the use of all double capital letters of the English alphabet as Brief Names for StabilityCodes with matching Explicit Names carefully chosen as memorable reminders of their meaning (example: HH for "HistoryHeap").
e. Memorable organization of StabilityCode names along the alphabet roughly tracking stages of project progression (start with early letters for early stages; minimize exceptions).
f. Use of the alphabet to group StabilityCodes by types for well-defined use cases (choosing the clearest Explicit Names with corresponding initials to sort accordingly with the alphabet).
g. Flexible to allow extremely complex or simple project structures to exist concurrently.
h. Clear transitions and support for both manual and automated operation of POST in many diverse storage structures (folders in file systems, Git repositories, sets with nested subsets and elements, arbitrary data structures, diverse off-line paper-based systems, etc.) i. No upper limits for nesting full-scale POST systems within other POST systems unless explicitly named (such as absolute file path length limits; storage size limits, etc.).
j. Clearly defined rules for making it easy to create tools for automating tedious tasks (such as activating or versioning the content of folders with various levels of stability).
k. Support for all key aspects of system development from first idea over the various stages of the software development life cycle to archiving the last results.
l. Defined information flows between various containers with different StabilityCodes (specifying which flows are allowed and which are not).
m. Utmost reduction of inessential complexity to maximize usability, but without sacrificing scalability or other functions that are essential for helping to organize and stabilize projects (together with alphabetical and mnemonic requirements this resulted in considerable naming challenges, especially for identifying memorable Explicit Names; we cannot see how to meet these challenges without at least separating Brief and Explicit Names; much of the value of POST is defined by the quality of this integration task).
n. Summarizing Names for StabilityCodes that serve as "mini documentation", reduce the need to consult a manual, and provide a nucleus for standardizing POST system semantics.
o. Define a StabilizingZone with defined StabilityLevels to serve as anchor for a stabilizing version variant numbering system that can help software projects to steer code towards longterm backwards compatibility by explicitly disentangling the speed of fast 'agile' prototype development (such as MockupModel, MM), from slow highly organized 'waterfall' project development (typically published as ReviewedRelease, RR), and in turn from the glacial pace of development of most international standards (usually StableSource, SS, albeit without excluding future changes or promising migration capabilities); keep these use cases separate from the endpoint of the StabilizingZone (TrustedTested, TT), which is required to remain 'stable forever' to the best of current knowledge or provide a migration path (see below).
p. Encourage long-term backwards compatibility of systems developed with POST by helping end users to quickly recognize version variants offering such maximal stability by providing a clear and memorable label (TrustedTested, TT) linked to a rigorous review process.
These requirements must be included in future extensions of the POST system that ideally extend and refine this list in order to move POST closer to the stability denoted as TrustedTested.
The POST definitions given here are not yet at StabilityLevel TT, but were designed with the aim to encourage and measure progress towards developing TT stable code in and for Evolvix (in order to reduce the cost of irreproducible research). This implies that the POST system defined here (and in particular its StabilizingZone) is closer to TT than any other code developed in or for Evolvix (see statement on status in section below). Thus, good reasons need to be provided for either • removing any requirement specified above to justify why the corresponding POST use cases are best expressed in other ways, or • adding requirements to justify why the corresponding use cases are important enough to merit inclusion in POST more than they merit removal to serve the aim of reducing the cognitive complexity for all users of POST (since beginners and experts pay for clutter caused by inessential complexity in POST, albeit in different ways).
POST aims to reduce organizational burdens in info-processing projects of any size by associating frequently used types of meaning with standardized Brief and Explicit 'StabilityCode' Names chosen to be memorable. You choose the StabilityCodes your project needs from the general scaffolding, organizing and stabilizing blueprint given by POST; it does not matter whether you are writing a small text or a big complex nested software system that needs long-term backwards compatibility.