An Ontological Characterization of the ‘Structured Information Flow’ (SIF) Model Shishir Bhatia shbhatia@vt.edu James D. Arthur arthur@vt.edu Department of Computer Science, Virginia Tech Blacksburg, Virginia, 24060, USA Abstract In the wake of the Internet revolution, countless organizations recognized the potential of electronic communication. While many raced to harness the Internet to reach a mass customer base, those not built for an Internet-like paradigm were only partially adapted to work with the World Wide Web. Now, almost a decade into the new millennium, those ‘modern’ organizations are still behind the curve due to the absence of a seamless Information flow created by an automated approach applicable to each organization’s unique processes. Structured Information Flow (SIF) is a flexible framework that structures the end-to-end Information flow of an organization, from initial input, to processing, to the final output. SIF utilizes the organization’s reporting structure to create an intelligent model of its automated Information flow infrastructure, thus preserving and yet automating an organization’s processes, making the approach simpler, intuitive, and acceptable to organizations. SIF models Information flow that seamlessly integrates input from and output to the external domain with the organization’s internal processes. The SIF model utilizes Information Labeling and a set of Rules to fully automate the Information flow between the Nodes. Any change in the two basic components, Information and Nodes, is known as an event, causing a state transition. The events and the transition are triggered by any change in the organization, such as Information exchange. This body of work includes a theoretical model and a visual representation schema to compactly and completely capture the events and state transitions in SIF. The SIF framework must be conceptualized with a powerful tool that transcends platforms, facilitating use across various organization types. Therefore, ontology is well-suited to characterize SIF components and conceptualize the proposed framework throughout this body of work. Keywords: End-to-End Information Flow, Paperless Office, Information Flow Framework, Ontological Characterization, Information Flow Automation Model, Electronic Information Flow 1 1. INTRODUCTION The advent of the Internet compelled enterprise systems to morph quickly to take advantage of the World Wide Web (WWW) and all the newly available technology. There was time to be saved and dollars to be made (Tao, 2006) by organizations that seized the potential of electronic communication, improving processes and eliminating the handling of endless reams of paper. Simpler, streamlined, paper-free processes meant less labor, fewer errors, and greater potential for growth and profit. The dream of the paperless office was born. But enterprise systems that were not designed for electronic information flow were dependent on processes that required paper to complete the loop. The fully digital paperless office was still just a dream. Even by the end of the 20th century, 2.4 billion sheets of paper were placed in folders each day to complete Information flow processes in modern organizations (Liu, 2000)! And today, years later, those ‘modern’ organizations are still mired in paper. The problem is the absence of an automated Information flow approach that lends itself to each organization’s unique processes, rendering them seamless. This body of work proposes an ontological conceptualization of such an approach - Structured Information Flow (SIF). Background By the 1980s, the newest innovations in computer technology promised a paperless office using electronic Information flow (Liu, 2000). At that time, computers were not standard in the workplace or in the average home. But as technology developed and public interest grew, the equipment became both more user-friendly and more readily available. Then came the age of the Internet and cheap computer hardware. PCs were finally becoming common in traditional households and empowered the masses to access Information instantaneously and at will. Quickly, it became an expectation that organizations not only provide Information through the Internet, but also allow customers to interface and receive dynamic and personalized responses. Although many organizations embraced the public’s desire for online services like bill payment, most organizations were challenged by costly intranet installations that were not geared to harness the Internet infrastructure. They were able to provide some electronic communication but could not meet the demands of a fully automated system, such as allowing the customer to interact directly with the organization’s domain. Therefore, in this seemingly IT-automated structure, the basic connection of processes still had to be performed ‘by hand.’ For example, when a customer request was received, a printout of the order was sent to trigger the delivery of a product or service. A set of truly interoperating systems leading to true automation was never achieved by most organizations. Even though electronic communication in place of paper-based communication guaranteed a great many advantages (Petrakis, 2000), to date we do not see many organizations that have totally embraced the concept of a paperless office. Problem Why is it that we do not very often see an organization’s tools and systems interoperating seamlessly to fully automate the flow of Information? The problem is that most of the systems currently used by organizations are ‘dumb’ (Bayer, 1996), meaning they hold the Information but do not decide who should have access. What is needed for true automation is an intelligent system – a system that can decide who should have access to what Information from the time the Information is created and route it accordingly. However, various attempts to create such intelligent systems called for radical changes in organizational functioning; most organizations, for monetary and cultural reasons, have not been able or perhaps willing to struggle with this daunting task (Smart, 1995). So, the real challenge is not only to devise an intelligent system-based approach that automates, but also one which is simple and does not force the organization to change. Ideally, once such an approach is devised, it should be standardized for use across various organizations and platforms. These three requirements alone have so far eluded the best efforts of Information structure designers. Moreover, once such a concept is created, it is critical to standardize (Moraga, 2006) it before communicating clearly across industries. It requires a specification or conceptualization schema both powerful and widely accepted. Only then can the knowledge of automation efforts be shared across organizations and platforms, which is critical for the successful implementation of such a system. We do witness paperless commerce in our everyday life from vendors like Amazon and eBay, which is testament that complete automation is possible using current technology. The question is not whether complete automation is now possible or not, but how to create the infrastructure for the Information flow that will automate even the most complex, largest, and oldest of organizations. Solution Approach In this paper, we propose the SIF Model and its ontological conceptualization. SIF is a framework that structures the end-to-end Information flow of an organization, from initial input, to processing, to the final output. SIF utilizes the organization’s reporting structure in order to create an intelligent model of its automated Information flow infrastructure. This way, it is preserving and yet automating an organization’s processes, making the approach simpler, intuitive, and acceptable to organizations. SIF models Information flow in a manner that seamlessly integrates input from and output to the external domain with the organization’s internal processes. SIF utilizes Information Labeling along with a set of Rules to fully automate the Information flow between the nodes. Information and Nodes are the two basic components of SIF; any change in either of the two, known as an event, causes the state to change, known as state transition. The events and the transition occur when there is any change in the organization, such as Information exchange. We have also developed a theoretical model and a visual representation schema to compactly and completely capture the changes. As mentioned in the Problem section, the framework itself must be conceptualized using a schema or model that transcends platforms. We believe ontology is the right tool for conceptualizing the SIF framework. 2. SIF FRAMEWORK’S ONTOLOGICAL CONCEPTUALIZATION From the Information perspective, at the most fundamental level, any organization is comprised of two elements: Entities that create/consume Information, and the Information itself. In SIF, the entities that create and consume Information are called Nodes. They are the abstract representation of an employee or any role in any organization. The fundamental unit of Information that a node creates, edits, or deletes is called an Information Piece. In the SIF model, these Nodes and Information Pieces are organized in a manner that is reflective of the organization’s reporting structure; for example, a real-life senior-subordinate relationship is represented by a senior-subordinate node relationship in the SIF model. As we mentioned earlier, SIF uses the reporting structure of the organization to model how the Information Pieces flow between Nodes. Information flow is regulated by rules based on the organization’s reporting structure. For example, one simple and intuitive rule for Information propagation is: A Senior node can view a subordinate node’s Information, while a subordinate cannot view a senior’s Information. Another is: A senior node can extend the right to view its subordinate’s Information to any other node of any hierarchical standing. Given that a senior has access to all of its subordinate’s Information, a logical superset of Information Pieces (containing senior’s as well as all its subordinate’s Information Pieces) exists at the senior node. We refer to this logical bundling of the Information at the nodes as Aggregation. Understandably, as we go up the nodal hierarchy, the bigger that Aggregate gets. Aggregation is a powerful way of logically organizing Information for fully automated Information flow. Ontological Conceptualization of SIF Components According to the definition of ontology (Wikipedia), ontology analysis of an item requires establishing the following: * Taxonomy * Individual (Instances) * Classes (Concept) * Attributes * Relationships Sections below use ontology to characterize the two components of SIF: Information and Nodes. The diagrams (Figure 1 and Figure 2) below represent the Taxonomy of the Information and Node components respectively. The element of Relationships requires a more involved explication and will be discussed in a separate section of its own. Typically, in the context of ontology, Relationship exists between the elements in Taxonomy. However, in this body of work, the relationship between the objects of the two taxonomies (Figure 1 and Figure 2) will be investigated, as that is what drives much of the Information flow in SIF. The Ontology of Information Component Refer to Figure 1 in the Appendices: Class: Information Aggregate Object: Information Piece Attributes: * Owner (The node that has all the privileges over an Information piece: View, Edit, and Delete) * Hierarchical standing of owner * List of nodes with Extended rights * History * External Designator Information is that humanly perceptible data that is created to perform, track, validate, etc., the functions for achieving the organization’s objective (Boisot, 2004). In SIF, this definition is narrowed down to the Information that pertains to the organization under consideration. As represented in the taxonomy in Figure 1, the Information in the organization could be capture or created, Restricted or Unrestricted, and With or Without External Access/Designator. The Information classifications are explained below. Captured or Created Information Information is generated in SIF two ways: it is Captured from the external domain, or it is Created in the domain internally. When an entity, external to the organization, registers new Information with the organization, that Information is Captured. For example, when a citizen lodges a complaint with the appropriate node at a local municipality office, that complaint becomes Information captured at that node. When an Information piece is generated within the organization rather than being captured from outside the domain, that Information is Created. For example, when a municipality’s communication office generates a report of complaints sorted by area, the report becomes Created Information. The node where the Information is captured or created becomes the owner of the Information, and the Information physically resides at that node in the Information bank of the node. Restricted or Unrestricted Information Just as in real life organizations, in SIF, Information is labeled as either Restricted, Information to which the access needs to be controlled and managed based on rules and rights; or Unrestricted, Information that is universally accessible without requiring any permission or privilege. The owner of the Information determines its accessibility and labels it either Restricted or Unrestricted. If labeled as restricted, the Information can be accessed only by the nodes with rights to access it. In this paper, we primarily discuss the Restricted Information, rather than Unrestricted Information, as it is the Restricted Information that requires control and regulation using rules and access rights. We will address the Access Rights in detail in the Rules section. Information With or Without External Access/Designator In some instances, Captured (registered from the external domain) Information pieces require some kind of receipt sent to the external entity that registered the Information. In our previous example of a citizen lodging a complaint, a receipt indicating that the complaint has been received by the municipality and a resolution time estimate would be ideal. This receipt can be sent to the external domain entity as an email reply, or retrieved via password access, or in a variety of other ways. The receipt, which is restricted for retrieval only by the designated external entity, is termed Restricted Information with External Designator. The Ontology of a Node Component Figure 2. Taxonomy of Node Component. Class: Group of Nodes [Group is conceptual aggregation of Nodes and represents Sub Organizations] Object: Node [With Respect to another node, a node could be Senior, Subordinate, or Peer] Attributes: * Hierarchical standing * Information Owned * Viewable Information Pieces * Direct Senior node * Direct Subordinate node As discussed previously, SIF utilizes Nodes to represent the Information processing entities in an organization that are organized in a hierarchal manner which mimics the Organization’s reporting structure. Why does SIF use hierarchical topology only? It is not difficult to observe that most organizations with which we interact, such as banks, commercial institutions, educational institutions, etc., can easily be mapped to a hierarchical work structure. In SIF, the hierarchical arrangement of the nodes is an important source of the Information access rights that are granted to any node. More specifically, the relative hierarchical standing of the node in the topology determines certain Information access rights that a node can exercise. As mentioned above, a node, relative to another node, could be Senior, Subordinate, or Peer. For example, as shown in Figure 3, for Node X, Nodes Y and Z are senior. V is subordinate, while W is a peer node. Nodes S and T are not related to X. Figure 3. Senior, Subordinate and Peer Nodes However, there are certain scenarios in which the hierarchical relationship between certain nodes is not obvious. The Information Based Precedence Table (Bhatia, 2006), a method developed as a part of this body of work but outside the scope of this paper, allows us to establish a clear hierarchy between nodes when one may not naturally exist. This enables us to map even the Pseudo-Hierarchical structures as strictly hierarchical structures in SIF. Ontological Characterization of Relationship between the SIF Components and Rules The previous sections define Information and Nodes as they are utilized in SIF. However, it’s the relationships between the Information and Nodes, and between nodes themselves, that hold the key to Information flow. The privileges and constraints, or rules, attendant to those relationships allow the system to automatically route Information to its appropriate destination. In this section, we will state the high-level rules, which remain static in the SIF framework, and fully explain the formulas which allow for customized rules based on an individual organization’s structure. The rules establish the pathways that Information will flow; the triggers that prompt the flow will be discussed in the Events in SIF section. Here we examine the privileges and constraints that govern Information flow based on relationships both between the Nodes and between the Nodes and Information. Privileges/Constraints Based on Node Relationships * A node can view any Information owned by any of its subordinates in the direct chain of the hierarchy. * An owner node can extend its right to edit or delete Information to another node. * A node can extend to any other node view rights over its Information or that of any direct subordinate. Relationship Between Node and Information From an ontology perspective, a maximum of four types of relationships can exist between a node and Information in SIF. Ontological node-Information Relationship definitions are as follows: * node (Can View) Information Piece * node (Can Copy) Information Piece * node (Can Edit) Information Piece * node (Can Delete) Information Piece If none of the above four relationships exist, then the node cannot access the Information. This is the Default relationship and is defined as: node (Cannot Access) Information Piece Based on the above characterizations of Privileges/Constraints and Relationships, we formally define the Rules that govern the Information flow in SIF below. Rules of Information Flow We have used a simple, formal language to document the Rules, where symbols mean the following: NR - is the hierarchical standing of the Requestor node NO - is the hierarchical standing of the Owner node > - is the greater than operation which indicates the node on LHS is the Superior of the node on RHS ?? - is the ‘Same’ operator which indicates the node on LHS is the same node as on RHS <-- - is the Extension of rights operator. The node on RHS has granted the view right to the node on the LHS ; - is the True If operator. The Rule on LHS of the semicolon is true if the condition on RHS is satisfied / - stands for OR Node (Can View) Information Piece OR Node (Can Copy) Information Piece We have aligned the View and Copy operations at the same level of privilege. In SIF, if you can view it, you can copy it to create a new Information piece. Based on the Privileges/Constraints discussed above, there are three scenarios under which a node can view and/or copy an Information piece: a. Requestor node is Superior to the Owner node NR > NO (1) b. Requestor node is owner of the Information piece itself NR ? NO (2) c. Requestor node is subordinate to the owner node, but a superior node extends it the right to view the Information. Given NO>NR and NE>NO Requestor can view Information if: NR <-- NE (3) In summary, the above three scenarios can be combined in OR logic: node (Can View/Copy) Information Piece If NR > NO | NR ? NO | NR <-- NE (4) Node (Can Edit) Information Piece OR Node (Can Delete) Information Piece The Edit and Delete operations are possible only by the Owner node of the Information, as represented in (2). Node (Can Not Access ) Information Piece In this scenario, the requestor node is neither senior nor the owner of the Information and none of the superior nodes has extended the view rights to the requestor node. In such a scenario, a requestor node can request a senior node to grant view access, after which the requestor node will be in compliance with Rule ‘c’. 3. SIF MODELING The previous sections have detailed the ontological conceptualization of SIF components, Nodes and Information, and the Information Flow Rules formalized using ontological relationships. This structure sets the stage for the SIF model of an organization’s Information flow, as well its triggers. The hypothetical organization depicted in Figure 4 illustrates how an organization’s Information flow infrastructure is modeled in SIF. In this organization, A is the manager who owns Information piece L, and subordinates B, C and D own Information pieces I, J and K respectively. This hypothetical organization is modeled as shown in Figure 5. Figure 4. Hypothetical Organization Figure 5. Hypothetical Organization mapped to SIF. In Figure 5, dotted cylinders represent the Information Banks of the respective nodes, and the ameba-shaped structure is the Information Aggregate. Only the senior has an Information Aggregate, as only the senior has subordinates and access to their Information Banks. While the above organization and its SIF model are over-simplified, the premise and basic elements translate conceptually to N times bigger and more complex organizations. The beauty of the design is, in fact, its simplicity, which allows for great versatility. Events in SIF Figure 5 represents the state of the organization at time t as mapped in SIF. Any change in the Information or node component, such as the creation of new Information, is called an Event and will trigger this state to change, called State Transition. Events that lead to State Transition, or transition events, when diagrammed as above, are the SIF mappings of the day-to-day action in an organization. Formally, the generic representation of a transitions event in SIF is represented and explained below: (5) where: * symbolizes the Transition caused by the Event in state component X. For example, a transition caused by the change in the node is represented by * and are the States at time ‘t‘ and ‘t+1,’ respectively * is the event that caused the change in any state component. The list of Events in SIF that can cause State Transition and their granular formal representation is presented below. A full visual representation schema representing the state transitions will be demonstrated only one time due to the constraints of this space. Node Based Events Any change in the reporting structure of the organization results in a change in the SIF model. For example, a new role created in the organization prompts a new node in the SIF model. Any State Transition caused by a node-based Event is generally represented as: (6) Where N is the node under consideration and ; Alpha stands for Addition, while Beta stands for the Deletion action. The remainder of the symbols are defined as above. If a new node J is added to the model, the formal representation appears as follows: (7) where the generic node modifier is replaced by symbol ‘alpha’, which indicates the action of adding a node. J represents the particular node added to the model. Similar to Addition, the event of the Deletion of node J is captured in notational language as: (8) where symbol ‘Beta’ () indicates the act of Deleting the node J. Other role changes and transfers seemingly not addressed by Additions and Deletions can easily be performed by breaking each change down into a series of deletions and re-additions of Nodes. Information-Based Events Information is the most fluid of all the SIF components, and is expected to change constantly as the organization goes through its daily functions. These Information-based events are a more dynamic element of SIF. Any State Transition caused by an Information-based Event is generally represented as: (9) Where symbol indicates an Information-based event. symbolizes that action is being performed on specific Information piece I, owned by node N. ; Alpha stands for Addition, Mu stands for Editing action, while Beta stands for the Deletion action. Below, the SIF Model demonstrates how State Transitions are impacted by Information-based Events. Addition – The act of addition itself can originate from two sources, Capturing and Creation, as discussed in the Information section. If an Information piece P is created/captured at a node J, this transition event is represented as: (10) Where the symbols are defined as above. Editing Information - Editing refers to a change in an existing Information piece or its attribute. A list of Information attributes is discussed with the ontology description of Information. If node J extends its ownership right for Information piece P to the node K, this transition generates a two-fold effect: a) The owner of the Information P is changed to K. b) The new owner is now reflected in Information attributes. This transition is represented by: (11) Information attribute change has a much broader application than it may at first appear, because it captures the change in the privileges of a particular Information piece and, hence, is actually a mechanism to implement the Rules of Information Flow. Whether it is a View privilege granted by a senior node, or a node passing its Ownership right over an Information piece to another node, it is all captured in the Information attributes. Any change in the attributes is an Information Editing event because we believe that Information attributes are an integral part of any Information Piece. Eventually, when any node attempts to access any Information piece, the Information attributes will determine if the access operation is allowed or restricted. Deleting Information - This Event occurs when existing Information is deleted by the Owner. For instance, if node K decides to delete the Information piece P, this Event will be represented as: (12) where Beta stands for the Deletion action and rest of the symbols are defined as above. As a consequence of an Information based event, appropriate change agents, through implicit invocation, effect a corresponding Information Flow/reorganization. Visual Representation Thus far, the formal language has captured the components undergoing transition, how SIF works – the components of SIF, how they come together to model an organization, what are the states and events in the model, and how the state transformation can be represented in a compact, formal manner. Going forward, we provide a visual representation of the model as a whole at any given time t. The presented visual schema can represent the whole organization or a subset of it at given time t and time t+1 following an event. Thus, visual representation gives us the whole story of: a) what the organization looked like before the transition event, b) what event triggered the transition and c) what is the new state of the organization at time t+1. For the purpose of demonstration, let’s assume that the SIF model in Figure 5 represents an organization at time t = 8:00. We further assume that at t = 8:10 a new role joins the organization, node F, who will be subordinate to the role represented by node D. And we assume that node F begins the first day at work with an initial set of Information it owns represented by M. As the visual representation shows, the addition of node F to the organization triggers two changes: a) New node is introduced in the model, formally represented by: (13) b) New Information M, is added to the model, represented by: (14) Figure 6 in the Appendices represents the state transition caused by the new role joining the organization, as explained in the example. Notice that in State t=8:10, after the new node is created, its initial set of Information M is reflected in the Information Aggregates of its direct senior nodes (D and A). 4. OTHER CONSIDERATIONS AND CONCLUSION The scope and length of this paper do not allow a detailed discussion of the many considerations accommodated in the design and refinement of the SIF model. Below is a brief discussion of some of the considerations worth mentioning for the purpose of this conference. SIF’s Event Driven Architecture In this paper, we have seen how SIF proposes a conceptual Information flow infrastructure for a hierarchically oriented organization. The node and Information configuration in SIF suggests an Implicit Event Driven Architecture. Each node should be equipped with a procedure to identify any change in its disposition within the SIF model, such as an Information addition at the node. Once the event is identified at the node, it should trigger other procedures to propagate the change throughout the model, such as the change in Information Aggregates discussed in the Visual representation example Figure 6, and others. We believe the SIF model naturally lends itself to an Implicit Event Driven Architecture implementation. Universal Information Pool While a critical feature of the SIF model is Restricted Information, and the rules to access it, equally important to the model is the management and accessibility of Unrestricted Information. Free access to Unrestricted Information by all internal and external domain entities is made possible by a pragmatic mechanism in SIF called the Universal Information Pooler. Interfaces in SIF Interfaces are the mechanism in SIF that implement the rules- and constraint-based Information flow. Interfaces reside with every node and act as a logical gateway for all the Information being accessed from the node. The logic that renders SIF intelligent, capable of determining which node can access what Information, resides in the Interfaces. While we have conceptualized the SIF model, Interfaces cannot be so generally represented. Interfaces capture the unique processes of the organization, and their construction must be contingent upon the implementation considerations. Contribution By defining the theory of SIF, through an ontological conceptualization, we have defined a generic, simple, and robust Information flow approach, applicable across platforms. More specific contributions of this body of work are detailed below. Model for Information Flow The theory of SIF offers, for the first time, an approach to automate the Information flow using an organization’s own work/reporting structure, roles, and Information to define the automation model. We have conceptualized the Model using an ontology that defines the Information flow infrastructure of complex hierarchical organizations of any size. This conceptual infrastructure enables the rules-based Information flow within the organization, as well as Information exchange with external entities. Notational Language and Visual Representation The ontological conceptualization of the SIF approach put forward in this body of work is supported by the formal language that represents State Transitions and Transition Events. This language, specially developed for the SIF model, is a compact way to capture events. Visual representation, on the other hand, gives a full view of the SIF model before and after the state transition. Both the Notational and Visual representations provide a powerful means to model, understand, and even simulate Information flow infrastructure prior to implementation. Conclusion In conclusion, we believe that the SIF approach proposed in this paper provides seamless connections between entities within and outside of an organization by automating Information flow, supporting a paperless office for simple to complex organizations. 5. REFERENCES Bayer, S. (1996) “Embedding Speech In Web Interfaces,” Proceedings of ICSLP, October, pp. 1684-1688. Bhatia, S. “Structured Information Flow (SIF) Framework for Automating End-to-End Information Flow for Large Organizations.” 2006. Boisot, M. and A. Canals. “Data, Information and Knowledge: Have We Got It Right?” Liu, Z. and D. G. Stork. “Is Paperless Really More?” Communications of the ACM, November 2000: 94 – 97. Moraga, M. A., C. Calero and M. Piattini, (2006) “Ontology Driven Definition of a Portlet Functionality Model.” Proceedings of EDOC '06, 10th IEEE International, October, pp. 405-408. “Ontology.” 2007 Petrakis, J.M., and M. J. Engiles (2000) “Creating a Paperless Municipal Court.” Proceedings of WSC, pp. 2029-2035. Smart, K.L. (1995) ”The Paperless Office: Facts and Fictions.” Proceedings IPCC, September, pp. 141. Tao, T., J. Yang and H. Jia (2006) “Business Collaboration Development: A Case Study in Capital Market.” Proceedings EDOC '06 10th IEEE International, October, pp. 449-452. w:compa