.
i4i's patent which Microsoft has been convicted of infringing in Word 2007, is about a generational leap in the capability of computers to process data.
- HTML as the first generation. If you wanted the author's name to be seen in bold, you used a descriptive tag thus: bold text
- SGML was the next generation: it operated with a special complement of descriptive "verbs" or "tags" and software that could process and interpret these efficiently. The advantage of such an approach was that different stylesheets could depict the same text differently. This promoted re-usability of data. Problem lay in the restricted number of tags.
- XML broke this limitation of SGML; now, one could create one's own tags; browsers could render it so long as it was "well-formed", i.e. close-tags followed open-tags predictably within the document, and there was no open-tag without a corresponding close-tag, or vice versa. Further condensed and sophisticated logic could be imposed on the document structure through the use of rules of logic and structure embedded in Document Type Definitions (DTD) or Schema. This provided another breakthrough in terms of the range of applications -- no longer did one need standard ERP software in order to exchange data; XML coders and decoders did the job, and organisations could merely exchange xml files representing transactional data independent of database software. Thus markup languages made data interchange possible easily and cheaply. Many other applications were developed that made xml a development of nearly revoltionary proportions.
- However, with all these developments, tags (which are commands to the computer) were interspersed with the data. This meant that when reading the data stream, the computer had to first apply logic to gather whether each character it read was part of a data stream or a command. This slowed down the ability of the computer to read and process a document or an object, While this may not be apparent on the scale of data that most of us are used to dealing with, where there are mountains of data to process, this is a serious time-and-efficiency robber.
- This is where the elegant concept of i4i's patent comes in. If there is a way in which commands are interpreted independent of the content, the computer can read all the content at one go and process the content by implementing the commands in serial order. In other words, if all the commands in a data object ("file") were to be are found in one place and all content in another, the computer no longer needs to evaluate every character to find if it was part of a command or content. This affords a huge, generational efficiency leap. For the same computing power, a lot more data can be crunched in much less time. In effect, this could make computing power cheaper by raising the efficiency with which computers process information/ data.