M/MUMPS-4b

From VistApedia
Revision as of 20:15, 21 October 2022 by DavidWhitten (talk | contribs) (Response)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

From The Roots to InterSystem

https://community.intersystems.com/post/roots-intersystems

Robert Cemper · Aug 28, 2017  3m read
From The Roots to InterSystems
#Caché #InterSystems IRIS

This is a  rather personal view on the history before Caché.

It is in no sense meant to compete with the excellent books from Mike Kadow  discussed in an earlier article. We have different history and so this is meant to create a different prospective of the past.

The whole story started in 1966 at MGH (Mass.General Hospital) on a PDP-7 Ser.#103 with 8K of memory (18-bit words) [ today = 18K byte ]  as a spare system.

"Serial Number 103 - was located in the basement of the now demolished Thayer Building, currently [2014] the site of the Cox Cancer Center at MGH."

"Neil Papparlardo and Curt Marble under the guidance of Octo Barnett developed and released the initial software on this machine."  
They named it MUMPS.  (source) </pre>
http://www.soemtron.org/pdp7no103systeminfo.html

The language itself was rather close to old style Basic.
But there were remarkable improvements over other programming languages:

The big idea was to store and retrieve persistent data without the need to deal with a file system. This was an enormous step forward at that time compared to other systems where storing and reading persistent data could easily take 30%+  of your available memory and no idea if sorting, indexing, ....
No strong data types anymore or data types imposed by names (ALGOL, FORTRAN, ..) an endless source for formal errors and conversions.

Dynamic (sparse) arrays without frozen structure and pre-allocated half empty  space in memory

Indexing persistent data with variable length structured indices  (subscripts) allowing easy sorting, grouping, subgrouping,..

You may want to compare it to old code in COBOL, FORTRAN or PL/1 to estimate the dimension of that  revolution.

The new software took its way along fast moving hardware development until it reached PDP-11 and was finally known as MUMPS 4b.

  • 1978 was a remarkable year:
    InterSystems was founded by Terry Ragon
    DEC rolled out its first VAX-11 Cluster (at Carnegie Mellon ?)
    DEC completed DSM-11 (Digital Standard Mumps) :
    Besides following the rather fresh standard it had new Global Module
    that improved storage performance radically.
    It easily outperformed any other DataBase named product by magnitudes.
    The author of this Global module was a brilliant engineer with international experience: Terry Ragon.
I myself joined DEC also in 1978 as Sales and Support Engineer for DSM-11 meeting Terry at the first support training in Maynard.

DEC at that time was completely high with the new VAX-11 and the VAX-Cluster.

The new high performing DB was ignored and its power totally misunderstood.

All requests from software developers to have DSM native on VAX to take advantage of the new box were ignored.

This persistent ignorance of customer requests was the base to encourage a customer of mine to invite me:
"If they don't do it join us and we will do it !" [How often have you got the offer to write an OS like this from scratch?]
I just couldn't resist and I joined and we wrote it up from point zero on bare boned VAX-750.
The OS was named VISOS and lived as long as the supported VAX models existed.
Some time later DEC presented DSM as layered product on top of VMS. In the beginning performance was dictated by the underlying RMS and didn't reflect the gain in processing power.   It moved out of my scope and I didn't care about anymore. Years later the best on my opinion that DEC did was: They sold its unloved product DSM to InterSystems. Not too long before they were sold themselves.

When I joined InterSystems 20 years later I found in Caché again so many details I had implement myself. So I could enjoy a very warm feeling of being at home.

Caché is today far far away from all its predecessors but still source compatible. The power of Globals is still there. There might be only few constellations where you can't outperform a competing DB. 

My favorite example out of many others:
GAIA Project run by European Space Agency (ESA)

https://www.intersystems.com/de/library/library-item/european-space-agency-chooses-intersystems-cach-database-for-gaia-mission-to-map-milky-way/

This is obviously a quite personal prospective into technologic history and part of a personal story. If you have questions or feel the need to correct me you are welcome. With my location in Vienna (Austria) I always had the impression to watch decisions in Cambridge, Maynard, Boston from far far  away at the border of the Milky Way. wink

Response Rich Taylor · Aug 29, 2017 to Robert Cemper

Robert,

Great history lesson!  I have a question for you though.  As you were there at the beginning or close to it perhaps you might have some insight.  I came from a background in MultiValued databases (aka PICK, Universe, Unidata) joining InterSystems in 2008 when they were pushing Cache's ability to migrate those systems.
From the beginning I was amazed at the parallel evolution of both platforms.  In fact when I was preparing for my first interviews, having not heard of Cache before, I thought it was some derivative of PICK. Conceptually both MUMPS and PICK share a lot of commonality.  Differing in implementation of course.  I have long harbored the belief that there had to be some common heritage. 
Some white papers or other IP that influenced both.   Would you have any knowledge on the how the original developers of MUMPS arrived at the design concepts they embraced?  Does the name Don Nelson ring a bell?

Thanks again for the history.

Response Robert Cemper Sep 10, 2017 to Rich Taylor

Hi Rich,
I remember we met several times @internal meetings and @Devcon / Summit. The common branch of M and MV might be The Ubiquitous B-Tree - 1979 by Dougles Comer. On the other hand mid 60ties it was time to have something new to support creative and faster development. So they might quite well taken ideas from each other. So as you find many lingual constructs that are pretty similar to Java.

Don Nelson didn't pass my way. But I have a personal gap from '85 to '99 where I was on a complete different road.

Response Athanassios Hatzis to Robert Cemper · Sep 9, 2017

Hi Robert, 
thank you for sharing with the rest of us this great piece of computer history. They say life is making circles. I believe it's about time for MUMPS to make history again in database management and database modeling with Associative Semiotic Hypergraph engine ( http://healis.eu/r3dm_project/post000109/ )  build on top of Intersystems Cache globals and a powerful OOP API in Python for data analytics. Stay tuned, I am fond of old pioneers of computer technology and I do respect a lot their efforts and strangle of their time. We build powerful meaningful relationships easily ;-)

Robert Cemper Sep 10, 2017 to Athanassios Hatzis

Thank you Athanassios! 
I see these cycles everywhere. Almost every relational DB today has its B-tree index. Well known here since DSM-11.
Similar when I did an evaluation of HBase and scratched a little bit under the surface : I found a tiny Global structure

with limited subscripts.

You are right there is a lot of power left for new development on this base. 

Athanassios Hatzis Sep 10, 2017 to Robert Cemper

And have you noticed that what ever the model and data structure we cannot escape from the fundamental principle of managing data allocation space with references, i.e. pointer based logic, memory addressing ? Isn't this the fundamental mechanism of programming languages too ? The problem I see with all these modern NOSQL databases, especially graph databases is that they provide a higher level abstraction for the end developer but they hide and lock completely the access to the low level storage and retrieval mechanism including indexes. Even in key-value stores you cannot see or understand the sorting of indexes, you cannot easily reference data values.

Transparency in computer science is a huge issue. Wizards and pioneers of computer hardware and software, have created multiple abstraction layers and here comes the next generation that is asked to program the machine without understanding what is going on underneath. And even if there is such a desire, the environment, the language and the tools, DO NOT help towards this direction. Intersystems cache does make the difference from many aspects. There is a built-in database  with subscripted arrays and multi-dimensional keys similar to the variables used by most programming languages to access main memory.
Let me repeat this, a programming language MUMPS-Cache objectscript with a built-in database. I think this is a fundamental aspect that they have been missing when others invented new programming languages. They are missing the innate common characteristic that both databases and programming languages share which is the pointer, reference based logic. So I believe it's time to return back and fix this for new generation databases AND post-modern programming languages too. What do you think ;-0
Rob Tweed Sep 11, 2017 to Athanassios Hatzis

Let me repeat this, a programming language MUMPS-Cache objectscript with a built-in database. I think this is a fundamental aspect that they have been missing when others invented new programming languages. They are missing the innate common characteristic that both databases and programming languages share which is the pointer, reference based logic. So I believe it's time to return back and fix this for new generation databases AND post-modern programming languages too.

This is a core part of the QEWD.js project: to make JavaScript a first-class language for Global Storage databases - and therefore give JavaScript a built-in database.

The cache.node module provides the high-performance in-process connection needed to allow the intimate relationship between JavaScript and the Cache database engine. The ewd-document-store module aims to provide the JavaScript equivalent of the ^ in COS (ie blurring the distinction between in-memory and on-disk JavaScript objects).

JavaScript's dynamic, schemaless objects are a perfect fit with the dynamic, schemaless nature of Global Storage, making it an ideal modern substitute language instead of COS.

For more information see the online tutorial at http://docs.qewdjs.com/qewd_training.html - specifically parts 17 - 27

Athanassios Hatzis Sep 12, 2017 to Rob Tweed

Hi Rob, thank you for the update on your QEWD project. One of the main reasons I chose Python as the binding language for Cache in our project instead of Javascript is that in data analysis, data science area the first is already well established, extremely popular and there is big momentum on developing further a vast collection of tools and libraries that extent the language. But the second clearly wins the battle in web platform development. I am sure I will definitely re-visit your project and perhaps ask you to collaborate when I reach the stage of developing the front end and/or another client API. For our readers I must also mention that your article on a universal NoSQL engine using Globals is a must read for anyone that wants to understand the power of multi-dimensional, schema-free, hierarchically structured, sparse, dynamic arrays, i.e. Global Storage databases. And for the history it appeared right at the birth (re-birth) of NoSQL movement back in 2010 ;-)

Robert Cemper · Sep 10, 2017

You hit the point:

Transparency is important. Not be forced to use it but as an offer to developers to make the underlying mechanics visible