Open Source Commitment

Diving into Open Source requires making a commitment and can lead you to places you might not expect. Here are some lessons learnt on the job.

Open Source Commitment

Philip K. Dick asked, "Do Androids Dream of Electric Sheep?".

With Digitial Transformation, Industry 4.0 and drive for Intelligent Automation where skills in Information & Communications Technology (ICT) are seen as essential is Open Source Adoption  also a key element of digital strategy ?

In the world of "Digital":

  • do industry leaders dream of programing ?
  • are software skills an essential capability ? and
  • is Open Source adoption an essential element of digital strategy ?

Your approach is likely to be dictated by combination of priorities and skills, as while Open Source is seen by some as "free", its adoption requires signficant commitment. Here is a summary of personal journey with Open Source over the last few years.


The Context

There is a mantra in IT of "working software" over documentation, which gets erroneously interpreted as lack of need for requirements and architecture. My response is that there should be "just enough". How much documentation is needed will depend on:

  • Domain knowledge of problem space that the development team has,
  • Complexity & Scale of task at hand and
  • Approach to project risk management.

As an enterprise architect I believe it is easy to over specifiy and also if out of touch with current technology to significantly under or over estimate the effort & cost of achieving the result.

I have seen first hand how organisation have struggled with transition to software centric "new ways of working" and strategy execution that includes: "cloud first", "leveraging and integrating Open Source (rather than using and integrating enterprise software packages)" and "having software and integration skills as key comptency".

Symptoms include: large number of coaches (from the same suppliers they were looking to supplant), requesting suppliers to act as agents for their Open Source needs, a lot of sabre rattling within the IT leadership team but seeing the adoption of strategy always being pushed a long way down the reporting chain and failure to invest in skills uplift of their own staff.

To counter this this I believe that organisations should:

  • Organise IT sytems ownership based on operational processes and KPIs they support
  • Adopt and contribute to Open Source directly and not by proxy,
  • Recognise that Software as a Service (SaaS) is the new commerical enterprise software,
  • Leverage Infrastructure as a Service (IaaS) based on cost benefits
  • Ground teams in software practice and nurture creating T model leaders (breadth with depth in specialist domain).

The Practice

In mid 2017 I got to the point where I had spent nearly 12 months immersed in future telecommunications architecture, built around Software Defined Networks (SDN) and Network Function Virtualisation (NFV). It became obvious that this next generation of networking was not a priority for an organisation that was focused on completiong the build of its old generation of networks, so it was time for a move.

I moved to new role with belief that to build a next generation autonomic and programmable networks would require strong IT development skills and this should be done using a devops model to drive clear operationally targetted goals.

Inventory and Telemetry are of central importance to realising goals of automatic service design and closed loop automation. With a traditional TMForum (Fulfilment, Assurance, Billing) layered view of the world and using enterpriese software (from the likes of Oracle, IBM, BMC, AMDOC, Nokia and Cisco) this is hard to achieve. The new stack for SDN, NFV and NFVi (NFV Infrastructure) included: Linux Kernel Virtual Machine (KVM), OpenStack, Kubernetes, Kafka, Contrail, Collectd, ArangoDB, Cassandra, Prometheus and lots of other Open Source technologies.

So there is a steep learning curve to take the strategy and architecture from paper to practice. To help I got a couple of multi-core server machines as I needed to validate and prove the NFVi telemetry architecture.

The telemetry architecture was defined around using the Collectd framework for Linux KVM hosts with Fluentd providing equivalent capabilities for Kubernetes based infrastructure monitoring.


The Lessons

Success with rapid software development is achieved by success with testing. So it is very important that your development has required enviornment available to allow testing. My starting point was to validate the use of Collectd to provide telemetry feeds to central Kafka collection hub. This allows "agentless monitoring" of the running virtual machines (VMs) as the Collectd agent is running on the hypervisor host, not within the VM guest.

The architecture is outlined in my blog post "Trams, Telemetry, Agile & Architecture".

In initial testing I used the main KVM hosting machine for testing. This quickly proved to be non-viable, so alternative were to: use the second host as test machine or establish nested virtualisation.

Using Ubuntu 18.04 I quickly hit bug with collectd, so picked up the Open Source adopt and contribute model rather than use a proxy stance. The result was uncovering of a number of other bugs and need to push the community very hard to get upstream updates back into the Ubuntu distribution (via Debian).

As part of bug discovery and testing I needed a FreeBSD testing environment.  So again I used KVM and nested virtualisation, but now with FreeBSD and its bhyve hypervisor.

The diversion to FreeBSD uncovered a further set of bugs in FreeBSD that had to be addressed to allow regression testing of "collectd" across Ubuntu and FreeBSD, as the "collectd" community had RedHat regression testing already covered.

In summary the work with collectd took over 9 months to flow from bug discovery to fix through to Ubuntu 19.04 release, but a further 9 months to get various bugs fixed for the FreeBSD regression testing environment . The number of defects found and fixed included:

  • Collected - 3 Bugs and 24 Commits (from May 2019 - Nov 2019) and Debian Pakaging Push to upstream into Ubuntu
  • FreeBSD - 4 Bugs and 2 Fixes - one for KVM Networking and one for bhyve hypervisor
  • Libvirt - 1 Bug and 1 Fix - to address controlling FreeBSD bhyve hypervisor via Libvirt
  • Ubuntu - 1 Linux Kernel Bug and 1 Fix

So the venture into "collectd" expanded to quite a few additional projects and areas, each of which required delving into the code to find source of bugs and fixes. The surprise was that there were signficant bugs in quite mature software (even Linux Kernel), as all this software is under continuous evolution. The other thing was that initial investment (in time / effort) in the testing framework was of equivalent size to the "working software" investment.

By taking time to pin point and document the defect you will get good support from the community and based on the number of bugs, it is obvious that some projects need more contributors and testers.

The time between starting and getting final "working software" flowing from up stream into releases can be longer then you might have initially expected.

The management of the major Linux distributions like Ubuntu provide very interesting illustrations of large and complex CI/CD tool chains and working on kernel bug and getting collectd up stream updates flowing into Ubuntu distribution provided an opportunity to see this in action.


The Working Software and a Bit More

Working on Open Source projects contributed to getting "working software" with: Ubuntu 20.04 now including upated Collectd, Ubuntu 19.10 having fixed kernal cursor issue, LibVirt providing better control of bhyve hypervisor and FreeBSD 11.4 & 12.2 networking working out of the box with KVM.

But there were other useful outcomes including:

  • building relationships within Open Source community,
  • documenting findings and result to help others with similar issues and
  • keeping my own technical skills fresh across a number of areas.

The work and effort also highlights that working with Open Source requires commitment and persistence. It is unlikely that you or your organisation will be successful with Open Source without similar commitment.

As a professional does this make you more or less employable as an IT architect or strategist ?

That is is an open question, but I believe it does make you more informed and will give you more relevancy in todays IT environment. I strongly believe that this is no such thing as being over qualified and continuing to learn and exercise the mind is essential for long term mental health.


Final Thoughts

In the world of "Digital"

  • Do industry leaders dream of programing ?

The S&P 500 now has Information Technology as the biggest industry constituent (27%) and when combined with Communications Services the total is 38% of market by industry.

But even if you look at the organisation you work for, you will find many leaders who are not programmer, though they might be realistic dreamers and they are likely to be looking to hire highly technically savy architects and developers (develoopers). So there is still plenty of opportunities out there beyond programming.

This might be obvious to most but sometimes we IT people do live in a bit of a bubble.  It is good to bring some balance and perspective, as a good architect should ;-) .

The S&P 500 by Industry - IT & Communiations is aroung 38% of Market

NOTE: Industry Breakdown from "New York Life Investments".

  • Are software skills an essential capability ?

In IT and Communication, which are intrinsically engineering based then yes and more so as we move more and more into digital markets and API based eco-systems, which inevitiably span across all industries (see above S&P 500 breakdown and note that Amazon is in the "Consumer Discretionary" bucket).

  • Is Open Source adoption an essential element of digital strategy ?

Adoption of Open Source is sound but it requires commitment, do not expect to something for nothing. It ia also good for your employee's as it allows then to contribute to a larger universe and interact with a broader range of people in the industry. At the same time you get to leverage a much larger community than you would if you tried to go it alone. I believe that the Open Source communities now play the role of acting as gathering point for common needs than was previously the domain of the enterprise application providers and their product managers. Also it is important to recognise that there is a significant effort required to integrate and validate Open Source based solutions.

Most Software as a Service products are built from Open Source, so if you want Open Source with support then this is available through SaaS packaging and AWS and Azure have many of these available for easy consumption.


Thanks

I work as an Architect for Ericsson which is leading the world in development of 5G networks. Ericsson is both a producer of proprietory communications hardware and software and a large adopter and contributor to Open Source projects. 5G is great example of a software centre products, being build as cloud native software to run on cloud infrastructure.

Ericsson is an company in the ICT business that sees a much broader picture of the role of communication on the world and that perspective is part of its corporate DNA. The Ericsson leadership dreams of more than programming, but it also recognises that software is part of its core technical competency.

NOTE: This a personal blog and does represent in any way views of Ericsson.


References & Links:

Manifesto Agile Software Development - a set of values and principle and for the development of software. Do not get to hung up on the "manifesto" or as someone once said to me, you need to strategy to catch a mouse and to build a house.

Accenture Industry X - we are now moving on from Industry 4.0

ESTI NFV Architecture - Lots of paper and specifications on Network Function Virtualisation that provided blue print for NFV and NFVi

TmForum Frameworx - The TmForum OpenAPIs still have strong foundations in the Framworx models. Has had a huge impact in the way traditional OSS/BSS systems have been build for over 20 years. This is now changing with move to Open Source, SDN, NFV & NFVi and APIs.

"Trams, Telemetry, Agile & Architecture" - my blog on NFVi Telemetry

IBM Autonomic Computing - Much of what we aim for with Telco Orchestration and Closed Loop Automation is consistent with what IBM defined as "Autonomic Computing" back in 2001. I just happen to remember it as I worked for IBM at the time.

Collectd github home - a framework for collecting metrics from anything ;-) with architecture to send these to central connection point. The starting point for my Open Source adventure.

Nested Virtualisation with KVM and Ubuntu - another problem to solve to get working test environment for collectd.

Nested FreeBSD Virtualisation with bhyve - FreeBSD is much less trodden path than Linux and especially with KVM as host and wanting to use bhyve as guest and host in layer 1 of nested hypervisor deployment. Found some bugs here, but testing much appreciated and bhyve guru Peter Grehan extremely helpful in resolving these.

LibVirt - a hypervisor and emulation abstraction layer, which includes support for FreeBSD bhyve hypervisor. Again very help community support in getting bug fixed and flowing through to release. Thank you Roman Bogorodskiy.

FreeBSD - BSD Unix is the spiritual leader of Open Source. Unfortunately the FreeBSD project has a bit of a blind spot for KVM so this was a source of many bugs, all reported via Bugzilla. The most important being the one that broke networking with KVM. Finding the root cause of this took a long long time, but is now fixed and released in 11.4 and 12.2.

"New York Life Investments" - S&P 500 by Industry is available for purchase. Here is another view of the S&P 500 from the point of Market Capitalisation share rather than Industry which show 50% of market capitalisation is in: Apple, Amazon, Alphabet (Google), Microsoft and Facebook (ie the Tech Titans).

ASX By Sector - I did not use Australian Stock Exchange by sector in this analysis as it paints a vastly differnt picture.

Using AWS to support Security Scanning - my testing included using AWS based security scanner and security scaning should be part of CI/CD practice and validation.