Massive Content, Validation & Serverless: Cloud Expo 2016 Recap

Cloud Expo Banner

The Cloud Expo was held June 7-9, 2016 in New York City, and Iron.io sent a team to present our vision for the future, collaborate with other attendees and answer questions. Below is a summary of three technical sessions representative of the Containers track at the conference:

Scaling Massive Content Stores in the Cloud

John-Newton-sm

John Newton, Alfresco Founder & CTO

Alfresco manages the most important content: documents and web pages that are regulated, secure and require a set of controls around the content. A billion documents is quite a bit. An enterprise content management system tends to handle a million documents are more. Companies have information problems well in excess of petabytes of data.

What are Enterprises Doing to Create That Much Data?

  • Medical and personnel records. Each of us in our lifetime generates thousands of documents.
  • Transportation and logistics. Fedex stores about a billion slips each year
  • Government Records & Archives
  • Claims & Case Processing
  • Research & Analysis
  • Real-time Video. Police forces around the world are storing video of everything that happens on a daily basis.
  • Internet of Things
  • Discovery & Litigation
  • Loans & Processes

At Alfresco, we call these things “content” — some may call them files, or documents. We want to store not just the documents and files but the metadata associated with them: Security, relationships between content, authorization and authentication, full-text search, multiple types of representation. This rich data model is stored not just in the database but in a full-text index.

Traditional Approaches

If we look at how traditional software has dealt with document storage, we’ve seen poor practices as far as redundancy, scalability, geographic distribution, and the lack of agility involved in setting up and modifying repositories; even the setup process alone can take years.

Next Generation Relational Approaches

Amazon has created a version of MySQL that’s designed to take advantage of Amazon’s infrastructure, using as much memory as possible and distributed geographically. I can appreciate the no-fail architecture they have created. There’s always a constant backup, and each of the nodes is constantly checking on the others through a quorum algorithm to detect any failed node. This leads to a resilient database infrastructure — and you can do this with any system.

It’s a lot cheaper and more effective to build out redundant enterprise CMSes this way — we’ve found a 10x improvement in performance. You don’t have to build a big huge underground store for all your company’s information.

How to Transform Your Cloud Validation Strategy from Cloudy to Clear

vandana

Vandana Viswanathan, Associate Director, Process and Quality Consulting, Cognizant

As applications are moving from on-prem to the cloud, there are questions about being able to maintain the same level of compliance and face inspections with the same level of rigor as our current on-prem applications.

Cloud Hosting Overview & Challenges

We’re all familiar with the IaaS, PaaS and Saas cloud service delivery model. In an IaaS model, the provider takes responsibility for the virtualization and hardware; in PaaS the provider takes responsiblity for the platform, and finally in the SaaS model the application is the responsibility of the provider as well.

Clients run into issues with cloud adoption, such as:

  • Regulatory compliance (e.g. Sox, FDA, HIPAA)
  • Data security and privacy (PHI,) Reliability and Performance, Governance & C hange Management, Legacy IT, Esisting investment in On-Prem IT
    Lack of resources/expertise

Regulated, Non-Regulated and Business Critical Applications

Regulated applications must meet regulatory requirements, such as SOX, HIPAA and GxP for banking and financial services, health care, and life sciences, respectively.

Additionally, business and software requirements need to be captured and tested based on business risks. Specifically, for regulated applications, we will need to check for impacts on patient safety, product quality, good lab/manufacturing practices, electronic records and signatures regulations, and FDA inspections.

Cloud Provider Assessment

You have to assess your cloud provider in terms of what is their SDLC and is it on-par with the client SDLC. You need to provide a gap analysis between the two and provide a report to the client with recommended remediation alternatives. At that point, you’ll end up at the contract stage with the cloud provider.

This assessment will be done on security & privacy policies & controls, operating procedures, SDLC and inspection support. Inspection support is important, because the cloud provider needs to be able to provide sufficient documentation to an external inspector, such as the FDA, to prove that the application is secure.

Achieving a Serverless Developer Experience

Ivan Dwyer

Ivan Dwyer – Head of Business Development, Iron.io @fortyfivan

We’re in a very interesting time right now of digital transformation, where businesses have to deliver continuous innovation to customers, and developers thereby have to deliver continuous innovation to their companies. Developer teams are under a lot of pressure. We demand faster time-to-market, shorter release cycles, and resilient systems at scale. How can we empower our developers to be more productive, but do it in a sustainable way within reasonable bounds?

Microservices, if architected properly, enable independent workloads, which allows us to move away from the tightly-coupled architectures of the past. In addition, containers now enable a portable runtime. Pair that with DevOps best practices around automation and reactive architecture and you enable a serverless developer experience.

Defining Serverless

Here’s an attempt to define a serverless architecture:

An application architecture that enables single purpose jobs
to be packaged independently as portable units of compute
that execute through a fully automated pipeline
when a predetermined event occurs.

Common Use Cases

Five years of running our platform in production has allowed us to see a lot of use cases.

  • A back-end transaction that you want to remove from the user experience: credit card processing, sending an email, and bots
  • Batch processing such as multiple file processing, crunching a big data set, or multi-step workflows.
  • Data pipelines such as ETL pipelines, processing machine data and custom CI/CD pipelines.
  • Scheduled jobs which are still a staple of many applications, including daily notifications and database scrubs.

The Serverless Developer Experience with Iron.io

After building your single-purpose function in a docker image, you can upload it to an image registry, set your event triggers and configure runtime parameters, and inspect the processing results. There’s no need to provision the resources, configure the system or manage the components.

White Paper

For a deep dive into the growing serverless landscape, download Ivan’s white paper: Serverless Computing: Developer Empowerment Reaches New Heights