security and responsibility
5 ways we protect your training mission and the data that drives it.
Let's face it: Even the most sophisticated performance analytics solution has little value if you aren't confident in the platform's security. At Metacog, security concerns are foundational. We developed our advanced performance analytics tools and processes with layers of protocols designed to keep your data safe and your mission on track. Read on and click through for the detail below. If you'd like more information on how we can support your training mission securely, we would be glad to meet.
AWS + 5 Layers of Security
We built Metacog to run within Amazon Web Services and take advantage of AWS’s Shared Responsibility model. In that model, AWS takes care of physical and infrastructure security (security of the Cloud), and Metacog takes care of application security (security in the Cloud). In addition, Metacog builds on this security model in five ways:
-
1. Services Over Servers
Many companies “stand up” virtual servers within a cloud provider. That’s a nice evolution from the days of managing one’s server rooms, racks, networks, and power – but it’s still open to attack from misconfigured (or neglected) server OS updates and insecure application development practices. Such companies then proceed to layer security products over this, but this paradoxically increases complexity (and adds places where determined attackers can penetrate.)
Metacog uses a set of services provided and secured by AWS’s IAM access model. You have to break AWS, not Metacog, to penetrate them. In particular, we use:
- Kinesis and Kinesis Firehose to ingest streaming data
- S3 for primary data storage
- DynamoDB for indexing and lookup of data in S3
- Lambda for transformation and serving models
- Managed Spark Clusters for batch processing and model building; we fetch data from S3 and return the results
- Secrets Manager to store service keys
- Cognito to manage Metacog licensee access (customer keys, logins, and single-sign-on integration with customer infrastructure)
- CloudWatch to log errors and auditable events of interest
The only server we created is to expose the Metacog API itself (more below on that). This server doesn’t store data, and no one has console/root access (more below on that as well). -
2. API-First Design
Every function in Metacog requires valid access keys; no keys, nothing happens (other than an error message informing you of the fact). You can’t even find (or gain access to) the data ingest endpoint. We build the API server as a minimal-attack-surface container. It’s just enough OS to run the Metacog API. We can’t even log into the underlying server!
We offer a minimalist User Interface on the Metacog API server (also accessed via API endpoints). This UI coordinates underlying Metacog API endpoints and demonstrates everyday use cases. Use it (or ignore it and build your own) as you see fit. Like everything else, Cognito also secures this UI.
-
3. Segregated Semantics
Just like AWS has a shared responsibility model, so does Metacog. Ours is simpler – don’t send us any Personally-identifiable information (PII). Our license forbids it. We designed our data model around events. Events are grouped in sessions and associated with one or more learners and one or more widgets.
- Learners are people – please identify them with a blind ID (a GUID or other unique ID meaningful to you and stored in your other systems)
- Widgets are things (content, assessments, simulations) that learners interact with – please identify them with a blind ID (as unique as your psychometricians or learning scientists require).
- Sessions are bounded sets of events – use unique IDs (we’ll generate them if you don’t want to). These let you distinguish between learner attempts (or reuse them to join session segments you deliver over extended timeframes).
- Events are the data of interest generated by your content. When you instrument your content with Metacog’s Client Library, you can give events descriptive or coded names to meet your security needs.
The goals – even if you were careless with your keys, or someone broke AWS, the data isn’t sufficient to break security around the training. Attackers need your learner rosters (from your external systems) and your training content (from your platform). They will also need your dictionary of event names (from your documentation) if the terms aren’t self-evident. You see the pattern: shared responsibility. -
4. Immutable Infrastructure & Infrastructure-as-code
We mentioned above that we build the Metacog API server with a minimal attack surface. There’s no console, logins, or any other access beyond the Metacog API web service itself. How do we maintain or patch it? We don’t! We destroy it and build a new API server from scratch every time we improve Metacog (or respond to AWS security advisories about the runtime or OS.)
Consistent with this desire not to expose security risks through manual fiddling, we avoid manual fiddling of the complete Metacog Infrastructure. We build and deploy everything with automated and scripted pipelines (also provided by AWS services.) Automation ensures that we create (or rebuild) Metacog with only approved components and configurations. Our team fiddles on “dev” environments when exploring ideas about improving Metacog, but “stage” and “production” environments are controlled and updated systematically.
-
5. Tenant Isolation
As a multi-tenant SaaS service, Metacog takes care to isolate access between tenants (our customers). We use Cognito and IAM permissions to ensure your neighbors can’t see your content. We also defend against the “noisy neighbor” problem whereby one tenant’s heavy use degrades another tenant’s performance. See this AWS whitepaper if you’re interested in learning more about good strategies we’ve employed to achieve this.
Note that if any client has high-security needs (e.g., AWS GovCloud or AWS Secret Region), we can deploy a copy of Metacog. Immutable Infrastructure and infrastructure-as-code support rapid replication of Metacog when essential. Unfortunately, a separate infrastructure loses the cost benefits of a SaaS platform.