logo
SIGN UP

Developer Documentation

# Let's Data : Focus on the data - we'll manage the infrastructure!

Cloud infrastructure that simplifies how you process, analyze and transform data.

Security

We've built #Let's Data adhering to the software security principles. Our code follows the principal of least privilege and is granted access only to the resources that it needs. In terms of a security discussion, we can break down the #Let's Data components into three categories:

  1. Data Plane Components: Data plane components are the entities that have access to the Dataset's data. These are currently limited to the DataTask Lambda Function, CreateDataset API and the ViewErrors API.
    • The DataTask function is responsible for processing the Dataset's data and invoking the user's data handlers.
    • The CreateDataset API does deep validation on the dataset's read destination to make sure it has access to the user's data.
    • The ViewErrors API captures the user's data that has errors and stores it as part of the error documents.
  2. Control Plane Components: These are components responsible for the setup, management and deletion of the dataset's execution. For example, control plane components would be responsible for the dataset's initialization, creating the execution environment, scaling resources as needed, emitting logs and metrics, metering usage for billing and deletion of the execution environment.
  3. Let's Data APIs: #Let's Data APIs are the api implementations that our website, our cli and possibly in future, our control plane sdk interfaces with. In terms of access, these are secured with standard OAuth. In terms of execution, each API is limited in terms of access by its separate security configuration that limits the resource access as needed. For example the ViewLogs API does not have access to dataset's metrics or errors and vice versa.

The overall implementation philosophy for #Let's Data's handling of the user's data in the read / write and error destinations is that we limit the access to entities that need it and have controls built in to prevent accidental access. Only the APIs listed, run in context of the dataset's user can elevate and access the user's data. Any control plane components, #Let's Data APIs are configured without these user data permissions (so they cannot accidentally elevate even if they wanted to).

We'd be happy to share additional details around our security with you to allay your security concerns (if any) support@letsdata.io

Backup, Recovery, Security Monitoring and Recommendations

Backups

  • #Let's Data enables automated backups for the resources in LetsData resource location (where applicable) and recommends customers to enable automated backups of resources in the customer accounts.
  • #Let's Data DynamoDB Write Connectors (resourceLocation: LetsData) are enabled with Point In Time Restore continuous backups with snapshots available 5 mins upto 35 day.
  • #Let's Data S3 Write Connectors and S3 Error Connectors (resourceLocation: LetsData) are enabled for hourly backup with AWS Backup and are retained for 7 days.
  • #Let's Data SQS & Kinesis Write Connectors aren't databases and store data for a limited time - so these are not backed up. (This seems to be consistent with the AWS where automated backups for these do not seem to be available)
  • We recommend customers enable Dynamo DB and S3 backups for write connectors with resourceLocation: Customer
  • Logs, Metrics and Trace data is retained for 7 days and is deleted upon deletion.
  • Control data, system actions, zombie records and billing and usage data is retained and is not deleted. (However, realistically greater than 35 day retention should not be expected.)

Recovery

  • We do backup and restore testing (& resiliency tests) every 90 days to make sure our backup and restore processes are uptodate
  • While we do not call these SLAs (we are not tracking / holding accountable for breaches), our internal calculations suggest that if an unplanned data loss incident occurs and is detected within the DynamoDB PITR and S3 Backup durations, we believe that system should hopefully have a maximum data loss of up to 4 hours for resourceLocation LetsData.
  • In case of an unplanned data loss incident, we expect to have the service back up within 24-48 hours in the worst case. This depends highly on the type of incident that we get, but if everything is down and we need to rebuild everything from backups, it might take 24 to 48 hours. Again, this isn't an SLA but our internal calculations.
  • Date of Last Backup and Restore Test: 5/16/2023

Security Monitoring

  • We've enabled our AWS accounts following the AWS Best Practices and continuously run security monitoring infrastructure such as AWS Security Hub, AWS Trusted Advisor, AWS Config and AWS Guard Duty to detect and act upon any high priority security issues
  • We follow AWS best practices around IAM permissioning - use IAM roles to grant temporary credentials for tasks, configure the process's security perimeter and use additional identifiers such as externalId to follow the best practices
  • Our access and IAM model is reviewed every major release days (approx ~90 days) for security threats and resilience testing.
  • We've worked with AWS Solution Architects on Foundational Technical Reviews to make sure our service meets the AWS best practices (in progress)

Customer Recommendations

  • Enable DynamoDB PITR for any Dynamo DB tables in resourceLocation: Customer
  • Enable AWS Backups for any S3 buckets in resourceLocation: Customer
  • Leverage resourceLocation: LetsData where possible to get automated backups and recovery enabled.
  • Understand the shared responsibility model (https://docs.aws.amazon.com/whitepapers/latest/aws-risk-and-compliance/shared-responsibility-model.html) when it comes to AWS and the #Let's Data managed service built on AWS:
    • See the 'Customer Responsibility In the Cloud' when it comes to any resourceLocation: Customer resources or additional resources in customer AWS accounts.
    • Understand that the customer plays an integral part in the security of their data in #let's Data resources. We are as secure as the customer's security practices!
On This Page