Skip to content

Data model

Edoardo Rosa edited this page Sep 10, 2022 · 4 revisions

Data Representation

A graph representation of the cloud ecosystem appeared to be a good fit for the project because the data model allows us to create simple entities and focus on the relationships between them: we need to find a path from a resource X to a resource Y.

Without knowing graphs or graph theory the immediate and intuitive way to represent "how things works" is using circles and arrows (nodes and edges): X is somehow connected to Y. The focus is on the "how" X and Y are connected because they represent the distinctive features of X and Y.

The modelling of cloud resources and services can be simplified using nodes and edges. The model designed take into consideration the following principles:

  • easy to implement
  • easy to understand
  • names must be eloquent
  • must allow cycles to perform recursive tasks and deep analysis
  • must allow search by aggregation and keywords
  • must allow non technical user to understand the relationships

The following schema is the result of the data model design:

Schema creation queries
MERGE (u:User{name: "User"})
MERGE (r:Role{name: "Role"})
MERGE (g:Group{name: "Group"})
MERGE (p:Policy{name: "Policy"})
MERGE (a:Action{name: "Action"})
MERGE (s:Service{name: "Service"})
MERGE (v:Vpc{name: "VPC"})

MERGE (u)-[:MEMBER_OF{from: 1, to: "N"}]->(g)
MERGE (u)-[:HAS_POLICY{from: 1, to: "N"}]->(p)
MERGE (r)-[:HAS_POLICY{from: 1, to: "N"}]->(p)
MERGE (g)-[:HAS_POLICY{from: 1, to: "N"}]->(p)
MERGE (p)-[:ALLOWS{from: 1, to: "N"}]->(a)
MERGE (a)-[:ON{from: 1, to: "N"}]->(s)
MERGE (a)-[:ON{from: 1, to: "N"}]->(u)
MERGE (a)-[:ON{from: 1, to: "N"}]->(g)
MERGE (a)-[:ON{from: 1, to: "N"}]->(p)
MERGE (a)-[:ON{from: 1, to: "N"}]->(r)
MERGE (s)-[:NETWORK{from: 1, to: "N"}]->(v)
MERGE (v)-[:LINKED{from: 1, to: "N"}]->(v)
MERGE (s)-[:USES{from:1, to: 1}]->(r)

A Node represents a single IAM User, Role, Policy, Service Instance or an Action in an cloud account (at the moment only AWS is supported). Each node stores information, the configurations and data of the object represented: for example, an EC2 instance node also contains information like the InstanceID, the used instance profile, the VPC configurations, creation date, etc.

  • Role: a Node representing an AWS role or instance profile
  • User: a Node representing an AWS user
  • Group: a Node representing an AWS group
  • Policy: a Node representing an AWS Policy both attached or inline
  • Action: a Node representing a single instance of a permission defined in a Policy
  • Service: a Node representing an instance of a supported AWS service (i.e. EC2, Bucket, RDS, etc.); each service is detailed with the name of the service

An Edge represents a relationship between two Nodes:

  • MEMBER_OF: relationship between an User and a Group; the User is member of one or more Groups nodes
  • HAS_POLICY: relationship between a Role, a User or a Group node and a Policy node; the IAM principal or group has one or more Inline or Attached policies
  • ALLOWS: relationship between a Policy and an Action (i.e. "iam:PassRole"); the Policy define one or more the Action in the policy document
  • ON: relationship between an Action and one or more Service, Role, User or a Group nodes; the permission that is applied inside AWS
  • USES: relationship between a Service and a Role; the role that is used inside that service (i.e. Instance Profile for EC2s)
  • NETWORK: relationship between a Service and one or more VPC nodes; the VPCs that the service uses
  • LINKED: relationship between one or more VPC nodes; the VPC peering configuration between VPCs

Node properties

The following tables list all available properties for each node created in the database. As a rule of thumb each property is a flattened (using _ char as a separator) version of the output of AWS APIs.

Properties by node that can be used to filter/search resources in queries:

Role
  • RoleId
  • RoleName
  • Path
  • Description
  • InstanceProfileArn
  • Arn
  • IamInstanceProfileId
  • AssumableBy
User
  • Arn
  • MFAStatus
  • UserId
  • PasswordLastChanged
  • UserName
  • PasswordEnabled
Group
  • CreateDate
  • GroupId
  • Path
  • GroupName
  • Arn
Policy
  • Type
  • Name
  • Arn
Action
  • Action
  • Service
S3 - ACL_Grantee_DisplayName_0 - ACL_Grantee_ID_0 - ACL_Grantee_Type_0 - ACL_Grantee_URI_0 - ACL_Permission_0 - CreationDate - Encrypted - Name - Policy_Id - Policy_Statement_Action_0 - Policy_Statement_Action_0_0 - Policy_Statement_Action_0_2 - Policy_Statement_Action_1 - Policy_Statement_Action_1_0 - Policy_Statement_Action_1_2 - Policy_Statement_Condition_ArnLike_aws:SourceArn_2 - Policy_Statement_Condition_Bool_aws:SecureTransport_0 - Policy_Statement_Condition_StringEquals_aws:SourceAccount_0 - Policy_Statement_Condition_StringEquals_aws:sourceVpce_0 - Policy_Statement_Condition_StringEquals_s3:x-amz-acl_1 - Policy_Statement_Condition_StringNotEquals_s3:x-amz-server-side-encryption_2 - Policy_Statement_Condition_StringNotEquals_s3:x-amz-server-side-encryption-aws-kms-key-id_1 - Policy_Statement_Effect_0 - Policy_Statement_Principal_0 - Policy_Statement_Principal_AWS_0 - Policy_Statement_Principal_AWS_0_2 - Policy_Statement_Principal_Service_0 - Policy_Statement_Resource_0 - Policy_Statement_Resource_0_0 - Policy_Statement_Resource_1 - Policy_Statement_Resource_1_0 - Policy_Statement_Sid_0 - Policy_Version
Ec2
  • AmiLaunchIndex
  • Architecture
  • BlockDeviceMappings_DeviceName_0
  • BlockDeviceMappings_Ebs_AttachTime_0
  • BlockDeviceMappings_Ebs_DeleteOnTermination_0
  • BlockDeviceMappings_Ebs_Status_0
  • BlockDeviceMappings_Ebs_VolumeId_0
  • BootMode
  • CapacityReservationSpecification_CapacityReservationPreference
  • ClientToken
  • CpuOptions_CoreCount
  • CpuOptions_ThreadsPerCore
  • EbsOptimized
  • EnaSupport
  • EnclaveOptions_Enabled
  • HibernationOptions_Configured
  • Hypervisor
  • IamInstanceProfile_Arn
  • IamInstanceProfile_Id
  • ImageId
  • InstanceId
  • InstanceLifecycle
  • InstanceState_Code
  • InstanceState_Name
  • InstanceType
  • KeyName
  • LaunchTime
  • MaintenanceOptions_AutoRecovery
  • MetadataOptions_HttpEndpoint
  • MetadataOptions_HttpProtocolIpv6
  • MetadataOptions_HttpPutResponseHopLimit
  • MetadataOptions_HttpTokens
  • MetadataOptions_InstanceMetadataTags
  • MetadataOptions_State
  • Monitoring_State
  • Placement_AvailabilityZone
  • Placement_GroupName
  • Placement_Tenancy
  • Platform
  • PlatformDetails
  • PrivateDnsName
  • PrivateDnsNameOptions_EnableResourceNameDnsAAAARecord
  • PrivateDnsNameOptions_EnableResourceNameDnsARecord
  • PrivateDnsNameOptions_HostnameType
  • PrivateIpAddress
  • PublicDnsName
  • PublicIpAddress
  • RootDeviceName
  • RootDeviceType
  • SecurityGroups_GroupId_0
  • SecurityGroups_GroupName_0
  • SourceDestCheck
  • SpotInstanceRequestId
  • State_Code
  • State_Name
  • StateTransitionReason
  • SubnetId
  • Tags_Key_0
  • Tags_Value_0
  • UsageOperation
  • UsageOperationUpdateTime
  • UserData
  • VirtualizationType
  • VpcId
Vpc
  • CidrBlock
  • CidrBlockAssociationSet_AssociationId_0
  • CidrBlockAssociationSet_CidrBlockState_State_0
  • CidrBlockAssociationSet_CidrBlock_0
  • DhcpOptionsId
  • InstanceTenancy
  • IsDefault
  • OwnerId
  • State
  • Type
  • VpcId
  • Tags_Key_0
  • Tags_Value_0
Lambda
  • Architectures_0
  • CodeSha256
  • CodeSize
  • Description
  • EphemeralStorage_Size
  • FunctionArn
  • FunctionName
  • Handler
  • LastModified
  • LastUpdateStatus
  • LastUpdateStatusReasonCode
  • Layers_Arn_0
  • Layers_CodeSize_0
  • Location
  • MemorySize
  • PackageType
  • Policy_Id
  • Policy_Statement_Action_0
  • Policy_Statement_Condition_ArnLike_AWS:SourceArn_0
  • Policy_Statement_Condition_StringEquals_AWS:SourceAccount_0
  • Policy_Statement_Effect_0
  • Policy_Statement_Principal_Service_0
  • Policy_Statement_Resource_0
  • Policy_Statement_Sid_0
  • Policy_Version
  • RepositoryType
  • RevisionId
  • Role
  • Runtime
  • State
  • StateReasonCode
  • Timeout
  • TracingConfig_Mode
  • Version
  • VpcConfig_SecurityGroupIds_0
  • VpcConfig_SubnetIds_0
  • VpcConfig_VpcId
RDS
  • ActivityStreamMode
  • ActivityStreamPolicyStatus
  • ActivityStreamStatus
  • AllocatedStorage
  • AssociatedRoles_FeatureName_0
  • AssociatedRoles_RoleArn_0
  • AssociatedRoles_Status_0
  • AutomationMode
  • AutoMinorVersionUpgrade
  • AvailabilityZone
  • AvailabilityZones_0
  • BacktrackConsumedChangeRecords
  • BacktrackWindow
  • BackupRetentionPeriod
  • BackupTarget
  • CACertificateIdentifier
  • CloneGroupId
  • ClusterCreateTime
  • CopyTagsToSnapshot
  • CrossAccountClone
  • CustomerOwnedIpEnabled
  • DatabaseName
  • DBClusterArn
  • DBClusterIdentifier
  • DBClusterMembers_DBClusterParameterGroupStatus_0
  • DBClusterMembers_DBInstanceIdentifier_0
  • DBClusterMembers_IsClusterWriter_0
  • DBClusterMembers_PromotionTier_0
  • DBClusterParameterGroup
  • DbClusterResourceId
  • DBInstanceArn
  • DBInstanceClass
  • DBInstanceIdentifier
  • DbInstancePort
  • DBInstanceStatus
  • DbiResourceId
  • DBName
  • DBParameterGroups_DBParameterGroupName_0
  • DBParameterGroups_ParameterApplyStatus_0
  • DBSubnetGroup
  • DBSubnetGroup_DBSubnetGroupDescription
  • DBSubnetGroup_DBSubnetGroupName
  • DBSubnetGroup_SubnetGroupStatus
  • DBSubnetGroup_Subnets_SubnetAvailabilityZone_Name_0
  • DBSubnetGroup_Subnets_SubnetIdentifier_0
  • DBSubnetGroup_Subnets_SubnetStatus_0
  • DBSubnetGroup_VpcId
  • DeletionProtection
  • EarliestBacktrackTime
  • EarliestRestorableTime
  • EnabledCloudwatchLogsExports_0
  • Endpoint
  • Endpoint_Address
  • Endpoint_HostedZoneId
  • Endpoint_Port
  • Engine
  • EngineMode
  • EngineVersion
  • EnhancedMonitoringResourceArn
  • GlobalWriteForwardingStatus
  • HostedZoneId
  • HttpEndpointEnabled
  • IAMDatabaseAuthenticationEnabled
  • InstanceCreateTime
  • KmsKeyId
  • LatestRestorableTime
  • LicenseModel
  • MasterUsername
  • MonitoringInterval
  • MonitoringRoleArn
  • MultiAZ
  • NetworkType
  • OptionGroupMemberships_OptionGroupName_0
  • OptionGroupMemberships_Status_0
  • PendingModifiedValues_AutomationMode
  • PerformanceInsightsEnabled
  • PerformanceInsightsKMSKeyId
  • PerformanceInsightsRetentionPeriod
  • Port
  • PreferredBackupWindow
  • PreferredMaintenanceWindow
  • PromotionTier
  • PubliclyAccessible
  • ReaderEndpoint
  • ReplicaMode
  • Status
  • StorageEncrypted
  • StorageType
  • TagList_Key_0
  • TagList_Value_0
  • VpcSecurityGroups_Status_0
  • VpcSecurityGroups_VpcSecurityGroupId_0
DynamoDB
  • Name
  • Region
Redshift
  • AllowVersionUpgrade
  • AquaConfiguration_AquaConfigurationStatus
  • AquaConfiguration_AquaStatus
  • AutomatedSnapshotRetentionPeriod
  • AvailabilityZone
  • AvailabilityZoneRelocationStatus
  • ClusterAvailabilityStatus
  • ClusterCreateTime
  • ClusterIdentifier
  • ClusterNamespaceArn
  • ClusterNodes_NodeRole_0
  • ClusterNodes_PrivateIPAddress_0
  • ClusterNodes_PublicIPAddress_0
  • ClusterParameterGroups_ParameterApplyStatus_0
  • ClusterParameterGroups_ParameterGroupName_0
  • ClusterPublicKey
  • ClusterRevisionNumber
  • ClusterStatus
  • ClusterSubnetGroupName
  • ClusterVersion
  • DBName
  • ElasticIpStatus_ElasticIp
  • ElasticIpStatus_Status
  • ElasticResizeNumberOfNodeOptions
  • Encrypted
  • Endpoint_Address
  • Endpoint_Port
  • EnhancedVpcRouting
  • IamRoles_ApplyStatus_0
  • IamRoles_IamRoleArn_0
  • KmsKeyId
  • MaintenanceTrackName
  • ManualSnapshotRetentionPeriod
  • MasterUsername
  • NextMaintenanceWindowStartTime
  • NodeType
  • NumberOfNodes
  • PreferredMaintenanceWindow
  • PubliclyAccessible
  • SnapshotScheduleState
  • Tags_Key_0
  • Tags_Value_0
  • TotalStorageCapacityInMegaBytes
  • VpcId
  • VpcSecurityGroups_Status_0
  • VpcSecurityGroups_VpcSecurityGroupId_0
Clone this wiki locally