crossplane / terrajet Goto Github PK
View Code? Open in Web Editor NEWGenerate Crossplane Providers from any Terraform Provider
Home Page: https://crossplane.io
License: Apache License 2.0
Generate Crossplane Providers from any Terraform Provider
Home Page: https://crossplane.io
License: Apache License 2.0
Common controller works for our prototypes but we need to do more experiments to see its scalability. There are certain issues that need some data before taking action, like #38
We can have a complex composition with 10+ resources and create 10 XRs using that. Then check the resource usage and errors across the board to see if we hit any limit. We can define some limits
in the deployment of the controller and see at what point we start to see context deadline exceeded
errors, meaning we can't complete a single reconciliation pass with those limits.
I believe if we copy the .github
directory over from crossplane/crossplane we'll automatically get CI/CD, issue and PR templates, etc etc for this repo.
Today if you run kubectl apply -f package/crds
in provider-tf-aws
, your cluster gets really slow. In GKE, kubectl command just stops after like 50 CRDs.
We have some ideas around sharding the controllers and API types, allowing customers to install only a set of them. But we haven't identified the actual problem. So, we need to make sure we know the root cause of the problem before choosing a solution so that we have that problem in mind for future designs.
When testing provider-tf-azure's KubernetesCluster
MR, we have observed that provisioned cluster's credentials, e.g. KubeConfig, are available in the Teffaform state as regular fields. An example is:
{
"version": 4,
"terraform_version": "1.0.5",
"serial": 3,
"lineage": "365fcd38-09a3-e30a-cb02-df3117306af0",
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "azurerm_kubernetes_cluster",
"name": "example",
"provider": "provider[\"registry.terraform.io/hashicorp/azurerm\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
...
"kube_config": [
{
"client_certificate":
Because these attributes are resource dependent, we need to be able to treat them as credentials in a resource specific manner. It looks like in the Terraform provider resource schema, such fields are marked as sensitive
, like kube_config_raw, or kube_config.password.
Those fields that are marked as sensitive
are also good candidates for being part of the connection secrets.
Provisioning a KubernetesCluster
using https://github.com/crossplane-contrib/provider-tf-azure/blob/258830e0bc0627f2749bcd7e285e3c831849a04a/examples/kubernetes/kubernetescluster.yaml reveals the issue.
There are some APIs that allow you to manage the same object in two different APIs, for example you can manage NodeGroup
through its API but also in an array under Cluster
object. Per our Managed Resources API Design, we make the decision that in such cases, we'd choose to represent the resource only in one way and that would be a separate resource.
It seems like Terraform community also tries to follow this pattern but there are cases where they didn't do that in the past and now they can't change the interface. So, what they do for such resources is to show a warning, encouraging to use the separate API.
Since we don't have a strong contract yet, we can omit those fields from schema before generation, i.e. remove nodeGroups
field from cluster
object. We can do that easily by manipulating the TF schema we give to CRD generator but we need to make sure it works well across the board by testing it manually and adding the instructions to the guide that is to be written.
In PR #76, @muvaf implemented two different modes for Apply and Destroy operations, namely sync
and async
.
Currently, the operation mode per resource is expected as input from provider developers. However, Terraform resource schema already has a per resource ResourceTimeout which was implemented by provider authors. We can leverage this information to decide on which mode to choose by default (still consider opening a configuration to override it).
For example, AWS RDS cluster has a default of 120 mins and we could choose the async mode. For resources that have no timeouts set or timeouts less than a sensible threshold, e.g. 5 mins, we can choose the sync mode.
Leverage ResourceTimeout to decide which operation mode to use by default.
Currently we are generating controllers setup functions but do not call setup of individual controllers like this.
This sounds like a similar problem to #20 and we might consider introducing a similar mechanism to register controllers.
tfcli runtime needs the Terraform provider source and version at runtime for correctly generating the corresponding provider block in per-resource Terraform configuration. We need to capture this information during container image build time and make it available to the tfcli runtime.
The source of truth for the provider source and version constraint could be two env. variables, as both of pieces of information are also needed when preparing the local Terraform image mirror. Then as discussed previously, we could read those env. variables injected into container's runtime in the main
function, and have their values flow into tfcli runtime via the following path: main -> (generated) controller.Setup -> <(generated) resource>.Setup -> terraform.SetupController
and stored in terraform.connector
to be used by tfcli.
https://github.com/iancoleman/strcase#custom-acronyms-for-tocamel--tolowercamel
If we're needing to translate snake case Terraform fields to camel case, it may be nice to teach our translation code a few common acronyms so it can capitalize them correctly. That said - in same cases (e.g. AWS) I recall that they don't capitalize acronyms so perhaps it's better to just match their API as closely as possible there. 🤷
In the initial implementation, we are planning to store terraform states as annotations for the managed resource.
ETCD (i.e. annotation) will be authoritative in terms of the state and we avoid making any changes/edits on it except stripping out sensitive attributes. This is the most conservative approach in terms of using terraform just like as an end-user.
However, we want to investigate whether we can safely (re)build the state information from the spec, status, and kind information of the resource and completely remove the annotation.
To better describe the problem, see the following tfstate
example:
{
"version": 4,
"terraform_version": "1.0.4",
"serial": 5,
"lineage": "f36b8453-5686-97c3-af3e-6a5c655c5fbf",
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "aws_vpc",
"name": "instance_vpc_hasan",
"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
"instances": [
{
"schema_version": 1,
"attributes": {
"arn": "arn:aws:ec2:us-west-2:609897127049:vpc/vpc-0b396ddd9469f29e5",
"assign_generated_ipv6_cidr_block": false,
"cidr_block": "10.0.0.0/16",
"default_network_acl_id": "acl-08e8f435b74af56e8",
"default_route_table_id": "rtb-0eb8f2b9933d31175",
"default_security_group_id": "sg-034d7899a57871f29",
"dhcp_options_id": "dopt-0c388774",
"enable_classiclink": false,
"enable_classiclink_dns_support": false,
"enable_dns_hostnames": false,
"enable_dns_support": true,
"id": "vpc-0b396ddd9469f29e5",
"instance_tenancy": "default",
"ipv6_association_id": "",
"ipv6_cidr_block": "",
"main_route_table_id": "rtb-0eb8f2b9933d31175",
"owner_id": "609897127049",
"tags": {
"Name": "ExampleVPCInstanceByHasan"
},
"tags_all": {
"Name": "ExampleVPCInstanceByHasan"
}
},
"sensitive_attributes": [],
"private": "eyJzY2hlbWFfdmVyc2lvbiI6IjEifQ=="
}
]
}
]
}
resources[0].instances[0].attributes
field, is a combination of
spec.forProvider
status.atProvider
external_name
annotation (id
here)resources[0].instances[0].sensitive_attributes
field would be available in the connection secret (if required during apply)resources[0].type
, resources[0].name
and resources[0].provider
would also be available during reconciliation.Two caveats here:
spec.forProvider
contains the desired state, not the last applied state which would be different that what tfstate
typically contains.lineage
, serial
and resources[0].instances[0].private
exactly the same as the original one which would require to persist them as well.When I gave provider-tf-azure KubernetesCluster
a try, provider cannot initially update the observed state at state.atProvider
because in the generated v1alpha1.KubernetesClusterObservation
struct, currently all the fields are required. However, values of most fields are not initially available like KubeConfig
.
Provisioning a KubernetesCluster
using https://github.com/crossplane-contrib/provider-tf-azure/blob/258830e0bc0627f2749bcd7e285e3c831849a04a/examples/kubernetes/kubernetescluster.yaml reveals the issue.
While I was testing #76 , I used RdsCluster
as async example. After creation, late initializer always reported true
, meaning that there needs to be a spec update and because of that resource never gets ready since spec update erases changes in the status.
I added the YAML I used. You can see it never gets ready and if you debug, it's seen that late initializer reports true all the time.
In Crossplane managed resources, not all fields in the spec are required during creation time.
For optional fields, we need to properly parse attributes in Terraform state into generated spec schema.
We use Terraform schema to generate the type of every field. While Terraform has one big untyped tree of fields to hold all the information, that's not the case in Crossplane. In line with Kubernetes API conventions, we separate fields into spec
and status
fields, status
ones being fields that user cannot ever configure.
In Terraform schema, fields report whether they are Computed
and Optional
. The fields that are both Computed
and Optional
are the ones that have server-side defaults and user-configurable; the ones we late-initialize. So we put those under spec, but fields that are Computed
and not Optional
go to status
.
The problem is that there can be some types that are Computed
and not Optional
but also has nested fields that can go under spec
and vice versa. Right now, when we compute the type for a field, we return a parameters type as well as an observation type, then decide which one to use depending on the properties of the field. If the field goes to spec but there are nested fields that could go to status, they are eliminated. Also, if a field goes to status, all nested fields go to status as well and the parameter types are ignored.
When I did an analysis of how prevalent this problem is in, for example, AWS, I see the following output:
there are 1 fields that could be observation type of .SyntheticsCanary.VpcConfig but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmplifyDomainAssociation.SubDomain but ignored since the field is a parameters field
there are 1 fields that could be observation type of .StoragegatewayGateway.SmbActiveDirectorySettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .LexBotAlias.ConversationLogs.LogSettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Route53ResolverEndpoint.IpAddress but ignored since the field is a parameters field
there are 1 fields that could be observation type of .LambdaFunction.VpcConfig but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.DagProcessingLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.SchedulerLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.TaskLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.WebserverLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.WorkerLogs but ignored since the field is a parameters field
there are 4 fields that could be parameters type of .ElasticBeanstalkEnvironment.AllSettings but ignored since the field is an observation field
there are 2 fields that could be observation type of .EksCluster.VpcConfig but ignored since the field is a parameters field
there are 2 fields that could be observation type of .DirectoryServiceDirectory.ConnectSettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .DirectoryServiceDirectory.VpcSettings but ignored since the field is a parameters field
there are 3 fields that could be observation type of .EmrCluster.CoreInstanceFleet but ignored since the field is a parameters field
there are 1 fields that could be observation type of .EmrCluster.CoreInstanceGroup but ignored since the field is a parameters field
there are 3 fields that could be observation type of .EmrCluster.MasterInstanceFleet but ignored since the field is a parameters field
there are 1 fields that could be observation type of .EmrCluster.MasterInstanceGroup but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.Input but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.Output but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.ReferenceDataSource but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.VpcConfiguration but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.CloudwatchLoggingOptions but ignored since the field is a parameters field
there are 3 fields that could be observation type of .SecretsmanagerSecret.Replica but ignored since the field is a parameters field
there are 4 fields that could be parameters type of .SsmDocument.Parameter but ignored since the field is an observation field
there are 1 fields that could be observation type of .Alb.SubnetMapping but ignored since the field is a parameters field
there are 1 fields that could be observation type of .SpotInstanceRequest.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .SpotInstanceRequest.RootBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Lb.SubnetMapping but ignored since the field is a parameters field
there are 2 fields that could be observation type of .ElasticsearchDomain.VpcOptions but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Apigatewayv2DomainName.DomainNameConfiguration but ignored since the field is a parameters field
there are 1 fields that could be observation type of .NetworkInterface.Attachment but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Instance.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Instance.RootBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .CodestarnotificationsNotificationRule.Target but ignored since the field is a parameters field
there are 2 fields that could be observation type of .CodeartifactRepository.ExternalConnections but ignored since the field is a parameters field
there are 8 fields that could be observation type of .AmiCopy.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmiCopy.EphemeralBlockDevice but ignored since the field is a parameters field
there are 8 fields that could be observation type of .AmiFromInstance.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmiFromInstance.EphemeralBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.CloudwatchLoggingOptions but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.Inputs.Schema.RecordFormat but ignored since the field is a parameters field
there are 2 fields that could be observation type of .KinesisAnalyticsApplication.Inputs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.Outputs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.ReferenceDataSources.Schema.RecordFormat but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.ReferenceDataSources but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisFirehoseDeliveryStream.ElasticsearchConfiguration.VpcConfig but ignored since the field is a parameters field
there are 1 fields that could be observation type of .GlueCatalogTable.PartitionIndex but ignored since the field is a parameters field
Generated 770 resources!
There are thousands of fields generated for 770 resources, so problem is not that big, at least for AWS. We see that almost all ignored ones are the ones that exist under a parameters type but should've gone to observation. So, the problem does not cause much miss in the configuration ability of the users but there is information they'd not be able to see under status.
There can be several solutions. The most basic one is to have the full schema in both status and spec but that'd mean deviation from Crossplane Resource Model even if unused fields will stay empty and it's not really worth doing this for all resources to be correct for minority cases.
The best solution would be the one that would construct the earlier fields and types under status once we encounter an observational type under a parameter type (and vice versa). However, it's not done in the first iteration since the field path wasn't known by the schema builder. With #24 it will know the path.
So, we could possibly have the type map be a more complex tree with some utilities that will allow you to add a certain field to an arbitrary point and construct the path from the root to that leaf by building the necessary types & fields. Right now, it has the list of types for uniqueness purposes but the actual information is recursively saved in the Go type objects. So this solution would mean we'd deviate from the Go types package to store the information and that'd require some refactoring.
I'd propose that we can have a utility that will work on Go types library objects to manipulate the existing type trees to add arbitrary field to a field path. There would be a function that knows the root types and when you give it an arbitrary field information, it'd construct the path in the either of the two root structs (parameters and observation). There are some caveats about naming of those new fields and types but it's doable and lets us still be in the same type system instead of building our own.
Repo doesn't have a README where people can read and understand what it's for.
Add a README.
Looking at crossplane-contrib/provider-jet-aws#11 , we have some differences between TF based providers and native ones in terms of how they are packaged or organized structurally.
We can consider provider-tf-template repo but since that'd be another repo to maintain, we first we need to come up with a list of things that are different than native ones so that we can see if it's worth it. A branch in provider-template could be another option.
PR #3 does not have any automated tests as of now. We would like to get it merged and track test implementation with this issue.
While having two different providers for the same API is doable, the UX is usually not great if you decide to use both because of reason. It'd be great if we have a single complete one.
TF-based providers could reach full feature parity and in that case, Terraform would really be an implementation detail. So, we can consider merging for example provider-tf-aws into provider-aws, covering only the missing resources so that we can generate CRR references between manual written, ACK generated and TF generated. And we'd have one provider to point to.
One drawback is that the API changes once we'd like to move underlying resource implementation from TF to ACK or manual and vice versa.
Though we can discuss this only after alpha version.
Right now we have a single provider binary and all terraform
invocations use that one but each of them spins up a new provider server to talk to. We haven't observed noticeable issues but it's likely that this will cause performance issues.
@ulucinar discovered a way to make Terraform CLI talk to the existing provider server. He didn't feel great about it for the initial pass but we might need to reconsider it after testing the providers with 100+ custom resources using composition.
Generated CRDs don't have some of the necessary tags like // +kubebuilder:validation:Required
. A comment printer functionality will be necessary and it will help with printing documentation as well.
If there are two fields with the same name in different types of the CRD struct tree, they'd be assumed to have the same schema. It turns out this assumption is wrong; having the same field name does not always mean that the underlying schema is the same.
An example is the timeout
field of aws_appmesh_route
resource; here and here. Our builder would assume once a type exists in the map, there is no need to generate another one and since it worked directly on a map, the chosen one would change, producing unstable output.
All generator utilities are written parallelization in mind at CRD level, i.e. every CRD is generated completely separately. But there isn't an interface in terrajet exposed to generator implementations, like core count allocated.
The library we use to do camel case conversions allow customization by adding acronyms in init()
. We should have a default set of custom acronyms (like DNS, IPv6, ID etc) and let providers add their own if need be.
crossplane/crossplane-tools#35 lets us generate reference resolver but it still needs the Reference
and Selector
fields generated beforehand.
Make schema buidler add those two fields each time it sees the comment marker resolver generator uses.
Terrajet providers have different Dockerfile
and several other things that we need to do lots of manual plumbing to make provider-template work.
We can have a provider-tf-template
people could just use to bootstrap a Terrajet-based provider.
The current readme doesn't provide any instructions on how to get started.
We should provide an overview on how Terrjet works and a getting started on how to generate a Crossplane provider based on a Terraform provider.
Crossplane provides a way to connect to cloud resources (if applicable) by writing the required details to connect into a Kubernetes secret.
These details should include all fields that would need to be required to connect including sensitive ones like username
and password
and not necessarily sensitive ones like port
information.
We would need to define/implement a way to generate these connection details with terrajet.
I am assuming this information could be composed using the attributes
and sensitive_attributes
fields in terraform state.
See #36 for details.
While some of the ideas do not require memory-based bookkeeping, it's a solution to the problem where we need to make the same call to retrieve its results. In other words, we could make it work with filesystem locks but I don't believe we need to bear the cost of that complexity if we can do it in memory in a self-updating way.
The main point of this issue that the layer that ExternalClient
implementation will interact with should look very similar to a contemporary cloud SDK API so that we don't break assumptions of generic reconciler. This includes simplifying the inner workings of that SDK as well by reducing the number of layers there which we added to be able to work in parallel during the initial phase.
Currently, we generate the CRDs but we also need to add it to the main schema so that it's registered with controller manager cache.
During the initial implementation, in order to move fast and see our limits we deferred some of the discussion about the implementation and how we should do the bookkeeping was one of them. Currently, we manage the state and bookkeeping using filesystem. However, we also interact with Terraform's lock and state files, which makes the code complex to understand for readers and we don't really need to keep that information out of memory since the crash of the process means a restart of the Pod
anyway. Additionally, there are some opportunities to merge some of the functionality in the code to get a simpler flow that is easier to read and change.
I suggest that we use a global map that does the bookkeeping of the running CLI executions. Roughly, I think we need the following improvements in the initial implementation. Please feel free to add more things that we had deferred to after PoC phase. cc @ulucinar @turkenh
map[string]Workspace
whose key is metadata.uuid
.
Workspace
, should update the map before it starts and after it completes, similar to today.Describe
method on that Workspace
object that will tell you the current status, i.e. creating
, destroying
or no-op
.
apply
pipeline to be run repeatedly to check the status. But no other provider does that except some of Azure APIs, which is not user friendly at all in a controller, so we run into possibility of violating generic reconciler assumptions.Create
repeatedly because we need to make the same Apply
call to finalize it but since crossplane/crossplane-runtime#283 , this won't be allowed because generic reconciler won't call Create
again in the grace period.map[string]Workspace
can be treated just like a cloud provider SDK client. We don't need to know Terraform CLI, pipeline/operation details or propagating status as error etc. We can use map[string]interface{}
to pass down spec configuration and receive StateV4
and information about whether it's creating
or destroying
.
ExternalClient
implementation directly call a single interface. Because if you look at it today, it's merely a shim on top of conversion layer. Since we use similar names, it makes it really hard to follow the layers.ApplyResult
part, reporting current status should be the job of Describe
) which is very similar to the set of functions we use in native provider ExternalClient
implementations.Note that some of the technical debt is accrued on purpose because we needed to be able to work in parallel during PoC and discover things from scratch. Before declaring terrajet as prod-ready and solidify the implementation with more tests, I think it's time to revisit the parts that we completed during that phase and refactor into a more cohesive structure because thanks to that initial effort we know what works, how Terraform behaves and the rough performance implications of the decisions.
Crossplane uses external-name annotation to uniquely identify external resource (typically living in cloud API): https://crossplane.io/docs/v0.11/introduction/managed-resources.html#external-name
Terraform has a similar concept, ID, which is used for the same purpose.
However, terraform resources do not use the same field in attributes as an id rather this is a piece of information documented per resource. See https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/route53_zone#import
From import documentation:
ID is dependent on the resource type being imported. For example, for AWS instances it is the instance ID (i-abcd1234) but for AWS Route53 zones it is the zone ID (Z12ABC4UGMOZ2N). Please reference the provider documentation for details on the ID format. If you’re unsure, feel free to just try an ID. If the ID is invalid, you’ll just receive an error message.
We need to figure out how can we auto-generate this information and use it while setting external name annotation.
Right now at this scale (+700 for AWS +650 for Azure etc.), we cannot write an example YAML for each resource. But we need them for documentation, Crossplane Conformance and just to make sure resource-specific stuff works, like external-name -> id
field.
We can have a tool that will take HCL example and print its YAML version using the same strcase
instance that our schema generator uses. The source of input would depend on provider but for example, AWS stores examples in their repo directly though it's not a complete list.
In order to implement a code generator pipeline using Terrajet, authors need to wire up bunch of generators. While this allows flexibility, it's not really necessary for some cases.
We can have a Pipeline
struct that can accept options for customization while keeping the current generators for advanced cases. However, we need to solidify the initial set of providers to have a close-to-complete idea of what kinds of options we'd like to expose. After a certain threshold, it may not be worth to have the abstraction.
Terrajet is a new project that needs visibility. We should create a blog post that announces this new project alongside with the new 3 providers
Create a blogpost that describes the idea and the status.
Users don't know how to generate a TF provider.
We can have a step-by-step guide for generating a new TF based provider using terrajet.
goimports
is run before the file is written to disk so it's hard to see the problem at a glance. If terrajet allows configuration of linter, then generators could expose it as flag for debugging.
I am trying to create a VPC with provider-tf-aws
but it is not getting ready, when I describe it, I see the following output:
Name: sample-vpc
Namespace:
Labels: <none>
Annotations: <none>
API Version: vpc.aws.tf.crossplane.io/v1alpha1
Kind: Vpc
Metadata:
Creation Timestamp: 2021-09-01T09:41:19Z
Finalizers:
finalizer.managedresource.crossplane.io
Generation: 1
# omit ManagedFields
Resource Version: 2817
UID: 578e6a58-636b-4e2f-82bd-5bf1c691e9fc
Spec:
Deletion Policy: Delete
For Provider:
Cidr Block: 10.0.0.0/16
Region: us-east-2
Tags:
Name: DemoVpc
Tags All:
Name: DemoVpc
Provider Config Ref:
Name: example
Status:
At Provider:
Arn:
Default Network Acl Id:
Default Route Table Id:
Default Security Group Id:
Dhcp Options Id:
ipv6AssociationId:
ipv6CidrBlock:
Main Route Table Id:
Owner Id:
Conditions:
Last Transition Time: 2021-09-01T09:41:25Z
Reason: Creating
Status: False
Type: Ready
Last Transition Time: 2021-09-01T09:42:54Z
Reason: ReconcileSuccess
Status: True
Type: Synced
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatedExternalResource 6s (x10 over 95s) managed/vpc.vpc.aws.tf.crossplane.io Successfully requested creation of external resource
Warning CannotCreateExternalResource 6s (x3 over 82s) managed/vpc.vpc.aws.tf.crossplane.io failed to update: cannot update with tf cli: failed to apply Terraform configuration:
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_vpc.sample-vpc will be created
+ resource "aws_vpc" "sample-vpc" {
+ arn = (known after apply)
+ assign_generated_ipv6_cidr_block = false
+ cidr_block = "10.0.0.0/16"
+ default_network_acl_id = (known after apply)
+ default_route_table_id = (known after apply)
+ default_security_group_id = (known after apply)
+ dhcp_options_id = (known after apply)
+ enable_classiclink = (known after apply)
+ enable_classiclink_dns_support = (known after apply)
+ enable_dns_hostnames = (known after apply)
+ enable_dns_support = true
+ id = (known after apply)
+ instance_tenancy = "default"
+ ipv6_association_id = (known after apply)
+ ipv6_cidr_block = (known after apply)
+ main_route_table_id = (known after apply)
+ owner_id = (known after apply)
+ tags = {
+ "Name" = "DemoVpc"
}
+ tags_all = {
+ "Name" = "DemoVpc"
}
}
Plan: 1 to add, 0 to change, 0 to destroy.
aws_vpc.sample-vpc: Creating...
However, when I check the controller logs, I can see the actual problem somewhere in a long log message:
2021-09-01T12:43:01.760+0300 INFO provider-tf-aws Failed to run Terraform CLI {"tfcli-version": "0.0.0", "args": ["apply", "-auto-approve", "-input=false"], "executable": "terraform", "cwd": "/var/folders/jb/zwwlz42935308fcydsp4h05h0000gn/T/ws-93c57199f76ed373701483fb75b20d8d22840904b04ce4f7a1093882131d1b99", "stderr": "\u001b[31m╷\u001b[0m\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\u001b[1m\u001b[31mError: \u001b[0m\u001b[0m\u001b[1mError creating VPC: VpcLimitExceeded: The maximum number of VPCs has been reached.\n\u001b[31m│\u001b[0m \u001b[0m\tstatus code: 400, request id: e26f86cf-8bb1-40f3-a5f8-6c9e36351b52\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\u001b[0m with aws_vpc.sample-vpc,\n\u001b[31m│\u001b[0m \u001b[0m on main.tf.json line 20, in resource.aws_vpc.sample-vpc.tags_all:\n\u001b[31m│\u001b[0m \u001b[0m 20: \"sample-vpc\": {\"assign_generated_ipv6_cidr_block\":null,\"cidr_block\":\"10.0.0.0/16\",\"enable_classiclink\":null,\"enable_classiclink_dns_support\":null,\"enable_dns_hostnames\":null,\"enable_dns_support\":null,\"instance_tenancy\":null,\"tags\":{\"Name\":\"DemoVpc\"},\"tags_all\":{\"Name\":\"DemoVpc\"}\u001b[4m}\u001b[0m\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\n\u001b[31m╵\u001b[0m\u001b[0m\n", "stdout": "\nTerraform used the selected providers to generate the following execution\nplan. Resource actions are indicated with the following symbols:\n \u001b[32m+\u001b[0m create\n\u001b[0m\nTerraform will perform the following actions:\n\n\u001b[1m # aws_vpc.sample-vpc\u001b[0m will be created\u001b[0m\u001b[0m\n\u001b[0m \u001b[32m+\u001b[0m\u001b[0m resource \"aws_vpc\" \"sample-vpc\" {\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0marn\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0massign_generated_ipv6_cidr_block\u001b[0m\u001b[0m = false\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mcidr_block\u001b[0m\u001b[0m = \"10.0.0.0/16\"\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_network_acl_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_route_table_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_security_group_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdhcp_options_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_classiclink\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_classiclink_dns_support\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_dns_hostnames\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_dns_support\u001b[0m\u001b[0m = true\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mid\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0minstance_tenancy\u001b[0m\u001b[0m = \"default\"\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mipv6_association_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mipv6_cidr_block\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mmain_route_table_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mowner_id\u001b[0m\u001b[0m = (known after apply)\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mtags\u001b[0m\u001b[0m = {\n \u001b[32m+\u001b[0m \u001b[0m\"Name\" = \"DemoVpc\"\n }\n \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mtags_all\u001b[0m\u001b[0m = {\n \u001b[32m+\u001b[0m \u001b[0m\"Name\" = \"DemoVpc\"\n }\n }\n\n\u001b[0m\u001b[1mPlan:\u001b[0m 1 to add, 0 to change, 0 to destroy.\n\u001b[0m\u001b[0m\u001b[1maws_vpc.sample-vpc: Creating...\u001b[0m\u001b[0m\n", "error": "exit status 1"}
Which is indeed:
Error creating VPC: VpcLimitExceeded: The maximum number of VPCs has been reached.
We need to ensure to return actual errors in the form of events on the managed resource.
Also, similar to other crossplane providers, we should hide (per resource) logs at info level.
We used to put every CRD into its own package, hence each had its own zz_terraformed
file but now we don't do that. We can generate a single zz_generated.terraformed.go file for the whole group, similar to zz_managed
and zz_deepcopy
.
There are some differences between Crossplane and Terraform on which fields are passed as provider configuration vs resource specification. A good example of this is "region" field for AWS, which is part of provider configuration on Terraform side but part of resource spec for Crossplane.
We need to discuss how to handle these differences and presumably follow Crossplane's conventions which would require some changes during schema generation and conversion side.
Terraform allows users to add custom lifecycle metadata to resource
block. Since we manage a single resource with one TF workspace and that resource shouldn't be deleted with any Apply
call, we need to include prevent_destroy
and return the error if user changes something immutable that requires recreation.
We use exec
package to call Terraform CLI and there isn't a utility like afero
to replace it with test structs. So, it's really cumbersome to test functions that make CLI calls.
We can implement a utility similar to afero
but for CLI calls.
The common controller is a layer sitting between generated schema and async terraform CLI library, mostly dealing with conversions between crossplane/terraform and proper implementation of crossplane managed reconciler interface functions.
The current implementation is mostly tested with some early implementations (open PRs, assumptions) and happy path scenarios (mostly with AWS VPC).
We would need to revisit the existing implementation with more resources and edge-case scenarios.
Another point is, implementation of this layer could be affected by the output of this discussion, hence it might make sense to consider that while working here.
When I delete all generated files from apis
folder, the following side-effect import throws error:
_ "github.com/crossplane-contrib/provider-tf-aws/apis"
imports github.com/crossplane-contrib/provider-tf-aws/apis: build constraints exclude all Go files in /Users/monus/go/src/github.com/crossplane/provider-tf-aws/apis
I think we should have a non-generated Go file in apis
folder that contains a config.Provider
object that will be populated by the init()
calls of custom.go
files in CRD packages instead of targeting the Provider
instance in Terrajet package.
find apis -iname 'zz_*' | xargs rm -rf && find internal -iname 'zz_*' | xargs rm -rf
go run cmd/generator/main.go
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.