crossplane / terrajet Goto Github PK

View Code? Open in Web Editor NEW

289.0 16.0 38.0 1.2 MB

Generate Crossplane Providers from any Terraform Provider

Home Page: https://crossplane.io

License: Apache License 2.0

Makefile 0.70% Go 99.30%

terraform terraform-provider kubernetes crossplane crossplane-provider

terrajet's Issues

Load test with 100+ CR instances

What problem are you facing?

Common controller works for our prototypes but we need to do more experiments to see its scalability. There are certain issues that need some data before taking action, like #38

How could Terrajet help solve your problem?

We can have a complex composition with 10+ resources and create 10 XRs using that. Then check the resource usage and errors across the board to see if we hit any limit. We can define some limits in the deployment of the controller and see at what point we start to see context deadline exceeded errors, meaning we can't complete a single reconciliation pass with those limits.

Generate AWS Provider

Add .github folder

I believe if we copy the .github directory over from crossplane/crossplane we'll automatically get CI/CD, issue and PR templates, etc etc for this repo.

Investigate why installing 700+ CRDs causing degradation of performance in apiserver

What problem are you facing?

Today if you run kubectl apply -f package/crds in provider-tf-aws, your cluster gets really slow. In GKE, kubectl command just stops after like 50 CRDs.

How could Terrajet help solve your problem?

We have some ideas around sharding the controllers and API types, allowing customers to install only a set of them. But we haven't identified the actual problem. So, we need to make sure we know the root cause of the problem before choosing a solution so that we have that problem in mind for future designs.

Sensitive data leaks at `status.atProvider` and the "tf.crossplane.io/state" annotation

What happened?

When testing provider-tf-azure's KubernetesCluster MR, we have observed that provisioned cluster's credentials, e.g. KubeConfig, are available in the Teffaform state as regular fields. An example is:

{
  "version": 4,
  "terraform_version": "1.0.5",
  "serial": 3,
  "lineage": "365fcd38-09a3-e30a-cb02-df3117306af0",
  "outputs": {},
  "resources": [
    {
      "mode": "managed",
      "type": "azurerm_kubernetes_cluster",
      "name": "example",
      "provider": "provider[\"registry.terraform.io/hashicorp/azurerm\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
...
            "kube_config": [
              {
                "client_certificate":

Because these attributes are resource dependent, we need to be able to treat them as credentials in a resource specific manner. It looks like in the Terraform provider resource schema, such fields are marked as sensitive, like kube_config_raw, or kube_config.password.

Those fields that are marked as sensitive are also good candidates for being part of the connection secrets.

How can we reproduce it?

Provisioning a KubernetesCluster using https://github.com/crossplane-contrib/provider-tf-azure/blob/258830e0bc0627f2749bcd7e285e3c831849a04a/examples/kubernetes/kubernetescluster.yaml reveals the issue.

Make sure we are able to remove schema fields that are represented as different types

What problem are you facing?

There are some APIs that allow you to manage the same object in two different APIs, for example you can manage NodeGroup through its API but also in an array under Cluster object. Per our Managed Resources API Design, we make the decision that in such cases, we'd choose to represent the resource only in one way and that would be a separate resource.

It seems like Terraform community also tries to follow this pattern but there are cases where they didn't do that in the past and now they can't change the interface. So, what they do for such resources is to show a warning, encouraging to use the separate API.

How could Terrajet help solve your problem?

Since we don't have a strong contract yet, we can omit those fields from schema before generation, i.e. remove nodeGroups field from cluster object. We can do that easily by manipulating the TF schema we give to CRD generator but we need to make sure it works well across the board by testing it manually and adding the instructions to the guide that is to be written.

Use schema.ResourceTimeout to infer sync vs async Apply/Destroy

What problem are you facing?

In PR #76, @muvaf implemented two different modes for Apply and Destroy operations, namely sync and async.
Currently, the operation mode per resource is expected as input from provider developers. However, Terraform resource schema already has a per resource ResourceTimeout which was implemented by provider authors. We can leverage this information to decide on which mode to choose by default (still consider opening a configuration to override it).

For example, AWS RDS cluster has a default of 120 mins and we could choose the async mode. For resources that have no timeouts set or timeouts less than a sensible threshold, e.g. 5 mins, we can choose the sync mode.

How could Terrajet help solve your problem?

Leverage ResourceTimeout to decide which operation mode to use by default.

Controllers setup functions should be registered

What problem are you facing?

Currently we are generating controllers setup functions but do not call setup of individual controllers like this.

How could Terrajet help solve your problem?

This sounds like a similar problem to #20 and we might consider introducing a similar mechanism to register controllers.

Validation test for generated CRDs against their schema

Capture Terraform Provider Source and Version during build time and make available in tfcli runtime

What problem are you facing?

tfcli runtime needs the Terraform provider source and version at runtime for correctly generating the corresponding provider block in per-resource Terraform configuration. We need to capture this information during container image build time and make it available to the tfcli runtime.

How could Terrajet help solve your problem?

The source of truth for the provider source and version constraint could be two env. variables, as both of pieces of information are also needed when preparing the local Terraform image mirror. Then as discussed previously, we could read those env. variables injected into container's runtime in the main function, and have their values flow into tfcli runtime via the following path: main -> (generated) controller.Setup -> <(generated) resource>.Setup -> terraform.SetupController and stored in terraform.connector to be used by tfcli.

Consider converting to camel case with acronyms

https://github.com/iancoleman/strcase#custom-acronyms-for-tocamel--tolowercamel

If we're needing to translate snake case Terraform fields to camel case, it may be nice to teach our translation code a few common acronyms so it can capitalize them correctly. That said - in same cases (e.g. AWS) I recall that they don't capitalize acronyms so perhaps it's better to just match their API as closely as possible there. 🤷

Alternative ways to store terraform state

In the initial implementation, we are planning to store terraform states as annotations for the managed resource.

ETCD (i.e. annotation) will be authoritative in terms of the state and we avoid making any changes/edits on it except stripping out sensitive attributes. This is the most conservative approach in terms of using terraform just like as an end-user.

However, we want to investigate whether we can safely (re)build the state information from the spec, status, and kind information of the resource and completely remove the annotation.

To better describe the problem, see the following tfstate example:

{
  "version": 4,
  "terraform_version": "1.0.4",
  "serial": 5,
  "lineage": "f36b8453-5686-97c3-af3e-6a5c655c5fbf",
  "outputs": {},
  "resources": [
    {
      "mode": "managed",
      "type": "aws_vpc",
      "name": "instance_vpc_hasan",
      "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
      "instances": [
        {
          "schema_version": 1,
          "attributes": {
            "arn": "arn:aws:ec2:us-west-2:609897127049:vpc/vpc-0b396ddd9469f29e5",
            "assign_generated_ipv6_cidr_block": false,
            "cidr_block": "10.0.0.0/16",
            "default_network_acl_id": "acl-08e8f435b74af56e8",
            "default_route_table_id": "rtb-0eb8f2b9933d31175",
            "default_security_group_id": "sg-034d7899a57871f29",
            "dhcp_options_id": "dopt-0c388774",
            "enable_classiclink": false,
            "enable_classiclink_dns_support": false,
            "enable_dns_hostnames": false,
            "enable_dns_support": true,
            "id": "vpc-0b396ddd9469f29e5",
            "instance_tenancy": "default",
            "ipv6_association_id": "",
            "ipv6_cidr_block": "",
            "main_route_table_id": "rtb-0eb8f2b9933d31175",
            "owner_id": "609897127049",
            "tags": {
              "Name": "ExampleVPCInstanceByHasan"
            },
            "tags_all": {
              "Name": "ExampleVPCInstanceByHasan"
            }
          },
          "sensitive_attributes": [],
          "private": "eyJzY2hlbWFfdmVyc2lvbiI6IjEifQ=="
        }
      ]
    }
  ]
}

resources[0].instances[0].attributes field, is a combination of
- spec.forProvider
- status.atProvider
- external_name annotation (id here)
resources[0].instances[0].sensitive_attributes field would be available in the connection secret (if required during apply)
resources[0].type, resources[0].name and resources[0].provider would also be available during reconciliation.

Two caveats here:

spec.forProvider contains the desired state, not the last applied state which would be different that what tfstate typically contains.
We are not sure how to terraform client behaves when other arbitrary information like lineage, serial and resources[0].instances[0].private exactly the same as the original one which would require to persist them as well.

"omitempty" JSON tag should be included for fields of Observation structs of Terraformed resources

What happened?

When I gave provider-tf-azure KubernetesCluster a try, provider cannot initially update the observed state at state.atProvider because in the generated v1alpha1.KubernetesClusterObservation struct, currently all the fields are required. However, values of most fields are not initially available like KubeConfig.

How can we reproduce it?

Late initialize change report returns false positive

What happened?

While I was testing #76 , I used RdsCluster as async example. After creation, late initializer always reported true, meaning that there needs to be a spec update and because of that resource never gets ready since spec update erases changes in the status.

How can we reproduce it?

I added the YAML I used. You can see it never gets ready and if you debug, it's seen that late initializer reports true all the time.

Implement LateInitialization

In Crossplane managed resources, not all fields in the spec are required during creation time.
For optional fields, we need to properly parse attributes in Terraform state into generated spec schema.

Some nested status fields might be missing

What problem are you facing?

We use Terraform schema to generate the type of every field. While Terraform has one big untyped tree of fields to hold all the information, that's not the case in Crossplane. In line with Kubernetes API conventions, we separate fields into spec and status fields, status ones being fields that user cannot ever configure.

In Terraform schema, fields report whether they are Computed and Optional. The fields that are both Computed and Optional are the ones that have server-side defaults and user-configurable; the ones we late-initialize. So we put those under spec, but fields that are Computed and not Optional go to status.

The problem is that there can be some types that are Computed and not Optional but also has nested fields that can go under spec and vice versa. Right now, when we compute the type for a field, we return a parameters type as well as an observation type, then decide which one to use depending on the properties of the field. If the field goes to spec but there are nested fields that could go to status, they are eliminated. Also, if a field goes to status, all nested fields go to status as well and the parameter types are ignored.

When I did an analysis of how prevalent this problem is in, for example, AWS, I see the following output:

there are 1 fields that could be observation type of .SyntheticsCanary.VpcConfig but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmplifyDomainAssociation.SubDomain but ignored since the field is a parameters field
there are 1 fields that could be observation type of .StoragegatewayGateway.SmbActiveDirectorySettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .LexBotAlias.ConversationLogs.LogSettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Route53ResolverEndpoint.IpAddress but ignored since the field is a parameters field
there are 1 fields that could be observation type of .LambdaFunction.VpcConfig but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.DagProcessingLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.SchedulerLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.TaskLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.WebserverLogs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .MwaaEnvironment.LoggingConfiguration.WorkerLogs but ignored since the field is a parameters field
there are 4 fields that could be parameters type of .ElasticBeanstalkEnvironment.AllSettings but ignored since the field is an observation field
there are 2 fields that could be observation type of .EksCluster.VpcConfig but ignored since the field is a parameters field
there are 2 fields that could be observation type of .DirectoryServiceDirectory.ConnectSettings but ignored since the field is a parameters field
there are 1 fields that could be observation type of .DirectoryServiceDirectory.VpcSettings but ignored since the field is a parameters field
there are 3 fields that could be observation type of .EmrCluster.CoreInstanceFleet but ignored since the field is a parameters field
there are 1 fields that could be observation type of .EmrCluster.CoreInstanceGroup but ignored since the field is a parameters field
there are 3 fields that could be observation type of .EmrCluster.MasterInstanceFleet but ignored since the field is a parameters field
there are 1 fields that could be observation type of .EmrCluster.MasterInstanceGroup but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.Input but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.Output but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.SqlApplicationConfiguration.ReferenceDataSource but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Kinesisanalyticsv2Application.ApplicationConfiguration.VpcConfiguration but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Kinesisanalyticsv2Application.CloudwatchLoggingOptions but ignored since the field is a parameters field
there are 3 fields that could be observation type of .SecretsmanagerSecret.Replica but ignored since the field is a parameters field
there are 4 fields that could be parameters type of .SsmDocument.Parameter but ignored since the field is an observation field
there are 1 fields that could be observation type of .Alb.SubnetMapping but ignored since the field is a parameters field
there are 1 fields that could be observation type of .SpotInstanceRequest.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .SpotInstanceRequest.RootBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Lb.SubnetMapping but ignored since the field is a parameters field
there are 2 fields that could be observation type of .ElasticsearchDomain.VpcOptions but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Apigatewayv2DomainName.DomainNameConfiguration but ignored since the field is a parameters field
there are 1 fields that could be observation type of .NetworkInterface.Attachment but ignored since the field is a parameters field
there are 1 fields that could be observation type of .Instance.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .Instance.RootBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .CodestarnotificationsNotificationRule.Target but ignored since the field is a parameters field
there are 2 fields that could be observation type of .CodeartifactRepository.ExternalConnections but ignored since the field is a parameters field
there are 8 fields that could be observation type of .AmiCopy.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmiCopy.EphemeralBlockDevice but ignored since the field is a parameters field
there are 8 fields that could be observation type of .AmiFromInstance.EbsBlockDevice but ignored since the field is a parameters field
there are 2 fields that could be observation type of .AmiFromInstance.EphemeralBlockDevice but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.CloudwatchLoggingOptions but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.Inputs.Schema.RecordFormat but ignored since the field is a parameters field
there are 2 fields that could be observation type of .KinesisAnalyticsApplication.Inputs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.Outputs but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.ReferenceDataSources.Schema.RecordFormat but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisAnalyticsApplication.ReferenceDataSources but ignored since the field is a parameters field
there are 1 fields that could be observation type of .KinesisFirehoseDeliveryStream.ElasticsearchConfiguration.VpcConfig but ignored since the field is a parameters field
there are 1 fields that could be observation type of .GlueCatalogTable.PartitionIndex but ignored since the field is a parameters field

Generated 770 resources!

There are thousands of fields generated for 770 resources, so problem is not that big, at least for AWS. We see that almost all ignored ones are the ones that exist under a parameters type but should've gone to observation. So, the problem does not cause much miss in the configuration ability of the users but there is information they'd not be able to see under status.

How could Terrajet help solve your problem?

There can be several solutions. The most basic one is to have the full schema in both status and spec but that'd mean deviation from Crossplane Resource Model even if unused fields will stay empty and it's not really worth doing this for all resources to be correct for minority cases.

The best solution would be the one that would construct the earlier fields and types under status once we encounter an observational type under a parameter type (and vice versa). However, it's not done in the first iteration since the field path wasn't known by the schema builder. With #24 it will know the path.

So, we could possibly have the type map be a more complex tree with some utilities that will allow you to add a certain field to an arbitrary point and construct the path from the root to that leaf by building the necessary types & fields. Right now, it has the list of types for uniqueness purposes but the actual information is recursively saved in the Go type objects. So this solution would mean we'd deviate from the Go types package to store the information and that'd require some refactoring.

I'd propose that we can have a utility that will work on Go types library objects to manipulate the existing type trees to add arbitrary field to a field path. There would be a function that knows the root types and when you give it an arbitrary field information, it'd construct the path in the either of the two root structs (parameters and observation). There are some caveats about naming of those new fields and types but it's doable and lets us still be in the same type system instead of building our own.

Add README.md

What problem are you facing?

Repo doesn't have a README where people can read and understand what it's for.

How could Terrajet help solve your problem?

Add a README.

Discuss whether we need provider-tf-template repository

What problem are you facing?

Looking at crossplane-contrib/provider-jet-aws#11 , we have some differences between TF based providers and native ones in terms of how they are packaged or organized structurally.

How could Terrajet help solve your problem?

We can consider provider-tf-template repo but since that'd be another repo to maintain, we first we need to come up with a list of things that are different than native ones so that we can see if it's worth it. A branch in provider-template could be another option.

Unit tests for common controller

Implement unit tests for Terraform CLI lifecycle manager & Async Scheduler

What problem are you facing?

PR #3 does not have any automated tests as of now. We would like to get it merged and track test implementation with this issue.

How could Terrajet help solve your problem?

Discuss merging tf providers into native ones

What problem are you facing?

While having two different providers for the same API is doable, the UX is usually not great if you decide to use both because of reason. It'd be great if we have a single complete one.

How could Terrajet help solve your problem?

TF-based providers could reach full feature parity and in that case, Terraform would really be an implementation detail. So, we can consider merging for example provider-tf-aws into provider-aws, covering only the missing resources so that we can generate CRR references between manual written, ACK generated and TF generated. And we'd have one provider to point to.

One drawback is that the API changes once we'd like to move underlying resource implementation from TF to ACK or manual and vice versa.

Though we can discuss this only after alpha version.

Consider making Terraform CLI talk to the single provider server

What problem are you facing?

Right now we have a single provider binary and all terraform invocations use that one but each of them spins up a new provider server to talk to. We haven't observed noticeable issues but it's likely that this will cause performance issues.

How could Terrajet help solve your problem?

@ulucinar discovered a way to make Terraform CLI talk to the existing provider server. He didn't feel great about it for the initial pass but we might need to reconsider it after testing the providers with 100+ custom resources using composition.

Comment tags on CRD fields

Generated CRDs don't have some of the necessary tags like // +kubebuilder:validation:Required. A comment printer functionality will be necessary and it will help with printing documentation as well.

Type collision in the same CRD package results in unstable output

If there are two fields with the same name in different types of the CRD struct tree, they'd be assumed to have the same schema. It turns out this assumption is wrong; having the same field name does not always mean that the underlying schema is the same.

An example is the timeout field of aws_appmesh_route resource; here and here. Our builder would assume once a type exists in the map, there is no need to generate another one and since it worked directly on a map, the chosen one would change, producing unstable output.

Expose parallelization config

All generator utilities are written parallelization in mind at CRD level, i.e. every CRD is generated completely separately. But there isn't an interface in terrajet exposed to generator implementations, like core count allocated.

Unit tests for CRD generator

Unit tests for conversion cli adapter

Add known special acronyms to init

The library we use to do camel case conversions allow customization by adding acronyms in init(). We should have a default set of custom acronyms (like DNS, IPv6, ID etc) and let providers add their own if need be.

Generate reference and selector fields if reference marker exists

What problem are you facing?

crossplane/crossplane-tools#35 lets us generate reference resolver but it still needs the Reference and Selector fields generated beforehand.

How could Terrajet help solve your problem?

Make schema buidler add those two fields each time it sees the comment marker resolver generator uses.

Create provider-tf-template

What problem are you facing?

Terrajet providers have different Dockerfile and several other things that we need to do lots of manual plumbing to make provider-template work.

How could Terrajet help solve your problem?

We can have a provider-tf-template people could just use to bootstrap a Terrajet-based provider.

Generate all alpha resources for `provider-tf-gcp`

Add Instructions

What problem are you facing?

The current readme doesn't provide any instructions on how to get started.

How could Terrajet help solve your problem?

We should provide an overview on how Terrjet works and a getting started on how to generate a Crossplane provider based on a Terraform provider.

Generate Azure Provider

Figure out how to handle sensitive information for managed resources

What problem are you facing?

Crossplane provides a way to connect to cloud resources (if applicable) by writing the required details to connect into a Kubernetes secret.
These details should include all fields that would need to be required to connect including sensitive ones like username and password and not necessarily sensitive ones like port information.

How could Terrajet help solve your problem?

We would need to define/implement a way to generate these connection details with terrajet.
I am assuming this information could be composed using the attributes and sensitive_attributes fields in terraform state.

Implement memory-based bookkeeping

What problem are you facing?

See #36 for details.

How could Terrajet help solve your problem?

While some of the ideas do not require memory-based bookkeeping, it's a solution to the problem where we need to make the same call to retrieve its results. In other words, we could make it work with filesystem locks but I don't believe we need to bear the cost of that complexity if we can do it in memory in a self-updating way.

The main point of this issue that the layer that ExternalClient implementation will interact with should look very similar to a contemporary cloud SDK API so that we don't break assumptions of generic reconciler. This includes simplifying the inner workings of that SDK as well by reducing the number of layers there which we added to be able to work in parallel during the initial phase.

Types should be registered

Currently, we generate the CRDs but we also need to add it to the main schema so that it's registered with controller manager cache.

Discuss filesystem locks and the need for repeated calls to get the operation status

What problem are you facing?

During the initial implementation, in order to move fast and see our limits we deferred some of the discussion about the implementation and how we should do the bookkeeping was one of them. Currently, we manage the state and bookkeeping using filesystem. However, we also interact with Terraform's lock and state files, which makes the code complex to understand for readers and we don't really need to keep that information out of memory since the crash of the process means a restart of the Pod anyway. Additionally, there are some opportunities to merge some of the functionality in the code to get a simpler flow that is easier to read and change.

How could Terrajet help solve your problem?

I suggest that we use a global map that does the bookkeeping of the running CLI executions. Roughly, I think we need the following improvements in the initial implementation. Please feel free to add more things that we had deferred to after PoC phase. cc @ulucinar @turkenh

A global async-safe map that holds information about existing Terraform workspaces, like map[string]Workspace whose key is metadata.uuid.
- Operation, which is invoked with a method on Workspace, should update the map before it starts and after it completes, similar to today.
Describe method on that Workspace object that will tell you the current status, i.e. creating, destroying or no-op.
- For example, we report that the resource does not exist during creation because we need apply pipeline to be run repeatedly to check the status. But no other provider does that except some of Azure APIs, which is not user friendly at all in a controller, so we run into possibility of violating generic reconciler assumptions.
- Today, we make controller call Create repeatedly because we need to make the same Apply call to finalize it but since crossplane/crossplane-runtime#283 , this won't be allowed because generic reconciler won't call Create again in the grace period.
This map map[string]Workspace can be treated just like a cloud provider SDK client. We don't need to know Terraform CLI, pipeline/operation details or propagating status as error etc. We can use map[string]interface{} to pass down spec configuration and receive StateV4 and information about whether it's creating or destroying.
- We could possibly remove the conversion layer completely and have the ExternalClient implementation directly call a single interface. Because if you look at it today, it's merely a shim on top of conversion layer. Since we use similar names, it makes it really hard to follow the layers.
- When we merge the layers and there is a single global map, the only interface we'd need could be the one proposed in the design doc (except ApplyResult part, reporting current status should be the job of Describe) which is very similar to the set of functions we use in native provider ExternalClient implementations.

Note that some of the technical debt is accrued on purpose because we needed to be able to work in parallel during PoC and discover things from scratch. Before declaring terrajet as prod-ready and solidify the implementation with more tests, I think it's time to revisit the parts that we completed during that phase and refactor into a more cohesive structure because thanks to that initial effort we know what works, how Terraform behaves and the rough performance implications of the decisions.

Setting external-name annotation

Crossplane uses external-name annotation to uniquely identify external resource (typically living in cloud API): https://crossplane.io/docs/v0.11/introduction/managed-resources.html#external-name

Terraform has a similar concept, ID, which is used for the same purpose.
However, terraform resources do not use the same field in attributes as an id rather this is a piece of information documented per resource. See https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/route53_zone#import

From import documentation:

ID is dependent on the resource type being imported. For example, for AWS instances it is the instance ID (i-abcd1234) but for AWS Route53 zones it is the zone ID (Z12ABC4UGMOZ2N). Please reference the provider documentation for details on the ID format. If you’re unsure, feel free to just try an ID. If the ID is invalid, you’ll just receive an error message.

We need to figure out how can we auto-generate this information and use it while setting external name annotation.

Generate example YAMLs

What problem are you facing?

Right now at this scale (+700 for AWS +650 for Azure etc.), we cannot write an example YAML for each resource. But we need them for documentation, Crossplane Conformance and just to make sure resource-specific stuff works, like external-name -> id field.

How could Terrajet help solve your problem?

We can have a tool that will take HCL example and print its YAML version using the same strcase instance that our schema generator uses. The source of input would depend on provider but for example, AWS stores examples in their repo directly though it's not a complete list.

Pipeline abstraction

What problem are you facing?

In order to implement a code generator pipeline using Terrajet, authors need to wire up bunch of generators. While this allows flexibility, it's not really necessary for some cases.

How could Terrajet help solve your problem?

We can have a Pipeline struct that can accept options for customization while keeping the current generators for advanced cases. However, we need to solidify the initial set of providers to have a close-to-complete idea of what kinds of options we'd like to expose. After a certain threshold, it may not be worth to have the abstraction.

Blogpost about Terrajet

What problem are you facing?

Terrajet is a new project that needs visibility. We should create a blog post that announces this new project alongside with the new 3 providers

How could Terrajet help solve your problem?

Create a blogpost that describes the idea and the status.

Guide to generate a new TF provider

What problem are you facing?

Users don't know how to generate a TF provider.

How could Terrajet help solve your problem?

We can have a step-by-step guide for generating a new TF based provider using terrajet.

Terrajet should expose linter as flag

goimports is run before the file is written to disk so it's hard to see the problem at a glance. If terrajet allows configuration of linter, then generators could expose it as flag for debugging.

Proper error reporting and eventing

What problem are you facing?

I am trying to create a VPC with provider-tf-aws but it is not getting ready, when I describe it, I see the following output:

Name:         sample-vpc
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  vpc.aws.tf.crossplane.io/v1alpha1
Kind:         Vpc
Metadata:
  Creation Timestamp:  2021-09-01T09:41:19Z
  Finalizers:
    finalizer.managedresource.crossplane.io
  Generation:  1
  # omit ManagedFields
  Resource Version:  2817
  UID:               578e6a58-636b-4e2f-82bd-5bf1c691e9fc
Spec:
  Deletion Policy:  Delete
  For Provider:
    Cidr Block:  10.0.0.0/16
    Region:      us-east-2
    Tags:
      Name:  DemoVpc
    Tags All:
      Name:  DemoVpc
  Provider Config Ref:
    Name:  example
Status:
  At Provider:
    Arn:
    Default Network Acl Id:
    Default Route Table Id:
    Default Security Group Id:
    Dhcp Options Id:
    ipv6AssociationId:
    ipv6CidrBlock:
    Main Route Table Id:
    Owner Id:
  Conditions:
    Last Transition Time:  2021-09-01T09:41:25Z
    Reason:                Creating
    Status:                False
    Type:                  Ready
    Last Transition Time:  2021-09-01T09:42:54Z
    Reason:                ReconcileSuccess
    Status:                True
    Type:                  Synced
Events:
  Type     Reason                        Age                From                                  Message
  ----     ------                        ----               ----                                  -------
  Normal   CreatedExternalResource       6s (x10 over 95s)  managed/vpc.vpc.aws.tf.crossplane.io  Successfully requested creation of external resource
  Warning  CannotCreateExternalResource  6s (x3 over 82s)   managed/vpc.vpc.aws.tf.crossplane.io  failed to update: cannot update with tf cli: failed to apply Terraform configuration:
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_vpc.sample-vpc will be created
  + resource "aws_vpc" "sample-vpc" {
      + arn                              = (known after apply)
      + assign_generated_ipv6_cidr_block = false
      + cidr_block                       = "10.0.0.0/16"
      + default_network_acl_id           = (known after apply)
      + default_route_table_id           = (known after apply)
      + default_security_group_id        = (known after apply)
      + dhcp_options_id                  = (known after apply)
      + enable_classiclink               = (known after apply)
      + enable_classiclink_dns_support   = (known after apply)
      + enable_dns_hostnames             = (known after apply)
      + enable_dns_support               = true
      + id                               = (known after apply)
      + instance_tenancy                 = "default"
      + ipv6_association_id              = (known after apply)
      + ipv6_cidr_block                  = (known after apply)
      + main_route_table_id              = (known after apply)
      + owner_id                         = (known after apply)
      + tags                             = {
          + "Name" = "DemoVpc"
        }
      + tags_all                         = {
          + "Name" = "DemoVpc"
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.
aws_vpc.sample-vpc: Creating...

However, when I check the controller logs, I can see the actual problem somewhere in a long log message:

2021-09-01T12:43:01.760+0300    INFO    provider-tf-aws Failed to run Terraform CLI     {"tfcli-version": "0.0.0", "args": ["apply", "-auto-approve", "-input=false"], "executable": "terraform", "cwd": "/var/folders/jb/zwwlz42935308fcydsp4h05h0000gn/T/ws-93c57199f76ed373701483fb75b20d8d22840904b04ce4f7a1093882131d1b99", "stderr": "\u001b[31m╷\u001b[0m\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\u001b[1m\u001b[31mError: \u001b[0m\u001b[0m\u001b[1mError creating VPC: VpcLimitExceeded: The maximum number of VPCs has been reached.\n\u001b[31m│\u001b[0m \u001b[0m\tstatus code: 400, request id: e26f86cf-8bb1-40f3-a5f8-6c9e36351b52\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\u001b[0m  with aws_vpc.sample-vpc,\n\u001b[31m│\u001b[0m \u001b[0m  on main.tf.json line 20, in resource.aws_vpc.sample-vpc.tags_all:\n\u001b[31m│\u001b[0m \u001b[0m  20:             \"sample-vpc\": {\"assign_generated_ipv6_cidr_block\":null,\"cidr_block\":\"10.0.0.0/16\",\"enable_classiclink\":null,\"enable_classiclink_dns_support\":null,\"enable_dns_hostnames\":null,\"enable_dns_support\":null,\"instance_tenancy\":null,\"tags\":{\"Name\":\"DemoVpc\"},\"tags_all\":{\"Name\":\"DemoVpc\"}\u001b[4m}\u001b[0m\u001b[0m\n\u001b[31m│\u001b[0m \u001b[0m\n\u001b[31m╵\u001b[0m\u001b[0m\n", "stdout": "\nTerraform used the selected providers to generate the following execution\nplan. Resource actions are indicated with the following symbols:\n  \u001b[32m+\u001b[0m create\n\u001b[0m\nTerraform will perform the following actions:\n\n\u001b[1m  # aws_vpc.sample-vpc\u001b[0m will be created\u001b[0m\u001b[0m\n\u001b[0m  \u001b[32m+\u001b[0m\u001b[0m resource \"aws_vpc\" \"sample-vpc\" {\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0marn\u001b[0m\u001b[0m                              = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0massign_generated_ipv6_cidr_block\u001b[0m\u001b[0m = false\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mcidr_block\u001b[0m\u001b[0m                       = \"10.0.0.0/16\"\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_network_acl_id\u001b[0m\u001b[0m           = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_route_table_id\u001b[0m\u001b[0m           = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdefault_security_group_id\u001b[0m\u001b[0m        = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mdhcp_options_id\u001b[0m\u001b[0m                  = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_classiclink\u001b[0m\u001b[0m               = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_classiclink_dns_support\u001b[0m\u001b[0m   = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_dns_hostnames\u001b[0m\u001b[0m             = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0menable_dns_support\u001b[0m\u001b[0m               = true\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mid\u001b[0m\u001b[0m                               = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0minstance_tenancy\u001b[0m\u001b[0m                 = \"default\"\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mipv6_association_id\u001b[0m\u001b[0m              = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mipv6_cidr_block\u001b[0m\u001b[0m                  = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mmain_route_table_id\u001b[0m\u001b[0m              = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mowner_id\u001b[0m\u001b[0m                         = (known after apply)\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mtags\u001b[0m\u001b[0m                             = {\n          \u001b[32m+\u001b[0m \u001b[0m\"Name\" = \"DemoVpc\"\n        }\n      \u001b[32m+\u001b[0m \u001b[0m\u001b[1m\u001b[0mtags_all\u001b[0m\u001b[0m                         = {\n          \u001b[32m+\u001b[0m \u001b[0m\"Name\" = \"DemoVpc\"\n        }\n    }\n\n\u001b[0m\u001b[1mPlan:\u001b[0m 1 to add, 0 to change, 0 to destroy.\n\u001b[0m\u001b[0m\u001b[1maws_vpc.sample-vpc: Creating...\u001b[0m\u001b[0m\n", "error": "exit status 1"}

Which is indeed:

Error creating VPC: VpcLimitExceeded: The maximum number of VPCs has been reached.

How could Terrajet help solve your problem?

We need to ensure to return actual errors in the form of events on the managed resource.
Also, similar to other crossplane providers, we should hide (per resource) logs at info level.

Generate a single zz_generated.terraformed.go file for the whole group

We used to put every CRD into its own package, hence each had its own zz_terraformed file but now we don't do that. We can generate a single zz_generated.terraformed.go file for the whole group, similar to zz_managed and zz_deepcopy.

Handle provider configuration differences with Terraform

What problem are you facing?

There are some differences between Crossplane and Terraform on which fields are passed as provider configuration vs resource specification. A good example of this is "region" field for AWS, which is part of provider configuration on Terraform side but part of resource spec for Crossplane.

How could Terrajet help solve your problem?

We need to discuss how to handle these differences and presumably follow Crossplane's conventions which would require some changes during schema generation and conversion side.

Use prevent_destroy lifecycle metadata in all generated HCL files

Terraform allows users to add custom lifecycle metadata to resource block. Since we manage a single resource with one TF workspace and that resource shouldn't be deleted with any Apply call, we need to include prevent_destroy and return the error if user changes something immutable that requires recreation.

Mock utility for command call tests

What problem are you facing?

We use exec package to call Terraform CLI and there isn't a utility like afero to replace it with test structs. So, it's really cumbersome to test functions that make CLI calls.

How could Terrajet help solve your problem?

We can implement a utility similar to afero but for CLI calls.

Revisit/test common controller after final tfcli implementation

What problem are you facing?

The common controller is a layer sitting between generated schema and async terraform CLI library, mostly dealing with conversions between crossplane/terraform and proper implementation of crossplane managed reconciler interface functions.

The current implementation is mostly tested with some early implementations (open PRs, assumptions) and happy path scenarios (mostly with AWS VPC).

How could Terrajet help solve your problem?

We would need to revisit the existing implementation with more resources and edge-case scenarios.
Another point is, implementation of this layer could be affected by the output of this discussion, hence it might make sense to consider that while working here.

Generator cannot run when there is no Go file in apis folder

What happened?

When I delete all generated files from apis folder, the following side-effect import throws error:

_ "github.com/crossplane-contrib/provider-tf-aws/apis"

imports github.com/crossplane-contrib/provider-tf-aws/apis: build constraints exclude all Go files in /Users/monus/go/src/github.com/crossplane/provider-tf-aws/apis

I think we should have a non-generated Go file in apis folder that contains a config.Provider object that will be populated by the init() calls of custom.go files in CRD packages instead of targeting the Provider instance in Terrajet package.

How can we reproduce it?

Clone https://github.com/crossplane-contrib/provider-tf-aws
Run find apis -iname 'zz_*' | xargs rm -rf && find internal -iname 'zz_*' | xargs rm -rf
Run go run cmd/generator/main.go

crossplane / terrajet Goto Github PK

terrajet's Issues

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What happened?

How can we reproduce it?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What happened?

How can we reproduce it?

What happened?

How can we reproduce it?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What problem are you facing?

How could Terrajet help solve your problem?

What happened?

How can we reproduce it?

Recommend Projects

Recommend Topics

Recommend Org