Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate config Go code from schema #10694

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ptodev
Copy link

@ptodev ptodev commented Jul 22, 2024

Description

This is an attempt to generate the config.go files from Json schema. The goal of the PR is to delete all the config.go files for each component and replace them with some sort of Yaml or Json schema, which will most likely be sourced from the metadata.yaml file for each component. This PR uses an earlier PR as a starting point.

There are many unknowns related to this project. I will start threads on various problems via file comments and line comments, so that we can keep the discussion organised.

Link to tracking issue

Fixes #9769

SendBatchMaxSize int `json:"send_batch_max_size,omitempty" yaml:"send_batch_max_size,omitempty" mapstructure:"send_batch_max_size,omitempty"`

// SendBatchSize corresponds to the JSON schema field "send_batch_size".
SendBatchSize int `json:"send_batch_size,omitempty" yaml:"send_batch_size,omitempty" mapstructure:"send_batch_size,omitempty"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-jsonschema doesn't seem to use uint. In theory, the minimum: 0 in the schema would be a good enough workaround but it looks like it's not supported right now.

Would you be happy if instances of uint are replaced with int + a validation that the number is >= 0?

Copy link
Contributor

github-actions bot commented Aug 7, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Aug 7, 2024
Comment on lines +60 to 66
if v, ok := raw["timeout"]; !ok || v == nil {
var err error
plain.Timeout, err = time.ParseDuration("200ms")
if err != nil {
return err
}
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I cheated a little bit. The original code was:

	if v, ok := raw["timeout"]; !ok || v == nil {
		plain.Timeout = "200ms"
	}

I edited it a bit so that it compiles. I will have to update the upstream PR for time.Duration support to generate this kind of code correctly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose that we add files like this whenever we need to augment the auto-generated code with some additional behaviour. For example, in this file we added Validate() and createDefaultConfig() functions.

send_batch_size:
type: integer
minimum: 0
default: 8192
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting a default value is important, because otherwise g--jsonschema will autogenerate an *int. Similarly, for string properties which don't have a default value it'd generate a *string. If we do set a default value, we will get an int and a string, with no pointers.

Timeout time.Duration `mapstructure:"timeout"`
// MetadataCardinalityLimit corresponds to the JSON schema field
// "metadata_cardinality_limit".
MetadataCardinalityLimit int `mapstructure:"metadata_cardinality_limit,omitempty"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to check whether having the omitempty is a problem. OTel doesn't use it much right now, but it looks like go-jsonschema always puts it in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think omitempty is used by mapstructure at all.

// Default value is 0, that means no maximum size.
SendBatchMaxSize uint32 `mapstructure:"send_batch_max_size"`
// SendBatchMaxSize corresponds to the JSON schema field "send_batch_max_size".
SendBatchMaxSize int `mapstructure:"send_batch_max_size,omitempty"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care much that the uint is now an int?
I suppose we don't, because we can add a check to make sure it's not negative. Also, I doubt that the integers people configure are so large that an int wouldn't fit them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do - at the very least we cannot change types like this. We need to deprecate and move over with a set of changes over multiple releases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could update my fork of go-jsonschema to use uint whenever there is a minimum: 0 or an exclusiveMinimum: 0? It seems like a reasonable feature - maybe it'll also get accepted upstream.

send_batch_max_size:
type: integer
default: 0
minimum: 0
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, go-jsonschema's main branch doesn't do anything when it sees constraints such as minimum and maximum. However, I patched my own fork of go-jsonschema so that it supports it. You can see in the config.go that there are checks to make sure some integers are not negative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can keep uint32 and we won't have to set such constraints?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Json schema itself doesn't seem to have an unsigned integer type. It only has integer and number (float) types, which can have various constraints.


// Config defines configuration for OTLP/HTTP exporter.
type Config struct {
confighttp.ClientConfig `mapstructure:",squash"` // squash ensures fields are correctly decoded in embedded struct.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I can't find a good way to schematise squashed fields. This is my main blocker right now.

I was hoping that we can use something like anyOf, but judging by some examples, it also looks awkward.

For now, I just listed all the squashed properties explicitly in metadata.yaml. Does anyone know of a better way? I'm not sure what folks working with Go code normally do in such situations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @atoulme! Good news! 🥳 I was able to find a way to create squashed structs with go-jsonschema. I will update this PR next week, once I polish my code a bit. The new generated code will look a lot more like the current OTel code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@@ -100,3 +113,5 @@ replace go.opentelemetry.io/collector/internal/globalgates => ../../internal/glo
replace go.opentelemetry.io/collector/consumer/consumerprofiles => ../../consumer/consumerprofiles

replace go.opentelemetry.io/collector/consumer/consumertest => ../../consumer/consumertest

replace github.com/atombender/go-jsonschema => github.com/ptodev/go-jsonschema v0.0.0-20240813163654-5518ba93ee84
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to fork go-jsonschema, because it didn't contain some of the features which I needed. I hope we can upstream those.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to rewrite these shared configs in yaml.

@@ -65,7 +65,7 @@ func (e *baseExporter) start(ctx context.Context, host component.Host) (err erro
e.metricExporter = pmetricotlp.NewGRPCClient(e.clientConn)
e.logExporter = plogotlp.NewGRPCClient(e.clientConn)
headers := map[string]string{}
for k, v := range e.config.ClientConfig.Headers {
for k, v := range e.config.Headers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change

if cfg.SendBatchMaxSize > 0 && cfg.SendBatchMaxSize < cfg.SendBatchSize {
return errors.New("send_batch_max_size must be greater or equal to send_batch_size")
// UnmarshalJSON implements json.Unmarshaler.
func (j *Config) UnmarshalJSON(b []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use UnmarshalJSON at all. We use the Validate function to validate.

}

func (c *Config) sanitizedEndpoint() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this functionality achieved?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment it is not. The migration of otlpexporter and otlphttpexporter to Json schema isn't complete, due to a blocker I hit with migrating squashed fields. I've been hoping we can find a solution to that blocker before we I fix the functionality of otlpexporter and otlphttpexporter.

But in general, we could put functions like this in .go helper files which are not generated by Json schema.

@atoulme
Copy link
Contributor

atoulme commented Aug 13, 2024

This PR is too big - please start with something much smaller.

Copy link
Author

@ptodev ptodev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is too big - please start with something much smaller.

Hi, @atoulme, thank you for your review! I agree that the PR is very large. The main reason for this is to demonstrate how using shared configuration could work between components. This was raised by @yurishkuro in the previous PR. The HTTP server/client settings are the primary example of shared config that I could find in the codebase.

Unfortunately, while working on shared configs, I realised that I don't know how to represent squashed fields in a clean way. I've been hoping that some of the OTel maintainers may have experience with Json schema and that the solution may be obvious to those folks.

}

func (c *Config) sanitizedEndpoint() string {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment it is not. The migration of otlpexporter and otlphttpexporter to Json schema isn't complete, due to a blocker I hit with migrating squashed fields. I've been hoping we can find a solution to that blocker before we I fix the functionality of otlpexporter and otlphttpexporter.

But in general, we could put functions like this in .go helper files which are not generated by Json schema.

// Default value is 0, that means no maximum size.
SendBatchMaxSize uint32 `mapstructure:"send_batch_max_size"`
// SendBatchMaxSize corresponds to the JSON schema field "send_batch_max_size".
SendBatchMaxSize int `mapstructure:"send_batch_max_size,omitempty"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could update my fork of go-jsonschema to use uint whenever there is a minimum: 0 or an exclusiveMinimum: 0? It seems like a reasonable feature - maybe it'll also get accepted upstream.

send_batch_max_size:
type: integer
default: 0
minimum: 0
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Json schema itself doesn't seem to have an unsigned integer type. It only has integer and number (float) types, which can have various constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve otel collector configuration w/ JSON schema
3 participants