Coder Social home page Coder Social logo

thanos-io / objstore Goto Github PK

View Code? Open in Web Editor NEW
96.0 12.0 65.0 744 KB

Go module providing unified interface and efficient clients to work with various object storage providers until like GCS, S3, Azure, SWIFT, COS and more.

License: Apache License 2.0

Makefile 2.09% Go 97.89% Shell 0.02%
azure-storage gcs minio object-storage s3 swift

objstore's Introduction

Thanos Logo

Latest Release Slack

Go Report Card Go Code reference

Tests

Thanos Object Storage Client

objstore is a Go module providing unified interface and efficient clients to work with various object storage providers.

Features:

  • Ability to perform common operations with clear contract against most popular object storages.
  • High focus on efficiency and reliability required for distributed databases on object storages.
  • Optional built-in YAML based configuration definition for consistent configuration.
  • Optional Prometheus metric instrumentation for bucket operations.

This moduile is battle-tested and used on high scale production by projects like Thanos, Loki, Cortex, Mimir, Tempo, Parca and more.

Contributing

Contributions are very welcome! See our CONTRIBUTING.md for more information.

Community

Thanos is an open source project and we value and welcome new contributors and members of the community. Here are ways to get in touch with the community:

Adopters

See [Adopters List](https://github.com/thanos-io/thanos/blob/main/website/data/adopters.yml.

Background

This library was initially developed as a Thanos objstore package. Thanos uses object storage as primary storage for metrics and metadata related to them. This package ended up being used by other projects like Cortex, Loki, Mimir, Tempo, Parca and more.

Given reusability, Thanos community promoted this package to standalone Go module with smaller amount of dependencies.

Maintainers

See MAINTAINERS.md

How to use objstore

The core this module is the Bucket interface:

// Bucket provides read and write access to an object storage bucket.
// NOTE: We assume strong consistency for write-read flow.
type Bucket interface {
	io.Closer
	BucketReader

	// Upload the contents of the reader as an object into the bucket.
	// Upload should be idempotent.
	Upload(ctx context.Context, name string, r io.Reader) error

	// Delete removes the object with the given name.
	// If object does not exist in the moment of deletion, Delete should throw error.
	Delete(ctx context.Context, name string) error

All provider implementations have to implement Bucket interface that allows common read and write operations that all supported by all object providers. If you want to limit the code that will do bucket operation to only read access (smart idea, allowing to limit access permissions), you can use the BucketReader interface:

// BucketReader provides read access to an object storage bucket.
type BucketReader interface {
	// Iter calls f for each entry in the given directory (not recursive.). The argument to f is the full
	// object name including the prefix of the inspected directory.
	// Entries are passed to function in sorted order.
	Iter(ctx context.Context, dir string, f func(string) error, options ...IterOption) error

	// Get returns a reader for the given object name.
	Get(ctx context.Context, name string) (io.ReadCloser, error)

	// GetRange returns a new range reader for the given object name and range.
	GetRange(ctx context.Context, name string, off, length int64) (io.ReadCloser, error)

	// Exists checks if the given object exists in the bucket.
	Exists(ctx context.Context, name string) (bool, error)

	// IsObjNotFoundErr returns true if error means that object is not found. Relevant to Get operations.
	IsObjNotFoundErr(err error) bool

	// IsAccessDeniedErr returns true if access to object is denied.
	IsAccessDeniedErr(err error) bool

	// Attributes returns information about the specified object.
	Attributes(ctx context.Context, name string) (ObjectAttributes, error)
}

Those interfaces represent the object storage operations your code can use from objstore clients.

Factory

Generally, you have two ways of using objstore module:

First is to import the provider you want e.g. github.com/thanos-io/objstore/providers/s3 and instantiate it with available constructor (e.g. NewBucket).

The second option is to use the factory NewBucket(logger log.Logger, confContentYaml []byte, reg prometheus.Registerer, component string) that will instantiate the object storage client based on YAML file provided. The YAML file has generally the format like this:

type: <PROVIDER_TYPE>
config:
  <PROVIDER_TYPE specific options>

The exact option depends on provider and are in sections below.

NOTE: All code snippets are auto-generated from code and up-to-date.

Check out the Thanos documentation to see how Thanos uses this module.

Supported Providers (Clients)

Current object storage client implementations:

Provider Maturity Aimed For Auto-tested on CI Maintainers
Google Cloud Storage Stable Production Usage yes @bwplotka
AWS/S3 (and all S3-compatible storages e.g disk-based Minio) Stable Production Usage yes @bwplotka
Azure Storage Account Stable Production Usage no @vglafirov,@phillebaba
OpenStack Swift Beta (working PoC) Production Usage yes @FUSAKLA
Tencent COS Beta Production Usage no @jojohappy,@hanjm
AliYun OSS Beta Production Usage no @shaulboozhiao,@wujinhu
Baidu BOS Beta Production Usage no @yahaa
Local Filesystem Stable Testing and Demo only yes @bwplotka
Oracle Cloud Infrastructure Object Storage Beta Production Usage yes @aarontams,@gaurav-05,@ericrrath
HuaweiCloud OBS Beta Production Usage no @setoru

Missing support to some object storage? Check out how to add your client section

NOTE: Currently Thanos requires strong consistency (write-read) for object store implementation for singleton Compaction purposes.

S3

Thanos uses the minio client library to upload Prometheus data into AWS S3.

NOTE: S3 client was designed for AWS S3, but it can be configured against other S3-compatible object storages e.g Ceph

The S3 object storage yaml configuration definition:

type: S3
config:
  bucket: ""
  endpoint: ""
  region: ""
  disable_dualstack: false
  aws_sdk_auth: false
  access_key: ""
  insecure: false
  signature_version2: false
  secret_key: ""
  session_token: ""
  put_user_metadata: {}
  http_config:
    idle_conn_timeout: 1m30s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_handshake_timeout: 10s
    expect_continue_timeout: 1s
    max_idle_conns: 100
    max_idle_conns_per_host: 100
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
  trace:
    enable: false
  list_objects_version: ""
  bucket_lookup_type: auto
  send_content_md5: true
  disable_multipart: false
  part_size: 67108864
  sse_config:
    type: ""
    kms_key_id: ""
    kms_encryption_context: {}
    encryption_key: ""
  sts_endpoint: ""
prefix: ""

At a minimum, you will need to provide a value for the bucket, endpoint, access_key, and secret_key keys. The rest of the keys are optional.

However if you set aws_sdk_auth: true Thanos will use the default authentication methods of the AWS SDK for go based on known environment variables (AWS_PROFILE, AWS_WEB_IDENTITY_TOKEN_FILE ... etc) and known AWS config files (~/.aws/config). If you turn this on, then the bucket and endpoint are the required config keys.

The field prefix can be used to transparently use prefixes in your S3 bucket. This allows you to separate blocks coming from different sources into paths with different prefixes, making it easier to understand what's going on (i.e. you don't have to use Thanos tooling to know from where which blocks came).

The AWS region to endpoint mapping can be found in this link.

By default, the library prefers using dual-stack endpoints. You can explicitly disable this behaviour by setting disable_dualstack: true.

Make sure you use a correct signature version. Currently AWS requires signature v4, so it needs signature_version2: false. If you don't specify it, you will get an Access Denied error. On the other hand, several S3 compatible APIs use signature_version2: true.

You can configure the timeout settings for the HTTP client by setting the http_config.idle_conn_timeout and http_config.response_header_timeout keys. As a rule of thumb, if you are seeing errors like timeout awaiting response headers in your logs, you may want to increase the value of http_config.response_header_timeout.

Please refer to the documentation of the Transport type in the net/http package for detailed information on what each option does.

part_size is specified in bytes and refers to the minimum file size used for multipart uploads, as some custom S3 implementations may have different requirements. A value of 0 means to use a default 128 MiB size.

Set list_objects_version: "v1" for S3 compatible APIs that don't support ListObjectsV2 (e.g. some versions of Ceph). Default value ("") is equivalent to "v2".

http_config.tls_config allows configuring TLS connections. Please refer to the document of tls_config for detailed information on what each option does.

bucket_lookup_type can be auto, virtual-hosted or path. Read more about it here.

For debug and testing purposes you can set

  • insecure: true to switch to plain insecure HTTP instead of HTTPS

  • http_config.insecure_skip_verify: true to disable TLS certificate verification (if your S3 based storage is using a self-signed certificate, for example)

  • trace.enable: true to enable the minio client's verbose logging. Each request and response will be logged into the debug logger, so debug level logging must be enabled for this functionality.

S3 Server-Side Encryption

SSE can be configued using the sse_config. SSE-S3, SSE-KMS, and SSE-C are supported.

  • If type is set to SSE-S3 you do not need to configure other options.

  • If type is set to SSE-KMS you must set kms_key_id. The kms_encryption_context is optional, as AWS provides a default encryption context.

  • If type is set to SSE-C you must provide a path to the encryption key using encryption_key.

If the SSE Config block is set but the type is not one of SSE-S3, SSE-KMS, or SSE-C, an error is raised.

You will also need to apply the following AWS IAM policy for the user to access the KMS key:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "KMSAccess",
            "Effect": "Allow",
            "Action": [
                "kms:GenerateDataKey",
                "kms:Encrypt",
                "kms:Decrypt"
            ],
            "Resource": "arn:aws:kms:<region>:<account>:key/<KMS key id>"
        }
    ]
}
Credentials

By default Thanos will try to retrieve credentials from the following sources:

  1. From config file if BOTH access_key and secret_key are present.
  2. From the standard AWS environment variable - AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
  3. From ~/.aws/credentials
  4. IAM credentials retrieved from an instance profile.

NOTE: Getting access key from config file and secret key from other method (and vice versa) is not supported.

AWS Policies

Example working AWS IAM policy for user:

  • For deployment (policy for Thanos services):
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::<bucket>"
            ]
        }
    ]
}

(No bucket policy)

To test the policy, set env vars for S3 access for empty, not used bucket as well as:

THANOS_TEST_OBJSTORE_SKIP=GCS,AZURE,SWIFT,COS,ALIYUNOSS,OCI,OBS
THANOS_ALLOW_EXISTING_BUCKET_USE=true

And run: GOCACHE=off go test -v -run TestObjStore_AcceptanceTest_e2e ./pkg/...

  • For testing (policy to run e2e tests):

We need access to CreateBucket and DeleteBucket and access to all buckets:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:CreateBucket",
                "s3:DeleteBucket"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::<bucket>"
            ]
        }
    ]
}

With this policy you should be able to run set THANOS_TEST_OBJSTORE_SKIP=GCS,AZURE,SWIFT,COS,ALIYUNOSS,OCI,OBS and unset S3_BUCKET and run all tests using make test.

Details about AWS policies: https://docs.aws.amazon.com/AmazonS3/latest/dev/using-with-s3-actions.html

STS Endpoint

If you want to use IAM credential retrieved from an instance profile, Thanos needs to authenticate through AWS STS. For this purposes you can specify your own STS Endpoint.

By default Thanos will use endpoint: https://sts.amazonaws.com and AWS region corresponding endpoints.

GCS

To configure Google Cloud Storage bucket as an object store you need to set bucket with GCS bucket name and configure Google Application credentials.

For example:

type: GCS
config:
  bucket: ""
  service_account: ""
  use_grpc: false
  grpc_conn_pool_size: 0
  http_config:
    idle_conn_timeout: 0s
    response_header_timeout: 0s
    insecure_skip_verify: false
    tls_handshake_timeout: 0s
    expect_continue_timeout: 0s
    max_idle_conns: 0
    max_idle_conns_per_host: 0
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
prefix: ""
Using GOOGLE_APPLICATION_CREDENTIALS

Application credentials are configured via JSON file and only the bucket needs to be specified, the client looks for:

  1. A JSON file whose path is specified by the GOOGLE_APPLICATION_CREDENTIALS environment variable.
  2. A JSON file in a location known to the gcloud command-line tool. On Windows, this is %APPDATA%/gcloud/application_default_credentials.json. On other systems, $HOME/.config/gcloud/application_default_credentials.json.
  3. On Google App Engine it uses the appengine.AccessToken function.
  4. On Google Compute Engine and Google App Engine Managed VMs, it fetches credentials from the metadata server. (In this final case any provided scopes are ignored.)

You can read more on how to get application credential json file in https://cloud.google.com/docs/authentication/production

Using inline a Service Account

Another possibility is to inline the ServiceAccount into the Thanos configuration and only maintain one file. This feature was added, so that the Prometheus Operator only needs to take care of one secret file.

type: GCS
config:
  bucket: "thanos"
  service_account: |-
    {
      "type": "service_account",
      "project_id": "project",
      "private_key_id": "abcdefghijklmnopqrstuvwxyz12345678906666",
      "private_key": "-----BEGIN PRIVATE KEY-----\...\n-----END PRIVATE KEY-----\n",
      "client_email": "[email protected]",
      "client_id": "123456789012345678901",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/thanos%40gitpods.iam.gserviceaccount.com"
    }
GCS Policies

Note: GCS Policies should be applied at the project level, not at the bucket level

For deployment:

Storage Object Creator and Storage Object Viewer

For testing:

Storage Object Admin for ability to create and delete temporary buckets.

To test the policy is working as expected, exec into the sidecar container, eg:

kubectl exec -it -n <namespace> <prometheus with sidecar pod name> -c <sidecar container name> -- /bin/sh

Then test that you can at least list objects in the bucket, eg:

thanos tools bucket ls --objstore.config="${OBJSTORE_CONFIG}"
Azure

To use Azure Storage as Thanos object store, you need to precreate storage account from Azure portal or using Azure CLI. Follow the instructions from Azure Storage Documentation: https://docs.microsoft.com/en-us/azure/storage/common/storage-quickstart-create-account

Config file format is the following:

type: AZURE
config:
  storage_account: ""
  storage_account_key: ""
  storage_connection_string: ""
  container: ""
  endpoint: ""
  user_assigned_id: ""
  max_retries: 0
  reader_config:
    max_retry_requests: 0
  pipeline_config:
    max_tries: 0
    try_timeout: 0s
    retry_delay: 0s
    max_retry_delay: 0s
  http_config:
    idle_conn_timeout: 0s
    response_header_timeout: 0s
    insecure_skip_verify: false
    tls_handshake_timeout: 0s
    expect_continue_timeout: 0s
    max_idle_conns: 0
    max_idle_conns_per_host: 0
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
  msi_resource: ""
prefix: ""

If msi_resource is used, authentication is done via system-assigned managed identity. The value for Azure should be https://<storage-account-name>.blob.core.windows.net.

If user_assigned_id is used, authentication is done via user-assigned managed identity. When using user_assigned_id the msi_resource defaults to https://<storage_account>.<endpoint>

If storage_connection_string is set, the values of storage_account and endpoint values will not be used. Use this method over storage_account_key if you need to authenticate via a SAS token.

The generic max_retries will be used as value for the pipeline_config's max_tries and reader_config's max_retry_requests. For more control, max_retries could be ignored (0) and one could set specific retry values.

OpenStack Swift

Thanos uses ncw/swift client to upload Prometheus data into OpenStack Swift.

Below is an example configuration file for thanos to use OpenStack swift container as an object store. Note that if the name of a user, project or tenant is used one must also specify its domain by ID or name. Various examples for OpenStack authentication can be found in the official documentation.

By default, OpenStack Swift has a limit for maximum file size of 5 GiB. Thanos index files are often larger than that. To resolve this issue, Thanos uses Static Large Objects (SLO) which are uploaded as segments. These are by default put into the segments directory of the same container. The default limit for using SLO is 1 GiB which is also the maximum size of the segment. If you don't want to use the same container for the segments (best practise is to use <container_name>_segments to avoid polluting listing of the container objects) you can use the large_file_segments_container_name option to override the default and put the segments to other container. In rare cases you can switch to Dynamic Large Objects (DLO) by setting the use_dynamic_large_objects to true, but use it with caution since it even more relies on eventual consistency.

type: SWIFT
config:
  auth_version: 0
  auth_url: ""
  username: ""
  user_domain_name: ""
  user_domain_id: ""
  user_id: ""
  password: ""
  domain_id: ""
  domain_name: ""
  application_credential_id: ""
  application_credential_name: ""
  application_credential_secret: ""
  project_id: ""
  project_name: ""
  project_domain_id: ""
  project_domain_name: ""
  region_name: ""
  container_name: ""
  large_object_chunk_size: 1073741824
  large_object_segments_container_name: ""
  retries: 3
  connect_timeout: 10s
  timeout: 5m
  use_dynamic_large_objects: false
  http_config:
    idle_conn_timeout: 1m30s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_handshake_timeout: 10s
    expect_continue_timeout: 1s
    max_idle_conns: 100
    max_idle_conns_per_host: 100
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
prefix: ""
Tencent COS

To use Tencent COS as storage store, you should apply a Tencent Account to create an object storage bucket at first. Note that detailed from Tencent Cloud Documents: https://cloud.tencent.com/document/product/436

To configure Tencent Account to use COS as storage store you need to set these parameters in yaml format stored in a file:

type: COS
config:
  bucket: ""
  region: ""
  app_id: ""
  endpoint: ""
  secret_key: ""
  secret_id: ""
  http_config:
    idle_conn_timeout: 1m30s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_handshake_timeout: 10s
    expect_continue_timeout: 1s
    max_idle_conns: 100
    max_idle_conns_per_host: 100
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
prefix: ""

The secret_key and secret_id field is required. The http_config field is optional for optimize HTTP transport settings. There are two ways to configure the required bucket information:

  1. Provide the values of bucket, region and app_id keys.
  2. Provide the values of endpoint key with url format when you want to specify vpc internal endpoint. Please refer to the document of endpoint for more detail.
AliYun OSS

In order to use AliYun OSS object storage, you should first create a bucket with proper Storage Class , ACLs and get the access key on the AliYun cloud. Go to https://www.alibabacloud.com/product/oss for more detail.

The AliYun OSS object storage yaml configuration definition:

type: ALIYUNOSS
config:
  endpoint: ""
  bucket: ""
  access_key_id: ""
  access_key_secret: ""
prefix: ""
Baidu BOS

In order to use Baidu BOS object storage, you should apply for a Baidu Account and create an object storage bucket first. Refer to Baidu Cloud Documents for more details. The Baidu BOS object storage yaml configuration definition:

type: BOS
config:
  bucket: ""
  endpoint: ""
  access_key: ""
  secret_key: ""
prefix: ""
Filesystem

This storage type is used when user wants to store and access the bucket in the local filesystem. We treat filesystem the same way we would treat object storage, so all optimization for remote bucket applies even though, we might have the files locally.

NOTE: This storage type is experimental and might be inefficient. It is NOT advised to use it as the main storage for metrics in production environment. Particularly there is no planned support for distributed filesystems like NFS. This is mainly useful for testing and demos.

Filesystem "object storage" yaml configuration definition:

type: FILESYSTEM
config:
  directory: ""
prefix: ""

Oracle Cloud Infrastructure Object Storage

To configure Oracle Cloud Infrastructure (OCI) Object Storage as a Thanos Object Store, you need to provide appropriate authentication credentials to your OCI tenancy. The OCI object storage client implementation for Thanos supports default keypair, instance principal, and OKE workload identity authentication.

API Signing Key

The default API signing key authentication provider leverages same configuration as the OCI CLI which is usually stored in at $HOME/.oci/config or via variable names starting with the string OCI_CLI. You can also use environment variables that start with TF_VAR. If the same configuration is found in multiple places the provider will prefer the first one.

The following example configures the provider to look for an existing API signing key for authentication:

type: OCI
config:
  provider: "default"
  bucket: ""
  compartment_ocid: ""
  part_size: ""                   // Optional part size to override the OCI default of 128 MiB, value is in bytes.
  max_request_retries: ""         // Optional maximum number of retries for a request.
  request_retry_interval: ""      // Optional sleep duration in seconds between retry requests.
  http_config:
    idle_conn_timeout: 1m30s      // Optional maximum amount of time an idle (keep-alive) connection will remain idle before closing itself. Zero means no limit.
    response_header_timeout: 2m   // Optional amount of time to wait for a server's response headers after fully writing the request.
    tls_handshake_timeout: 10s    // Optional maximum amount of time waiting to wait for a TLS handshake. Zero means no timeout.
    expect_continue_timeout: 1s   // Optional amount of time to wait for a server's first response headers. Zero means no timeout and causes the body to be sent immediately.
    insecure_skip_verify: false   // Optional. If true, crypto/tls accepts any certificate presented by the server and any host name in that certificate.
    max_idle_conns: 100           // Optional maximum number of idle (keep-alive) connections across all hosts. Zero means no limit.
    max_idle_conns_per_host: 100  // Optional maximum idle (keep-alive) connections to keep per-host. If zero, DefaultMaxIdleConnsPerHost=2 is used.
    max_conns_per_host: 0         // Optional maximum total number of connections per host.
    disable_compression: false    // Optional. If true, prevents the Transport from requesting compression.
    client_timeout: 90s           // Optional time limit for requests made by the HTTP Client.

Instance Principal Provider

For Example:

type: OCI
config:
  provider: "instance-principal"
  bucket: ""
  compartment_ocid: ""

You can also include any of the optional configuration just like the example in Default Provider.

Raw Provider

For Example:

type: OCI
config:
  provider: "raw"
  bucket: ""
  compartment_ocid: ""
  tenancy_ocid: ""
  user_ocid: ""
  region: ""
  fingerprint: ""
  privatekey: ""
  passphrase: ""         // Optional passphrase to encrypt the private API Signing key

You can also include any of the optional configuration just like the example in Default Provider.

OKE Workload Identity Provider

For Example:

type: OCI
config:
  provider: "oke-workload-identity"
  bucket: ""
  region: ""

The bucket and region fields are required. The region field identifies the bucket region.

HuaweiCloud OBS

To use HuaweiCloud OBS as an object store, you should apply for a HuaweiCloud Account to create an object storage bucket at first. More details: HuaweiCloud OBS

To configure HuaweiCloud Account to use OBS as storage store you need to set these parameters in YAML format stored in a file:

type: OBS
config:
  bucket: ""
  endpoint: ""
  access_key: ""
  secret_key: ""
  http_config:
    idle_conn_timeout: 1m30s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_handshake_timeout: 10s
    expect_continue_timeout: 1s
    max_idle_conns: 100
    max_idle_conns_per_host: 100
    max_conns_per_host: 0
    tls_config:
      ca_file: ""
      cert_file: ""
      key_file: ""
      server_name: ""
      insecure_skip_verify: false
    disable_compression: false
prefix: ""

The access_key and secret_key field is required. The http_config field is optional for optimize HTTP transport settings.

How to add a new client to Thanos?

Following checklist allows adding new Go code client to supported providers:

  1. Create new directory under ./providers/<provider>
  2. Implement objstore.Bucket interface
  3. Add NewTestBucket constructor for testing purposes, that creates and deletes temporary bucket.
  4. Use created NewTestBucket in ForeachStore method to ensure we can run tests against new provider. (In PR)
  5. RUN the TestObjStoreAcceptanceTest against your provider to ensure it fits. Fix any found error until test passes. (In PR)
  6. Add client implementation to the factory in factory code. (Using as small amount of flags as possible in every command)
  7. Add client struct config to cfggen; to allow config auto generation.

At that point, anyone can use your provider by spec.

objstore's People

Contributors

alanprot avatar andyasp avatar brancz avatar bwplotka avatar charleskorn avatar clyang82 avatar danielblando avatar deejay1 avatar dependabot[bot] avatar fabxc avatar fpetkovski avatar fusakla avatar giedriuss avatar gotjosh avatar hanjm avatar jojohappy avatar kakkoyun avatar kavirajk avatar metalmatze avatar michahoffmann avatar pedro-stanaka avatar phillebaba avatar pracucci avatar pstibrany avatar ritacanavarro avatar saswatamcode avatar simonswine avatar squat avatar sylr avatar yeya24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

objstore's Issues

OpenTelemetry wrapper

This project already provides an opentracing wrapper (as a matter of fact it also wraps instances automatically in the factory). It would be nice to also provide an OpenTelemetry wrapper for those that don't use OpenTracing anymore.

Pagination Support

Some providers, such as:

s3

Support parameters such as max-keys or start-after. More in the API reference

gc

Support parameters such as maxResults, pageToken and startOffset. More in the API reference

Have some support that allows us to paginate results when listing items.

Is there room/need for having the client support these? In our case, we have a need to allow client-facing APIs to iterate over 150k keys at a time and as you can expect the experience of having to iterate through all of these keys at once it's not ideal.

In reality, I don't have much data on timings hence filing #79 but I reckon I'd like to get the conversation started on wether this is something we would like to support.

To begin with, I'm open to the idea of having us support this for the three major cloud providers: GCS, AWS and Azure.

with filesystem backend, please don't create the `directory` if non-existent

Hey.

When one has a bucked config like:

type: FILESYSTEM
config:
  directory: /path/to/dir
prefix: ''

Thanos, as of now, apparently creates an of these directories if they're non-existent.

It would be nice if it could just not do this (but only create directories below directory: as needed).

The reason is that one might e.g. use something like /mounpoint/dir and dir only existing when some other filesystem is actually mounted on /mounpoint and if not, Thanos should rather fail to write (and e.g. not fill up the system’s root fs).

If it however creates any missing directories, it would simply create /mounpoint/dir on the unmounted mountpoint directory.

Thanks,
Chris.

error uploading prometheus blocks to to azure blob store via sidecar

##versions

Thanos: v0.27.0

Prometheus: v2.36.2

Environment

AKS - 1.22.11

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    component: "server"
    app: prometheus
    release: prometheus
    chart: prometheus-15.11.0
    heritage: Helm
  name: prometheus-server
  namespace: monitoring
spec:
  serviceName: prometheus-server-headless
  selector:
    matchLabels:
      component: "server"
      app: prometheus
      release: prometheus
  replicas: 1
  podManagementPolicy: OrderedReady
  template:
    metadata:
      labels:
        component: "server"
        app: prometheus
        release: prometheus
        chart: prometheus-15.11.0
        heritage: Helm
    spec:
      priorityClassName: "system-node-critical"
      enableServiceLinks: true
      serviceAccountName: prometheus-server
      containers:

        - name: prometheus-server
          image: "prometheus/prometheus:v2.36.2"
          imagePullPolicy: "IfNotPresent"
          securityContext:
            {}
          env:
            - name: CLUSTER_NAME
              value: sandbox
          args:
            - --storage.tsdb.retention.time=7d
            - --config.file=/etc/config/prometheus.yml
            - --storage.tsdb.path=/data
            - --web.console.libraries=/etc/prometheus/console_libraries
            - --web.console.templates=/etc/prometheus/consoles
            - --web.enable-lifecycle
            - --web.enable-admin-api
            - --storage.tsdb.max-block-duration=2h
            - --storage.tsdb.min-block-duration=2h
          ports:
            - containerPort: 9090
          readinessProbe:
            httpGet:
              path: /-/ready
              port: 9090
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 5
            timeoutSeconds: 4
            failureThreshold: 3
            successThreshold: 1
          livenessProbe:
            httpGet:
              path: /-/healthy
              port: 9090
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 15
            timeoutSeconds: 10
            failureThreshold: 3
            successThreshold: 1
          resources:
            limits:
              memory: 106Gi
            requests:
              cpu: 23
              memory: 106Gi
          volumeMounts:
            - name: config-volume
              mountPath: /etc/config
            - name: storage-volume
              mountPath: /data
              subPath: ""
        - name: thanos-sidecar
          args:
          - sidecar
          - --log.level=debug
          - --tsdb.path=/data/
          - --prometheus.url=http://127.0.0.1:9090
          - --objstore.config-file=/etc/thanos-object-store-config/blobStore.yml
          - --reloader.config-file=/etc/config/prometheus.yml
          - --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yml
          - --reloader.rule-dir=/etc/config/rules
          image: quay.io/thanos/thanos:v0.27.0
          ports:
          - containerPort: 10902
            name: sidecar-http
          - containerPort: 10901
            name: grpc
          - containerPort: 10900
            name: cluster
          resources:
            limits:
              cpu: "2"
              memory: 4Gi
            requests:
              cpu: "2"
              memory: 4Gi
          volumeMounts:
          - mountPath: /data
            name: storage-volume
          - mountPath: /etc/config
            name: config-volume
            readOnly: false
          - mountPath: /etc/prometheus-shared/
            name: prometheus-config-shared
            readOnly: false
          - mountPath: /etc/thanos-object-store-config/
            name: thanos-object-store-config
            readOnly: false
      hostNetwork: false
      dnsPolicy: ClusterFirst
      nodeSelector:
        nodeClass: monitoring
      securityContext:
        fsGroup: 65534
        runAsGroup: 65534
        runAsNonRoot: true
        runAsUser: 65534
      tolerations:
        - effect: NoSchedule
          key: nodeClass
          operator: Equal
          value: monitoring
      terminationGracePeriodSeconds: 300
      volumes:
        - name: config-volume
          configMap:
            name: prometheus-server
        - emptyDir: {}
          name: prometheus-config-shared
        - configMap:
            name: thanos-object-store-config-map
          name: thanos-object-store-config
  volumeClaimTemplates:
    - metadata:
        name: storage-volume
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "1Ti"

**Object Store Config **

type: AZURE
config:
  storage_account: 'someobjectstore'
  storage_account_key: 'somesupersecretkey'
  container: 'somecontainername'
  endpoint: 'blob.core.windows.net'

thanos Logs:

level=info ts=2022-08-25T20:47:15.050946476Z caller=grpc.go:131 service=gRPC/server component=sidecar msg="listening for serving gRPC" address=0.0.0.0:10901
level=info ts=2022-08-25T20:47:15.051145277Z caller=intrumentation.go:75 msg="changing probe status" status=healthy
level=info ts=2022-08-25T20:47:15.051170677Z caller=http.go:73 service=http/server component=sidecar msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2022-08-25T20:47:15.051218278Z caller=tls_config.go:195 service=http/server component=sidecar msg="TLS is disabled." http2=false
level=debug ts=2022-08-25T20:47:15.052403388Z caller=promclient.go:623 msg="build version" url=http://127.0.0.1:9090/api/v1/status/buildinfo
level=info ts=2022-08-25T20:47:15.052956392Z caller=sidecar.go:179 msg="successfully loaded prometheus version"
level=info ts=2022-08-25T20:47:15.05638812Z caller=reloader.go:375 component=reloader msg="Reload triggered" cfg_in=/etc/config/prometheus.yml cfg_out=/etc/prometheus-shared/prometheus.yml watched_dirs=/etc/config/rules
level=info ts=2022-08-25T20:47:15.056457621Z caller=reloader.go:236 component=reloader msg="started watching config file and directories for changes" cfg=/etc/config/prometheus.yml out=/etc/prometheus-shared/prometheus.yml dirs=/etc/config/rules
level=info ts=2022-08-25T20:47:15.056485521Z caller=sidecar.go:201 msg="successfully loaded prometheus external labels" external_labels="{prometheus_group=\"sandbox\", prometheus_replica=\"$(HOSTNAME)\"}"
level=warn ts=2022-08-25T20:47:17.051936943Z caller=shipper.go:239 msg="reading meta file failed, will override it" err="failed to read /data/thanos.shipper.json: open /data/thanos.shipper.json: no such file or directory"
level=debug ts=2022-08-25T20:47:17.053628057Z caller=azure.go:381 msg="check if blob exists" blob=01GBAR8SGF29B024GFAHH9VA5E/meta.json
level=info ts=2022-08-25T20:47:17.056943484Z caller=shipper.go:334 msg="upload new block" id=01GBAR8SGF29B024GFAHH9VA5E
level=debug ts=2022-08-25T20:47:17.061069018Z caller=azure.go:396 msg="Uploading blob" blob=01GBAR8SGF29B024GFAHH9VA5E/chunks/000001
panic: send on closed channel

goroutine 176 [running]:
github.com/Azure/azure-storage-blob-go/azblob.staticBuffer.Put({0xc000b2b560, 0xc0009c80d0, 0xc000b2b500}, {0xc000e80000, 0xc0005efea8, 0x0})
/go/pkg/mod/github.com/!azure/[email protected]/azblob/highlevel.go:427 +0x3c
github.com/Azure/azure-storage-blob-go/azblob.(*copier).write(0xc0001fad00, {{0xc000e80000, 0x300000, 0x300000}, {0xc0007e8780, 0x58}})
/go/pkg/mod/github.com/!azure/[email protected]/azblob/chunkwriting.go:166 +0x347
github.com/Azure/azure-storage-blob-go/azblob.(*copier).sendChunk.func1()
/go/pkg/mod/github.com/!azure/[email protected]/azblob/chunkwriting.go:136 +0xb3
github.com/Azure/azure-storage-blob-go/azblob.NewStaticBuffer.func1()
/go/pkg/mod/github.com/!azure/[email protected]/azblob/highlevel.go:406 +0x3b
created by github.com/Azure/azure-storage-blob-go/azblob.NewStaticBuffer
/go/pkg/mod/github.com/!azure/[email protected]/azblob/highlevel.go:404 +0xd3

Issue and Analysis:

Thanos sidecar panics when trying to upload data to azure blob store. this is reproducible on 0.27.0, 0.26.0, 0.25.2 as well. all these releases seem to be using github.com/Azure/azure-storage-blob-go of version v0.13.0 which seems to the source of the issue.

In the recently released v0.28.0-rc.0 I am not sure if the above issue is fixed but i do see v0.28.0-rc.0 is vendoring github.com/thanos-io/objstore v0.0.0-20220715165016-ce338803bc1e which then is using github.com/Azure/azure-storage-blob-go of version 0.14.0 which in theory should have fixed this issue but in reality I am not seeing any error but also not seeing data being uploaded.

the sidecar process seems to hang as the log statement from the following line is the last message i am seeing in logs.

I am trying to run in local and continue to debug this issue any help from the community is more than appreciated.

cc: @vglafirov

Tracing doesn't work

It relies on tracerKey which is only used in one place to read from a Context. Since it is never set, the NoopTracer is used.

func tracerFromContext(ctx context.Context) opentracing.Tracer {
val := ctx.Value(tracerKey)
if sp, ok := val.(opentracing.Tracer); ok {
return sp
}
return nil

In the similar package in thanos-io/thanos there is a function to set the key:

https://github.com/thanos-io/thanos/blob/25d91c103a416df733320563bb108d734f91b8af/pkg/tracing/tracing.go#L37-L39

Note that the tracerKey there is a different value.

Some of the objstore operation duration metrics are 0

The InstrumentedBucket doesn't emit duration metrics for Iter().

thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="0.001"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="0.01"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="0.1"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="0.3"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="0.6"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="1"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="3"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="6"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="9"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="20"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="30"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="60"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="90"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="120"} 0
thanos_objstore_bucket_operation_duration_seconds_bucket{bucket="",component="compactor",operation="iter",le="+Inf"} 0
thanos_objstore_bucket_operation_duration_seconds_sum{bucket="",component="compactor",operation="iter"} 0
thanos_objstore_bucket_operation_duration_seconds_count{bucket="",component="compactor",operation="iter"} 0

https://github.com/thanos-io/objstore/blob/main/objstore.go#L487
Is this by design? I would like to measure the time taken for iterating the bucket.

Support authentication via connection strings for Azure storage

In some circumstances, it may be necessary to authenticate using a SAS token rather than an account key. Adding this should be a non breaking change.

There are some open requests between this repo and loki which consumes this library related to this:

grafana/loki#3309
#21

I've submitted proposed changes for allowing this type of authentication by exposing an option to use a connection string: #51

Enable store to automatically discover prefixes in bucket

Use Case:

Say I have multiple clusters, that use the prefix field so that the sidecar can put metrics in the same bucket, but under different prefixes.

From what I understand, I would have to have a thanos-store for each prefix to start reading those metrics.

Would it be possible to have just one thanos-store, connected to that bucket, that is smart enough to discover all the prefixes that exist in that bucket?

Related Issue(s):

S3 Provider doesn't expose a way to pass in the SessionToken

👋 I'm trying to use the objstore with S3, and need to provide the sdk with a session token.

I'm attempting to use the NewBucketWithConfig and using dynamic credentials but the config object only exposes the
AccessKey and SecretKey but not the SessionToken is this something that could be added?

I'm happy to submit the PR but just wanted to see if there was a reason that was left out of the config?

Is there a plan to remove the thanos- prefix from the user agent?

Some providers are currently adding a thanos- prefix to the user agent.

option.WithUserAgent(fmt.Sprintf("thanos-%s/%s (%s)", component, version.Version, runtime.Version())),

client.Config.UserAgent = fmt.Sprintf("thanos-%s", component)

client.SetAppInfo(fmt.Sprintf("thanos-%s", component), fmt.Sprintf("%s (%s)", version.Version, runtime.Version()))

Is there a plan to remove this?

Related issue: #23

should ListObjects support list pagination?

hey:
we met an error

"err="new bucket block: list chunk files: too many requests, please retry later"" 

with old thanos version (about 2021 ).
our bucket server(Self-owned storage service built according to the s3 protocol) response error when list over 1000.

and i check the latest objstore code

  func (b *Bucket) Iter(
   xxx
    for object := range b.client.ListObjects(ctx, b.name, opts) {
		// Catch the error when failed to list objects.
		if object.Err != nil {
			return object.Err
		}

full stack trace as

thanos-io:
newBucketBlock
  	// Get object handles for all chunk files from storage.
	if err = bkt.Iter(ctx, path.Join(meta.ULID.String(), block.ChunksDirname), 
           xxx
objstore:
      // Iter calls f for each entry in the given directory. The argument to f is the full
    // object name including the prefix of the inspected directory.
    func (b *Bucket) Iter(ctx context.Context, dir string, f func(string) error, options ...objstore.IterOption) error {

does it means the bkt.Iter can only firmly works well if list object is less 1000? how can we imporve our case if objects over 1000? or how to delve deeper into this issue?

thanks a lot!!!

Azure Upload Block Size and Concurrency

I have had some time to look deeper into the Azure code as I have spent some time refactoring it. One thing that has bothered me for a while is that hard coded block size and concurrency. I don't really know where these values come from or why those values were chosen. One explanation may be that there were non optimal default values set in the old Azure SDK which now have changed. Either way I do not think these are optimal values.

Currently these values are set to 3Mb and 4 concurrent threads.

opt := azblob.UploadStreamOptions{
BufferSize: 3 * 1024 * 1024,
MaxBuffers: 4,
}

The default value in the current Azure SDK is 1Mb and 1 concurrent thread.
https://github.com/Azure/azure-sdk-for-go/blob/c8c9838f7dc383a0bc2ad7b6cc09d51eb619d8f6/sdk/storage/azblob/blockblob/models.go#L256-L261

To understand why block size matters it is useful to understand how Thanos uses Azure Storage Accounts. Thanos uses Block Blobs when storing metrics in an Azure Storage Account.

Block blobs are optimized for uploading large amounts of data efficiently. Block blobs are composed of blocks, each of which is identified by a block ID. A block blob can include up to 50,000 blocks. Each block in a block blob can be a different size, up to the maximum size permitted for the service version in use.

https://learn.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs

This means that all files that are uploaded by Thanos are split into blocks which are the size of the set block size. When all of the blocks have been uploaded they are committed so that the blob becomes available. During the upload process the block size matters because the maximum memory used is defined by block size * concurrency. This is useful when limiting the memory used by the receiver during upload. Microsoft seems to have a plan on simplifying this by automatically calculating an optimal value based on maximum memory usage. It is not clear when or if this will be implemented.
https://github.com/Azure/azure-sdk-for-go/blob/c8c9838f7dc383a0bc2ad7b6cc09d51eb619d8f6/sdk/messaging/azeventhubs/internal/blob/chunkwriting.go#L32-L39

Things get a bit more interesting when you start thinking about how the blocks are stored and fetched during read. Sadly there is very little documentation by Microsoft about what effect this has. There are however some blog posts and Stackoverflow answers that claim it has an affect but with no data to back up the claim. There is an issue MicrosoftDocs/azure-docs#100416 which is requesting more information, but these things can take a while to get a response on.

If there is any correlation between block size and download duration it should be easy to prove, just upload the same amount of data multiple times with different block sizes and measure the download duration. So that is what I have done. I have written phillebaba/azblob-benchmark to test this. It uploads and downloads files multiple times for different block sizes and takes an average duration value for each block size. One assumption I have made, which should be corrected if wrong, is that most of the files that Thanos downloads are 512Mb. This assumption is taken from the fact that the chunk files are all capped at 512 Mb. I measured both upload and download speed, concurrency does not matter for download duration, which is why it is set to an arbitrary value. The tests were run from a Standard D2s v5 (2 vcpus, 8 GiB memory) in the same region as the Storage Account.

image

The following result shows that there is a correlation between block size and download duration. Diminishing returns seems to eventually kick in as the block size gets bigger. At first I feared that the result I was seeing was somehow linked to slow startup or some sort of caching which eventually kicks in. So I ran the same test in reverse order of block size, decreasing the block size this time. And got very much of the same results.

image

My conclusion from these results is that the current block size is not optimal for reads. I think there is some more research that needs to be done before a proper conclusion can be made but I think it is safe to say that increasing the block size from the current 3 Mb to somewhere around 8 Mb would have a positive impact on download speed. In the end we should find a good middle ground between fast read speeds and memory usage during upload.

azure v0.34.0 - gopanic: send on closed channel

Hello,

with the latest version of thanos v0.34.0 running a compactor on an Azure Container storage. We are seeing from time to time some go panics while uploading the blocks. I think this is new to v0.34.0 and was working ok with v0.33.x

We are running in the official docker image with the following config:

args

      --log.level=debug
      --data-dir=/data
      --objstore.config=
        type: AZURE
        config:
          container: "argos-argos-amadeus-com"
          endpoint: "privatelink.blob.core.windows.net"
          storage_account: "$(OBJSTORE_ACCESS_KEY)"
          storage_account_key: "$(OBJSTORE_SECRET_KEY)"
          max_retries: 0
          http_config:
            insecure_skip_verify: true
      --wait
      --wait-interval=30m
      --compact.cleanup-interval=30m
      --compact.concurrency=8
      --downsample.concurrency=2
      --compact.blocks-fetch-concurrency=1
      --block-files-concurrency=1
      --block-meta-fetch-concurrency=5
      --web.disable
      --compact.progress-interval=10m
      --http-address=0.0.0.0:8080
      --consistency-delay=30m
      --retention.resolution-raw=35d
      --retention.resolution-5m=100d
      --retention.resolution-1h=100d
      --compact.skip-block-with-out-of-order-chunks

Go Panic stack trace

ts=2024-02-26T12:19:24.936631613Z caller=downsample.go:391 level=info msg="downsampled block" from=01HQ3G1QHTRFY7KSAJ2RJRXXT4 to=01HQJNH9DAP3BQCEKK8HT619MG duration=2m36.3914078s duration_ms=156391
ts=2024-02-26T12:19:37.839394217Z caller=azure.go:311 level=debug msg="uploading blob" blob=01HQJNH9DAP3BQCEKK8HT619MG/chunks/000001
panic: send on closed channel

goroutine 1233414 [running]:
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.staticBuffer.Put(...)
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:83
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blockblob.(*copier).write(0xc0013c9400, {{0xc01d0a2000, 0x300000, 0x300000}, {0xc004e3c5a0, 0x58}, 0x300000})
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/blockblob/chunkwriting.go:179 +0x39a
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blockblob.(*copier).sendChunk.func1()
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/blockblob/chunkwriting.go:153 +0xb6
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.NewStaticBuffer.func1()
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:62 +0x35
created by github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.NewStaticBuffer in goroutine 1233411
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:60 +0xc6
panic: send on closed channel

goroutine 1233413 [running]:
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.staticBuffer.Put(...)
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:83
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blockblob.(*copier).write(0xc0013c9400, {{0xc01c5a8000, 0x300000, 0x300000}, {0xc00529a240, 0x58}, 0x300000})
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/blockblob/chunkwriting.go:191 +0x37c
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blockblob.(*copier).sendChunk.func1()
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/blockblob/chunkwriting.go:153 +0xb6
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.NewStaticBuffer.func1()
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:62 +0x35
created by github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/shared.NewStaticBuffer in goroutine 1233411
/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/[email protected]/internal/shared/transfer_manager.go:60 +0xc6

Remove `thanos_` prefix from metrics

Metric names are currently prefixed with thanos_: thanos_objstore_bucket_operation_failures_total
Since this library isn't necessarily used within a Thanos project anymore, it would make sense to simply omit the prefix.
objstore_bucket_operation_failures_total should totally work as well.

Add support for following redirects

We have a scenario where it would be nice if the Bucket client could follow redirects from object storage backends, specifically, 302 responses. This comes into play if someone is running a reverse proxy that forwards http->https, for example, in front of minio. But it also comes up if you're using GCS, which can return 302 responses.

Would you be open to adding support for following redirects? If so I can submit a PR, but I could use some guidance on where the best place to add this config should be since it looks like the different providers have their own configs. exthttp.HTTPConfig seems like a good place, but it looks like that's only used by S3 and Azure. I'd also be interested to see if you have any thoughts on when redirects should be allowed. My thinking is we may only want to follow redirects for GET and HEAD requests.

Feature request: conditions API

Firstly thank you for this library I've just started using it and the experience is really good.

Many object stores now support conditional gets and writes. The former allows better caching, and the latter allows developers to write data to an object if and only if the object hasn't been changed since a previous point in time, which forms the backbone for 'atomic' updates to datasets on object storage (a reference use case would be databricks delta format).

Generally object store providers either provide this functionality using a conditional header or payload containing an etag, a version or generation or even a last modified.

Examples:
GCS: https://pkg.go.dev/cloud.google.com/go/storage#hdr-Conditions
Azure: https://learn.microsoft.com/en-us/rest/api/storageservices/specifying-conditional-headers-for-blob-service-operations

Note - (infamously?) AWS S3 doesn't support this. The standard response to this seems to be to ask library users to implement their own locking provider, and not address this in the library.

Here is an example of another library that has nice API for this: https://docs.rs/object_store/latest/object_store/struct.UpdateVersion.html

My thought would be that the ObjectAttributes API could include this metadata, and then I'm not sure how the API would be be changed conservatively to support this for Get or Upload; I quite like the minimal addition of an If(condition) receiver function that returns a wrapped object (similar to the cloud-storage library).

If this feature gets the maintainers' blessing I am happy to contribute pull requests for GCS and a suitable-for-testing purposes version for LocalFilesystem.

Thanks again.

Format option is required when the Swift provider calls ncw/swift's ObjectNames()

Hello.

When using Swift as Object Storage in Thanos, call ncw/swift's ObjectNames() to get the entire list of Objects.

https://github.com/thanos-io/thanos/blob/main/pkg/block/fetcher.go#L366
https://github.com/thanos-io/objstore/blob/main/providers/swift/swift.go#L221
https://github.com/ncw/swift/blob/master/swift.go#L1057-L1070

At this time, ncw/swift assumes that the response is text/plain and parses it (this is the default format of openstack swift).

Depending on the settings of openstack swift, this response can be in json format.

I requested that ncw/swift also be able to parse json if it is, but the thanos side also needs to decide on this use.

ncw/swift#181

  1. When Thanos Fetcher calls objstore, it is executed with format (plain) parameters or headers
  2. Or Objstore (swift provider) excute with format (plain) parameter or header when calling ncw/swift

Or is there a better way?

Support Azure AD Workload Identity

Is your proposal related to a problem?

The currently supported Azure AD Pod Identity is deprecated in favour of the new Azure AD Workload identity.

Describe the solution you'd like

I'd like Thanos to support Azure AD Workload Identity.

Describe alternatives you've considered

n/a

Additional context

The following 2 PRs are for adding this support to other projects and might help with the required changes.

Add separate modules for metrics and tracing wrappers

The specific libraries and mechanisms used for metrics and tracing can be quite opinionated and a library meant as an abstraction for object storage providers should not force a dependency of one or the other system onto downstream users. As such I think that the:

  • prometheus wrapper (already exists)
  • opentracing wrapper (already exists)
  • opentelemetry wrapper (does not exist yet)

Should be served by separate modules.

Feature request: Read Azure blob storage credentials from existing kubernetes secret

Hello,

The way that credentials for azure blob storage is stored inside the objstoreConfig secret does not work well for us.
Credentails for our Azure blob storage are provided for us and stored in a secret in the cluster:

apiVersion: v1
kind: Secret
metadata:
  name: azure-private-storage-account-credentials
data:
  accountKey: c3VwZXJzZWNyZXRhenVyZWJsb2JzdG9yYWdla2V5Cg==
  accountName: YXp1cmVibG9ic3RvcmFnZWFjY291bnQK
type: Opaque

The accountKey in the secret is periodically rotated.

This poses a problem with both specifying the credentials and keeping them in sync in the objstoreConfig secret. It would be so much nicer and clean if you could just reference the credentials in the existing secret.

Preferable we would like to reference the secret above in the objstoreConfig configuration. Something like this:

type: AZURE
config:
  storage_account_secret:
    name: azure-private-storage-account-credentials
    key: accountName
  storage_account_key_secret:
    name: azure-private-storage-account-credentials
    key: accountKey
  container: 'metrics'

Alternatively specify files from a secret mouted as a volume:

type: AZURE
config:
  storage_account_file: /etc/azure/accountName
  storage_account_key_file: /etc/azure/accountKey
  container: 'metrics'  

Where the volume mount looks something like this in helm values:

          volumes:
            - name: azurecredentials
              secret:
                secretName: azure-private-storage-account-credentials
          volumeMounts:
            - mountPath: /etc/azure/
              name: azurecredentials
              readOnly: true

Add native histogram version of objstore_bucket_operation_duration_seconds

I'd like to propose to add native histogram version of objstore_bucket_operation_duration_seconds so dependent projects can take advantage and migrate to native histograms. This would require bumping client_golang:

-       github.com/prometheus/client_golang v1.12.2
-       github.com/prometheus/common v0.36.0
+       github.com/prometheus/client_golang v1.17.0
+       github.com/prometheus/common v0.44.0

I can do the PR.

Out of scope: objstore_bucket_operation_transferred_bytes . Although that could be made into a native histogram with factor==2 (schema==0).

Support signed URLs

While likely not something Thanos is going to use, but it would be really nice to have an abstraction that supports pre-signed URLs for uploads by clients.

It may be that some providers don't support this, but I think that's ok in that case applications can implement uploads themselves as fallback or just don't use those providers.

Allow prefixing `BucketReader`

It would be nice if there was an equivalent NewPrefixedBucketReader that complies just to the BucketReader interface.

Confusing s3 partSize configuration.

issue

When uploading files through s3 bucket, I want to upload large files of 128MB through one http request to reduce s3 api fees.
So I set part_size to 128MB.

But after checking the source code of s3 upload, I found that when the uploaded file size is less than s3 partsize, partsize will be set to 0, and finally minio client will use 16MB part size when uploading, which is not what I expected.

objstore version

v0.0.0-20230220090313-0796692f1ae5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.