crate / crate-howtos Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 6.0 29.17 MB

How-to guides for CrateDB.

Home Page: https://cratedb.com/docs/crate/howtos/

License: Apache License 2.0

crate-howtos's Introduction

https://github.com/crate/crate/workflows/CrateDB%20SQL/badge.svg?branch=master

Help us improve CrateDB by taking our User Survey!

About

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

CrateDB offers the benefits of an SQL database and the scalability and flexibility typically associated with NoSQL databases. Modest CrateDB clusters can ingest tens of thousands of records per second without breaking a sweat. You can run ad-hoc queries using standard SQL. CrateDB's blazing-fast distributed query execution engine parallelizes query workloads across the whole cluster.

CrateDB is well suited to containerization, can be scaled horizontally using ephemeral virtual machines (e.g., Kubernetes, AWS, and Azure) with no shared state. You can deploy and run CrateDB on any sort of network — from personal computers to multi-region hybrid clouds and the edge.

Features

Use standard SQL via the PostgreSQL wire protocol or an HTTP API.
Dynamic table schemas and queryable objects provide document-oriented features in addition to the relational features of SQL.
Support for time-series data, real-time full-text search, geospatial data types and search capabilities.
Horizontally scalable, highly available and fault-tolerant clusters that run very well in virtualized and containerized environments.
Extremely fast distributed query execution.
Auto-partitioning, auto-sharding, and auto-replication.
Self-healing and auto-rebalancing.
User-defined functions (UDFs) can be used to extend the functionality of CrateDB.

Screenshots

CrateDB provides an Admin UI:

Try CrateDB

The fastest way to try CrateDB out is by running:

sh$ bash -c "$(curl -L try.crate.io)"

Or spin up the official Docker image:

sh$ docker run --publish 4200:4200 --publish 5432:5432 --env CRATE_HEAP_SIZE=1g crate -Cdiscovery.type=single-node

Visit the installation documentation to see all the available download and install options.

Once you're up and running, head over to the introductory docs. To interact with CrateDB, you can use the Admin UI sql console or the CrateDB shell CLI tool. Alternatively, review the list of recommended clients and tools that work with CrateDB.

For container-specific documentation, check out the CrateDB on Docker how-to guide or the CrateDB on Kubernetes how-to guide.

Contributing

This project is primarily maintained by Crate.io, but we welcome community contributions!

See the developer docs and the contribution docs for more information.

Security

The CrateDB team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions.

If you think you discovered a security flaw, please follow the guidelines at SECURITY.md.

Help

Looking for more help?

Try one of our beginner tutorials, how-to guides, or consult the reference manual.
Check out our support channels.
Crate.io also offers CrateDB Cloud, a fully-managed CrateDB Database as a Service (DBaaS). The CrateDB Cloud Tutorials will get you started.

crate-howtos's People

Contributors

Stargazers

Watchers

Forkers

nook24 bploetz gruzilla th3ragex crypto-forks jcoffi

crate-howtos's Issues

Getting started with CrateDB Cloud

A step by step guide whereby we end up having a 3 node cluster on which we can run the ISS series tutorial if we choose to.

This request comes from a discussion.

Links are broken

Documentation feedback

Page title: Rolling upgrade
Page URL: https://crate.io/docs/crate/howtos/en/latest/admin/rolling-upgrade.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/admin/rolling-upgrade.rst

Links are broken:
checked Release Notes and Versions... => 404

Rename section "Clustering"

Documentation feedback

Page title: Scaling
Page URL: https://crate.io/docs/crate/howtos/en/latest/scaling/index.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/scaling/index.rst

I think it would make more sense to rename this section "clustering". scaling up and scaling down a cluster are naturally a part of clustering

I think that new users are going to want to learn how to "cluster" before they want to learn how to "scale". so I think the docs should reflect that priority in terms of what terminology is used

Update node name config section for shared config setups

Documentation feedback

Page title: Going into production
Page URL: https://crate.io/docs/crate/howtos/en/latest/going-into-production.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/going-into-production.rst

per #216, "this document should be updated to account for the fact that some people may want to configure a cluster using hostnames or IP addresses without setting node names. this allows setting up a cluster with the same crate.yaml file shared between all nodes"

Upsert operation on KafkaConnect Sink in CrateDB

Documentation feedback

Page title: Data Ingestion using Kafka and Kafka Connect
Page URL: https://crate.io/docs/crate/howtos/en/latest/integrations/kafka-connect.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/integrations/kafka-connect.rst

I am trying to use KafkaConnect Sink operation using Upsert operation. My Connector Sink command is below

CREATE SINK CONNECTOR SINK_TO_CRATE_02_6 WITH (
    'connection.backoff.ms'                   = 10000,
    'connector.class'                         = 'io.confluent.connect.jdbc.JdbcSinkConnector',
    'connection.url'                          = 'jdbc:postgresql://10.42.0.84:5432/guestbook?user=crate',
    'topics'                                  = 'tbl-student',
    'connection.password'		      = 'somePassword',
    'tasks.max'				      = 1, 
    'insert.mode'                             = 'upsert',
    'batch.size'			      = 5,
    'delete.enable'			      = 'true',
    'auto.evolve'			      = 'true',
    'pk.mode'				      = 'record_key',
    'pk.fields'				      = 'message_key',
    'table.name.format'			      = 'student'
);

Found that upsert is not working. Insert operation is working fine.
What is the issue in above KafkaConnect Sink command? Pleas help

Command in dockerentrypoint.sh maybe deprecated...

Documentation feedback

Page title: Run CrateDB on Kubernetes
Page URL: https://crate.io/docs/crate/howtos/en/latest/deployment/containers/kubernetes.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/deployment/containers/kubernetes.rst

The command - -Cdiscovery.zen.hosts_provider=srv is deprecated as described here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.html
It also may cause a CrashLoopBackOff failure when trying to run the deployment in a three nodes kubernetes cluster.

migrating-from-mongodb.rst appears to be out-of-date

Documentation feedback

Page title: Migrating from MongoDB
Page URL: https://crate.io/docs/crate/howtos/en/latest/best-practices/migrating-from-mongodb.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/best-practices/migrating-from-mongodb.rst

this page appears to be out-of-date

Fix "node.name" reference

Documentation feedback

Page title: CrateDB multi-node setup
Page URL: https://crate.io/docs/crate/howtos/en/latest/scaling/multi-node-setup.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/scaling/multi-node-setup.rst

under the cluster name section, we reference the "node.name" setting. this should reference the "cluster.name" setting

Meta: address all current `content: correction` issues

this meta issue address all current content: correction issues

planned commits:

Correction: Add note about the maintainer status of cr8 (fixes #189)
- #260

Upgrade build system to use docs-utils

Remove old build system

Check for the presence of bootstrap.sh in the top-level directory and git rm bootstrap.sh if the file's only function is to prepare a Python virtual environment for Sphinx.
Check for the presence of docs/requirements.txt.
- Create it if it doesn't exist and run git add docs/requirements.txt.
- Look for files named like requirements.txt or requirements-docs.txt in the top-level directory, and move any Python packages related to Sphinx or the docs to the docs/requirements.txt file.
  
  If you're unsure what a package does, you can look it up on PyPI.
- If any of the requirements files in the top-level directory are now empty, you can git rm the files.
Check for the presence of bin in the top-level directory and git rm -r bin if the directory's only contents are a file named sphinx.
Check for the presence of docs/docutils.conf and git rm docs/docutils.conf if the file is present.

Add the new build system

Copy the contents of the crate-docs-utils docs/Makefile to a file named docs/Makefile in your current repository and run git add docs/Makefile.
Copy the contents of the crate-docs-utils docs/utils.json to a file named docs/utils.json in your current repository and run git add docs/utils.json.
Open the file docs/conf.py.

Normally, this file should have one line that imports a module with a name matching the name of the docs project.

Remove all other lines, with the exception of config that has been added specifically for novel features that are used in that docs repository.
Open the file docs/requirements.txt.

Normally, this file should have one line (with no versioning):
```
crate-docs-theme
```
Remove all other lines, with the exception of modules that have been added specifically for novel features that are used in that docs repository.

Test the build system

Run cd docs && make dev and fix any issues.
- You may have to rename files so that they use the .rst extension instead of the .txt extension.
Check the docs in your browser look okay and fix any issues if not.
Run cd docs && make check and fix any issues.

Finishing touches

Copy the contents of the crate-docs-utils .gitignore file to a file named .gitignore at the root of your current repository.

If the file already exists:
- If the file is simple, add the content anywhere you like, then sort all lines alphabetically (case-insensitive). Finally, remove duplicate entries and trim the file so it ends with a single empty newline.
- If the file looks complex (e.g., has been generated by an IDE), insert the new content wherever it makes sense and then remove any duplicate entries.
Copy the contents of the crate-docs-utils .readthedocs.yml file to a file named .readthedocs.yml at the root of your current repository.
- If the file already exists, replace it.
Copy the contents of the crate-docs-utils .travis.yml file to a file named .travis.yml at the root of your current repository.

If the file already exists:
- If the rules only test the documentation, replace the whole file.
- If the rules test something other than the documentation, the rules for testing the documentation need to be integrated with the existing rules and any old rules for testing the documentation need to be removed. (See the crate-admin .travis.yml for an example.)
Check the DEVELOP.rst file at the root of your current repository.
- If the file only documents how to build the docs, replace the whole file with the contents of the crate-docs-utils DEVELOP.rst file.
  - Cut the Preparing a release section and any associated link references.
  - Update the URLs used in the link references so that they point to the proper resources (i.e., GitHub, Travis CI, Read The Docs) for the current repository.
- If the file is more complex than that, integrate the Documentation section from the crate-docs-utils DEVELOP.rst file into the document, replacing any previous information about how to build the docs.
  - Copy over the CI/CD badge definitions (found at the bottom of the file) as well as any link references used by the Documentation section.
  - Merge the link references into one list, sort alphabetically (case-insensitive), and remove any duplicates.
  - Update the URLs used in the link references so that they point to the proper resources (i.e., GitHub, Travis CI, Read The Docs) for the current repository.

Wrap up

Add all your changes to Git
Commit your changes to a branch
Create a new pull request

Squirrel screenshots

Please update information in section about Squirrel SQL usage.
There are 3 outdated information in regards to connection string:

port number should be 5432 (not 4300)
connection string should obligatory end with /
there should be user configured (crate in default setup) - information that JDBC driver do not use credentials is outdated.

Include SSL/TLS reference in GOING INTO PRODUCTION Guide

Documentation feedback

Page title: Going into production
Page URL: https://crate.io/docs/crate/howtos/en/latest/going-into-production.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/going-into-production.rst

I think the going into production guide should include at least a reference to the SSL / TLS chapter.
https://crate.io/docs/crate/reference/en/latest/admin/ssl.html
As imho you should never a cluster without encryption.

Improve Readability of Ubuntu Installation Guide

Currently the installation guides for CrateDB are very hard to read and follow. Also commands can't be easily copied, which isn't user friendly at all.

Missing Ubuntu 20 (focal) packages

Documentation feedback

Page title: Run CrateDB on Ubuntu
Page URL: https://crate.io/docs/crate/howtos/en/latest/deployment/linux/ubuntu.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/deployment/linux/ubuntu.rst

Documentation says that packages for focal are available, however they aren't.

"Master-edible" should be "Master-eligible"

Documentation feedback

Page title: CrateDB multi-node setup
Page URL: https://crate.io/docs/crate/howtos/en/latest/clustering/multi-node-setup.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/clustering/multi-node-setup.rst

"You can define the initial set of master-edible nodes..."

Sounds funny, but I guess is a typo!

the status of the cr8 tool is not clear

Documentation feedback

Page title: Testing inserts performance
Page URL: https://crate.io/docs/crate/howtos/en/latest/performance/inserts/testing.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/performance/inserts/testing.rst

the status of the cr8 tool is not clear. cr8 is not maintained by cr8, it is maintained by @mfussenegger. we should add a note with the appropriate caveat

Pitch: Scale up/down a cluster!

Pitch

Scale up/down a cluster!

Best Practice
Needed changes in configuration
Consequences/tradeoffs/pitfalls

This request comes from a discussion.

Launch checklist

Follow the launch checklist to get new content ideas from pitch to published. Use the comments on this issue to provide the information requested.

Provide context
- Who is the audience for this content?
- What problem should this content solve for the reader?
- What is the best format for this content (e.g, blog post, docs tutorial, etc.)?
- Why is this content important for the company?
- How urgent is this content?
Identify an author

The ideal content author knows the most about what has to be written about. Typically, this is a domain expert or someone with relevant hands-on experience.

The @crate/tech-writing team can help locate an author.

Alternatively, a @crate/tech-writing team member can author the content. Note, however, this may slow things down if the technical writer has to learn the topic before writing about it.
Create an outline

You have a description of the content idea. Before writing starts, the description should be turned into an outline.

You can think of an outline as a set of bullet-point notes that summarizes the content structure. A good outline is the starting point for a draft.
Produce a first draft

The author should produce a first draft, using the outline as a guide.

Once the author has produced the first draft, editing can begin. If the author is not a technical writer, a member of the @crate/tech-writing team will provide editorial support and help the author bring the piece to fruition.

insert with unnest drops invalid rows but documentation doesn't mention this in the how-to guide

Documentation feedback

Page title: Insert methods
Page URL: https://crate.io/docs/crate/howtos/en/latest/performance/inserts/methods.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/performance/inserts/methods.rst

Inserts done with unnest are one of crates recommended ways of handling big inserts. If only looking at this entry the biggest drawback of unnest is not mentioned and can be unexpected.

When doing insert with unnest, rows that produces errors when inserted will be dropped and there is no error indicating that an error happened see python example at the end. This is documented here:
https://crate.io/docs/crate/reference/en/4.1/sql/statements/insert.html?highlight=unnest#insert-from-dynamic-queries-constraints

I would expect that this is also mentioned (or linked) in the how-to guides.

conn = client.connect('localhost:4200')
cursor = conn.cursor()

cursor.execute("CREATE TABLE IF NOT EXISTS untest (payload OBJECT(DYNAMIC));")

stmt = "INSERT INTO untest (payload) (SELECT col1 FROM UNNEST(?));"
payload = [
    {"info": "this is a valid object"},
    {"": "this is a invalid object"},
    {"info": "this is another valid object"}
]

try:
    cursor.execute(stmt, (payload,))
except Exception as e:
    print(f"encountered exception: {e}")

using cursor.rowcount one could implement error handling themself but as this is not mentioned in the how-to guide i wouldn't expect it to be necessary.

How-To Guides Menu is broken

Documentation feedback

Page title: CrateDB How-To Guides
Page URL: https://crate.io/docs/crate/howtos/en/latest/index.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/index.rst

How-To Guides menu doesn't unfold on the left side

Docker compose outdated

Documentation feedback

Page title: Run CrateDB on Docker
Page URL: https://crate.io/docs/crate/howtos/en/latest/deployment/containers/docker.rst
Source: https://github.com/crate/crate-howtos/blob/master/docs/deployment/containers/docker.html

Hello,

the docker compose file is outdated. Due to changes in elasticsearch and in docket-ce the nodes will not discover eachother anymore. Hence, the networking needs to be reconfigured. Please see here: deviantony/docker-elk#455

It took me quite a while until I found this.

Additinally:

Container_name is not supported in stack deployment.
explaination of the effect of servicename, hostname, and node.name(in args) would be very helpful.
I think you need to set node.name now, which is really a pain since the command args do not allow variable substitution in stack files

Especially when planing on a sophististicated reusable stack file for multiple deployments.

It took me so long to figure out. I know that there is no support for docker-swarm but you even advertise it on the website.

Cheers

Correct node name section

Documentation feedback

Page title: CrateDB multi-node setup
Page URL: https://crate.io/docs/crate/howtos/en/latest/scaling/multi-node-setup.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/scaling/multi-node-setup.rst

"To do this, you must be able to refer to nodes by name."

this isn't true. you can use node name, fully-qualified hostname, or IP address

this document should be updated to account for the fact that some people may want to configure a cluster using hostnames or IP addresses without setting node names. this allows setting up a cluster with the same crate.yaml file shared between all nodes

Missing POD_NAME EnvVar in cluster-deployment example

Documentation feedback

Page title: Run CrateDB on Kubernetes
Page URL: https://crate.io/docs/crate/howtos/en/latest/deployment/containers/kubernetes.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/deployment/containers/kubernetes.rst

If you (like me) try to launch crate in your cluster, using the provided guide and you get the Error that "Node-Name" mustn't be empty, try to add the following to the deployment yaml:

env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

This goes into the env-field, analogous to NAMESPACE. Since the arg-field uses ${POD_NAME}, it needs to be defined there.

Unix paths in going into production

Documentation feedback

Page title: Going into production
Page URL: https://crate.io/docs/crate/howtos/en/latest/going-into-production.html
Source: https://github.com/crate/crate-howtos/blob/master/docs/going-into-production.rst

Maybe a pit nit picky, but isn't /srv also a system-level directory? and shouldn't logs be kept in /var/logs especially if /srv is mounted from a separate drive (e.g. cloud deployments)? (so that logging, doesn't interfere with db performance)

path.conf: /srv/crate/config
path.data: /srv/crate/data
path.logs: /srv/crate/logs
path.repo: /srv/crate/snapshots

crate / crate-howtos Goto Github PK

crate-howtos's Introduction

About

Features

Screenshots

Try CrateDB

Contributing

Security

Help

crate-howtos's People

Contributors

Stargazers

Watchers

Forkers

crate-howtos's Issues

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Remove old build system

Add the new build system

Test the build system

Finishing touches

Wrap up

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Pitch

Launch checklist

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Documentation feedback

Recommend Projects

Recommend Topics

Recommend Org