Coder Social home page Coder Social logo

crate-howtos's Introduction

CrateDB
https://github.com/crate/crate/workflows/CrateDB%20SQL/badge.svg?branch=master

Help us improve CrateDB by taking our User Survey!

About

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

CrateDB offers the benefits of an SQL database and the scalability and flexibility typically associated with NoSQL databases. Modest CrateDB clusters can ingest tens of thousands of records per second without breaking a sweat. You can run ad-hoc queries using standard SQL. CrateDB's blazing-fast distributed query execution engine parallelizes query workloads across the whole cluster.

CrateDB is well suited to containerization, can be scaled horizontally using ephemeral virtual machines (e.g., Kubernetes, AWS, and Azure) with no shared state. You can deploy and run CrateDB on any sort of network โ€” from personal computers to multi-region hybrid clouds and the edge.

Features

  • Use standard SQL via the PostgreSQL wire protocol or an HTTP API.
  • Dynamic table schemas and queryable objects provide document-oriented features in addition to the relational features of SQL.
  • Support for time-series data, real-time full-text search, geospatial data types and search capabilities.
  • Horizontally scalable, highly available and fault-tolerant clusters that run very well in virtualized and containerized environments.
  • Extremely fast distributed query execution.
  • Auto-partitioning, auto-sharding, and auto-replication.
  • Self-healing and auto-rebalancing.
  • User-defined functions (UDFs) can be used to extend the functionality of CrateDB.

Screenshots

CrateDB provides an Admin UI:

Screenshots of the CrateDB Admin UI

Try CrateDB

The fastest way to try CrateDB out is by running:

sh$ bash -c "$(curl -L try.crate.io)"

Or spin up the official Docker image:

sh$ docker run --publish 4200:4200 --publish 5432:5432 --env CRATE_HEAP_SIZE=1g crate -Cdiscovery.type=single-node

Visit the installation documentation to see all the available download and install options.

Once you're up and running, head over to the introductory docs. To interact with CrateDB, you can use the Admin UI sql console or the CrateDB shell CLI tool. Alternatively, review the list of recommended clients and tools that work with CrateDB.

For container-specific documentation, check out the CrateDB on Docker how-to guide or the CrateDB on Kubernetes how-to guide.

Contributing

This project is primarily maintained by Crate.io, but we welcome community contributions!

See the developer docs and the contribution docs for more information.

Security

The CrateDB team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions.

If you think you discovered a security flaw, please follow the guidelines at SECURITY.md.

Help

Looking for more help?

crate-howtos's People

Contributors

amotl avatar andreidan avatar autophagy avatar baurzhansakhariev avatar bploetz avatar chaudum avatar gruzilla avatar hammerhead avatar hlcianfagna avatar infoverload avatar jayeff avatar jodok avatar karynzv avatar kevinkq avatar kovrus avatar markush avatar marregui avatar matkuliak avatar matriv avatar mfussenegger avatar mkleen avatar msbt avatar mxm avatar nomicode avatar proddata avatar scjimenez avatar seut avatar smakalias avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crate-howtos's Issues

Getting started with CrateDB Cloud

A step by step guide whereby we end up having a 3 node cluster on which we can run the ISS series tutorial if we choose to.

This request comes from a discussion.

Rename section "Clustering"

Documentation feedback


I think it would make more sense to rename this section "clustering". scaling up and scaling down a cluster are naturally a part of clustering

I think that new users are going to want to learn how to "cluster" before they want to learn how to "scale". so I think the docs should reflect that priority in terms of what terminology is used

Update node name config section for shared config setups

Documentation feedback


per #216, "this document should be updated to account for the fact that some people may want to configure a cluster using hostnames or IP addresses without setting node names. this allows setting up a cluster with the same crate.yaml file shared between all nodes"

Upsert operation on KafkaConnect Sink in CrateDB

Documentation feedback


I am trying to use KafkaConnect Sink operation using Upsert operation. My Connector Sink command is below

CREATE SINK CONNECTOR SINK_TO_CRATE_02_6 WITH (
    'connection.backoff.ms'                   = 10000,
    'connector.class'                         = 'io.confluent.connect.jdbc.JdbcSinkConnector',
    'connection.url'                          = 'jdbc:postgresql://10.42.0.84:5432/guestbook?user=crate',
    'topics'                                  = 'tbl-student',
    'connection.password'		      = 'somePassword',
    'tasks.max'				      = 1, 
    'insert.mode'                             = 'upsert',
    'batch.size'			      = 5,
    'delete.enable'			      = 'true',
    'auto.evolve'			      = 'true',
    'pk.mode'				      = 'record_key',
    'pk.fields'				      = 'message_key',
    'table.name.format'			      = 'student'
);

Found that upsert is not working. Insert operation is working fine.
What is the issue in above KafkaConnect Sink command? Pleas help

Command in dockerentrypoint.sh maybe deprecated...

Documentation feedback


The command - -Cdiscovery.zen.hosts_provider=srv is deprecated as described here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.html
It also may cause a CrashLoopBackOff failure when trying to run the deployment in a three nodes kubernetes cluster.

Upgrade build system to use docs-utils

Remove old build system

  • Check for the presence of bootstrap.sh in the top-level directory and git rm bootstrap.sh if the file's only function is to prepare a Python virtual environment for Sphinx.

  • Check for the presence of docs/requirements.txt.

    • Create it if it doesn't exist and run git add docs/requirements.txt.

    • Look for files named like requirements.txt or requirements-docs.txt in the top-level directory, and move any Python packages related to Sphinx or the docs to the docs/requirements.txt file.

      If you're unsure what a package does, you can look it up on PyPI.

    • If any of the requirements files in the top-level directory are now empty, you can git rm the files.

  • Check for the presence of bin in the top-level directory and git rm -r bin if the directory's only contents are a file named sphinx.

  • Check for the presence of docs/docutils.conf and git rm docs/docutils.conf if the file is present.

Add the new build system

  • Copy the contents of the crate-docs-utils docs/Makefile to a file named docs/Makefile in your current repository and run git add docs/Makefile.

  • Copy the contents of the crate-docs-utils docs/utils.json to a file named docs/utils.json in your current repository and run git add docs/utils.json.

  • Open the file docs/conf.py.

    Normally, this file should have one line that imports a module with a name matching the name of the docs project.

    Remove all other lines, with the exception of config that has been added specifically for novel features that are used in that docs repository.

  • Open the file docs/requirements.txt.

    Normally, this file should have one line (with no versioning):

    crate-docs-theme
    

    Remove all other lines, with the exception of modules that have been added specifically for novel features that are used in that docs repository.

Test the build system

  • Run cd docs && make dev and fix any issues.

    • You may have to rename files so that they use the .rst extension instead of the .txt extension.
  • Check the docs in your browser look okay and fix any issues if not.

  • Run cd docs && make check and fix any issues.

Finishing touches

  • Copy the contents of the crate-docs-utils .gitignore file to a file named .gitignore at the root of your current repository.

    If the file already exists:

    • If the file is simple, add the content anywhere you like, then sort all lines alphabetically (case-insensitive). Finally, remove duplicate entries and trim the file so it ends with a single empty newline.

    • If the file looks complex (e.g., has been generated by an IDE), insert the new content wherever it makes sense and then remove any duplicate entries.

  • Copy the contents of the crate-docs-utils .readthedocs.yml file to a file named .readthedocs.yml at the root of your current repository.

    • If the file already exists, replace it.
  • Copy the contents of the crate-docs-utils .travis.yml file to a file named .travis.yml at the root of your current repository.

    If the file already exists:

    • If the rules only test the documentation, replace the whole file.

    • If the rules test something other than the documentation, the rules for testing the documentation need to be integrated with the existing rules and any old rules for testing the documentation need to be removed. (See the crate-admin .travis.yml for an example.)

  • Check the DEVELOP.rst file at the root of your current repository.

    • If the file only documents how to build the docs, replace the whole file with the contents of the crate-docs-utils DEVELOP.rst file.

      • Cut the Preparing a release section and any associated link references.

      • Update the URLs used in the link references so that they point to the proper resources (i.e., GitHub, Travis CI, Read The Docs) for the current repository.

    • If the file is more complex than that, integrate the Documentation section from the crate-docs-utils DEVELOP.rst file into the document, replacing any previous information about how to build the docs.

      • Copy over the CI/CD badge definitions (found at the bottom of the file) as well as any link references used by the Documentation section.

      • Merge the link references into one list, sort alphabetically (case-insensitive), and remove any duplicates.

      • Update the URLs used in the link references so that they point to the proper resources (i.e., GitHub, Travis CI, Read The Docs) for the current repository.

Wrap up

  • Add all your changes to Git

  • Commit your changes to a branch

  • Create a new pull request

Squirrel screenshots

Please update information in section about Squirrel SQL usage.
There are 3 outdated information in regards to connection string:

  • port number should be 5432 (not 4300)
  • connection string should obligatory end with /
  • there should be user configured (crate in default setup) - information that JDBC driver do not use credentials is outdated.

Include SSL/TLS reference in GOING INTO PRODUCTION Guide

Documentation feedback


I think the going into production guide should include at least a reference to the SSL / TLS chapter.
https://crate.io/docs/crate/reference/en/latest/admin/ssl.html
As imho you should never a cluster without encryption.

Pitch: Scale up/down a cluster!

Pitch

Scale up/down a cluster!

  • Best Practice
  • Needed changes in configuration
  • Consequences/tradeoffs/pitfalls

This request comes from a discussion.

Launch checklist

Follow the launch checklist to get new content ideas from pitch to published. Use the comments on this issue to provide the information requested.

  • Provide context

    • Who is the audience for this content?

    • What problem should this content solve for the reader?

    • What is the best format for this content (e.g, blog post, docs tutorial, etc.)?

    • Why is this content important for the company?

    • How urgent is this content?

  • Identify an author

    The ideal content author knows the most about what has to be written about. Typically, this is a domain expert or someone with relevant hands-on experience.

    The @crate/tech-writing team can help locate an author.

    Alternatively, a @crate/tech-writing team member can author the content. Note, however, this may slow things down if the technical writer has to learn the topic before writing about it.

  • Create an outline

    You have a description of the content idea. Before writing starts, the description should be turned into an outline.

    You can think of an outline as a set of bullet-point notes that summarizes the content structure. A good outline is the starting point for a draft.

  • Produce a first draft

    The author should produce a first draft, using the outline as a guide.

Once the author has produced the first draft, editing can begin. If the author is not a technical writer, a member of the @crate/tech-writing team will provide editorial support and help the author bring the piece to fruition.

insert with unnest drops invalid rows but documentation doesn't mention this in the how-to guide

Documentation feedback


Inserts done with unnest are one of crates recommended ways of handling big inserts. If only looking at this entry the biggest drawback of unnest is not mentioned and can be unexpected.

When doing insert with unnest, rows that produces errors when inserted will be dropped and there is no error indicating that an error happened see python example at the end. This is documented here:
https://crate.io/docs/crate/reference/en/4.1/sql/statements/insert.html?highlight=unnest#insert-from-dynamic-queries-constraints

I would expect that this is also mentioned (or linked) in the how-to guides.

conn = client.connect('localhost:4200')
cursor = conn.cursor()

cursor.execute("CREATE TABLE IF NOT EXISTS untest (payload OBJECT(DYNAMIC));")

stmt = "INSERT INTO untest (payload) (SELECT col1 FROM UNNEST(?));"
payload = [
    {"info": "this is a valid object"},
    {"": "this is a invalid object"},
    {"info": "this is another valid object"}
]

try:
    cursor.execute(stmt, (payload,))
except Exception as e:
    print(f"encountered exception: {e}")

using cursor.rowcount one could implement error handling themself but as this is not mentioned in the how-to guide i wouldn't expect it to be necessary.

Docker compose outdated

Documentation feedback


Hello,

the docker compose file is outdated. Due to changes in elasticsearch and in docket-ce the nodes will not discover eachother anymore. Hence, the networking needs to be reconfigured. Please see here: deviantony/docker-elk#455

It took me quite a while until I found this.

Additinally:

  • Container_name is not supported in stack deployment.
  • explaination of the effect of servicename, hostname, and node.name(in args) would be very helpful.
  • I think you need to set node.name now, which is really a pain since the command args do not allow variable substitution in stack files

Especially when planing on a sophististicated reusable stack file for multiple deployments.

It took me so long to figure out. I know that there is no support for docker-swarm but you even advertise it on the website.

Cheers

Correct node name section

Documentation feedback


"To do this, you must be able to refer to nodes by name."

this isn't true. you can use node name, fully-qualified hostname, or IP address

this document should be updated to account for the fact that some people may want to configure a cluster using hostnames or IP addresses without setting node names. this allows setting up a cluster with the same crate.yaml file shared between all nodes

Missing POD_NAME EnvVar in cluster-deployment example

Documentation feedback


If you (like me) try to launch crate in your cluster, using the provided guide and you get the Error that "Node-Name" mustn't be empty, try to add the following to the deployment yaml:

env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

This goes into the env-field, analogous to NAMESPACE. Since the arg-field uses ${POD_NAME}, it needs to be defined there.

Unix paths in going into production

Documentation feedback


Maybe a pit nit picky, but isn't /srv also a system-level directory? and shouldn't logs be kept in /var/logs especially if /srv is mounted from a separate drive (e.g. cloud deployments)? (so that logging, doesn't interfere with db performance)

path.conf: /srv/crate/config
path.data: /srv/crate/data
path.logs: /srv/crate/logs
path.repo: /srv/crate/snapshots

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.